Results

Selected article for: "chemical language and natural language processing"

Author: Campillos-Llanos, Leonardo; Valverde-Mateos, Ana; Capllonch-CarriÃ³n, AdriÃ¡n; Moreno-Sandoval, Antonio

Title: A clinical trials corpus annotated with UMLS entities to enhance the access to evidence-based medicine

Cord-id: 71k56tq8

Document date: 2021_2_22

ID: 71k56tq8

Hyperlink: Download document. Google Scholar. Related documents. PubMed

Snippet: BACKGROUND: The large volume of medical literature makes it difficult for healthcare professionals to keep abreast of the latest studies that support Evidence-Based Medicine. Natural language processing enhances the access to relevant information, and gold standard corpora are required to improve systems. To contribute with a new dataset for this domain, we collected the Clinical Trials for Evidence-Based Medicine in Spanish (CT-EBM-SP) corpus. METHODS: We annotated 1200 texts about clinical tri

KG: Link to Knowledge Graph

Complete Snippet

Document: BACKGROUND: The large volume of medical literature makes it difficult for healthcare professionals to keep abreast of the latest studies that support Evidence-Based Medicine. Natural language processing enhances the access to relevant information, and gold standard corpora are required to improve systems. To contribute with a new dataset for this domain, we collected the Clinical Trials for Evidence-Based Medicine in Spanish (CT-EBM-SP) corpus. METHODS: We annotated 1200 texts about clinical trials with entities from the Unified Medical Language System semantic groups: anatomy (ANAT), pharmacological and chemical substances (CHEM), pathologies (DISO), and lab tests, diagnostic or therapeutic procedures (PROC). We doubly annotated 10% of the corpus and measured inter-annotator agreement (IAA) using F-measure. As use case, we run medical entity recognition experiments with neural network models. RESULTS: This resource contains 500 abstracts of journal articles about clinical trials and 700 announcements of trial protocols (292 173 tokens). We annotated 46 699 entities (13.98% are nested entities). Regarding IAA agreement, we obtained an average F-measure of 85.65% (Â±4.79, strict match) and 93.94% (Â±3.31, relaxed match). In the use case experiments, we achieved recognition results ranging from 80.28% (Â±00.99) to 86.74% (Â±00.19) of average F-measure. CONCLUSIONS: Our results show that this resource is adequate for experiments with state-of-the-art approaches to biomedical named entity recognition. It is freely distributed at: http://www.lllf.uam.es/ESP/nlpmedterm_en.html. The methods are generalizable to other languages with similar available sources.

Search related documents:

Co phrase search for related documents

Try single phrases listed below for: 1

Co phrase search for related documents, hyperlinks ordered by date

ABSTRACT:

TERMS:

DOCUMENTS: