Author: Funkner, Anastasia A.; Kovalchuk, Sergey V.
Title: Time Expressions Identification Without Human-Labeled Corpus for Clinical Text Mining in Russian Cord-id: kkzj7nvc Document date: 2020_5_23
ID: kkzj7nvc
Snippet: To obtain accurate predictive models in medicine, it is necessary to use complete relevant information about the patient. We propose an approach for extracting temporary expressions from unlabeled natural language texts. This approach can be used for the first analysis of the corpus, for data labeling as the first stage, or for obtaining linguistic constructions that can be used for a rule-based approach to retrieve information. Our method includes the sequential use of several machine learning
Document: To obtain accurate predictive models in medicine, it is necessary to use complete relevant information about the patient. We propose an approach for extracting temporary expressions from unlabeled natural language texts. This approach can be used for the first analysis of the corpus, for data labeling as the first stage, or for obtaining linguistic constructions that can be used for a rule-based approach to retrieve information. Our method includes the sequential use of several machine learning and natural language processing methods: classification of sentences, the transformation of word bag frequencies, clustering of sentences with time expressions, classification of new data into clusters and construction of sentence profiles using feature importances. With this method, we derive the list of the most frequent time expressions and extract events and/or time events for 9801 sentences of anamnesis in Russian. The proposed approach is independent of the corpus language and can be used for other tasks, for example, extracting an experiencer of a disease.
Search related documents:
Co phrase search for related documents- accurate model and additional set: 1
- accurate model and additional time: 1, 2
- accurate model and machine learn: 1
- accurate model and machine learning: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25
- accurate model and machine learning method: 1, 2, 3, 4
- accurate predictive model and machine learning: 1
- accurate predictive model and machine learning method: 1
- accurately work and machine learning: 1, 2, 3
- acs acute coronary syndrome and acute coronary syndrome: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25
- acs acute coronary syndrome and additional time: 1
- acute coronary syndrome and additional time: 1
- acute coronary syndrome and machine learning: 1, 2, 3
- additional set and machine learning: 1, 2, 3
- additional time and machine learning: 1, 2
Co phrase search for related documents, hyperlinks ordered by date