Author: Lu, Hsin-Min; Chen, Hsinchun; Zeng, Daniel; King, Chwan-Chuen; Shih, Fuh-Yuan; Wu, Tsung-Shu; Hsiao, Jin-Yi
Title: Multilingual chief complaint classification for syndromic surveillance: An experiment with Chinese chief complaints Cord-id: ngow9w7p Document date: 2008_10_5
ID: ngow9w7p
Snippet: PURPOSE: Syndromic surveillance is aimed at early detection of disease outbreaks. An important data source for syndromic surveillance is free-text chief complaints (CCs), which may be recorded in different languages. For automated syndromic surveillance, CCs must be classified into predefined syndromic categories to facilitate subsequent data aggregation and analysis. Despite the fact that syndromic surveillance is largely an international effort, existing CC classification systems do not provid
Document: PURPOSE: Syndromic surveillance is aimed at early detection of disease outbreaks. An important data source for syndromic surveillance is free-text chief complaints (CCs), which may be recorded in different languages. For automated syndromic surveillance, CCs must be classified into predefined syndromic categories to facilitate subsequent data aggregation and analysis. Despite the fact that syndromic surveillance is largely an international effort, existing CC classification systems do not provide adequate support for processing CCs recorded in non-English languages. This paper reports a multilingual CC classification effort, focusing on CCs recorded in Chinese. METHODS: We propose a novel Chinese CC classification system leveraging a Chinese-English translation module and an existing English CC classification approach. A set of 470 Chinese key phrases was extracted from about one million Chinese CC records using statistical methods. Based on the extracted key phrases, the system translates Chinese text into English and classifies the translated CCs to syndromic categories using an existing English CC classification system. RESULTS: Compared to alternative approaches using a bilingual dictionary and a general-purpose machine translation system, our approach performs significantly better in terms of positive predictive value (PPV or precision), sensitivity (recall), specificity, and F measure (the harmonic mean of PPV and sensitivity), based on a computational experiment using real-world CC records. CONCLUSIONS: Our design provides satisfactory performance in classifying Chinese CCs into syndromic categories for public health surveillance. The overall design of our system also points out a potentially fruitful direction for multilingual CC systems that need to handle languages beyond English and Chinese.
Search related documents:
Co phrase search for related documents- abdominal pain and admission time: 1, 2, 3, 4, 5, 6, 7
- abdominal pain and low respiratory: 1, 2, 3, 4, 5, 6
- abdominal pain and low threshold: 1, 2, 3, 4
- acute respiratory syndrome and admission time: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25
- acute respiratory syndrome and low performance: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14
- acute respiratory syndrome and low respiratory: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25
- acute respiratory syndrome and low respiratory upper respiratory: 1
- acute respiratory syndrome and low threshold: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25
- admission time and low performance: 1
- admission time and low respiratory: 1, 2, 3, 4
Co phrase search for related documents, hyperlinks ordered by date