Author: Chen, Liangâ€Ching; Chang, Kueiâ€Hu
Title: A novel corpusâ€based computing method for handling critical wordâ€ranking issues: An example of COVIDâ€19 research articles Cord-id: q7byywb8 Document date: 2021_3_11
ID: q7byywb8
Snippet: A corpus is a massive body of structured textual data that are stored and operated electronically. It usually combines with statistics, machine learning algorithms, or artificial intelligence (AI) technologies to explore the semantic relationship between lexical units, and beneficial when applied to language learning, information processing, translation, and so forth. In the face of a novel disease, like, COVIDâ€19, establishing medicalâ€specific corpus will enhance frontline medical personnel
Document: A corpus is a massive body of structured textual data that are stored and operated electronically. It usually combines with statistics, machine learning algorithms, or artificial intelligence (AI) technologies to explore the semantic relationship between lexical units, and beneficial when applied to language learning, information processing, translation, and so forth. In the face of a novel disease, like, COVIDâ€19, establishing medicalâ€specific corpus will enhance frontline medical personnel's information acquisition efficiency, guiding them on the right approaches to respond to and prevent the novel disease. To effectively retrieve critical messages from the corpus, appropriately handling wordâ€ranking issues is quite crucial. However, traditional frequencyâ€based approaches may cause bias in handling wordâ€ranking issues because they neither optimize the corpus nor integrally take words' frequency dispersion and concentration criteria into consideration. Thus, this paper develops a novel corpusâ€based approach that combines a corpus software and Hirsch index (Hâ€index) algorithm to handle the aforementioned issues simultaneously, making wordâ€ranking processes more accurate. This paper compiled 100 COVIDâ€19â€related research articles as an empirical example of the target corpus. To verify the proposed approach, this study compared the results of two traditional frequencyâ€based approaches and the proposed approach. The results indicate that the proposed approach can refine corpus and simultaneously compute words' frequency dispersion and concentration criteria in handling wordâ€ranking issues.
Search related documents:
Co phrase search for related documents, hyperlinks ordered by date