Selected article for: "case control and logistic regression analysis"

Author: Mengying Dong; Xiaojun Cao; Mingbiao Liang; Lijuan Li; Huiying Liang; Guangjian Liu
Title: Understand Research Hotspots Surrounding COVID-19 and Other Coronavirus Infections Using Topic Modeling
  • Document date: 2020_3_30
  • ID: 3wuh6k6g_19
    Snippet: We built a topic model using LDA (20, 21) from the article abstracts within the corpus. LDA assumes individual document as random mixtures over latent topics, with topics in turn being probability distributions over multiple vocabularies, and topics being uncorrelated. The Gibbs sampling algorithm is used to estimate the topic distribution parameters of the document and the multi-distribution of the vocabulary on each topic. The evaluation index .....
    Document: We built a topic model using LDA (20, 21) from the article abstracts within the corpus. LDA assumes individual document as random mixtures over latent topics, with topics in turn being probability distributions over multiple vocabularies, and topics being uncorrelated. The Gibbs sampling algorithm is used to estimate the topic distribution parameters of the document and the multi-distribution of the vocabulary on each topic. The evaluation index in the statistical language model, coherence score, was used to find the optimal number of topics in the corpus. The coherence score is calculated by the co-occurrence frequency of the words in the sliding window, which increases with the increase of sentence similarity, meaning the higher, the better. Although the coherence score can help to decide the most appropriate topic number, such number sometimes can be still large for manually understanding. Under this circumstance, manual examination should be performed to evaluate models with different numbers of topics with the aid of expert knowledge to thoroughly understand the corpus. For each model, 15 top words per topic were examined to assess scientific coherence of the words as a set, overlap in topic words across topics, and human understandability. The selected model was used for all subsequent analyses.

    Search related documents:
    Co phrase search for related documents
    • Try single phrases listed below for: 1