Selected article for: "access technology and additional information"

Author: Yang, Y. Tony; Horneffer, Michael; DiLisio, Nicole
Title: Mining Social Media and Web Searches For Disease Detection
  • Document date: 2013_5_31
  • ID: k3ujatua_17_0
    Snippet: Blogs enable users to upload a linked series of postings about a particular topic in a forum setting. The blogosphere is the aggregation of these postings. Blogs may be considered a form of social networking. 12 Data mining of blogs for the purpose of flu surveillance gives a uniform voice to the web-connected public for the purpose of disease reporting. Previously, those who contributed public health-related information to blogs and other online.....
    Document: Blogs enable users to upload a linked series of postings about a particular topic in a forum setting. The blogosphere is the aggregation of these postings. Blogs may be considered a form of social networking. 12 Data mining of blogs for the purpose of flu surveillance gives a uniform voice to the web-connected public for the purpose of disease reporting. Previously, those who contributed public health-related information to blogs and other online social media outlets only did so to communicate with other individuals who were using the same information interfaces. Data mining of these sources now allows these contributions to be aggregated and studied. What was previously an inchoate and fragmented forum becomes a single voice when contributions to online forums are collated and analysed. 13 Previous efforts aimed at estimating the flu prevalence in a population relied solely upon extrapolation of formally diagnosed cases. 13 Blogs provide additional information to better inform traditional epidemiological models. Incorporating these sources into their work, Corley et al. developed a system that identifies blog communities that share flu-related postings. 13 Trends in postings were correlated to CDC ILI patient reporting at sentinel healthcare providers. 13 Patterns in flurelated postings were then further discerned via graph-based data mining to identify structural anomalies in the flu blogosphere that correspond to increases in ILI. 13 Spinn3r was used to automatically process and analyse the content of thousands of blogs. Spinn3r is a web and social media (WSM) indexing service that conducts real-time indexing with the throughput power of 100,000 new blogs per hour. 13 Spinn3r collected, processed, and discriminated the blogs containing flu keywords. Those selected blogs were then further analysed by text mining. Text mining is the process of discovering information in large text collections and automatically identifying interesting patterns and relationships in textual data. 13 To further identify nuanced interrelationships between blogs, such as influence of a particular blogger on flu-related material, the Subdue system was developed. Subdue was devised for general purpose automated discovery, concept learning, and hierarchical clustering. 13 Whereas a human being might miss a larger pattern in aggregate data which might indicate the beginning of a flu outbreak, Subdue will be able to immediately recognise and flag these less visible phenomena in a larger data set. Public health officials could then analyse the aggregated and tamed data to determine the relevance of computer-indicated trends (Figure 1 ). To determine whether CDC ILI surveillance was correlated to WSM, Corley et al. compared the two data series with Pearson's correlation statistic. 13 The CDC ILI reports and the WSM were shown to have correlated strongly with a Pearson statistic of r=0.545 with 95% confidence. 13 Despite the appeal of such a comprehensive WSM collation system, it has two limitations; i) sample bias; and ii) the reliability of blogger statements. Those who post on blogs generally tend to have access to higher levels of technology and education. Thus, only the literate and wired segments of society would be tapped as sources of flu intelligence. Further study must be conducted to determine whether this skews response variety. Secondly, the verisimilitude of blogger statements must also be verified. It is possible that bloggers may intentionally create

    Search related documents:
    Co phrase search for related documents
    • Try single phrases listed below for: 1