Selected article for: "absolute shrinkage and machine learning"

Author: Park, Hyeoun-Ae; Jung, Hyesil; On, Jeongah; Park, Seul Ki; Kang, Hannah
Title: Digital Epidemiology: Use of Digital Data Collected for Non-epidemiological Purposes in Epidemiological Studies
  • Document date: 2018_10_31
  • ID: 1go3jjeu_45
    Snippet: Regarding study design, time series cross-sectional studies were the most frequent (77 studies, 70.6%) followed by single point in time cross-sectional studies (25, 22 .9%). In traditional epidemiological studies, the most frequently used method is cross-sectional regression. However, the use of digital data collected across a time period enables the modeling of effects across time and space. A majority of the studies (76.1%) used external datase.....
    Document: Regarding study design, time series cross-sectional studies were the most frequent (77 studies, 70.6%) followed by single point in time cross-sectional studies (25, 22 .9%). In traditional epidemiological studies, the most frequently used method is cross-sectional regression. However, the use of digital data collected across a time period enables the modeling of effects across time and space. A majority of the studies (76.1%) used external datasets as outcome variables for model development or outcome validation. Regarding outcome measures, incidence and prevalence were the most common measures used in digital epidemiological studies. A majority of the studies used correlation analyses to examine the relationships among variables (55 studies) followed by various regression analyses (45 studies), such as linear regression, jointpoint regression, and Least Absolute Shrinkage and Selection Operator regression. We examined how digital data collected for non-epidemiological purposes is being used for epidemiologic purposes. Digital epidemiological studies require large datasets and advanced analytics such as machine learning. Most machine learning algorithms are openly available due to the strong open source software movement. Thus, it is important to ensure that as much data as possible are openly accessible. As Salathe [2] elaborated so well in his review article, this is clearly at odds with the desire to have as little personal data as possible publicly accessible to protect individual privacy. There is no straightforward solution to this conflict of interest between open access to large data sets and privacy protection, but Salathe [2] proposed data cooperatives with restricted access as a solution.

    Search related documents:
    Co phrase search for related documents