Selected article for: "high classification and Methods Materials"

Author: Estiri, Hossein; Strasser, Zachary H; Murphy, Shawn N.
Title: High-throughput Phenotyping with Temporal Sequences
  • Cord-id: 1uwydg3v
  • Document date: 2020_9_10
  • ID: 1uwydg3v
    Snippet: Objective High-throughput electronic phenotyping algorithms can accelerate translational research using data from electronic health record (EHR) systems. The temporal information buried in EHRs are often underutilized in developing computational phenotypic definitions. The objective of this study is to develop a high-throughput phenotyping method, leveraging temporal sequential patterns of discrete events from electronic health records. Materials and Methods We develop a representation mining al
    Document: Objective High-throughput electronic phenotyping algorithms can accelerate translational research using data from electronic health record (EHR) systems. The temporal information buried in EHRs are often underutilized in developing computational phenotypic definitions. The objective of this study is to develop a high-throughput phenotyping method, leveraging temporal sequential patterns of discrete events from electronic health records. Materials and Methods We develop a representation mining algorithm to extract five classes of representations from EHR diagnosis and medication records: the aggregated vector of the records (AVR), the traditional immediate sequential patterns (SPM), the transitive sequential patterns (tSPM), as well as two hybrid classes of SPM+AVR and tSPM+AVR. A final small set of representations were selected from each class using the MSMR dimensionality reduction algorithm. Using EHR data on 10 phenotypes from Mass General Brigham Biobank, we trained regularized logistic regression algorithms, which we validated using labeled data. Results Phenotyping with temporal sequences resulted in a superior classification performance across all 10 phenotypes compared with the AVR representations that are conventionally used in electronic phenotyping. Although this study only utilizes the diagnosis and medication records, the high-throughput algorithm’s classification performance was superior or similar to the performance of previously published electronic phenotyping algorithms. We characterize and evaluate the top transitive sequences of diagnosis records paired with the records of risk factors, symptoms, complications, medications, or vaccinations. Discussion The proposed high-throughput phenotyping approach enables seamless discovery of sequential record combinations that may be difficult to assume from raw EHR data. A transitive sequence can offer a more accurate characterization of the phenotype, compared with its individual components. Additionally, the identified transitive sequences of a given phenotype reflect the actual lived experiences of the patients with that particular disease. Conclusion Sequential data representations provide a precise mechanism for incorporating raw EHR records into downstream Machine Learning.

    Search related documents:
    Co phrase search for related documents
    • academic medical center and longitudinal study: 1
    • academic medical center and lung disease: 1, 2, 3
    • academic medical center and machine learning: 1, 2, 3, 4, 5, 6, 7, 8
    • accuracy high level and long lstm short term memory: 1
    • accuracy high level and lstm short term memory: 1
    • accuracy high level and machine learning: 1, 2
    • accurate label and machine learning: 1
    • accurate sequence and logistic regression: 1
    • accurate sequence and machine learning: 1, 2, 3
    • actual onset and logistic regression: 1
    • ad alzheimer disease and logistic regression: 1
    • logistic regression and long lstm short term memory: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10
    • logistic regression and longitudinal information: 1
    • logistic regression and longitudinal study: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25
    • logistic regression and lstm short term memory: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10
    • logistic regression and lung disease: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25
    • logistic regression and machine learning: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25
    • logistic regression classifier and long lstm short term memory: 1
    • logistic regression classifier and lstm short term memory: 1