Selected article for: "death hospitalization and model prediction"

Author: Khalid, S.; Yang, C.; Blacketer, C.; Duarte-Salles, T.; Fernandez-Bertolin, S.; Kim, C.; Park, R. W.; Park, J.; Schuemie, M.; Sena, A. G.; Suchard, M. A.; You, S. C.; Rijnbeek, P.; Reps, J. M.
Title: A standardized analytics pipeline for reliable and rapid development and validation of prediction models using observational health data
  • Cord-id: n0ni37nh
  • Document date: 2021_3_26
  • ID: n0ni37nh
    Snippet: Background and Objective: As a response to the ongoing COVID-19 pandemic, several prediction models have been rapidly developed, with the aim of providing evidence-based guidance. However, no COVID-19 prediction model in the existing literature has been found to be reliable. Models are commonly assessed to have a risk of bias, often due to insufficient reporting, use of non-representative data, and lack of large-scale external validation. In this paper, we present the Observational Health Data S
    Document: Background and Objective: As a response to the ongoing COVID-19 pandemic, several prediction models have been rapidly developed, with the aim of providing evidence-based guidance. However, no COVID-19 prediction model in the existing literature has been found to be reliable. Models are commonly assessed to have a risk of bias, often due to insufficient reporting, use of non-representative data, and lack of large-scale external validation. In this paper, we present the Observational Health Data Sciences and Informatics (OHDSI) analytics pipeline for patient-level prediction as a standardized approach for rapid yet reliable development and validation of prediction models. We demonstrate how our analytics pipeline and open-source software can be used to answer important prediction questions while limiting potential causes of bias (e.g., by validating phenotypes, specifying the target population, performing large-scale external validation and publicly providing all analytical source code). Methods: We show step-by-step how to implement the pipeline for the question: In patients hospitalized with COVID-19, what is the risk of death 0 to 30 days after hospitalization. We develop models using six different machine learning methods in a US claims database containing over 20,000 COVID-19 hospitalizations and externally validate the models using data containing over 45,000 COVID-19 hospitalizations from South Korea, Spain, and the US. Results: Our open-source tools enabled us to efficiently go end-to-end from problem design to reliable model development and evaluation. When predicting death in patients hospitalized for COVID-19 adaBoost, random forest, gradient boosting machine, and decision tree yielded similar or lower internal and external validation discrimination performance compared to L1-regularized logistic regression, whereas the MLP neural network consistently resulted in lower discrimination. L1-regularized logistic regression models were well calibrated. Conclusion: Our results show that following the OHDSI analytics pipeline for patient-level prediction can enable the rapid development towards reliable prediction models. The OHDSI tools and pipeline are open source and available to researchers around the world.

    Search related documents:
    Co phrase search for related documents
    • adaboost model and logistic regression model: 1
    • adequate size and logistic regression: 1, 2, 3