Author: Rinderknecht, M. D.; Klopfenstein, Y.
Title: Predicting Critical State after COVID-19 Diagnosis Using Real-World Data from 20152 Confirmed US Cases Cord-id: 9igv2din Document date: 2020_7_27
ID: 9igv2din
Snippet: The global COVID-19 pandemic caused by the virus SARS-CoV-2 has led to over 10 million confirmed cases, half a million deaths, and is challenging healthcare systems worldwide. With limited medical resources, early identification of patients with a high risk of progression to severe disease or a critical state is crucial. We present a prognostic model predicting critical state within 28 days following COVID-19 diagnosis trained on data from US electronic health records (EHR) within IBM Explorys,
Document: The global COVID-19 pandemic caused by the virus SARS-CoV-2 has led to over 10 million confirmed cases, half a million deaths, and is challenging healthcare systems worldwide. With limited medical resources, early identification of patients with a high risk of progression to severe disease or a critical state is crucial. We present a prognostic model predicting critical state within 28 days following COVID-19 diagnosis trained on data from US electronic health records (EHR) within IBM Explorys, including demographics, comorbidities, symptoms, laboratory test results, insurance types, and hospitalization. Our entire cohort included 20152 COVID-19 cases, of which 3160 patients went into critical state or died. Random, stratified train-test splits were repeated 100 times to obtain a distribution of performance. The median and interquartile range of the areas under the receiver operating characteristic curve (ROC AUC) and the precision recall curve (PR AUC) were 0.863 [0.857, 0.866] and 0.539 [0.526, 0.550], respectively. Optimizing the decision threshold lead to a sensitivity of 0.796 [0.775, 0.821] and a specificity of 0.784 [0.769, 0.805]. Good model calibration was achieved, showing only minor tendency to over-forecast probabilities above 0.6. The validity of the model was demonstrated by the interpretability analysis confirming existing evidence on major risk factors (e.g., higher age and weight, male gender, diabetes, cardiovascular disease, and chronic kidney disease). The analysis also revealed higher risk for African Americans and "self-pay patients". To the best of our knowledge, this is the largest dataset based on EHR used to create a prognosis model for COVID-19. In contrast to large-scale statistics computing odds ratios for individual risk factors, the present model combining a rich set of covariates can provide accurate personalized predictions enabling early treatment to prevent patients from progressing to a severe or critical state.
Search related documents:
Co phrase search for related documents, hyperlinks ordered by date