Selected article for: "accuracy high specificity and machine learning"

Author: Doyle, R.
Title: Prediction of COVID-19 Mortality to Support Patient Prognosis and Triage and Limits of Current Open-Source Data
  • Cord-id: lr0a48it
  • Document date: 2021_3_24
  • ID: lr0a48it
    Snippet: This study examines the accuracy and applicability of machine learning methods in early prediction of mortality in COVID-19 patients. Patient symptoms, pre-existing conditions, age and sex were employed as predictive attributes from data spanning 17 countries. Performance on a semi-evenly balanced class sample of 212 patients resulted in high detection accuracy of 92.5%, with strong specificity and sensitivity. Performance on a larger sample of 5,121 patients with only age and mortality informat
    Document: This study examines the accuracy and applicability of machine learning methods in early prediction of mortality in COVID-19 patients. Patient symptoms, pre-existing conditions, age and sex were employed as predictive attributes from data spanning 17 countries. Performance on a semi-evenly balanced class sample of 212 patients resulted in high detection accuracy of 92.5%, with strong specificity and sensitivity. Performance on a larger sample of 5,121 patients with only age and mortality information was added as a measure of baseline discriminatory ability. Stratifying - Random Forest - and linear - Logistic Regression - methods were applied, both achieving modestly strong performance, with 77.4%-79.3% sensitivity and 71.4%-72.6% accuracy, highlighting predictive power even on the basis of a single attribute. Mutual information was employed as a dimensionality reduction technique, either greatly improving performance or having negligible impact, showing how a small number of easily retrievable attributes can provide timely and accurate predictions, with applications for datasets with slowly available attributes - such as laboratory results. Limitations of the data were extensively explored and detailed, as each results section outlines a further investigation exploring a facet of its flaws. Future use of this dataset should be cautious and always accompanied by disclaimers on issues of real-life reproducibility. While its open-source nature is a credit to the wider research community and more such datasets should be published, in its current state it is imperfect for most statistical patient-level studies and can produce valid conclusions only for a limited set of applications.

    Search related documents:
    Co phrase search for related documents
    • accuracy machine learning model and logistic regression random forest: 1
    • accuracy random forest and logistic regression: 1, 2, 3, 4, 5, 6, 7, 8
    • accuracy random forest and logistic regression random forest: 1, 2, 3, 4
    • additional point and logistic regression: 1, 2, 3, 4, 5
    • logistic regression and low performance: 1, 2, 3, 4, 5, 6, 7, 8
    • logistic regression random forest and low performance: 1