Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2022 Nov 28;22(1):309.
doi: 10.1186/s12911-022-02057-4.

Comparison of machine learning methods with logistic regression analysis in creating predictive models for risk of critical in-hospital events in COVID-19 patients on hospital admission

Affiliations

Comparison of machine learning methods with logistic regression analysis in creating predictive models for risk of critical in-hospital events in COVID-19 patients on hospital admission

Aaron W Sievering et al. BMC Med Inform Decis Mak. .

Abstract

Background: Machine learning (ML) algorithms have been trained to early predict critical in-hospital events from COVID-19 using patient data at admission, but little is known on how their performance compares with each other and/or with statistical logistic regression (LR). This prospective multicentre cohort study compares the performance of a LR and five ML models on the contribution of influencing predictors and predictor-to-event relationships on prediction model´s performance.

Methods: We used 25 baseline variables of 490 COVID-19 patients admitted to 8 hospitals in Germany (March-November 2020) to develop and validate (75/25 random-split) 3 linear (L1 and L2 penalty, elastic net [EN]) and 2 non-linear (support vector machine [SVM] with radial kernel, random forest [RF]) ML approaches for predicting critical events defined by intensive care unit transfer, invasive ventilation and/or death (composite end-point: 181 patients). Models were compared for performance (area-under-the-receiver-operating characteristic-curve [AUC], Brier score) and predictor importance (performance-loss metrics, partial-dependence profiles).

Results: Models performed close with a small benefit for LR (utilizing restricted cubic splines for non-linearity) and RF (AUC means: 0.763-0.731 [RF-L1]); Brier scores: 0.184-0.197 [LR-L1]). Top ranked predictor variables (consistently highest importance: C-reactive protein) were largely identical across models, except creatinine, which exhibited marginal (L1, L2, EN, SVM) or high/non-linear effects (LR, RF) on events.

Conclusions: Although the LR and ML models analysed showed no strong differences in performance and the most influencing predictors for COVID-19-related event prediction, our results indicate a predictive benefit from taking account for non-linear predictor-to-event relationships and effects. Future efforts should focus on leveraging data-driven ML technologies from static towards dynamic modelling solutions that continuously learn and adapt to changes in data environments during the evolving pandemic.

Trial registration number: NCT04659187.

Keywords: COVID-19; Clinical decision-making; Critical event prediction; Machine learning; Predictive models.

PubMed Disclaimer

Conflict of interest statement

The authors declare that they have no competing interests.

Figures

Fig. 1
Fig. 1
Patient pathways and outcomes. Prediction models use admission data of COVID-19-infected patient´s clinical data on hospital admission for predicting at least one of three critical in-hospital events during the remainder hospital stay
Fig. 2
Fig. 2
Box plots of the goodness of fit as measured by AUC values and Brier score for 50 repeatedly performed data splits for each model approach in model development for predicting critical in-hospital events using COVID-19-infected patient’s data on hospital admission. Box plots show the smallest value (low whisker), lower quartile (lower boundary of the box), median (vertical line in the box), upper quartile (upper boundary of the box), and maximum value (upper whisker)
Fig. 3
Fig. 3
Performance comparison of model approaches for prediction of critical in-hospital events using COVID-19-infected patient’s clinical data on hospital admission displayed as ROC curves. The dashed line indicates random prediction
Fig. 4
Fig. 4
Importance of predictor variables in models predicting critical in-hospital events from COVID-19-infected patient’s clinical data on hospital admission (AF). Permutation based performance loss of all variables for the LR model (A), the regularized regression models L1 (B), L2 (C) and EN (D), and the SVM (E) and RF (F) model
Fig. 4
Fig. 4
Importance of predictor variables in models predicting critical in-hospital events from COVID-19-infected patient’s clinical data on hospital admission (AF). Permutation based performance loss of all variables for the LR model (A), the regularized regression models L1 (B), L2 (C) and EN (D), and the SVM (E) and RF (F) model
Fig. 4
Fig. 4
Importance of predictor variables in models predicting critical in-hospital events from COVID-19-infected patient’s clinical data on hospital admission (AF). Permutation based performance loss of all variables for the LR model (A), the regularized regression models L1 (B), L2 (C) and EN (D), and the SVM (E) and RF (F) model
Fig. 5
Fig. 5
Partial-dependence profiles of creatinine for each model for predicting critical in-hospital events in COVID-19-infected patients on hospital admission. Results from 50 data splits are aggregated using local regression (LOESS). 90% of the creatinine values lie within the area represented by the grey box. The median creatinine value is represented by the vertical line. Creatinine values are given in mg/dl

Similar articles

Cited by

References

    1. Tanne JH, Hayasaki E, Zastrow M, Pulla P, Smith P, Rada AG. Covid-19: how doctors and health care systems are tackling coronavirus worldwide. BMJ. 2020;368:m1090. doi: 10.1136/bmj.m1090. - DOI - PubMed
    1. Zhou F, Yu T, Du R, Fan G, Liu Y, Liu Z, et al. Clinical course and risk factors for mortality of adult inpatients with COVID-19 in Wuhan, China: a retrospective cohort study. Lancet. 2020;395(3):1054–1062. doi: 10.1016/S0140-6736(20)30566-3. - DOI - PMC - PubMed
    1. Yang X, Yu Y, Xu J, Shu H, Xia J, Liu H, et al. Clinical course and outcomes of critically ill patients with SARS-CoV-2 pneumonia in Wuhan, China: a single-centered, retrospective, observational study. Lancet Respir Med. 2020;8(5):475–481. doi: 10.1016/S2213-2600(20)30079-5. - DOI - PMC - PubMed
    1. Phua J, Weng L, Ling L, Egi M, Lim CM, Divatia JV, et al. Intensive care management of coronavirus disease 2019 (COVID-19): challenges and recommendations. Lancet Respir Med. 2020;8(5):506–517. doi: 10.1016/S2213-2600(20)30161-2. - DOI - PMC - PubMed
    1. Tsui ELH, Lui CSM, Woo PPS, Cheung ATL, Lam PKW, Tang VTW, et al. Development of a data-driven COVID-19 prognostication tool to inform triage and step-down care for hospitatlised patients in Hong Kong: a population-based cohort study. BMC Med Inform Decis Mak. 2020;20(1):323. doi: 10.1186/s12911-020-01338-0. - DOI - PMC - PubMed

Associated data