This article focuses on:
- Data Sources and Processing
- Potential Predictive Variables
- Variable Selection and Score Construction
- Assessment of Accuracy
- Score Validation
- Characteristics of the Development Cohort
- Predictor Selection
- Construction of the Risk Score and Web-Based Calculator
- The Performance of COVID Risk Score
- Validation of COVID-GRAM
To develop and validate a clinical score at hospital admission for predicting which patients with COVID-19 will develop critical illness based on a nationwide cohort in China.
Data Sources and Processing
- Medical records were obtained from laboratory-confirmed hospitalized cases with COVID-19 reported to the China National Health Commission between November 21, 2019 and January 31, 2020
- The National Health Commission of China requested that all 1855 hospitals in China that were designated to care for COVID-19 patients submit the clinical records of all hospitalized COVID-19 cases without selection to the database by January 31, 2020.
- Data was used from the 575 hospitals that contributed clinical data by the deadline.
- COVID-19 diagnoses were confirmed by positive highthroughput sequencing or real-time reverse-transcription polymerase-chain-reaction (RT-PCR) assay for nasal and pharyngeal swab specimens.
- A team of experienced respiratory clinicians reviewed, abstracted and cross-checked the data.
- Each record was checked independently by 2 clinicians.
- All patients with data on clinical status at hospitalization were included.
Potential Predictive Variables
- Potential predictive variables included the following patient characteristics at hospital admission: clinical signs and symptoms, imaging results, laboratory findings, demographic variables, and medical history.
- Demographic variables collected for the study included age, sex, smoking status, exposure to Wuhan, residency in Hubei province, and time between onset of symptoms to admission.
- Medical history included a number of comorbidities, chronic obstructive pulmonary disease, diabetes, hypertension, coronary artery disease, cerebrovascular disease, hepatitis B, cancer, chronic renal disease, immunodeficiency disease, and pregnancy.
- Clinical signs and symptoms included categorical and continuous variables: first body temperature, respiratory rate, heart rate, cardiac arrhythmia, systolic blood pressure, diastolic blood pressure, symptoms rating, fever, conjunctival congestion, nasal congestion, headache, cough, expectoration, sore throat, fatigue, hemoptysis, dyspnea, nausea and vomiting, diarrhea, arthralgia and myalgia, rigor, throat blockage, tonsillar enlargement, enlarged lymph nodes, skin rash, and unconsciousness.
- Imaging results included chest radiography (CXR) abnormality, the severity of CXR abnormality, chest computed tomographic (CT) imaging abnormality, and the severity of CT abnormality.
- Laboratory findings included partial arterial oxygen pressure, oxygen saturation, white blood cell, lymphocyte, and platelet counts, neutrophil to lymphocyte ratio, and levels of hemoglobin, C-reactive protein, procalcitonin, lactate dehydrogenase, aspartate transaminase, direct bilirubin, indirect bilirubin, total bilirubin, creatine kinase, creatinine, hypersensitive troponin I, albumin, serum sodium, serum potassium, serum chlorine, D-dimer levels, prothrombin time, and activated partial thromboplastin time.
- Critical COVID-19 illness was defined as a composite of admission to the intensive care unit (ICU), invasive ventilation, or death.
- This composite end point was adopted because admission to ICU, invasive ventilation, and death are serious outcomes of COVID-19 that have been adopted in previous studies to assess the severity of other serious infectious diseases.
Variable Selection and Score Construction
- All 1590 patients hospitalized with COVID-19 in the development cohort were included for variable selection and risk score development.
- 72 variables were entered into the selection process.
- Least Absolute Shrinkage and Selection Operator (LASSO) regression was applied to minimize the potential collinearity of variables measured from the same patient and over-fitting of variables.
- Imputation for missing variables was considered if missing values were less than 20%.
- Predictive mean matching was used to impute numeric features, logistic regression to impute binary variables and Bayesian polytomous regression to impute factor features.
- L1-penalized least absolute shrinkage and selection regression was used for multivariable analyses, augmented with 10-fold cross validation for internal validation.
- With larger penalties, the estimates of weaker factors shrink toward zero, so that only the strongest predictors remain in the model.
- The most predictive covariates were selected by the minimum (λ min).
- The R package “glmnet” statistical software (R Foundation) was used to perform the LASSO regression.
- Subsequently, variables identified by LASSO regression analysis were entered into logistic regression models and those that were consistently statistically significant were used to construct the risk score (COVID-GRAM),7 which was then used to construct a web-based risk calculator.
Assessment of Accuracy:
- The accuracy of COVID risk score was assessed using the area under the receiver-operator characteristic curve (AUC).
- AUC was used to compare the accuracy of the COVID-GRAM with CURB-6 models, which have been used in classification of the severity of community-acquired pneumonia.
- For internal validation of the accuracy estimates and to reduce overfit bias, we used 200 bootstrap resamples.
- Statistical analysis was performed with R software and P < .05 was considered statistically significant.
- To validate the generalizability of COVID risk score, data was used from hospitals that were not included in the development cohort including 710 patients.
- Data for the validation cohort were pooled from 4 sources:
(1) a multicenter cohort of hospitals from 10 cities in Hubei province thatmissed the deadline for data submission, but subsequently submitted data on cases admitted before the January 31, 2020
(2) Daye Hospital (near Wuhan)
(3) The First People’s Hospital of Foshan (Guangdong province), and Nanhai People’s Hospital of Foshan (Guangdong province).
- The variables required for calculating the COVID risk score from the validation cohort were collected and cross-checked by 2 experienced physicians and the risk score was calculated as described herein for the development cohort.
Characteristics of the Development Cohort
- In the development cohort, data was collected from 1590 patients from 575 hospitals in 31 provincial administrative regions between November 21, 2019 and January 31, 2020.
- At hospital admission, 24 of 1590 patients were considered to be severe and the rest were considered to be mild.
- A total of 131 patients eventually developed critical illness.
- The overall mortality was 3.2% and 1334 patients had a history of exposure to Wuhan.
- Overall, the mean (SD) age of patients in the cohort was 48.9 years; 904 patients were mens and 399 had at least 1 coexisting condition, including hypertension , diabetes and cardiovascular disease as the top 3 comorbidities
- Fever, dry cough , fatigue , productive cough and shortness of breath were the most common symptoms.
- Most patients had abnormal chest CT findings.
- Seventy-two variables measured at hospital admission were included in the LASSO regression.
- After LASSO regression selection, 19 variables remained significant predictors of critical illness, including clinical features and blood test results, CXR abnormality, age, exposure to Wuhan, first and highest body temperature, respiratory rate, systolic blood pressure, hemoptysis, dyspnea, skin rash, unconsciousness, number of comorbidities, chronic obstructive pulmonary disease (COPD), cancer, oxygen saturation levels, neutrophils, neutrophil to lymphocyte ratio, lactate dehydrogenase, direct bilirubin, and creatinine levels.
- Inclusion of these 19 variables in a logistic regression model resulted in 10 variables that were independently statistically significant predictors of critical illness and were included in risk score.
- These variables included CXR abnormality, age, hemoptysis, dyspnea , unconsciousness, number of comorbidities, cancer history, neutrophil-to-lymphocyte ratio, lactate dehydrogenase and direct bilirubin.
Construction of the Risk Score and Web-Based Calculator
- The COVID risk score was constructed based on the coefficients from the logistic model.
- An online calculator based on COVID-GRAM was developed to allow clinicians to enter the values of the 10 variables required for the risk score with automatic calculation of the likelihood that a hospitalized patient with COVID-19 will develop critical illness.
The Performance of COVID Risk Score
- By internal bootstrap validation, the mean AUC based on data from the development cohort was 0.88.
- The AUC of COVID risk score for patients in the epicenter at Hubei was 0.87 and outside Hubei was 0.82.
- The predictive value of COVID-GRAM was higher than the CURB-6 model, which had an AUC of 0.75 for correct prediction of development of critical illness (P < .001).
Validation of COVID-GRAM
- The validation cohort included 710 patients with a mean age of 48.2 years, 382 were men and had at least 1 coexisting condition.
- Critical illness eventually developed in 87 of these patients and 8 died.
- The accuracy of COVID risk score in the validation cohort was similar to that of the development cohort with an AUC in the validation cohort of 0.88
- In this study, validated a clinical risk score was developed and a web-based risk calculator to predict the development of critical illness among hospitalized COVID-19 infected patients.
- The performance of this risk score was satisfactory with accuracy based on AUCs in both the development and validation cohorts of 0.88.
- The web-based calculator can be used by clinicians to estimate an individual hospitalized patient’s risk of developing critical illness.
- The 10 variables required for calculation of the risk of developing critical illness are generally readily available at hospital admission, and the web-based calculator is easy to use.
- If the patient’s estimated risk for critical illness is low, the clinician may choose to monitor, whereas high-risk estimates might support aggressive treatment or admission to the ICU.
- In areas with high case volume and/or limited resources, the decision might be to provide less aggressive care to moderate-risk patients to maximize availability of IUC beds and ventilators.
- Chest radiographic abnormality, age, hemoptysis, dyspnea, unconsciousness, number of comorbidities, cancer history, neutrophil-to-lymphocyte ratio, lactate dehydrogenase, and direct bilirubin were included in the COVID risk score.
- Previous studies have found several of these variables to be risk factors for severe illness related to COVID-19.
- Wu et al3 found that older age and more comorbidities were associated with a higher risk of developing ARDS in patients infected with COVID-19.
- A previous study from our group found that patients with COVID-19 with cancer had higher risk of severe events compared with patients without cancer..
- Zhou and colleagues found lower lymphocyte count, higher lactate dehydrogenase, and more imaging abnormalities in patients who died from COVID-19 disease.
- A modest sample size for constructing the risk score and a relatively small sample for validation.
- The data for score development and validation are entirely from China, which could potentially limit the generalizability of the risk score in other areas of the world.
- Additional validation studies of the COVID risk score from areas outside China should be completed.
- In this study, a risk score was developed and web-based calculator to estimate the risk of developing critical illness among patients with COVID-19 based on 10 variables commonly measured on admission to the hospital.
- Estimating the risk of critical illness could help identify patients who are and are not likely to develop critical illness, thus supporting appropriate treatment and optimizing the use of medical resources.