Pneumonia scoring systems for severe COVID-19: which one is better

To investigate the predictive significance of different pneumonia scoring systems in clinical severity and mortality risk of patients with severe novel coronavirus pneumonia. A total of 53 cases of severe novel coronavirus pneumonia were confirmed. The APACHE II, MuLBSTA and CURB-65 scores of different treatment methods were calculated, and the predictive power of each score on clinical respiratory support treatment and mortality risk was compared. The APACHE II score showed the largest area under ROC curve in both noninvasive and invasive respiratory support treatment assessments, which is significantly different from that of CURB-65. Further, the MuLBSTA score had the largest area under ROC curve in terms of death risk assessment, which is also significantly different from that of CURB-65; however, no difference was noted with the APACHE II score. For patients with COVID, the APACHE II score is an effective predictor of the disease severity and mortality risk. Further, the MuLBSTA score is a good predictor only in terms of mortality risk.


Introduction
A novel coronavirus pneumonia outbreak in Wuhan, China, in December 2019, has had a major impact globally. This disease was named as "Corona Virus Disease 2019" , and this new type of corona virus was named as SARS-CoV-2 by the World Health Organization. According to the 7th edition of the Chinese National Health Commission, such patients can be categorized into light, normal, and severe depending on their clinical symptoms and test results [1]. However, a proper methodology to reflect the degree of the disease and predict the disease development still does not exist.
The CURB-65 (confusion, urea, respiratory rate, blood pressure, and age 65) scoring system [2] is being used as a measure of the severity of community-acquired pneumonia, which can be combined with other clinical parameters to assess if the patients need to be hospitalized or transferred to the intensive care unit (ICU). The APACHE II (acute physiology and chronic health evaluation II) scoring system [3,4] is being used to evaluate the condition of patients in ICU using 12 parameters. Currently, this method is being widely used clinically due to its capability of distinguishing the severity of the disease. The MuLBSTA (multilobular infiltration, hypo-lymphocytosis, bacterial coinfection, smoking history, hyper-tension, and age) scoring system is an easy-to-use clinical tool to predict the risk of mortality in high-risk and low-risk groups of patients with viral pneumonia. With the use of this method, hospitalized patients with viral pneumonia can be classified into relevant risk categories to acquire guidance for further clinical decision making [5]. However, the effectiveness of these three scoring systems in assessing COVID-19 has not yet been reported.

Inclusion criteria
This single-center, retrospective observational clinical study was approved by the Ethics Committee of the General Hospital of central theater command of PLA (2020-008-1). A total of 53 cases of severe novel coronavirus pneumonia were confirmed in the General Hospital of central theater command of People's Liberation Army between 1, January 2020 and 4, March 2020 ( Fig. 1).
The inclusion criteria for the study were as follows: (1) All patients were confirmed positively by SARS-CoV-2 nucleic acid RT-PCR (Ct value ≤ 38.0, BGI, Shenzhen, China) using specimens derived from oropharyngeal swabs or sputum, prior to or during the hospitalization; and (2) Patients with the severe form of the disease were categorized based on the 7th edition of the Chinese National Health Commission, which included meeting any of the following criteria: (1) shortness of breath, respiratory rate ≥ 30 beats/min; (2) oxygen saturation ≤ 93% in the resting state; (3) arterial blood oxygen partial pressure (PaO 2 )/oxygen concentration (FiO 2 ) ≤ 300 mmHg (1 mmHg = 0.133 kPa); and (4) lung images showing obvious progress of lesions > 50% within 24-48 h.
The exclusion criteria for the study were as follows: (1) Age < 18 years; (2) Patients with definite diagnosis of cancer; (3) Long-term hospitalization ≥ 3 m before death; (4) Presence of unconsciousness before admission; and (5) Patients receiving renal replacement therapy.

Data acquisition
Data related to demography, underlying comorbidities, symptoms, physical and radiological findings, laboratory values, and respiratory and physiologic parameters of the subjects while receiving mechanical ventilation were collected from electronic and paper medical records. We used a positive bacterial culture of blood and sputum samples as the criteria for bacterial growth. The APACHE II, MuLBSTA, and CURB-65 scores were calculated for different treatment time points, and the predictive power of each score for treatment with clinical respiratory support and respective mortality risk was compared.

Observational indicators
High-flow oxygen inhalation, noninvasive ventilator support, and invasive ventilator support were used as the three treatment methods. The APACHE II, MuLBSTA, and CURB-65 scoring systems were used to calculate the patient scores at each time point. The area under the receiver operating characteristic (ROC) curve was used to calculate the hierarchical boundary values of each scoring model for each treatment method [6]. The sensitivity and specificity of all the values were calculated, and the difference in area under ROC curve (AUROC) of each scoring model for the same treatment was compared. The patients were divided into high-flow oxygen inhalation group, noninvasive ventilator support group, and invasive ventilator support group. They were also categorized into death and non-death groups. The categorization was based on the severity of the patient's condition and the outcome. The APACHE II, MuLBSTA, and CURB-65 scores for the high-flow oxygen inhalation, Fig. 1 Research process noninvasive ventilator support, and invasive ventilation support groups prior to intubation were recorded. Further, the APACHE II, MuLBSTA, and CURB-65 scores in the death group were recorded on the day of death.

Statistical methods
Software Package for Social Sciences (IBM SPSS 25.0) was used for statistical analysis. Dates were described with median and range of continuous variables as well as frequency and percentage of categorical variables. The performance of each scoring system was evaluated by measuring the AUROC. Further, the χ 2 test was used to calculate sensitivity and specificity. The different scoring models used different ROC curve areas for comparison.

Basic information
Out of the 53 patients, 27 patients in the high-flow nasal catheter oxygen therapy group were cured and discharged. The remaining 26 patients underwent noninvasive ventilator support. Out of these, 20 patients further underwent endotracheal intubation; however, 16 patients could not be cured and eventually died. One of the patients who died had only received noninvasive ventilator treatment but not endotracheal intubation. The median time from onset to admission was 7 days, onset to noninvasive ventilator support was 12 days, onset to invasive ventilator support was 20 days, onset to death was 25 days, and onset to discharge was 35 days. The other demographic characteristics are listed in Table 1.

Cut-off values of CURB-65, APACHE II, and MuLBSTA for predicting the risk of noninvasive ventilator support, invasive ventilator support, and mortality
In terms of the cut-off values of CURB-65, 1.5 points was used for noninvasive ventilator support, 2.5 points for invasive ventilator support and mortality. In terms of the cut-off values of APACHE II, 9.5 points was used for noninvasive ventilator support, 12.5 points for invasive ventilator support and 11.5 points for mortality. In terms of the cut-off values of MuLBSTA, 8.5 points was used for noninvasive ventilator support, 10.5 points for invasive ventilator support and 13.5 points for mortality. These have been listed in Table 2.

Comparison of area under ROC curve of three scoring models in each group
On evaluating the three scoring models for noninvasive ventilator support, the area under the ROC curve of the APACHE II scoring model was identified to be the largest and statistically different from that of the MuLBSTA and CURB-65 models (P = 0.0046 and 0.0059, respectively). Further, no statistical difference was identified between the MuLBSTA and CURB-65 models (P = 0.9369). The assessment of the need for invasive ventilator support revealed that the AUROC of the APACHE II scoring model was the largest, statistically different from the CURB-65 scoring model (P = 0.0372), and identical with the MuLBSTA scoring model (P = 0.2708). When assessing mortality, the AUROC of the MuLBSTA scoring model was identified to be the largest, which was statistically different from CURB-65 (P = 0.0021). However, no difference was noted with APACHE II (P = 0.0549). These findings are listed in Table 3 and shown in Figs. 5, 6, and 7.

Multivariate analysis of individual risk factors in each model for DEATH and INTUBATION in patients with COVID-19
On evaluating the individual risk factors in each model for death in patients with COVID-19, bacterial coinfection and age ≥ 60 years from MuLBSTA scoring model, breathing rate ≥ 30/min and age ≥ 65 years from CURB-65 scoring model were considered to be statistically   Tables 4 and 5.

Discussion
In this study, we analyzed 53 patients with a severe form of the disease in our hospital. These patients were tested positive for the nucleic acid test between January 2020 and February 2020. In terms of demographic characteristics, the patients were older, mostly male, and had underlying diseases similar to those described by other scholars [7]. However, Wang et al. identified that among the COVID patients, 54.3% were male and 45.7% were female, showing no gender difference [8]. Only severe cases were included in our study; the rate of patients on noninvasive ventilator support, patients on invasive ventilator support, and mortality was identified to be 49.1%, 37.7%, and 30%, respectively, which was similar to the results of other studies [9,10]. The median time from onset to admission was 7 days, onset to noninvasive ventilator treatment was 12 days, and onset to invasive ventilator treatment was 20 days. The obtained data were found to be similar to that of previous studies [11,12]. The median time from onset to discharge was 35 days and from onset to death was 25 days. According to current studies, early respiratory support treatment can improve the condition of patients with severe COVID-19 [1]. However, such treatments are normally administered based on a single test or simple clinical experience of doctors, which has significant limitations. In our study, three scoring systems are used to calculate the approximate scores of each respiratory treatment method. This can help clinicians judge and perform reasonable and timely respiratory management. Our research suggests that high-flow oxygen inhalation can be considered when the APACHE II score < 9.5, MuLBSTA score < 8.5, or CURB-65 score < 1.5. Further, noninvasive ventilator support can be considered when the APACHE II score ranges from 9.5 to 12.5, MuLBSTA score ranges from 8.5 to 10.5, or CURB-65 score ranges from 1.5 to 2.5, and invasive ventilator support can be considered when the APACHE II score > 12.5, MuLBSTA score ≥ 10.5, or CURB-65 score ≥ 2.5. Patients may be at risk of death when the APACHE II score > 11.5, MuLB-STA score > 13.5, or CURB-65 score > 2.5.
The APACHE II score is a classic tool for assessing the severity of the disease in patients in the ICU [4,13]. The higher the score, the more critical the situation, worse the prognosis, and higher the mortality [13,14]. Wang et al. determined that the median APACHE II score of patients with severe novel coronavirus pneumonia was 17 (10)(11)(12)(13)(14)(15)(16)(17)(18)(19)(20)(21)(22) [8], which is also consistent with The AUROC of the MuLBSTA scoring model was identified to be the largest, which was statistically different from CURB-65 (P = 0.0021), and no difference was noted with APACHE II (P = 0.0549) our research. The APACHE II score is better than the scores of the other two methods when evaluating noninvasive respiratory support treatment (P = 0.0046 and 0.0059, respectively). In terms of invasive respiratory support therapy, the APACHE II score is better than the CURB-65 score (P = 0.0372). Further, the APACHE II score is also better than that of CURB-65 (P = 0.0150) in predicting mortality risk. Therefore, with comprehensive consideration, the APACHE II score is first recommended when assessing the overall condition of patients with COVID.
The MuLBSTA score assesses the risk of death from viral pneumonia [5,15]. Patients with MuLBSTA score > 12 are categorized as the high-risk group [7]. Further, in our study, the patients are at risk of death when MuLBSTA score > 13.5. The MuLBSTA score has a sensitivity of 0.6364 and specificity of 0.9355 when assessing the risk of death, as reported through studies [5]. We also identify the MuLBSTA score to be better compared to the CURB-65 score in assessing death risk (P = 0.0021). Therefore, we recommend MuLBSTA score as the first choice when predicting only the risk of death.
The CURB-65 score is often used to assess the severity of community-acquired pneumonia, which requires only few assessment tools [16]. Owing to its simplicity and low score, the CURB-65 score has high sensitivity and low specificity when assessing a condition [17]. It is necessary to combine other parameters of the patient with the CURB-65 score to reach a final clinical judgment. Similarly, in our study, we find that the CURB-65 score is not as efficient as the other two scoring systems in assessing the necessity of respiratory support and death risk. Therefore, based on our study results, the CURB-65 score is not recommended for assessment of patients with COVID.
Our study has few limitations. It is a single-center, retrospective study with a relatively small sample size. Further, there is a certain degree of clinical data deficiency.