现在的位置: 首页研究点评, 进展交流>正文
[JAMA发表述评]:重症患者器官功能衰竭评价的更新
2025年11月09日 研究点评, 进展交流 [JAMA发表述评]:重症患者器官功能衰竭评价的更新已关闭评论

Editorial 

A Revision to Organ Failure Assessment in Critically Ill Patients

Christopher W. Seymour

JAMA Published Online: October 29, 2025

doi: 10.1001/jama.2025.20255

Modern intensive care of critically ill patients is evolving. Nearly 1 in 5 hospitalized patients receive intensive care, and there are new diseases, treatments, and approaches to organ support. Many intensive care unit (ICU) patients develop acute, vital organ failure—a nefarious and unremitting cause of death—and structured measures of vital organ function help to quantify illness severity.1 These scores were developed almost 50 years ago and were meant to be generic and independent of the cause of multiple organ failure.2,3 Now, organ failure scores are still incorporated into contemporary risk prediction models, syndromic criteria like Sepsis-3, and disaster triage tools and are also used to compare ICU populations and outcomes from randomized clinical trials.4,5 Organ failure scores are used widely by clinicians, researchers, and quality improvement teams. But how these scores should evolve with changes in modern intensive care is debated.

Among the many options for quantifying organ function, the Sequential Organ Function Assessment (SOFA) score was meant to reconcile conflict and create a standardized tool for randomized trials.3 Developed in 1994 at a consensus meeting and published shortly thereafter, the SOFA score uses 6 domains to capture the severity of organ injury—respiratory, cardiovascular, hepatic, neurological, coagulation, and kidney. The total score ranges from 0 to 24 points, with 0 to 4 points assigned to each domain. The clearly stated principles guiding the SOFA score development were that (1) organ failure is on a continuum and is not a dichotomous process that is present or absent; (2) the score must be adaptable to changes during a patient’s course over time; (3) simplicity, generalizability, objectivity, and independence from treatment interventions are paramount; and most importantly, (4) the score was meant to describe organ function and not predict outcome. Its purpose is distinct from other scores in assessing critically ill patients, such as the Acute Physiology and Chronic Health Evaluation (APACHE) or multiple organ dysfunction score (MODS), which use multiple data points to generate accurate predictions of in-hospital mortality.2,6 The SOFA score, in contrast, may be a more of an accessible bedside tool that is useful at different time points during intensive care to capture a patient’s response to therapeutic interventions. For example, Ferreira et al7 reported that a change in a patient’s SOFA score during intensive care was strongly associated with outcome and that a SOFA increase from baseline was associated with a more than 50% mortality rate. The SOFA score has since been adapted for pediatric patients, modified for measurement in low-resource settings, made computable in the electronic health record, and even simplified to just 3 organ domains.4,8,9 Now, clinicians may also be aware of the SOFA score as part of the international consensus of definitions for sepsis and septic shock (Sepsis-3) after the SOFA score was evaluated in more than 2 million patients.4 In these criteria, a suspicion for infection and an increase of 2 or more SOFA points from baseline identified an in-hospital mortality of nearly 10% among patients with sepsis.

There are many issues with the SOFA score that deserve consideration. First, the organs included and number of score categories per organ were subjectively determined. The impact of critical illness on the gut or immune system were not included. Second, the SOFA variables specifying the loss of organ function may not in fact represent the true function of the organ. An example is how the Glasgow Coma Score scale for the neurological domain of the SOFA score measures the severity of brain injury due to trauma but is a less reliable measure of delirium or septic encephalopathy. Third, the thresholds for an increase of a single SOFA point in any organ were decided arbitrarily without data analysis. Fourth, the SOFA score did not distinguish between so-called chronic organ dysfunction from what occurs acutely during critical illness. For example, 2 SOFA points for kidney dysfunction in an elderly patient with hypertension and chronic kidney disease is equivalent to that scored for a young, healthy patient with new acute kidney injury after a motor vehicle accident—despite different biological mechanisms and chance for recovery. Fifth, the cardiovascular and respiratory SOFA domains are measured by the treatment provided for organ support rather than as a biological measure of organ function. And notably, new types of organ support such as vasopressin and high-flow oxygen therapy are not even included.10 This subjectivity of the SOFA score is criticized because 2 intensivists, faced with the same patient, could make very different decisions about the type and amount of organ support to provide. With forethought, though, the SOFA authors noted at the outset that their proposed score was “not definitive” and should be revised with new data.3

In this issue of JAMA,11 and in an accompanying article in JAMA Network Open,12 Moreno and colleagues present the development and validation of the SOFA-2 score, a data-driven update to the SOFA (now SOFA-1) score from 1994. The update proceeded in 8 stages, beginning with expert selection for 2 modified Delphi meetings and systematic reviews (stages 1 to 5). A panel of 60 experts reaffirmed the original principles of the SOFA score, identified evidence gaps, and developed a theoretical framework for the score update. They proposed evaluating 8 organ domains, including the original 6 from SOFA-1 plus gastrointestinal and immune domains. New variables to reflect organ dysfunction were proposed (ie, the arterial oxygen saturation to fraction of inspired oxygen [Sao2:Fio2] ratio when the arterial oxygen tension [Pao2] to Fio2 ratio was not available) as well as the inclusion of new treatments in the shock domain (ie, vasopressin). With access to 8 national ICU registries and 2 multicenter electronic health record–based datasets from 1319 intensive care units in 9 countries, they conducted internal validation of the proposed score in 4 cohorts (stage 6). Their analysis of 2 098 356 patients revealed 2 key findings: (1) machine learning models found candidate variable thresholds for SOFA-2 similar to those proposed by experts in the modified Delphi process, and (2) measures of gastrointestinal and immune function were not strongly associated with outcome. An iterative Delphi (stage 7) then excluded these domains from the final score for external validation (stage 8). The authors then proceeded with a variety of confirmatory analyses across 2.5 million patients that included different methods for missing data, individual organ evaluations, and assessments of SOFA-2 distributions with ICU mortality. Not surprisingly, an increase in the SOFA-2 score is correlated with worse ICU outcomes. For example, a 1-unit increase in SOFA-2 was associated with an increase in the odds of ICU mortality of 38% (odds ratio, 1.378; 95% CI, 1.375-1.381).

But what has changed in SOFA-2? One observation is that the SOFA-2 score is different from the SOFA-1 score for nearly half of ICU patients. The SOFA-2 score was greater in 11% of patients (median difference, 2 [IQR, 1 to 3] points) and lower in 40% of patients (median difference, −3 [IQR, −4 to −1]) than the corresponding SOFA-1 score. What accounts for these differences? It is likely that new threshold values and category criteria in respiratory, cardiovascular, and kidney systems resulted in the intermediate distributions more aligned with mortality. As evidence of these new thresholds, membership of a cardiovascular SOFA score of 2 had a larger proportion of patients when assessed by SOFA-2 (9%), compared with SOFA-1 (1%). Additional changes are the inclusion of modern treatments in SOFA-2 score criteria. Interventions like high-flow nasal oxygen, noninvasive ventilation, extracorporeal membrane oxygenation, sedative agents, and a more detailed stratification of vasopressors now inform point scores.

What did not change in SOFA-2 is also notable. The SOFA score retained 6 organ system domains and forced a 5-category (4-point) framework onto each domain. Metaphorically, if each domain was a ladder, the authors have standardized the number of rungs on the ladder and that they are equally spaced. Their analyses do not definitively establish whether 4 rungs is optimal, nor whether a 1 rung change in each organ corresponds to the same change in outcome. For example, an increase from 0 to 1 point in the hepatic domain (ie, total bilirubin increase from 1.1 mg/dL to 1.3 mg/dL [18.81-22.24 μmol/L]) is deemed equivalent to an increase from 1 to 2 points in the cardiovascular domains (ie, mean arterial pressure of 68 mm Hg for the patient requiring a norepinephrine dose of 0.19 µg/kg/min). The predictive validity also did not change in the updated score. When evaluating all the data available in the 2-stage, meta-analysis, the area under the receiver operating characteristic (AUROC) curve estimates were similar for SOFA-2 (AUROC, 0.79) and SOFA-1 (AUROC, 0.77) for ICU mortality.

What issues were left unresolved? First, the complicated issue of chronic organ dysfunction was left untouched. Although likely debated during the modified Delphi process, no consideration for chronic, failing organs is included in the final SOFA-2. It is possible that the SOFA-2 score will be unchanged whether organ dysfunction is acute, chronic, or acute-on-chronic disease, but that the meaning of that score will be different. A second issue is that of missing data. Often, when organs are functioning well, the clinical team will not measure laboratory tests, so data are not available to populate the SOFA-2 score. This nonrandom missingness is evident in the different SOFA-2 distributions using complete-case and normal-value substitution data. How missing data are ultimately handled in SOFA-2 likely depends on the user and use case. Third, some organ system domains continue to rely on organ-support interventions to categorize point scores—an approach that is subject to clinician practice variability. Fourth, to their credit, the authors involved more than 1000 ICUs across many countries. Yet the SOFA-2 score was not evaluated for patients outside the ICU—where its inclusion in Sepsis-3 criteria is recommended. Fifth, SOFA-2 assumes, without verification, that the failure of organs is independent across domains, such that complete failure in 1 organ (4 points) is equivalent to moderate failure (2 points) in 2 different organs. The variable combination of organ failures was not studied.

So is the SOFA-2 score the ideal measure of organ dysfunction in critically ill patients? Thirty years ago, many of the same authors stated a simple, easy, reliable, routinely available, continuous score independent of patients, case mix, and intervention was ideal.3 SOFA-2 comes close, but many future opportunities remain. The SOFA score, despite its many applications, is yet to become a bedside tool, discussion topic on ICU rounds, or a score that directly influences clinical practice. Perhaps more parsimony or embedding in the electronic health record could move this agenda forward. The fundamental assumption that SOFA-2 has adequate construct validity is also worth evaluating. A firm link to the underlying biology of organ failure could help limit the reliance on organ support treatments for score categories. Finally, a greater understanding of how the SOFA-2 score relates to demographics, comorbidities, and types of critical illness is an imperative. This will ensure that a similar change in a SOFA-2 score for an older patient with septic shock has the same meaning as for a young adult with hemorrhagic shock.

Taken together, SOFA-2 has been thoughtfully prepared and thoroughly studied. But like SOFA-1, it should not be considered definitive, and the measurement of organ function in critically ill patients should continue to evolve.

抱歉!评论已关闭.

×
腾讯微博