Your Mileage May Vary: Toward Personalized Oxygen Supplementation

Derek C. Angus

JAMA. Published online March 19, 2024. doi:10.1001/jama.2024.0972

When car makers tout standard emission tests to demonstrate their car’s fuel efficiency, they must disclose that the results are just estimates and that a prospective buyer’s experience may differ.¹ The phrase. “your mileage may vary” is used to acknowledge that the average result from a standard test may not reflect an individual user’s experience. This phrase is now so ubiquitously understood that the phrase appears on more than 70 million web pages and is routinely shortened to YMMV in online reviews of topics far beyond fuel efficiency.

In medicine, things are different. The standard test is the randomized clinical trial (RCT), which also generates an average, not an individual, effect. Treatment guidelines acknowledge physician judgment is needed to determine if RCT results apply to their patient. However, guidelines typically recommend that physicians prescribe to all patients those interventions that worked in applicable populations and withhold those that do not. Quality improvement efforts further rely on the average effect when they set compliance targets. In other words, the emphasis is on the average effect and not the possibility that your mileage may vary.

Using the average effect as a guide for everyone works in many situations. If all that varies for a therapy with an average beneficial effect is the magnitude of the benefit, one could start that therapy at a low dose and adjust as needed, particularly if there were no downsides to titrating slowly and cessation of the therapy quickly resolved any untoward adverse effects. But, what if an intervention, even in applicable populations, helped some patients but hurt others? And what if, once one approach was selected, an alternative was no longer possible? In this issue of JAMA, Buell et al² consider such a potential situation: supplemental oxygen support for patients requiring mechanical ventilation in the intensive care unit (ICU).

Supplemental oxygen, listed as an essential medicine by the World Health Organization,³ is used to avert life-threatening tissue hypoxia in many conditions, and may be the world’s most prescribed therapy. Oxygen is provided to every patient receiving mechanical ventilation, set as a fraction of inspired gas. The fraction is adjusted to maintain an arterial oxygen saturation target, typically measured by continuous pulse oximetry (with or without intermittent arterial blood gas estimation). The challenge comes in choosing the target. One can set a target close to 100%, thereby giving lots of oxygen and providing the best chance of avoiding vital organ tissue hypoxia (a liberal oxygen approach). Alternatively, one can give just enough oxygen (a conservative oxygen approach) to avoid arterial hypoxemia (below 90%), thereby minimizing the likelihood of toxic hyperoxia.⁴ Unfortunately, there is no easy way to measure directly either tissue hypoxia or oxygen toxicity. Furthermore, any consequences may not be apparent until days or weeks later.

Liberal oxygen titration is traditionally used, but some studies suggested a conservative approach is safer and other studies found the opposite or no difference between the liberal or conservative targets. More recently, 2 large RCTs examined this question: the Intensive Care Unit Randomized Trial Comparing Two Approaches to Oxygen Therapy (ICU-ROX) trial,⁵ conducted in 21 ICUs in Australia and New Zealand (n = 965) and the Pragmatic Investigation of Optimal Oxygen Targets (PILOT) trial,⁶ conducted at Vanderbilt University Medical Center in a single ICU. PILOT was a cluster RCT, evaluating 2541 patients in 18 2-month clusters randomly assigned to deploy liberal, conservative, or intermediate approaches. Neither RCT found any (average) difference in outcome with any approach. Rather than be reassured, Buell and colleagues worried that the possibility remained that some patients were helped and others harmed. They conducted a secondary analysis of the 2 trials to determine whether there was important heterogeneity of treatment effects across individuals.

The authors used the liberal and conservative groups from PILOT as a development set for the construction of a machine-learning model to estimate individual effects, given a set of prerandomization predictor variables. In contrast, to prediction of a patient’s outcome (a traditional risk prediction model), these models predict the likely effect that therapy will have on that individual’s outcome (an effect model). Once the authors selected the best performing effect model, they used ICU-ROX as a validation set. The model predicted that individual treatment effects of conservative oxygen in ICU-ROX patients ranged from a 27% reduction to a 34% increase in absolute mortality. Ranking patients by their individual treatment effect and grouping them in thirds, those in the group predicted most likely to benefit from conservative oxygen had a 6% lower mortality rate if exposed to conservative oxygen; in the group most likely to benefit from liberal oxygen, those receiving liberal oxygen had a 13% lower mortality. In the middle third, outcomes were similar with either strategy. If the model had been deployed for all patients, mortality would have been 6.4% lower than that observed when therapy was assigned randomly.

So what does this all mean? The authors’ methods were generally robust. Their approach is discussed further, together with a broader consideration of effect models, by Wang et al in this issue.⁷ The work of Buell and colleagues is somewhat at odds with a recent statement on how and when to conduct heterogeneity of treatment effects analyses, which recommends using risk models, not effect models, and avoiding exploration of individual treatment effect when no average effect was demonstrated.⁸ However, the field is young. Future best practice statements will likely evolve, and the authors’ rationale and methods appear well-justified in this instance.

If the results are true and generalizable, then the consequences are staggering. If one could instantly assign every patient into their appropriate group of predicted benefit or harm and assign their oxygen target accordingly, the intervention would theoretically yield the greatest single improvement in lives saved from critical illness in the history of the field. But, there are key issues to consider. Effect models tend to overfit data and find more heterogeneity than exists.⁸^-10 In other words, the magnitude of heterogeneity may be exaggerated. The likelihood of overfitting is decreased, but not eliminated, through use of external validation.

A central issue in estimating individual effects is the impossibility of observing the outcome of interest—how one’s outcome is changed by therapy—because that would require access to 2 parallel universes. This dilemma is Holland’s¹¹ fundamental problem of causal inference. Consequently, information on the best approach to estimate heterogeneity of treatment effects comes from exploration in simulated trial data sets where one can construct parallel universes and create situations with and without heterogeneity of treatment effects. These simulation exercises demonstrate that model performance is highly dependent on the size of the sample and the degree to which variables that play an important role in modulating a therapy’s effect, or strong proxies for these variables, are captured in the data.⁹^,10 Thus, whether models built on larger or richer trial data sets would yield similar results to those of Buell et al is not known.

Overinterpreting any implications about what variables were selected by the model should also be avoided. For example, the finding that patients with sepsis were more likely to be assigned to the third group that benefited most from liberal oxygen does not mean that all patients with sepsis should receive liberal oxygen, nor does it mean that sepsis caused conservative oxygen therapy to be harmful. Other variables that appeared important in the analysis of Buell and colleagues, such as prerandomization blood pressure and heart rate, change rapidly and their meaning is unclear. Although they were important in this instance, models built on larger richer data sets may well select different variables.

Of course, even if we were confident in the model’s performance and choice of variables, there are many implementation considerations with the deployment of any decision rule. That said, the work by Buell et al sets up important next steps. Two far larger multicenter RCTs, UK-ROX and MEGA-ROX, are likely to complete enrollment in the next year or so. With more than 50 000 randomized patients anticipated in these 2 trials alone, more robust heterogeneity of treatment effect estimation will be possible and such an estimate will likely be more important than the primary results of either trial. Assuming qualitative heterogeneity of treatment effect is again shown in analyses that include these 2 trials, it will be imperative to design, implement, and evaluate practical bedside decision rules for the personalized use of one of the world’s most important therapies.

作者: dubin98

该日志由 dubin98 于2年前发表在研究点评, 进展交流分类下，
转载请注明: [JAMA发表述评]：你的里程可能有所不同：迈向个体化氧疗 | 中国病理生理学会危重病医学专业委员会 +复制链接

【上篇】[ICU Management & Practice]: 采用机器学习确定机械通气患者个体化氧合目标
【下篇】[ICU Management & Practice]: HOT-COVID临床试验- ISICEM 2024

抱歉!评论已关闭.

Your Mileage May Vary: Toward Personalized Oxygen Supplementation

Derek C. Angus

JAMA. Published online March 19, 2024. doi:10.1001/jama.2024.0972

作者: dubin98

最活跃的读者

返回首页