现在的位置: 首页研究点评, 进展交流>正文
2019年06月09日 研究点评, 进展交流 暂无评论

Editorial May 19, 2019

New Phenotypes for Sepsis: The Promise and Problem of Applying Machine Learning and Artificial Intelligence in Clinical Research

William A. Knaus, Richard D. Marks

JAMA. Published online May 19, 2019. doi:10.1001/jama.2019.5794

In this issue of JAMA, Seymour and his multidisciplinary team of coauthors1 aim to improve the current understanding of sepsis by identifying new clinical phenotypes using machine learning clustering techniques. The authors assume that current sepsis definitions are too broad and clinically imprecise to untangle complex clinical and biological interactions in sepsis and that better-defined phenotypes represent the key to ultimately identifying new and successful therapeutic approaches. Their analytic approaches, such as unsupervised clustering, may be unfamiliar to many readers. Unsupervised learning is a type of machine learning algorithm used to draw inferences from data sets consisting of input data without labeled responses or direction. The most common unsupervised learning method is a cluster analysis, which is used for exploratory data analysis to find hidden patterns or groupings in the data.

The authors used this and other unsupervised techniques in their retrospective analyses of multiple data sets. They attempted to derive sepsis phenotypes from clinical data, determine the reproducibility and relationship of these phenotypes with biomarkers and clinical outcomes, and simulate the potential influence of the phenotypes on the results of previous randomized clinical trials (RCTs). The researchers started with clinical data from 20 189 patients who met Sepsis-3 criteria2 to serve as the derivation cohort, with data from an electronic health record database. The analyses were limited to 29 variables, mostly vital signs and laboratory tests, collected within the initial 6 hours following hospital presentation. This set of clinical variables meant that models such as clustering were the appropriate analytic choice.3

The investigators initially used a density-based clustering algorithm or OPTICS (Ordering Points to Identify the Clustering Strategy) to provide overall visualization of how the 29 clinical variables were structured.4This analysis suggested using a partitioning approach to clustering these data (akin to slicing a pizza into slices) as opposed to discrete natural groupings (eg, separating apples and oranges; eFigure 4 in the article1). The authors then used another unsupervised technique, the consensus k means clustering method, to partition the data to examine how often each of the 29 clinical variables was placed in the same cluster or grouping to reach a consensus on a matrix. From these analyses in the derivation cohort emerged 4 distinct clinical phenotypes labeled α, β, γ, and δ.

The 4 phenotypes were compared with commonly used approaches to identify subgroups in sepsis such as organ failure (the Sequential Organ Failure Assessment) and traditional severity of illness (the Acute Physiology and Chronic Health Evaluation) scores.5,6 The authors found that, although the 4 phenotypes have some relationship to existing approaches, they represent unique clinical groupings. The authors used Chord plots to demonstrate how organ system groupings of the 29 variables contributed to these new phenotypes and the thickness of each band represents the strength of the contribution (Figure 1 in the article1).

The authors also investigated the differential contribution each of these variables makes to defining the phenotype (Figure 2 and eFigures 6-9 in the article1). For example, the α phenotype has few abnormal laboratory values, less organ system dysfunction, and the lowest mortality rate. Patients with the β phenotype were older and had frequent chronic illness and kidney disease. The γ phenotype was more likely to have measures of inflammation, lower albumin levels, and higher temperatures. The δ phenotype had high serum lactate levels, elevated levels of transaminases, hypotension, and the highest mortality rate.

Across the phenotypes, there were clear outcome differences in short-term (in-hospital) mortality (2% for the α phenotype, 5% for the β phenotype, 15% for the γ phenotype, and 32% for the δ phenotype) in the derivation cohort that were largely maintained in other data sets, including the validation cohort from a database of electronic health records (n = 43 086 patients), a cohort of patients with pneumonia (n = 583 patients), and patients with sepsis from 3 RCTs (n = 1706, 1690, and 1341 patients; Figure 5 in the article1). When examining the relationships of phenotypes with biomarker data, the authors found significant differences in the distribution of many biomarkers across the 4 phenotypes in the GenIMS (Genetic and Inflammatory Markers of Sepsis) cohort of patients with pneumonia and in the 3 RCTs (Figure 4 in the article1).

The authors also simulated how the results of the 3 RCTs would have varied as the proportions of α and δ phenotypes were increased among the trial study populations. Although none of the 3 trials ultimately resulted in a successful outcome, there were changing proportions of benefit to harm observed with variations in the proportion of α and δ phenotypes. These results reinforce the original assumption that current consensus definitions of sepsis are very broad and may encompass many potentially meaningful, but unrecognized clinical phenotypes.

Seymour and colleagues1 also correctly acknowledge that future investigation will be needed to determine whether these 4 phenotypes will be useful in clinical research and practice. Assuming these phenotypes are unique and do not simply reflect variations in organ system failure or severity of illness, they might be the used in combination with additional clinical data combined with other manifestations of sepsis as measured by systems biology and novel gene expression patterns. This approach might identify new subsets of patients with sepsis who require different immunotherapeutic interventions. Yet this promise of combining advanced machine learning approaches with large clinical and biological data sets is not possible today.

A reason some of these more advanced machine learning or artificial intelligence techniques may not have been adequate in this study is that, despite having millions of patient records, the researchers could access only a small amount of the clinical data within them. For instance, additional details in patients’ longitudinal electronic health records may have been useful in the initial hours of the clinical evaluation, and might have included variables such as (1) the precise past clinical and pathological diagnoses; (2) the exact type and severity of individual comorbidities; and (3) recent interventions such as chemotherapy that might reduce a patient’s ability to mount a strong immune defense. However, these clinical variables are not readily accessible today, and, moreover, it is uncertain whether and how the addition of these variables would have improved the models the authors derived. Whether having additional data would improve the analysis and allow for the development of more effective interventions is unknown.

Thus, the study by Seymour and colleagues1 should serve as a reality check on the excitement and anticipation that large electronic clinical data sets combined with sophisticated machine or deep learning techniques will soon lead to the discovery of new therapeutic approaches and more precise, personalized clinical care. Much of the progress that artificial intelligence has had in medicine is with the analyses of images such as the deep learning complex network of artificial neurons that proved capable of detecting diabetic neuropathy when all the data needed are available.7

Researchers will adopt ever-improving artificial intelligence and machine learning techniques.8,9 But will they have opportunities to apply artificial intelligence and machine learning to immense stores of patient data in electronic health record systems across the United States; and what might they discover then? Without that access, can the search by artificial intelligence for novel patterns, leading to new therapeutic advances, reach what some anticipate to be its potential?

Liberating patient data in exchangeable, standardized clinical data sets is the problem of interoperability. Its engineering solution is a digital, updatable data exchange standard incorporating appropriate security and privacy protections. (In appropriate domains, such as GPS and electronic health record data exchange, government standards are essential.) Yet the US Department of Health and Human Services refuses to promulgate a health information exchange standard, even though it was required to do so by the 21st Century Cures Act.10,11 Consensus exchange standards, although far from perfect, exist and industry groups are constantly improving them.12-14 The Act specifies these standards must be used for new exchange protocols. Dependable exchange of clinical data for research remains unattainable in the interim.

The study by Seymour and colleagues1 represents the brave new world of attempting to apply patient data, machine learning, and artificial intelligence to better understand complex, serious clinical problems. However, the ultimate answer to the question “will this approach improve patient outcomes?” remains unknown.


您必须 [ 登录 ] 才能发表留言!