现在的位置: 首页时讯速递, 进展交流>正文
[JAMA Netw Open发表论文]:采用大型语言模型评价急诊科就诊的成年患者的临床急性程度
2024年08月05日 时讯速递, 进展交流 [JAMA Netw Open发表论文]:采用大型语言模型评价急诊科就诊的成年患者的临床急性程度已关闭评论

Original Investigation 

Emergency Medicine

May 7, 2024

Use of a Large Language Model to Assess Clinical Acuity of Adults in the Emergency Department

Christopher Y. K. Williams, Travis Zack, Brenda Y. Miao, et al

JAMA Netw Open. 2024;7(5):e248895. doi:10.1001/jamanetworkopen.2024.8895

editorial comment icon 

Editorial 

Comment

Key Points

Question  Can a large language model (LLM) accurately assess clinical acuity in the emergency department (ED)?

Findings  This cross-sectional study of 251 401 adult ED visits investigated the potential for an LLM to classify acuity levels of patients in the ED based on the Emergency Severity Index across 10 000 patient pairs. The LLM demonstrated accuracy of 89% and was comparable with human physician classification in a 500-pair subsample.

Meaning  These findings suggest that LLMs could accurately identify higher-acuity patient presentation when given pairs of presenting histories extracted from patients’ first ED documentation.

Abstract

Importance  The introduction of large language models (LLMs), such as Generative Pre-trained Transformer 4 (GPT-4; OpenAI), has generated significant interest in health care, yet studies evaluating their performance in a clinical setting are lacking. Determination of clinical acuity, a measure of a patient’s illness severity and level of required medical attention, is one of the foundational elements of medical reasoning in emergency medicine.

Objective  To determine whether an LLM can accurately assess clinical acuity in the emergency department (ED).

Design, Setting, and Participants  This cross-sectional study identified all adult ED visits from January 1, 2012, to January 17, 2023, at the University of California, San Francisco, with a documented Emergency Severity Index (ESI) acuity level (immediate, emergent, urgent, less urgent, or nonurgent) and with a corresponding ED physician note. A sample of 10 000 pairs of ED visits with nonequivalent ESI scores, balanced for each of the 10 possible pairs of 5 ESI scores, was selected at random.

Exposure  The potential of the LLM to classify acuity levels of patients in the ED based on the ESI across 10 000 patient pairs. Using deidentified clinical text, the LLM was queried to identify the patient with a higher-acuity presentation within each pair based on the patients’ clinical history. An earlier LLM was queried to allow comparison with this model.

Main Outcomes and Measures  Accuracy score was calculated to evaluate the performance of both LLMs across the 10 000-pair sample. A 500-pair subsample was manually classified by a physician reviewer to compare performance between the LLMs and human classification.

Results  From a total of 251 401 adult ED visits, a balanced sample of 10 000 patient pairs was created wherein each pair comprised patients with disparate ESI acuity scores. Across this sample, the LLM correctly inferred the patient with higher acuity for 8940 of 10 000 pairs (accuracy, 0.89 [95% CI, 0.89-0.90]). Performance of the comparator LLM (accuracy, 0.84 [95% CI, 0.83-0.84]) was below that of its successor. Among the 500-pair subsample that was also manually classified, LLM performance (accuracy, 0.88 [95% CI, 0.86-0.91]) was comparable with that of the physician reviewer (accuracy, 0.86 [95% CI, 0.83-0.89]).

Conclusions and Relevance  In this cross-sectional study of 10 000 pairs of ED visits, the LLM accurately identified the patient with higher acuity when given pairs of presenting histories extracted from patients’ first ED documentation. These findings suggest that the integration of an LLM into ED workflows could enhance triage processes while maintaining triage quality and warrants further investigation.

抱歉!评论已关闭.

×
腾讯微博