Use of Artificial Intelligence Chatbots in Interpretation of Pathology Reports

Eric Steimetz, Jeremy Minkowitz, Elmer C. Gabutan, et al

JAMA Netw Open. 2024;7(5):e2412767. doi:10.1001/jamanetworkopen.2024.12767

Key Points

Question Can artificial intelligence chatbots accurately simplify pathology reports so that patients can easily understand them?

Findings In this cross-sectional study of 1134 pathology reports, 2 chatbots significantly decreased the reading grade level of pathology reports while interpreting most reports correctly. However, some reports contained significant errors or hallucinations.

Meaning These findings suggest that chatbots have the potential to explain pathology reports to patients and extrapolate pertinent details; however, they are not flawless and should not be used without a review by a health care professional.

Abstract

Importance Anatomic pathology reports are an essential part of health care, containing vital diagnostic and prognostic information. Currently, most patients have access to their test results online. However, the reports are complex and are generally incomprehensible to laypeople. Artificial intelligence chatbots could potentially simplify pathology reports.

Objective To evaluate the ability of large language model chatbots to accurately explain pathology reports to patients.

Design, Setting, and Participants This cross-sectional study used 1134 pathology reports from January 1, 2018, to May 31, 2023, from a multispecialty hospital in Brooklyn, New York. A new chat was started for each report, and both chatbots (Bard [Google Inc], hereinafter chatbot 1; GPT-4 [OpenAI], hereinafter chatbot 2) were asked in sequential prompts to explain the reports in simple terms and identify key information. Chatbot responses were generated between June 1 and August 31, 2023. The mean readability scores of the original and simplified reports were compared. Two reviewers independently screened and flagged reports with potential errors. Three pathologists reviewed the flagged reports and categorized them as medically correct, partially medically correct, or medically incorrect; they also recorded any instances of hallucinations.

Main Outcomes and Measures Outcomes included improved mean readability scores and a medically accurate interpretation.

Results For the 1134 reports included, the Flesch-Kincaid grade level decreased from a mean of 13.19 (95% CI, 12.98-13.41) to 8.17 (95% CI, 8.08-8.25; t = 45.29; P < .001) by chatbot 1 and 7.45 (95% CI, 7.35-7.54; t = 49.69; P < .001) by chatbot 2. The Flesch Reading Ease score was increased from a mean of 10.32 (95% CI, 8.69-11.96) to 61.32 (95% CI, 60.80-61.84; t = −63.19; P < .001) by chatbot 1 and 70.80 (95% CI, 70.32-71.28; t = −74.61; P < .001) by chatbot 2. Chatbot 1 interpreted 993 reports (87.57%) correctly, 102 (8.99%) partially correctly, and 39 (3.44%) incorrectly; chatbot 2 interpreted 1105 reports (97.44%) correctly, 24 (2.12%) partially correctly, and 5 (0.44%) incorrectly. Chatbot 1 had 32 instances of hallucinations (2.82%), while chatbot 2 had 3 (0.26%).

Conclusions and Relevance The findings of this cross-sectional study suggest that artificial intelligence chatbots were able to simplify pathology reports. However, some inaccuracies and hallucinations occurred. Simplified reports should be reviewed by clinicians before distribution to patients.

作者: dubin98

该日志由 dubin98 于2年前发表在时讯速递, 进展交流分类下，
转载请注明: [JAMA Netw Open发表论文]：利用人工智能聊天机器人解读病理报告 | 中国病理生理学会危重病医学专业委员会 +复制链接

【上篇】[ICU Management & Practice]: 应对ICU决策中的不确定性
【下篇】[JAMA Surg发表论文]：手术机器人辅助胆囊切除术的学习曲线

抱歉!评论已关闭.

Use of Artificial Intelligence Chatbots in Interpretation of Pathology Reports

Eric Steimetz, Jeremy Minkowitz, Elmer C. Gabutan, et al

JAMA Netw Open. 2024;7(5):e2412767. doi:10.1001/jamanetworkopen.2024.12767

作者: dubin98

最活跃的读者

返回首页