现在的位置: 首页时讯速递, 进展交流>正文
[JAMA Netw Open发表论文]:人工智能与专业翻译出院指导的准确性
2025年10月25日 时讯速递, 进展交流 [JAMA Netw Open发表论文]:人工智能与专业翻译出院指导的准确性已关闭评论

Original Investigation 

Equity, Diversity, and Inclusion

Accuracy of Artificial Intelligence vs Professionally Translated Discharge Instructions

Melissa Martos, Blanca Fields, Samuel G. Finlayson, et al

JAMA Netw Open 2025;8;(9):e2532312. doi:10.1001/jamanetworkopen.2025.32312

Key Points

Question  How does the accuracy of artificial intelligence (AI)–based translation compare with professional translation of discharge instructions under routine practice conditions?

Findings  In this comparative effectiveness analysis of AI vs professional translation of 148 sections of 34 issued discharge instructions, AI translations were noninferior in some domains for Spanish instructions but consistently inferior in Chinese, Vietnamese, and Somali.

Meaning  These findings suggest that AI translation may have similar performance to professional translations for Spanish discharge instructions but requires further development and validation prior to implementation in other languages.

Abstract

Importance  Patients using languages other than English are a group at risk of poor health outcomes and encounter barriers to access of translated written materials. Although artificial intelligence (AI) may offer an opportunity to improve access, few studies have evaluated the accuracy and safety of AI translation for clinical care under routine practice conditions.

Objective  To investigate the accuracy of AI translation compared with professional human translation of patient-specific issued pediatric inpatient discharge instructions.

Design, Setting, and Participants  This comparative effectiveness analysis compared translations by a neural machine translation model vs professional translators using patient-specific pediatric inpatient discharge instructions received by families between May 18, 2023, and May 18, 2024, at a single center academic pediatric hospital. Instructions were translated to Simplified Chinese, Somali, Spanish, and Vietnamese by professional translators and the Azure AI system and then broken into scoring sections. Two professional translators per language evaluated translations (blinded to source) on an established 5-point scale for fluency, adequacy, meaning, and error severity, with 1 indicating worst performance and 5 indicating best performance.

Exposure  AI vs professional translation.

Main Outcome and Measure  Quality of discharge instruction translation, including fluency, adequacy, meaning, and severity of errors.

Results  A total of 148 sections from 34 discharge instructions were analyzed. When considering all 4 languages together, average fluency, adequacy, and meaning were lower among AI compared with professional human translations. Among all tested languages, mean (SD) fluency for AI translations was 2.98 (1.12) compared with 3.90 (0.96) for professional translations (difference, 0.92; 95% CI, 0.83-1.01; P < .001), adequacy was 3.81 (1.14) compared with 4.56 (0.70) (difference, 0.74; 95% CI, 0.65-0.83; P < .001), meaning was 3.38 (1.15) compared with 4.28 (0.84) (difference, 0.90; 95% CI, 0.80-0.99; P < .001), and error severity was 3.53 (1.28) compared with 4.48 (0.88) (difference, 0.95; 95% CI, 0.85-1.06; P < .001). Compared with professional translations, the Spanish AI translations were noninferior in adequacy (difference, 0.08; 95% CI, −0.02 to 0.19) and error severity (difference, 0.03; 95% CI, −0.09 to 0.14) but inferior in fluency (difference, 0.38; 95% CI, 0.23-0.53) and just crossed the inferiority threshold in meaning (difference, 0.08; 95% CI, −0.04 to 0.20). The Chinese, Vietnamese, and Somali AI translations were inferior to the professional translations across all metrics, with the greatest differences for Somali.

Conclusions and Relevance  In this comparative effectiveness analysis of AI- vs professionally translated issued discharge instructions, AI-translated instructions performed similarly for Spanish but worse for other languages tested. Validation and clinical implementation of AI-based translation will require special attention to languages of lesser diffusion to prevent creating new inequities.

抱歉!评论已关闭.

×
腾讯微博