Original Investigation
AI in Surgery
Artificial Intelligence–Augmented Human Instruction and Surgical Simulation Performance: A Randomized Clinical Trial
Bianca Giglio, Abdulmajeed Albeloushi, Ahmad Kh. Alhaj, et al
JAMA Surg Published Online: August 6, 2025
doi: 10.1001/jamasurg.2025.2564
Key Points
Question Does artificial intelligence–augmented personalized expert instruction improve surgical performance, skill transfer, and affective-cognitive responses compared to intelligent tutoring alone?
Findings In this randomized clinical trial of 88 medical students, trainees achieved significantly higher performance scores when tutored by a human educator providing personalized feedback based on artificial intelligence error data than by an intelligent tutor alone.
Meaning Providing human educators with artificial intelligence performance data to tailor feedback improves learning outcomes in surgical simulation training.
Abstract
Importance How the Intelligent Continuous Expertise Monitoring System, an artificial intelligence tutoring system, might be best optimized for surgical training is unknown.
Objective To determine the effects of artificial intelligence–augmented personalized expert instruction vs intelligent tutoring alone on surgical performance, skill transfer, and affective-cognitive responses.
Design, Setting, and Participants This single-blinded randomized clinical trial was conducted among a volunteer sample of medical students in preparatory, first, or second year without prior use of a virtual reality surgical simulator (NeuroVR) at the McGill Neurosurgical Simulation and Artificial Intelligence Learning Centre in Montreal, Quebec, Canada. Cross-sectional data were collected from March to September 2024, and per-protocol data analysis was conducted in March 2025.
Intervention During simulated surgical procedures, trainees received 1 of 3 feedback methods. Group 1 received only intelligent tutor instruction (control). The 2 intervention arms included group 2, which received expert feedback in identical words to the intelligent tutor, and group 3, which received artificial intelligence data–informed personalized expert feedback.
Main Outcomes and Measures The coprimary outcomes included change in overall surgical performance across practice resections and skill transfer to a complex realistic scenario, measured by artificial intelligence–calculated composite expertise score (range, −1.00 [novice] to 1.00 [expert]). Secondary outcomes included emotional and cognitive demands, measured via questionnaires.
Results In this randomized clinical trial, the final analysis included 87 medical students (46 [53%] women; mean [SD] age, 22.7 [4.0] years), with 30, 29, and 28 participants in groups 1, 2, and 3, respectively. Group 3 achieved significantly higher scores than group 1 across several trials, including trial 5 (mean difference, 0.26; 95% CI, 0.09-0.43; P = .01) and the realistic task (mean difference, 0.20; 95% CI, 0.06-0.34; P = .02). Group 3 also achieved significantly better scores than the other 2 groups in certain metrics, such as bleeding and injury risk. Emotions and cognitive load demonstrated significant differences.






Conclusions and Relevance In this randomized clinical trial, personalized expert instruction resulted in enhanced surgical performance and skill transfer compared with intelligent tutor instruction, highlighting the importance of human input and participation in artificial intelligence–based surgical training.
Trial Registration ClinicalTrials.gov Identifier: NCT06273579