Viewpoint
March 27, 2023
AI-Generated Medical Advice—GPT and Beyond
Claudia E. Haupt, Mason Marks
JAMA. Published online March 27, 2023. doi:10.1001/jama.2023.5321
For years, experts have speculated about the future role of artificial intelligence (AI) in health care. Some AI tools can outperform physicians on specific tasks in radiology, dermatology, and other fields, which raised concerns that AI might render certain specialists obsolete. Some feared AI might expose patients and clinicians to novel risks.1 Others wondered whether physicians could use AI in good conscience if they do not understand how it works, or whether clinicians who fail to adopt it might be accused of providing substandard care.2
These concerns have faded somewhat as high-profile AI platforms like IBM Watson failed to deliver on their promise.3 Moreover, lacking anything resembling general intelligence, AI bested humans only at narrowly defined tasks. However, AI-related fears reemerged with the rise of language learning models (LLMs), exemplified by Open AI’s GPT (now in its fourth version). This technology has left clinicians wondering how they might use LLMs and what risks the technology poses to patients and clinicians.
This Viewpoint surveys the medical applications of GPT and related technologies and considers whether new forms of regulation are necessary to minimize safety and legal risks to patients and clinicians. These risks depend largely on whether the software is used to assist health care practitioners or to replace them, and the degree to which clinicians maintain control.
What Is GPT?
A generative pretrained transformer (GPT) is an AI tool that produces text resembling human writing, allowing users to interact with AI almost as if they are communicating with another person. The sudden rise in popularity of LLMs was driven largely by GPT-3, OpenAI’s third iteration, which was called the fastest growing app of all time and the most innovative LLM.
People use GPT by entering prompts—text instructions in the form of questions or commands. Creating effective AI prompts is an art as much as a science, and the possibilities seem endless. One can use GPT like a search engine. However, GPT’s predictive algorithms can also answer questions that have never been posed.
If asked to write a haiku about the Krebs cycle, the software will provide one, even if nobody has written one before. GPT can explain complex topics like quantum mechanics in simple terms or provide a differential diagnosis for right upper quadrant pain. In this respect, its utility as a user-friendly scientific or medical encyclopedia is obvious. Researchers have even entered questions from the US Medical Licensing Examination into GPT-3 and claimed the software “approaches or exceeds the passing threshold.”4 One of GPT’s most impressive features is how it handles repetitive writing tasks. In seconds, it can reduce large text files to abstracts or bullet-point summaries. It can author first drafts of letters, presentations, and other documents.
However, in its current form, GPT is prone to errors and omissions. It can fail at simple tasks, such as basic arithmetic, or insidiously commit errors that go unnoticed without scrutiny by subject matter experts. Some users observe that when asked to provide references for its claims, GPT often makes them up. Educators fear students might be misinformed when relying on the software. Due to the risk of fabrication, academic publishers are requiring authors to disclose their use of the technology. Finally, algorithms generally are known to reproduce biases of their training data, creating the potential for harmful discrimination.5
Potential Uses in Clinical Practice
Within health care, GPT could play roles in research, education, and clinical care. In research settings, it can help scientists formulate questions, develop study protocols, and summarize data. In medical education, GPT can serve as an interactive encyclopedia. It could simulate patient interactions to help students hone history-taking skills. GPT can even produce first drafts of progress notes, patient care plans, and other documents students must prepare for class or on the wards.
For clinicians, GPT can potentially ease burnout by taking on repetitive tasks. It could provide clinical decision support and be incorporated into electronic medical record platforms like Epic. GPT might augment or replace frequently used resources like UpToDate. In theory, physicians could enter patient information into the software and ask for a differential diagnosis or preliminary treatment plan. However, current versions of GPT are not HIPAA compliant and could jeopardize patient privacy. Until professional-grade versions with adequate safeguards are available, clinicians should avoid inputting protected health information.
Patients might benefit from using GPT as a medical resource. However, unless its advice is filtered through health care practitioners, false or misleading information could endanger their safety.
Legal Assessment
With all these potential uses for GPT and other LLMs, how should clinicians proceed? Does GPT pose novel legal risks or new modes of regulation? The key to understanding ethical and legal questions surrounding new technologies is rarely about the technologies themselves. What matters most is how they affect social relationships between users.6 One must first identify these relationships, the values that define them, and the relevant legal frameworks.
We distinguish 3 GPT use cases: AI within the patient-physician relationship that augments rather than replaces clinician judgment, patient-facing AI in health care delivery that substitutes for clinician judgment, and direct-to-consumer health advice-giving. Clinicians have distinct concerns in each context.
Consider the heavily regulated patient-physician relationship, defined by values like competence, trust, and patient autonomy. The legal framework around this relationship is designed to ensure that medical advice-giving aligns with these principles. Specifically, rules regarding licensing and discipline, malpractice liability, informed consent, and fiduciary duties reflect these values.
If GPT is used to augment rather than replace professional judgment, its introduction to clinical practice is unlikely to affect the patient-physician relationship. Using LLMs is like physicians’ earlier adoption of smartphones, electronic medical records, or even old-fashioned desk references. Accordingly, GPT may be regulated as routine professional advice-giving.
From a liability perspective, it is unimportant whether clinicians understand how AI works and why it makes recommendations. What matters is whether they provide care that meets or exceeds accepted standards. Inaccurate advice generated by AI is no different from erroneous information disseminated by professionals resulting in harm when both are filtered through professionals’ judgment. In that case, existing legal frameworks can assign liability to professionals for their advice, regardless of its source. That means clinicians are responsible for the outcomes. Accordingly, they should not trust GPT any more than other medical tools until they have been thoroughly validated. Clinicians can use LLMs to offload repetitive tasks or generate new ideas. However, they should scrutinize and verify their outputs to protect patients and themselves.
More worrisome and legally uncertain are patient-facing uses for GPT where AI advice becomes removed from the professional relationship. Examples include using LLMs to provide basic mental health care to patients or to replace clinic staff who usually perform triage. Here, the software may substitute for professional judgment, potentially undermining the competence and trust that define the patient-physician relationship. Due to liability stemming from the lack of professional oversight, and concerns regarding accuracy and reliability, GPT should not be used in these roles for the foreseeable future. Nevertheless, the risks have not stopped some companies from using LLMs on the front lines and fringes of health care.
Consumer-facing uses of GPT are the least regulated. In 2023, a company called Koko, which provides emotional support chat services for people in distress, swapped GPT-3 for its usual human volunteer counselors.7 Without asking permission, Koko provided AI-generated messages to about 4000 consumers seeking psychological support. When users learned of this unauthorized experiment, many felt betrayed. If Koko’s developers were licensed health care practitioners, they would have breached the duties of care and trust they owe to patients. However, this kind of consumer-facing wellness product lies in a legal gray area, and the responsibilities owed to users are unclear.
When consumers ask AI for emotional support or medical advice, they act outside the patient-physician relationship, and few guardrails exist. The same is true when consumers access other sources of health information, such as Google or Twitter. In those cases, the law imposes no limits like those protecting the patient-physician relationship. These consumer-facing products are not covered entities under HIPAA, and its privacy rules do not apply. Moreover, there is no malpractice liability for bad advice that causes harm. In fact, courts believe the First Amendment’s free speech clause shields those who provide erroneous medical advice outside professional relationships.8
The risks may be unusually high when LLMs are involved because people tend to anthropomorphize them, and, as with Koko, they might not know they are communicating with software. GPT’s human-like conversational style may gain consumers’ trust, making them susceptible to manipulation, experimentation, and commercial exploitation.
Although most health care laws do not apply in the consumer context, the Federal Trade Commission (FTC) could frame AI manipulation and misleading AI-generated medical advice as unfair or deceptive business practices that violate the FTC Act. Furthermore, the US Food and Drug Administration could hold software developers responsible if GPT makes false medical claims. Much of this is uncharted legal territory.
Online intermediaries like Google and Twitter, which primarily disseminate user-generated content without alteration, are protected against liability by Section 230 of the Communications Decency Act. However, GPT, which synthesizes information to produce its own content, is unlikely to be immune according to lawmakers who drafted the act and comments by Supreme Court Justice Neil Gorsuch during oral arguments for Gonzalez v Google.9That means OpenAI could be liable for medical misinformation.
Regardless of regulation, clinicians should educate patients to be cautious when using LLMs like GPT outside the patient-physician relationship. They should explain that this kind of software is largely unregulated, potentially misleading, and, unlike health care practitioners, it owes patients none of the care, trust, and confidentiality that clinicians provide.
With respect to AI-generated medical advice, as with other innovations, we suggest focusing on relevant social relationships and how the technology affects them. If clinicians use LLMs to aid decision-making, they function like other medical resources or tools. However, using AI to replace human judgment poses safety risks to patients and may expose clinicians to legal liability. Until its accuracy and reliability are proven, GPT should not replace clinician judgment. Although clinicians are not responsible for harms caused by consumer-facing LLMs, they should educate patients about the risks. They might also advocate for FTC regulation that protects patients from false or misleading AI-generated medical advice.