Guide to Statistics and Methods
Surgical Education Research
January 3, 2024
Practical Guide to Pragmatic Clinical Trials in Surgical Education Research
Karl Y. Bilimoria, Jason S. Haukoos, Gerard M. Doherty
JAMA Surg. Published online January 3, 2024. doi:10.1001/jamasurg.2023.6690
Introduction
In 1967, Schwartz and Lellouch1 distinguished “explanatory” trials from “pragmatic” trials by the nature of study intent: clinical trials aim to inform scientific understanding, whereas pragmatic trials aim to inform decision-making. Today, clinical trials are synonymous with efficacy trials (ie, to assess whether the intervention produces an expected result under ideal circumstances); pragmatic trials are synonymous with effectiveness trials (ie, to measure the degree of beneficial effect in real-world settings). This article outlines key features of pragmatic trials within the context of surgical education research using illustrations from the Flexibility in Duty-Hour Requirements for Surgical Trainees (FIRST) trial. The FIRST trial was a cluster-randomized, clinical, pragmatic trial that compared the noninferiority of a flexible resident duty-hour policy vs Accreditation of Graduate Medical Education (ACGME) duty-hour policies (usual care) with respect to patient outcomes and self-reported resident well-being.2,3
Using the Methodology
When and Why This Method Might Be Used
A pragmatic approach may be appropriate when the objective is to assess the effectiveness of an educational intervention under real-world conditions. A pragmatic approach is not appropriate when the objective is to establish efficacy or to understand causality.
The aim of the FIRST trial was to inform the ongoing policy debate concerning the relative merits of flexible vs standard ACGME duty-hour regimes for surgical residents—not to test causal hypotheses.
How This Methodology Should Be Used
The distinction between pragmatic and explanatory trials is one of degree.1,4,5 The Pragmatic Explanatory Continuum Indicator Summary (PRECIS-2) tool identifies 9 dimensions along which pragmatic trials may be assessed5:
- Eligibility criteria. Pragmatic study samples are typically intended to resemble real-world target populations of inferential interest and may involve variable eligibility criteria. Pragmatic trials tend to have a larger samples size with less onerous data collection for each participant, in comparison to an efficacy trial. In the FIRST trial, all US general surgery residency programs in good ACGME standing and affiliated with 1 or more hospitals participating in the National Surgical Quality Improvement Project (NSQIP) were eligible.
- Recruitment. Multiple, broad channels of recruitment may enhance pragmatism by reducing barriers to participation and extending recruitment reach. This may help reduce self-selection bias. With endorsement by key stakeholders, the FIRST trial investigators invited all eligible general surgery residency program directors and chairs to informational webinars.
- Setting. Pragmatic trial settings typically aim to resemble settings in which intervention implementation would occur in the real world. The FIRST trial was conducted entirely in naturalistic settings of policy interest: general surgery residency programs and affiliated hospitals.
- Organization. Study-specific resources beyond what may be expected with intervention implementation in real-world circumstances may render a trial less pragmatic. As an evaluation of a policy intervention, the FIRST trial required minimal additional resources.
- Flexibility of intervention delivery/implementation. Standardizing intervention implementation to conform to a uniform set of procedures will not reflect the heterogeneity of contexts in which an intervention may be implemented. FIRST trial programs randomized to flexible duty hours were able to implement modified institutional policies allowing flexibility (but not requiring extended duty hours).
- Flexibility in adherence to study conditions. Study oversight to ensure participant adherence to assigned trial conditions renders a trial less pragmatic because it neglects the heterogeneity of real-world adherence. Although FIRST trial participants were informed that consent to participate indicated assent to implement and adhere to assigned study arm conditions, no external enforcement was undertaken by investigators.
- Follow-up. Unless data collection relies on existing sources of data collected for purposes other than the study, frequent or obtrusive follow-up protocols render a trial less pragmatic. FIRST trial use of secondary data from NSQIP and American Board of Surgery In-Training Examination collection modalities minimized study-specific primary data collection.
- Primary outcome. The more meaningful an outcome is to real-world stakeholders, the more pragmatic a trial. The primary outcome in the FIRST trial was 30-day postoperative patient death or serious morbidity.
- Primary analysis. Intent-to-treat analyses that incorporate all data may be appropriate in pragmatic trials. Intent-to-treat was adopted as the primary analytic approach in the FIRST trial, with planned secondary analyses to address adherence.
Advantages of the Method
The chief advantage of pragmatic trials is enhanced external validity. Less stringent eligibility criteria and flexibility in implementation and adherence may lower barriers to trial participation.
Pitfalls or Limitations of the Method
Greater heterogeneity typically implies smaller effect sizes and may present challenges in designing a trial with adequate statistical power. Increasing power may not be as straightforward as increasing sample size because of the ceiling on the number of potential study units available for recruitment. Extending recruitment for sample accrual may be infeasible, prohibitively costly, or jeopardize other aspects of trial design. Alternative outcomes with larger expected effect sizes may be chosen provided they are still substantively meaningful to stakeholders. In cluster-randomized trials, power may depend more on the number of clusters than the number of units within clusters (ie, number of residency programs [n = 119] largely drives power calculation, not the number of patients [N = approximately 140 000]).
Statistical Considerations
Analytic Approach
Analyses of pragmatic trials commonly follow an intention-to-treat approach: data from all study participants are analyzed without regard to adherence to assigned study conditions. This approach acknowledges real-world heterogeneity in implementation and adherence. In pragmatic trials lacking enforcement of adherence to assigned study conditions, the intention-to-treat estimate of the average treatment effect does not refer to the treatment but to the offer of treatment.
In the FIRST trial, the investigators evaluated the effect of offering programs the option of duty-hour flexibility on patient and resident outcomes. The intent was not to assess the effect of flexible or extended duty hours.
Stakeholders may be interested in understanding the treatment effect among participants who adhered to assigned study conditions. If data on adherence were collected, estimates of the average treatment effect on the treated can be obtained through as-treated or per-protocol analyses undertaken on the subsample of participants that adhered to assigned study conditions. These analyses and a number of sensitivity analyses should be prespecified in the statistical analysis plan.
In the FIRST trial, programs adhered to a variety of duty-hour policy models. Control arm programs also had some contamination where duty-hour violations occurred. Thus, the as-treated analysis was needed to compare programs that truly followed their assigned study arm rules regarding duty hours.
Cluster Randomization
Cluster-randomized trials randomize clusters of individual units to study arm conditions. Pragmatic trials should be randomized at the level of the intervention target. For example, organizations should be cluster randomized when interventions are at the organization level, whereas individuals should be randomized if intervention implementation varies from individual to individual.
The FIRST trial was cluster randomized because all residents in a program were exposed to the same duty hours. It would have been impractical, unethical, and unrealistic to implement duty-hour regulations that varied from one resident to another within a program (Box).
Box.
Summary
- Pragmatic trials may be suitable when evaluating the effectiveness of an intervention under real-world conditions and/or when the intent is to inform an implementation decision.
- Randomize participants at the level of intervention implementation in the real world.
- Minimize study-specific primary data collection. Use existing sources of secondary data when possible.
- Intention-to-treat analysis remains the most appropriate approach for estimating treatment effects. When possible, consider additional as-treated analyses.
Where to Find More Information
In addition to PRECIS-2,6 the National Institutes of Health Collaboratory on Rethinking Clinical Trials7 is an excellent resource for best practices in designing and conducting pragmatic trials.