{"id":25341,"date":"2024-02-20T04:56:00","date_gmt":"2024-02-19T20:56:00","guid":{"rendered":"http:\/\/csccm.org.cn\/?p=25341"},"modified":"2024-02-20T05:42:27","modified_gmt":"2024-02-19T21:42:27","slug":"jama-surg%e7%bb%9f%e8%ae%a1%e4%b8%8e%e6%96%b9%e6%b3%95%e5%ad%a6%e6%8c%87%e5%af%bc%ef%bc%9a%e5%a4%96%e7%a7%91%e6%95%99%e5%ad%a6%e7%a0%94%e7%a9%b6%e5%b8%b8%e8%a7%81%e9%94%99%e8%af%af%e7%9a%84%e5%ae%9e","status":"publish","type":"post","link":"https:\/\/csccm.org.cn\/?p=25341","title":{"rendered":"[JAMA Surg\u7edf\u8ba1\u4e0e\u65b9\u6cd5\u5b66\u6307\u5bfc]\uff1a\u5916\u79d1\u6559\u5b66\u7814\u7a76\u5e38\u89c1\u9519\u8bef\u7684\u5b9e\u8df5\u6307\u5bfc"},"content":{"rendered":"\n<p>Guide to Statistics and Methods&nbsp;<\/p>\n\n\n\n<p>Surgical Education Research<\/p>\n\n\n\n<p>January&nbsp;3,&nbsp;2024<\/p>\n\n\n\n<h1 class=\"wp-block-heading\">Practical Guide to Common Flaws With Surgical Education Research<\/h1>\n\n\n\n<h3 class=\"wp-block-heading\">Dimitrios&nbsp;Stefanidis,&nbsp;Laura&nbsp;Torbeck,&nbsp;Amy H.&nbsp;Kaji<\/h3>\n\n\n\n<h3 class=\"wp-block-heading\"><em>JAMA Surg.&nbsp;<\/em>Published online January 3, 2024. doi:10.1001\/jamasurg.2023.6675<\/h3>\n\n\n\n<p>Introduction<\/p>\n\n\n\n<p>Over the past 2 decades, surgical education literature has seen tremendous growth driven by changes in graduate medical education, such as work hour restrictions, a focus on competency-based education, the incorporation of simulation, and proper assessment. Some authors have criticized the quality of education research and indicated a need for improvement.<sup><a href=\"https:\/\/jamanetwork.com\/journals\/jamasurgery\/fullarticle\/2813498#sgm230002r1\">1<\/a><\/sup>&nbsp;Quality issues have been attributed to decreased funding for education research, limiting the ability to conduct rigorous, multicenter trials.<sup><a href=\"https:\/\/jamanetwork.com\/journals\/jamasurgery\/fullarticle\/2813498#sgm230002r2\">2<\/a><\/sup>&nbsp;The objective of this article is to describe common methodological flaws in surgical education research to help prospective authors avoid errors and to help reviewers better recognize them (<a href=\"https:\/\/jamanetwork.com\/journals\/jamasurgery\/fullarticle\/2813498#sgm230002b1\">Box<\/a>).<a><\/a><a><\/a><\/p>\n\n\n\n<p>Box.&nbsp;<\/p>\n\n\n\n<h4 class=\"wp-block-heading\">Summary<\/h4>\n\n\n\n<ul>\n<li>Avoid type II errors by performing a power analysis and sample size calculation prior to study implementation.<\/li>\n\n\n\n<li>When comparing 2 groups, ensure that they are well matched at baseline related to the outcomes of interest to maximize the ability to identify intervention differences.<\/li>\n\n\n\n<li>Use contemporary validity frameworks to collect validity evidence on an assessment tool or simulator. Avoid referring to a tool or simulator as being validated.<\/li>\n\n\n\n<li>When studying the effect of an educational intervention, avoid pretest\u2013immediate posttest designs; rather, use retention tests and\/or control groups.<\/li>\n<\/ul>\n\n\n\n<p>Using the Methodology<\/p>\n\n\n\n<p>This article addresses design and statistical flaws that are common across methodologies. We recommend adhering to reporting guidelines as outlined by the Enhancing the Quality and Transparency of Health Research (EQUATOR) Network (<a href=\"https:\/\/www.equator-network.org\/\">https:\/\/www.equator-network.org\/<\/a>).<a><\/a><\/p>\n\n\n\n<p>Statistical Considerations<\/p>\n\n\n\n<p>Common Flaw 1: Type II Error<\/p>\n\n\n\n<p>What Is the Issue?<\/p>\n\n\n\n<p>Not unique to surgical education research, type II errors frequently plague papers in the field.<sup><a href=\"https:\/\/jamanetwork.com\/journals\/jamasurgery\/fullarticle\/2813498#sgm230002r3\">3<\/a><\/sup>&nbsp;A type II error occurs when one fails to reject a null hypothesis that is false, primarily due to insufficient power of a study, which is determined by sample size, effect size, significance level, and variance. The sample size is low in many education research studies due to the target population of learners typically being limited in a department of surgery. The effect size reflects the magnitude of the true underlying difference between groups that is clinically or practically important; the larger the effect size is, the more relevant the results are. It is critical to determine the clinically or educationally significant effect size (not just the statistically significant difference) prior to performing any sample size calculation or power analysis. The desired clinically or educationally important effect estimate (risk difference, risk ratio, odds ratio, mean or median difference) and the associated confidence interval determine the sample size. The statistical significance level (generally set at \u03b1\u2009=\u2009.05), the power of the study (generally set at \u03b2\u2009=\u2009.20, or 80% power), and variance (usually not known at the study\u2019s beginning) are less under investigator control.<a><\/a><\/p>\n\n\n\n<p>How to Recognize It<\/p>\n\n\n\n<p>Determine if the authors identified a clinically significant effect estimate and calculated a sample size based on the desired power to detect this effect. If the authors have not described an effect size or determined a sufficient sample size, then failure to detect a statistically and clinically significant effect may not indicate a lack of a difference or association but rather a type II error due to insufficient power. Conclusions may not be valid if the power analysis and sample size calculation were performed post hoc.<a><\/a><\/p>\n\n\n\n<p>How to Avoid It<\/p>\n\n\n\n<p>Investigators must first identify what constitutes a minimum clinically significant effect and calculate a sample size based on the desired power to detect this effect. Rather than the&nbsp;<em>P<\/em>&nbsp;value, investigators should report the effect estimate and the associated confidence interval for the reader. Study participants should be representative of the population of interest and without selection bias.<a><\/a><\/p>\n\n\n\n<p>Common Flaw 2: Poor Participant Matching at Baseline for Studies Comparing 2 Cohorts<\/p>\n\n\n\n<p>What Is the Issue?<\/p>\n\n\n\n<p>Flaw 1 described earlier is amplified when study groups are not well balanced at baseline, especially as it relates to the primary outcome of the study or the factors that may directly affect the primary outcome. Systematic differences between comparison cohorts indicate selection bias.<a><\/a><\/p>\n\n\n\n<p>How to Recognize It<\/p>\n\n\n\n<p>Whether an observational study or a randomized clinical trial, one should look at the baseline differences between compared groups and determine whether the groups are similar as it relates to the outcome of interest. Consider a study investigating the impact of laparoscopic simulator training on learner performance in 2 groups using 2 different approaches. At baseline, the mean (SD) scores were 40 (15) for group A and 15 (18) for group B (<em>P<\/em>\u2009=\u2009.15). After training completion, scores were 90 (9) and 100 (7), respectively (<em>P<\/em>\u2009=\u2009.09). The author\u2019s conclusion that there was no difference in the effectiveness of the 2 approaches (based on the&nbsp;<em>P<\/em>&nbsp;value) is flawed as group A\u2019s performance improved by 125% while group B\u2019s performance improved by 633%, reflecting an important effect size that should have been delineated a priori. In this instance, a difference-in-differences approach should have been described.<a><\/a><\/p>\n\n\n\n<p>How to Avoid It<\/p>\n\n\n\n<p>Randomized clinical trials will avoid a type II error if a sufficient sample size is used.<sup><a href=\"https:\/\/jamanetwork.com\/journals\/jamasurgery\/fullarticle\/2813498#sgm230002r4\">4<\/a><\/sup>However, in education research, which often has smaller study cohorts, one consideration may be to use stratified methods, such as those based on similar baseline experience or performance. For observational studies, adjusting for baseline differences using regression or matching may mitigate this error.<a><\/a><\/p>\n\n\n\n<p>Common Flaw 3: Inappropriate Use of Validity<\/p>\n\n\n\n<p>What Is the Issue?<\/p>\n\n\n\n<p>The term&nbsp;<em>validity<\/em>&nbsp;is frequently used in relation to assessment tools and simulators. Unfortunately, many researchers in surgical education have not followed the transformative changes in validity concepts that have occurred over time, and they continue using the term inappropriately by stating that they have validated an assessment tool, a curriculum, a simulator, etc. Validity is not currently regarded as an inherent property of a test but instead refers to the specified uses of a test for a particular purpose. Importantly, the evaluation of validity is neither static nor a 1-time event but a continuing process.<a><\/a><\/p>\n\n\n\n<p>How to Recognize It<\/p>\n\n\n\n<p>One should look for language that implies that a tool, simulator, curriculum, etc has been validated. Evidence for validity can be collected to support the use of a tool, simulator, curriculum, etc for a particular purpose, but the tool itself cannot be validated. Also, look for obsolete terms such as&nbsp;<em>face validity<\/em>&nbsp;or studies that purport to assess the performance of experts and novices on a simulator to demonstrate its construct validity. The difference in surgical skill between these 2 groups is so large that such a comparison is meaningless and should be avoided.<a><\/a><\/p>\n\n\n\n<p>How to Avoid It<\/p>\n\n\n\n<p>Authors should be aware of Messick and Kane\u2019s frameworks<sup><a href=\"https:\/\/jamanetwork.com\/journals\/jamasurgery\/fullarticle\/2813498#sgm230002r5\">5<\/a><\/sup><sup>,<a href=\"https:\/\/jamanetwork.com\/journals\/jamasurgery\/fullarticle\/2813498#sgm230002r6\">6<\/a><\/sup>&nbsp;and use these to guide studies that investigate validity. When collecting construct validity evidence for a tool or simulator, include several groups of variable skill in your comparison rather than just experts and novices.<a><\/a><\/p>\n\n\n\n<p>Common Flaw 4: Pre-Post Studies<\/p>\n\n\n\n<p>What Is the Issue?<\/p>\n\n\n\n<p>While pre-post study designs are an acceptable and useful methodology, in the context of education research these are less meaningful. Immediate improvements inevitably seen after an educational intervention should be expected and are not noteworthy; rather, retention of knowledge, attitude, or skills should be determined.<a><\/a><\/p>\n\n\n\n<p>How to Recognize It<\/p>\n\n\n\n<p>Be aware of the limitations of single-group pre-post study design when studying the impact of an educational intervention. Determine whether a meaningful control group has been included or whether retention data are reported to document the lasting outcomes of the intervention.<a><\/a><\/p>\n\n\n\n<p>How to Avoid It<\/p>\n\n\n\n<p>The benefit of an educational intervention should be judged not on immediate postintervention improvements but rather by including a retention assessment offered weeks or months after the intervention to detect knowledge, skill, or attitude decay. Another approach is to include a meaningful control group that allows comparisons with the intervention group.<a><\/a><\/p>\n\n\n\n<p>The aforementioned flaws are not the only ones that occur in the surgical education literature. We focused on these due to the frequency with which they occur and the fact that they are easily correctable with appropriate study design. We believe that preventing such flaws will improve the quality of surgical education research.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Guide to Statistics and Methods&nbsp; Surgical Educatio [&hellip;]<\/p>\n","protected":false},"author":3,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":[],"categories":[13,14,32,23],"tags":[],"_links":{"self":[{"href":"https:\/\/csccm.org.cn\/index.php?rest_route=\/wp\/v2\/posts\/25341"}],"collection":[{"href":"https:\/\/csccm.org.cn\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/csccm.org.cn\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/csccm.org.cn\/index.php?rest_route=\/wp\/v2\/users\/3"}],"replies":[{"embeddable":true,"href":"https:\/\/csccm.org.cn\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=25341"}],"version-history":[{"count":2,"href":"https:\/\/csccm.org.cn\/index.php?rest_route=\/wp\/v2\/posts\/25341\/revisions"}],"predecessor-version":[{"id":25350,"href":"https:\/\/csccm.org.cn\/index.php?rest_route=\/wp\/v2\/posts\/25341\/revisions\/25350"}],"wp:attachment":[{"href":"https:\/\/csccm.org.cn\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=25341"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/csccm.org.cn\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=25341"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/csccm.org.cn\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=25341"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}