Part I of this series provided an overview of the importance of critically appraising research to understand if the results can be trusted, while Part II provided a framework for evaluating bias in clinical research.
A final component that needs to be assessed is whether the trial was powered sufficiently, meaning the sample size of the trial was large enough to detect a true difference between groups. Critical appraisal tools do not require appraisers to identify if the trial was powered correctly. However, power is a fundamental component of sound methodological design. Assuming the risk of bias of a trial is low, the results from a trial adequately powered to detect differences between the primary outcome(s) have a higher probability of being true.
Before a trial is conducted, researchers need to calculate the number of participants that are required for minimally important clinical effects to be observed for the primary outcome. The sample size needs to be large enough to minimize the risk of finding false positive and false negative results. The minimally important clinical effects are dependent on the disease and outcome studied. If the trial is underpowered, meaning the sample size that was included and analyzed in the study was less than the calculated sample size, the risk of false negatives is higher. If no statistically significant results were found but the trial was underpowered, it is inappropriate to conclude that there were no differences between groups because the sample size was not statistically large enough to detect a difference. Similarly, if statistically significant results were obtained but the trial was underpowered, the risk of false positives is high, meaning the significant effects may not be real and the results cannot be trusted. Given that power is such an important component, the expectation is that all researchers would calculate the required sample size before beginning a trial. Unfortunately, trials are often not powered at the start, or a large proportion of participants drop out of the trial before the study is complete, leaving the final analysis underpowered.
To determine if a trial was powered correctly, the calculated sample size for the trial should be compared to the number of participants who were actually enrolled and completed the study. If the number of participants who completed the trial and were included in the analysis was equal to or greater than the calculated sample size, then the trial was adequately powered to detect a difference in outcomes, and more faith can be invested in the truth of the results (assuming the risk of bias was also low).
The critical appraisal process may seem daunting. Knowledge consumers that are not capable of critical appraisal can read systematic reviews/meta analyses (SR/MA), which are research reports that assess the overall body of evidence around an intervention, and understand if it is safe and effective. Researchers conduct SR/MAs by summarizing the literature through the following steps: aggregating the totality of the literature around a subject, narratively and quantitatively pooling the results from each study to examine the relationship that individual studies have between each other, assessing the risk of bias of each study, and arriving at conclusions regarding the direction (beneficial or harmful) and significance (if the results are statistically and clinically meaningful) of the intervention. When viewed from this “30,000 ft level,” researchers can make a data-driven assessment of how effective an intervention is likely to be.
Entire research fields are dedicated to conducting, assessing, and applying the results of clinical trials into practice. As demonstrated, there is no simple method to identify if a clinical trial’s results can be trusted. However, assessment tools can be used, and SR/MAs can be conducted and read, offering knowledge consumers access to an examination of the overall evidence base around a topic.
Science, health, and wellness are sold in a competitive market. Many health-promoting treatments and products either: 1) have no evidence behind them; 2) rely on minimal evidence from claims originating from pre-clinical cell culture or animal studies that very rarely translate to humans; or 3) are based on N = 1 trials where “biohackers” test new unproven treatments on themselves at the advice of a guru, generating an ill-founded movement. Even among the treatments and products that have been studied in clinical trials, any number of measurement, design, or systematic errors may be present, reducing the validity and reliability of the trial’s results.
Caution must be stressed before trusting any intervention. Generally, treatments that can be trusted originate from high-quality randomized controlled trials with a low risk of bias and where a well conducted SR/MA has demonstrated the safety and effectiveness of the intervention across a range of related studies. Ultimately, the devil is in the details with any type of research, both preclinical or clinical. When reading a clinical study, remember that a study’s conclusions are only as reliable as its methods. Never blindly trust the conclusions from a research article, no matter how much conviction the authors or related experts assert in its outcomes. Trials need to be critically appraised to examine the validity of claims from the research and ideally supported by additional studies that reproduce and replicate the results.
The next time you read “A study shows…," do not accept the statement and study at face value. To find the truth - or at least the most likely truth - it is necessary to challenge everything.