Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
BackgroundIn medical practice, clinically unexpected measurements might be quite properly handled by the remeasurement, removal, or reclassification of patients. If these habits are not prevented during clinical research, how much of each is needed to sway an entire study?Methods and ResultsBelieving there is a difference between groups, a well-intentioned clinician researcher addresses unexpected values. We tested how much removal, remeasurement, or reclassification of patients would be needed in most cases to turn an otherwise-neutral study positive. Remeasurement of 19 patients out of 200 per group was required to make most studies positive. Removal was more powerful: just 9 out of 200 was enough. Reclassification was most powerful, with 5 out of 200 enough. The larger the study, the smaller the proportion of patients needing to be manipulated to make the study positive: the percentages needed to be remeasured, removed, or reclassified fell from 45%, 20%, and 10% respectively for a 20 patient-per-group study, to 4%, 2%, and 1% for an 800 patient-per-group study. Dot-plots, but not bar-charts, make the perhaps-inadvertent manipulations visible. Detection is possible using statistical methods such as the Tadpole test.ConclusionsBehaviours necessary for clinical practice are destructive to clinical research. Even small amounts of selective remeasurement, removal, or reclassification can produce false positive results. Size matters: larger studies are proportionately more vulnerable. If observational studies permit selective unblinded enrolment, malleable classification, or selective remeasurement, then results are not credible. Clinical research is very vulnerable to “remeasurement, removal, and reclassification”, the 3 evil R's.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
GFR, glomerular filtration rate; CCr, rate of creatinine clearance; ACEi, angiotensin-converting enzyme inhibitor; ARB, angiotensin receptor blocker; RASi, rennin-angiotensin system inhibitor ; DM, diabetes mellitus; PKD, polycystic kidney disease; HBP, high blood pressure (hypertension).
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
BackgroundClinical trial results registries may contain relevant unpublished information. Our main aim was to investigate the potential impact of the inclusion of reports from industry results registries on systematic reviews (SRs).MethodsWe identified a sample of 150 eligible SRs in PubMed via backward selection. Eligible SRs investigated randomized controlled trials of drugs and included at least 2 bibliographic databases (original search date: 11/2009). We checked whether results registries of manufacturers and/or industry associations had also been searched. If not, we searched these registries for additional trials not considered in the SRs, as well as for additional data on trials already considered. We reanalysed the primary outcome and harm outcomes reported in the SRs and determined whether results had changed. A “change” was defined as either a new relevant result or a change in the statistical significance of an existing result. We performed a search update in 8/2013 and identified a sample of 20 eligible SRs to determine whether mandatory results registration from 9/2008 onwards in the public trial and results registry ClinicalTrials.gov had led to its inclusion as a standard information source in SRs, and whether the inclusion rate of industry results registries had changed.Results133 of the 150 SRs (89%) in the original analysis did not search industry results registries. For 23 (17%) of these SRs we found 25 additional trials and additional data on 31 trials already included in the SRs. This additional information was found for more than twice as many SRs of drugs approved from 2000 as approved beforehand. The inclusion of the additional trials and data yielded changes in existing results or the addition of new results for 6 of the 23 SRs. Of the 20 SRs retrieved in the search update, 8 considered ClinicalTrials.gov or a meta-registry linking to ClinicalTrials.gov, and 1 considered an industry results registry.ConclusionThe inclusion of industry and public results registries as an information source in SRs is still insufficient and may result in publication and outcome reporting bias. In addition to an essential search in ClinicalTrials.gov, authors of SRs should consider searching industry results registries.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
BackgroundRandomised controlled trials (RCTs) are widely accepted as the preferred study design for evaluating healthcare interventions. When the sample size is determined, a (target) difference is typically specified that the RCT is designed to detect. This provides reassurance that the study will be informative, i.e., should such a difference exist, it is likely to be detected with the required statistical precision. The aim of this review was to identify potential methods for specifying the target difference in an RCT sample size calculation.Methods and FindingsA comprehensive systematic review of medical and non-medical literature was carried out for methods that could be used to specify the target difference for an RCT sample size calculation. The databases searched were MEDLINE, MEDLINE In-Process, EMBASE, the Cochrane Central Register of Controlled Trials, the Cochrane Methodology Register, PsycINFO, Science Citation Index, EconLit, the Education Resources Information Center (ERIC), and Scopus (for in-press publications); the search period was from 1966 or the earliest date covered, to between November 2010 and January 2011. Additionally, textbooks addressing the methodology of clinical trials and International Conference on Harmonisation of Technical Requirements for Registration of Pharmaceuticals for Human Use (ICH) tripartite guidelines for clinical trials were also consulted. A narrative synthesis of methods was produced. Studies that described a method that could be used for specifying an important and/or realistic difference were included. The search identified 11,485 potentially relevant articles from the databases searched. Of these, 1,434 were selected for full-text assessment, and a further nine were identified from other sources. Fifteen clinical trial textbooks and the ICH tripartite guidelines were also reviewed. In total, 777 studies were included, and within them, seven methods were identified—anchor, distribution, health economic, opinion-seeking, pilot study, review of the evidence base, and standardised effect size.ConclusionsA variety of methods are available that researchers can use for specifying the target difference in an RCT sample size calculation. Appropriate methods may vary depending on the aim (e.g., specifying an important difference versus a realistic difference), context (e.g., research question and availability of data), and underlying framework adopted (e.g., Bayesian versus conventional statistical approach). Guidance on the use of each method is given. No single method provides a perfect solution for all contexts.Please see later in the article for the Editors' Summary
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
RCI, reliable change index; VAS, visual analogue scale; WTP, willingness to pay per unit of effectiveness.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Assessment of the value of the methods.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
When evaluating the real-world treatment effect, the analysis based on randomized clinical trials (RCTs) often introduces generalizability bias due to the difference in risk factors between the trial participants and the real-world patient population. This problem of lack of generalizability associated with the RCT-only analysis can be addressed by leveraging observational studies with large sample sizes that are representative of the real-world population. A set of novel statistical methods, termed “genRCT”, for improving the generalizability of the trial has been developed using calibration weighting, which enforces the covariates balance between the RCT and observational study. This paper aims to review statistical methods for generalizing the RCT findings by harnessing information from large observational studies that represent real-world patients. Specifically, we discuss the choices of data sources and variables to meet key theoretical assumptions and principles. We introduce and compare estimation methods for continuous, binary, and survival endpoints. We showcase the use of the R package genRCT through a case study that estimates the average treatment effect of adjuvant chemotherapy for the stage 1B non-small cell lung patients represented by a large cancer registry.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
CVD = cardiovascular disease; No = number; NR = not reported; Rx = treatment; T2DM = type two diabetes mellitus; USA = United States of America; yrs = years.Note: In the case of multiple intervention groups, we selected one pair of interventions i.e. treatment and control that was most relevant to this systematic review question.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
*Upper 95% confidence limit for Hazard ratio was no greater than 1.18; **Upper 95% confidence limit for the Hazard ratio was less than 1.40.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
ObjectivesTo test the inter-rater reliability of the RoB tool applied to Physical Therapy (PT) trials by comparing ratings from Cochrane review authors with those of blinded external reviewers.MethodsRandomized controlled trials (RCTs) in PT were identified by searching the Cochrane Database of Systematic Reviews for meta-analysis of PT interventions. RoB assessments were conducted independently by 2 reviewers blinded to the RoB ratings reported in the Cochrane reviews. Data on RoB assessments from Cochrane reviews and other characteristics of reviews and trials were extracted. Consensus assessments between the two reviewers were then compared with the RoB ratings from the Cochrane reviews. Agreement between Cochrane and blinded external reviewers was assessed using weighted kappa (κ).ResultsIn total, 109 trials included in 17 Cochrane reviews were assessed. Inter-rater reliability on the overall RoB assessment between Cochrane review authors and blinded external reviewers was poor (κ = 0.02, 95%CI: −0.06, 0.06]). Inter-rater reliability on individual domains of the RoB tool was poor (median κ = 0.19), ranging from κ = −0.04 (“Other bias”) to κ = 0.62 (“Sequence generation”). There was also no agreement (κ = −0.29, 95%CI: −0.81, 0.35]) in the overall RoB assessment at the meta-analysis level.ConclusionsRisk of bias assessments of RCTs using the RoB tool are not consistent across different research groups. Poor agreement was not only demonstrated at the trial level but also at the meta-analysis level. Results have implications for decision making since different recommendations can be reached depending on the group analyzing the evidence. Improved guidelines to consistently apply the RoB tool and revisions to the tool for different health areas are needed.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Additional file 1: Search strategy. The document AdditionalFile1.odt contains the full search strategy used for each database. The PRISMA-S guidelines [46] were followed to report the search strategy.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
RD = risk difference; MD = mean difference; HR = hazard ratio; PP = per protocol; ITT = intention-to-treat; NR = not reported; * = NI established; # = NI not established; $ = superior; % = study terminated early; ** = study inconclusive; ## = inferior; $$ = NI established by PP analysis, NI not established by ITT analysis.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Adverse reactions during study.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Descriptive Statistics of ITT Population.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Pharmaceutical obesity RCTs used to evaluate the scope of the missing data problem.pdf (0.27 MB DOC)
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
HRQoL = Health-related quality of life; PT = physical therapy; RCT = randomized controlled trial; RoB = Risk of Bias; ROM = range of motion; WOMAC = Western Ontario and McMaster Universities Arthritis Index.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Additional file 2: Information Sources. The document AdditionalFile2.odt contains the list of all sources searched, the dates of the searches and the interfaces used.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Cumulative meta-analyses of studies of the effects of healthcare interventions. (DOCX)
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
M(SD) unless otherwise stated. N = 797aSex and age standardized cut points [22]. bHousehold income lower than 50% of the sample median income. cBoth parents native Danish. n varies due to different data sources
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Supplementary material including Nonlinear curve fitting and estimation of time to relapse, Figure S1 and Table S1. (DOC)
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
BackgroundIn medical practice, clinically unexpected measurements might be quite properly handled by the remeasurement, removal, or reclassification of patients. If these habits are not prevented during clinical research, how much of each is needed to sway an entire study?Methods and ResultsBelieving there is a difference between groups, a well-intentioned clinician researcher addresses unexpected values. We tested how much removal, remeasurement, or reclassification of patients would be needed in most cases to turn an otherwise-neutral study positive. Remeasurement of 19 patients out of 200 per group was required to make most studies positive. Removal was more powerful: just 9 out of 200 was enough. Reclassification was most powerful, with 5 out of 200 enough. The larger the study, the smaller the proportion of patients needing to be manipulated to make the study positive: the percentages needed to be remeasured, removed, or reclassified fell from 45%, 20%, and 10% respectively for a 20 patient-per-group study, to 4%, 2%, and 1% for an 800 patient-per-group study. Dot-plots, but not bar-charts, make the perhaps-inadvertent manipulations visible. Detection is possible using statistical methods such as the Tadpole test.ConclusionsBehaviours necessary for clinical practice are destructive to clinical research. Even small amounts of selective remeasurement, removal, or reclassification can produce false positive results. Size matters: larger studies are proportionately more vulnerable. If observational studies permit selective unblinded enrolment, malleable classification, or selective remeasurement, then results are not credible. Clinical research is very vulnerable to “remeasurement, removal, and reclassification”, the 3 evil R's.