Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Frequency of reported types of studies and use of descriptive and inferential statistics (n = 216).
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Introductory statistical inference texts and courses treat the point estimation, hypothesis testing, and interval estimation problems separately, with primary emphasis on large-sample approximations. Here, I present an alternative approach to teaching this course, built around p-values, emphasizing provably valid inference for all sample sizes. Details about computation and marginalization are also provided, with several illustrative examples, along with a course outline. Supplementary materials for this article are available online.
Facebook
TwitterThis article resulted from our participation in the session on the “role of expert opinion and judgment in statistical inference” at the October 2017 ASA Symposium on Statistical Inference. We present a strong, unified statement on roles of expert judgment in statistics with processes for obtaining input, whether from a Bayesian or frequentist perspective. Topics include the role of subjectivity in the cycle of scientific inference and decisions, followed by a clinical trial and a greenhouse gas emissions case study that illustrate the role of judgments and the importance of basing them on objective information and a comprehensive uncertainty assessment. We close with a call for increased proactivity and involvement of statisticians in study conceptualization, design, conduct, analysis, and communication.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
One of the newest types of multimedia involves body-connected interfaces, usually termed haptics. Haptics may use stylus-based tactile interfaces, glove-based systems, handheld controllers, balance boards, or other custom-designed body-computer interfaces. How well do these interfaces help students learn Science, Technology, Engineering, and Mathematics (STEM)? We conducted an updated review of learning STEM with haptics, applying meta-analytic techniques to 21 published articles reporting on 53 effects for factual, inferential, procedural, and transfer STEM learning. This deposit includes the data extracted from those articles and comprises the raw data used in the meta-analytic analyses.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This work is motivated by learning the individualized minimal clinically important difference, a vital concept to assess clinical importance in various biomedical studies. We formulate the scientific question into a high-dimensional statistical problem where the parameter of interest lies in an individualized linear threshold. The goal is to develop a hypothesis testing procedure for the significance of a single element in this parameter as well as of a linear combination of this parameter. The difficulty dues to the high-dimensional nuisance in developing such a testing procedure, and also stems from the fact that this high-dimensional threshold model is nonregular and the limiting distribution of the corresponding estimator is nonstandard. To deal with these challenges, we construct a test statistic via a new bias-corrected smoothed decorrelated score approach, and establish its asymptotic distributions under both null and local alternative hypotheses. We propose a double-smoothing approach to select the optimal bandwidth in our test statistic and provide theoretical guarantees for the selected bandwidth. We conduct simulation studies to demonstrate how our proposed procedure can be applied in empirical studies. We apply the proposed method to a clinical trial where the scientific goal is to assess the clinical importance of a surgery procedure. Supplementary materials for this article are available online.
Facebook
TwitterThe Bland and Altman plot method is a widely cited and applied graphical approach to assess the equivalence of quantitative measurement techniques, usually aiming to replace a traditional technique with a new, less invasive or less expensive one. However, the Bland and Altman plot is often misinterpreted due to a lack of suitable inferential statistical support. Usual alternatives, such as Pearson's correlation or ordinal least-square linear regression, also fail to identify the weaknesses of each measurement technique. This is a package designed for the analysis of equivalence between measurement techniques. It should be noted that this package does not introduce another iteration of the Bland-Altman plot method. The package's name and our intention were simply inspired by the shared objective of establishing equivalence. This objective revolves around comparing single or repeated interval-scaled measures from two measurement techniques applied to the same subjects. We have developed a completely different inferential test in contrast to the original Bland-Altman proposal. We have highlighted certain criticisms of the original Bland-Altman plot method, which heavily relies on visual inspection and subjectivity for determining equivalence. Our goal is to empower the reader to make an informed decision regarding the validity of this new measurement technique. Here, inferential statistics support for equivalence between measurement techniques is proposed in three nested tests based on structural regressions to assess the equivalence of structural means (accuracy), equivalence of structural variances (precision), and concordance with the structural bisector line (agreement in measurements obtained from the same subject), by analytical methods and robust approach by bootstrapping. Graphical outputs are also implemented to follow Bland and Altman's principles for easy communication. The related publication shows that this approach was tested on five datasets from articles that used Bland and Altman's method. In one case, where authors concluded disagreement, the approach identified equivalence by addressing bias correction. In another case, it aligned with the original assessment but refined the original authors’ results. In a specific case, unnecessary numerical transformations led to a conclusion of equivalence, but this approach, which naturally generates slanted bands, found non-equivalence in precision and agreement. In one case where authors claimed disagreement, the approach revealed precision issues, rendering the comparison invalid.
Facebook
TwitterNetworks are often characterized by node heterogeneity for which nodes exhibit different degrees of interaction and link homophily for which nodes sharing common features tend to associate with each other. In this article, we rigorously study a directed network model that captures the former via node-specific parameterization and the latter by incorporating covariates. In particular, this model quantifies the extent of heterogeneity in terms of outgoingness and incomingness of each node by different parameters, thus allowing the number of heterogeneity parameters to be twice the number of nodes. We study the maximum likelihood estimation of the model and establish the uniform consistency and asymptotic normality of the resulting estimators. Numerical studies demonstrate our theoretical findings and two data analyses confirm the usefulness of our model. Supplementary materials for this article are available online.
Facebook
TwitterModels with intractable normalizing functions arise frequently in statistics. Common examples of such models include exponential random graph models for social networks and Markov point processes for ecology and disease modeling. Inference for these models is complicated because the normalizing functions of their probability distributions include the parameters of interest. In Bayesian analysis, they result in so-called doubly intractable posterior distributions which pose significant computational challenges. Several Monte Carlo methods have emerged in recent years to address Bayesian inference for such models. We provide a framework for understanding the algorithms, and elucidate connections among them. Through multiple simulated and real data examples, we compare and contrast the computational and statistical efficiency of these algorithms and discuss their theoretical bases. Our study provides practical recommendations for practitioners along with directions for future research for Markov chain Monte Carlo (MCMC) methodologists. Supplementary materials for this article are available online.
Facebook
TwitterThere is a growing interest in cell-type-specific analysis from bulk samples with a mixture of different cell types. A critical first step in such analyses is the accurate estimation of cell-type proportions in a bulk sample. Although many methods have been proposed recently, quantifying the uncertainties associated with the estimated cell-type proportions has not been well studied. Lack of consideration of these uncertainties can lead to missed or false findings in downstream analyses. In this article, we introduce a flexible statistical deconvolution framework that allows a general and subject-specific covariance of bulk gene expressions. Under this framework, we propose a decorrelated constrained least squares method called DECALS that estimates cell-type proportions as well as the sampling distribution of the estimates. Simulation studies demonstrate that DECALS can accurately quantify the uncertainties in the estimated proportions whereas other methods fail. Applying DECALS to analyze bulk gene expression data of post mortem brain samples from the ROSMAP and GTEx projects, we show that taking into account the uncertainties in the estimated cell-type proportions can lead to more accurate identifications of cell-type-specific differentially expressed genes and transcripts between different subject groups, such as between Alzheimer’s disease patients and controls and between males and females. Supplementary materials for this article are available online, including a standardized description of the materials available for reproducing the work.
Facebook
TwitterA recent article reported evidence from a survey experiment indicating that Americans reward whites more than blacks for hard work but penalize blacks more than whites for laziness. However, the present study demonstrates that these inferences were based on an unrepresentative selection of possible analyses: strength of inferences from results reported in the original article were weakened when combined with results from equivalent or relevant analyses not reported in the original article; moreover, newly-reported evidence revealed heterogeneity in racial bias: respondents given a direct choice between equivalent targets of different races favored the black target over the white target. Results illustrate how the presence of researcher degrees of freedom can foster production of inferences that are not representative of all inferences that could have been produced from a set of data, thus illustrating the value in preregistering research design protocols and requiring public posting of data.
Facebook
TwitterCC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Background. Contests are games in which the players compete for a prize and exert effort to increase their probability of winning. For sport contests, analysts often use the Pythagorean model to estimate teams’ expected wins (quality). We ask if there are alternative contest models that minimize error or information loss from misspecification and outperform the Pythagorean model. Aim. This article aims to use simulated data to select the optimal expected wins model among the choice of relevant alternatives. The choices include the traditional Pythagorean model and the difference-form contest success functions (CSF). Method. We simulate 1,000 iterations of the 2014 MLB season for the purpose of estimating and analyzing alternative models of expected wins (team quality). We use the open-source, Strategic Baseball Simulator and develop an AutoHotKey script that programmatically executes the SBS application, chooses the correct settings for the 2014 season, enters a unique ID for the simulation data file, and iterates these steps 1,000 times. We estimate expected wins using the traditional Pythagorean model, as well as the difference-form CSF model that is used in game theory and public choice economics. Each model is estimated while accounting for fixed (team) effects. Result. We find that the difference-form CSF model outperforms the traditional Pythagorean model in terms of explanatory power and in terms of misspecification-based information loss as estimated by the Akaike Information Criterion. Through parametric estimation, we further confirm that the simulator yields realistic statistical outcomes.
Facebook
Twitterhttps://spdx.org/licenses/etalab-2.0.htmlhttps://spdx.org/licenses/etalab-2.0.html
Data and codes for reproducing the MCMC inferences, tables and the main figures described in Kwame Adrakey et al. 2023. Bayesian inference for spatio-temporal stochastic transmission of plant disease in the presence of roguing: a case study to estimate the dispersal distance of Flavescence dorée. The codes are written in C and R languages.
Facebook
Twitterhttps://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html
Nearly sixty years ago, in a publication with a growing rate of citation ever since, JR Platt presented “strong inference” (SI) as an accumulative method of inductive inference to produce much more rapid progress than others. The article offered persuasive testimony for the use of multiple working hypotheses combined with disconfirmation. It is often cited as an exemplar of scientific practice. However, the article provides no evidence of greater efficacy. Over a 34 year period a total 780 matched trials were completed in 56 labs in a university course in statistical science. The reduction from random (18.9 cards) to selected cards was 7.2 cards, compared to a further reduction of 0.3 cards from selected to SI. In 46% of the 780 trials, the number of cards to infer a rule was greater for strong inference than for a less rigid experimental method. Based on the evidence, strong inference added little additional strength beyond that of less rigidly structured experiments. Methods Using inductive cards as a model (Gardner 1959 Inductive Cards. Scientific American 200:160), I devised a lab for a course in statistics for graduate and upper level undergraduate university students. Students worked in groups of three or four. One person (Nature) devises a rule for placing cards in piles. The other students in the group work together to infer a rule for cards placed by Nature according to the unknown rule. On the first round cards are drawn from a shuffled deck. This is an observational study with an uncontrolled random component. On the second round (Selected cards) each rule moves to a different group, where students chose cards to present to Nature (an experimental study). On the third round a new group applies the strong inference (SI) method to a rule. The lab required students to list multiple working hypotheses at each step, list one or more “crucial test” cards, present them to Nature for placement, and disconfirm one or more hypotheses. The procedure is repeated until the rule is discovered. The number of cards to infer the rule is tallied by the instructor and distributed to students for their lab write-ups.
Facebook
TwitterBackgroundChocolate, as a cocoa-derived product rich in flavanols, has been used for medical and anti-inflammatory purposes. Therefore, the aim of this study was to investigate if the ingestion of different percentages of cocoa products affects the experimentally induced pain caused by intramuscular hypertonic saline injections in the masseter muscle of healthy men and women.MethodsThis experimental randomized, double-blind, and controlled study included 15 young, healthy, and pain-free men and 15 age-matched women and involved three visits with at least a 1-week washout. Pain was induced twice at each visit with intramuscular injections of 0.2 mL hypertonic saline (5%), before and after intake of one of the different chocolate types: white (30% cocoa content), milk (34% cocoa content), and dark (70% cocoa content). Pain duration, pain area, peak pain, and pressure pain threshold (PPT) were assessed every fifth minute after each injection, up until 30 min after the initial injection. Descriptive and inferential statistics were performed using IBM® SPSS (Version 27); significance level was set to p<0.05.ResultsThis study showed that intake of chocolate, no matter the type, reduced the induced pain intensity significantly more than no intake of chocolate (p<0.05, Tukey test). There were no differences between the chocolate types. Further, men showed a significantly greater pain reduction than women after intake of white chocolate (p<0.05, Tukey test). No other differences between pain characteristics or sexes were revealed.ConclusionIntake of chocolate before a painful stimulus had a pain-reducing effect no matter the cocoa concentration. The results indicate that perhaps it is not the cocoa concentration (e.g., flavanols) alone that explains the positive effect on pain, but likely a combination of preference and taste-experience. Another possible explanation could be the composition of the chocolate, i.e. the concentration of the other ingredients such as sugar, soy, and vanilla.ClinicalTrials.gov Identifier: NCT05378984.
Facebook
TwitterAbstract The aim of this study was to compare gifted, with academic and artistic talent, and non-gifted students regarding overexcitability, as well as to investigate the perceptions of teachers from a specialized educational program for the gifted about their students’ emotional development. The study included 150 students and six teachers. As instruments, we used participants characterization questionnaires, an overexcitability scale and a semi-structured interview script. Data were analyzed using inferential statistics and content analysis. The results indicated significant differences between gifted and non-gifted students in the patterns of intellectual and imaginative over-excitability, as well as a tendency for teachers to emotionally characterize gifted students with an emphasis on psychological disorders and weaknesses. To invest in educational strategies that use information derived from overexcitability patterns as facilitating tools for the learning process of the gifted can contribute to increasing student engagement at school, keeping them motivated.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset supports the research article entitled ‘Supplementary Data for the Mediation Effect of Personality Functioning – Gender Differences, Separate Analyses of Depression and Anxiety Symptoms and Inferential Statistics of the Relationship Between Personality Functioning and Different Types of Child Maltreatment' (Freier et al., 2022). The data were collected from 2,510 participants. The data include participants’ socio-demographic variables (age, gender, level of education and personal income), and the raw data of the items answers as well as the sum scores for the Childhood Trauma Questionnaire, the Four-Item Public Health Questionnaire and the short version of the OPD Structure Questionnaire (OPD-SQS). A further description of the data can be found in the related article in Data in Brief.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
These are the data obtained from 4 evaluations made to a sample of 630 students of a Peruvian pre-university center. First, two study variables were evaluated: teaching sequence compliance and the student's educational need according to the perspective of 315 students of the verbal reasoning course. Second, the same study variables were assessed in the remaining 315 students of the mathematical reasoning course. This information is being used in research to obtain an academic degree and later to make a publication of a scientific article.
For the treatment of these data, inferential statistics was used through the software R version 3.4.4 (2018) The R Foundation for Statistical Computing.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Bayesian Classification and Regression Trees (BCART) and Bayesian Additive Regression Trees (BART) are popular Bayesian regression models widely applicable in modern regression problems. Their popularity is intimately tied to the ability to flexibly model complex responses depending on high-dimensional inputs while simultaneously being able to quantify uncertainties. This ability to quantify uncertainties is key, as it allows researchers to perform appropriate inferential analyses in settings that have generally been too difficult to handle using the Bayesian approach. However, surprisingly little work has been done to evaluate the sensitivity of these modern regression models to violations of modeling assumptions. In particular, we will consider influential observations, which one reasonably would imagine to be common—or at least a concern—in the big-data setting. In this article, we consider both the problem of detecting influential observations and adjusting predictions to not be unduly affected by such potentially problematic data. We consider three detection diagnostics for Bayesian tree models, one an analogue of Cook’s distance and the others taking the form of a divergence measure and a conditional predictive density metric, and then propose an importance sampling algorithm to re-weight previously sampled posterior draws so as to remove the effects of influential data in a computationally efficient manner. Finally, our methods are demonstrated on real-world data where blind application of the models can lead to poor predictions and inference. Supplementary materials for this article are available online.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Bootstrapped evidence toolbox for using the methods described in my article "Non-dichotomous inference using bootstrapped evidence" (currently unpublished)
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Relative frequency of extracted maximum α levels < .05 in articles with p-values.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Frequency of reported types of studies and use of descriptive and inferential statistics (n = 216).