25 datasets found
  1. r

    Power analysis for non-normaly distributed traits

    • researchdata.edu.au
    Updated Dec 17, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Caitlin Jenvey (2021). Power analysis for non-normaly distributed traits [Dataset]. http://doi.org/10.26181/5FCDAB1677082
    Explore at:
    Dataset updated
    Dec 17, 2021
    Dataset provided by
    La Trobe University
    Authors
    Caitlin Jenvey
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    An important step in the design of any research experiment is to determine the sample size required to detect an effect. Statistical power is the likelihood that a study will detect an effect when there is an effect to be detected.

    A power analysis will estimate the sample size required for a given statistical power, effect size and significance level, assuming that your data is normally distributed. But what if your data is skewed, or non-normally distributed? This workshop will show you how to use R to assess if your data is non-normally distributed, how to determine the distribution of your data and how to perform a power analysis appropriate for that distribution.

  2. f

    Normalization of High Dimensional Genomics Data Where the Distribution of...

    • plos.figshare.com
    tiff
    Updated Jun 1, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mattias Landfors; Philge Philip; Patrik Rydén; Per Stenberg (2023). Normalization of High Dimensional Genomics Data Where the Distribution of the Altered Variables Is Skewed [Dataset]. http://doi.org/10.1371/journal.pone.0027942
    Explore at:
    tiffAvailable download formats
    Dataset updated
    Jun 1, 2023
    Dataset provided by
    PLOS ONE
    Authors
    Mattias Landfors; Philge Philip; Patrik Rydén; Per Stenberg
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Genome-wide analysis of gene expression or protein binding patterns using different array or sequencing based technologies is now routinely performed to compare different populations, such as treatment and reference groups. It is often necessary to normalize the data obtained to remove technical variation introduced in the course of conducting experimental work, but standard normalization techniques are not capable of eliminating technical bias in cases where the distribution of the truly altered variables is skewed, i.e. when a large fraction of the variables are either positively or negatively affected by the treatment. However, several experiments are likely to generate such skewed distributions, including ChIP-chip experiments for the study of chromatin, gene expression experiments for the study of apoptosis, and SNP-studies of copy number variation in normal and tumour tissues. A preliminary study using spike-in array data established that the capacity of an experiment to identify altered variables and generate unbiased estimates of the fold change decreases as the fraction of altered variables and the skewness increases. We propose the following work-flow for analyzing high-dimensional experiments with regions of altered variables: (1) Pre-process raw data using one of the standard normalization techniques. (2) Investigate if the distribution of the altered variables is skewed. (3) If the distribution is not believed to be skewed, no additional normalization is needed. Otherwise, re-normalize the data using a novel HMM-assisted normalization procedure. (4) Perform downstream analysis. Here, ChIP-chip data and simulated data were used to evaluate the performance of the work-flow. It was found that skewed distributions can be detected by using the novel DSE-test (Detection of Skewed Experiments). Furthermore, applying the HMM-assisted normalization to experiments where the distribution of the truly altered variables is skewed results in considerably higher sensitivity and lower bias than can be attained using standard and invariant normalization methods.

  3. 4

    Supplementary data for the paper "Why psychologists should not default to...

    • data.4tu.nl
    zip
    Updated Apr 28, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Joost de Winter (2025). Supplementary data for the paper "Why psychologists should not default to Welch’s t-test instead of Student’s t-test (and why the Anderson–Darling test is an underused alternative)" [Dataset]. http://doi.org/10.4121/e8e6861a-7ab0-4b6d-bd67-5f95029322c5.v3
    Explore at:
    zipAvailable download formats
    Dataset updated
    Apr 28, 2025
    Dataset provided by
    4TU.ResearchData
    Authors
    Joost de Winter
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This paper evaluates the claim that Welch’s t-test (WT) should replace the independent-samples t-test (IT) as the default approach for comparing sample means. Simulations involving unequal and equal variances, skewed distributions, and different sample sizes were performed. For normal distributions, we confirm that the WT maintains the false positive rate close to the nominal level of 0.05 when sample sizes and standard deviations are unequal. However, the WT was found to yield inflated false positive rates under skewed distributions, even with relatively large sample sizes, whereas the IT avoids such inflation. A complementary empirical study based on gender differences in two psychological scales corroborates these findings. Finally, we contend that the null hypothesis of unequal variances together with equal means lacks plausibility, and that empirically, a difference in means typically coincides with differences in variance and skewness. An additional analysis using the Kolmogorov-Smirnov and Anderson-Darling tests demonstrates that examining entire distributions, rather than just their means, can provide a more suitable alternative when facing unequal variances or skewed distributions. Given these results, researchers should remain cautious with software defaults, such as R favoring Welch’s test.

  4. d

    Data from: Evolution of quantitative traits under a migration-selection...

    • search.dataone.org
    • data.niaid.nih.gov
    • +1more
    Updated Sep 12, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Florence Débarre; Sam Yeaman; Frédéric Guillaume (2023). Evolution of quantitative traits under a migration-selection balance: when does skew matter? [Dataset]. http://doi.org/10.5061/dryad.ms52b
    Explore at:
    Dataset updated
    Sep 12, 2023
    Dataset provided by
    Dryad Digital Repository
    Authors
    Florence Débarre; Sam Yeaman; Frédéric Guillaume
    Time period covered
    Jun 20, 2020
    Description

    Quantitative-genetic models of differentiation under migration-selection balance often rely on the assumption of normally distributed genotypic and phenotypic values. When a population is subdivided into demes with selection toward different local optima, migration between demes may result in asymmetric, or skewed, local distributions. Using a simplified two-habitat model, we derive formulas without a priori assuming a Gaussian distribution of genotypic values, and we find expressions that naturally incorporate higher moments, such as skew. These formulas yield predictions of the expected divergence under migration-selection balance that are more accurate than models assuming Gaussian distributions, which illustrates the importance of incorporating these higher moments to assess the response to selection in heterogeneous environments. We further show with simulations that traits with loci of large effect display the largest skew in their distribution at migration-selection balance.

  5. J

    Alternative technical efficiency measures: Skew, bias and scale (replication...

    • journaldata.zbw.eu
    • jda-test.zbw.eu
    txt
    Updated Dec 7, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Qu Feng; William C. Horrace; Qu Feng; William C. Horrace (2022). Alternative technical efficiency measures: Skew, bias and scale (replication data) [Dataset]. http://doi.org/10.15456/jae.2022320.0724524832
    Explore at:
    txt(2357), txt(126320)Available download formats
    Dataset updated
    Dec 7, 2022
    Dataset provided by
    ZBW - Leibniz Informationszentrum Wirtschaft
    Authors
    Qu Feng; William C. Horrace; Qu Feng; William C. Horrace
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    In the fixed-effects stochastic frontier model an efficiency measure relative to the best firm in the sample is universally employed. This paper considers a new measure relative to the worst firm in the sample. We find that estimates of this measure have smaller bias than those of the traditional measure when the sample consists of many firms near the efficient frontier. Moreover, a two-sided measure relative to both the best and the worst firms is proposed. Simulations suggest that the new measures may be preferred depending on the skewness of the inefficiency distribution and the scale of efficiency differences.

  6. o

    Data and Code for: Intrinsic Information Preferences and Skewness

    • openicpsr.org
    Updated May 2, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Yusufcan Masatlioglu; Yesim Orhun; Collin Raymond (2023). Data and Code for: Intrinsic Information Preferences and Skewness [Dataset]. http://doi.org/10.3886/E190641V1
    Explore at:
    Dataset updated
    May 2, 2023
    Dataset provided by
    American Economic Association
    Authors
    Yusufcan Masatlioglu; Yesim Orhun; Collin Raymond
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    Nov 2014
    Area covered
    US
    Description

    This project examines whether people have an intrinsic preference for negatively skewed or positively skewed information structures and how these preferences relate to intrinsic preferences for informativeness. It reports results from 5 studies (3 lab experiments, 2 online studies).

  7. d

    Data from: Food web interaction strength distributions are conserved by...

    • datadryad.org
    • zenodo.org
    • +1more
    zip
    Updated Jul 22, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Daniel L. Preston; Landon P. Falke; Jeremy S. Henderson; Mark Novak (2019). Food web interaction strength distributions are conserved by greater variation between than within predator-prey pairs [Dataset]. http://doi.org/10.5061/dryad.sr6888t
    Explore at:
    zipAvailable download formats
    Dataset updated
    Jul 22, 2019
    Dataset provided by
    Dryad
    Authors
    Daniel L. Preston; Landon P. Falke; Jeremy S. Henderson; Mark Novak
    Time period covered
    2019
    Description

    Abiotic_Raw_DryadRaw data on abiotic variables from surveyed stream reaches.Abiotic_Summarized_DryadSummarized data on abiotic variables from surveyed stream reaches.FeedingRates_DryadSculpin feeding rates from nine reaches in three Oregon streams across three seasons.PreyItems_DryadSculpin stomach contents data.SculpinDensity_DryadSculpin density data from electroshock surveys.Surbers_Raw_DryadRaw benthic macroinvertebrate data from Surber samples.Surbers_Summarized_DryadSummarized benthic macroinvertebrate data from Surber samples.

  8. Data from: The improbability of detecting trade-offs and some practical...

    • data.niaid.nih.gov
    • datadryad.org
    zip
    Updated Jul 19, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Marc Johnson (2024). The improbability of detecting trade-offs and some practical solutions [Dataset]. http://doi.org/10.5061/dryad.xpnvx0kq5
    Explore at:
    zipAvailable download formats
    Dataset updated
    Jul 19, 2024
    Dataset provided by
    University of Toronto
    Authors
    Marc Johnson
    License

    https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html

    Description

    Trade-offs are a fundamental concept in evolutionary biology because they are thought to explain much of nature’s biological diversity, from variation in life-histories to differences in metabolism. Despite the predicted importance of trade-offs, they are notoriously difficult to detect. Here we contribute to the existing rich theoretical literature on trade-offs by examining how the shape of the distribution of resources or metabolites acquired in an allocation pathway influences the strength of trade-offs between traits. We further explore how variation in resource distribution interacts with two aspects of pathway complexity (i.e., the number of branches and hierarchical structure) affects tradeoffs. We simulate variation in the shape of the distribution of a resource by sampling 106 individuals from a beta distribution with varying parameters to alter the resource shape. In a simple “Y-model” allocation of resources to two traits, any variation in a resource leads to slopes less than -1, with left skewed and symmetrical distributions leading to negative relationships between traits, and highly right skewed distributions associated with positive relationships between traits. Adding more branches further weakens negative and positive relationships between traits, and the hierarchical structure of pathways typically weakens relationships between traits, although in some contexts hierarchical complexity can strengthen positive relationships between traits. Our results further illuminate how variation in the acquisition and allocation of resources, and particularly the shape of a resource distribution and how it interacts with pathway complexity, makes it challenging to detect trade-offs. We offer several practical suggestions on how to detect trade-offs given these challenges. Methods Overview of Flux Simulations To study the strength and direction of trade-offs within a population, we developed a simulation of flux in a simple metabolic pathway, where a precursor metabolite emerging from node A may either be converted to metabolic products B1 or B2 (Fig. 1). This conception of a pathway is similar to De Jong and Van Noordwijk’s Y-model (Van Noordwijk & De Jong, 1986; De Jong & Van Noordwijk, 1992), but we used simulation instead of analytical statistical models to allow us to consider greater complexity in the distribution of variables and pathways. For a simple pathway (Fig. 1), the total flux Jtotal (i.e., the flux at node A, denoted as JA) for each individual (N = 106) was first sampled from a predetermined beta distribution as described below. The flux at node B1 (JB1) was then randomly sampled from this distribution with max = Jtotal = JA and min = 0. The flux at the remaining node, B2, was then simply the remaining flux (JB2 = JA - JB1). Simulations of more complex pathways followed the same basic approach as described above, with increased numbers of branches and hierarchical levels added to the pathway as described below under Question 2. The metabolic pathways were simulated using Python (v. 3.8.2) (Van Rossum & Drake Jr., 2009) where we could control the underlying distribution of metabolite allocation. The output flux at nodes B1 and B2 was plotted using R (v. 4.2.1) (Team, 2022) with the resulting trade-off visualized as a linear regression using the ggplot2 R package (v. 3.4.2) (Wickham, 2016). While we have conceptualized the pathway as the flux of metabolites, it could be thought of as any resource being allocated to different traits. Question 1: How does variation in resource distribution within a population affect the strength and direction of trade-offs? We first simulated the simplest scenario where all individuals had the same total flux Jtotal = 1, in which case the phenotypic trade-off is expected to be most easily detected. We then modified this initial scenario to explore how variation in the distribution of resource acquisition (Jtotal) affected the strength and direction of trade-offs. Specifically, the resource distribution was systematically varied by sampling n = 103 total flux levels from a beta distribution, which has two parameters alpha and beta that control the size and shape of the distribution (Miller & Miller, 1999). When alpha is large and beta is small, the distribution is left skewed, whereas for small alpha and large beta, the distribution is right skewed. Likewise, for alpha = beta, the curve is symmetrical and approximately normal when the parameters are sufficiently large (>2). We can thus systematically vary the underlying resource distribution of a population by iterating through values of alpha and beta from 0.5 to 5 (in increments of 0.5), which was done using the NumPy Python package (v. 1.19.1) (Harris et al., 2020). The resulting slope of each linear regression of the flux at B1 and B2 (i.e., the two branching nodes) was then calculated using the lm function in R and plotted as a contour map using the latticeExtra Rpackage (v. 0.6-30) (Sarkar, 2008). Question 2: How does the complexity of the pathway used to produce traits affect the strength and direction of trade-offs? Metabolic pathways are typically more complex than what is described above. Most pathways consist of multiple branch points and multiple hierarchical levels. To understand how complexity affects the ability to detect trade-offs when combined with variation in the distribution of total flux we systematically manipulated the number of branch points and hierarchical levels within pathways (Fig. 1). We first explored the effect of adding branches to the pathway from the same node, such that instead of only branching off to nodes B1 and B2, the pathway branched to nodes B1 through to Bn (Fig. 1B), where n is the total number of branches (maximum n = 10 branches). Flux at a node was calculated as previously described, and the remaining flux was evenly distributed amongst the remaining nodes (i.e., nodes B2 through to Bnwould each receive J2-n = (Jtotal - JB1)/(n - 1) flux). For each pathway, we simulated flux using a beta distribution of Jtotalwith alpha = 5, beta = 0.5 to simulate a left skewed distribution, alpha = beta = 5 to simulate a normal distribution, and with alpha = 0.5, beta = 5 to simulate a right skewed distribution, as well as the simplest case where all individuals have total flux Jtotal = 1. We next considered how adding hierarchical levels to a metabolic pathway affected trade-offs. We modified our initial pathway with node A branching to nodes B1 and B2, and then node B2 further branched to nodes C1 and C2 (Fig. 1C). To compute the flux at the two new nodes C1 and C2, we simply repeated the same calculation as before, but using the flux at node B2, JB2, as the total flux. That is, the flux at node C1 was obtained by randomly sampling from the distribution at B2 with max = JB and min = 0, and the flux at node C2 is the remaining flux (JC = JB2 - JC1). Much like in the previous scenario with multiple branch points, we used three beta distributions (with the same parameters as before) to represent left, normal, and right skewed resource distributions, as well as the simplest case where Jtotal = 1 for all individuals. Quantile Regressions We performed quantile regression to understand whether this approach could help to detect trade-offs. Quantile regression is a form of statistical analysis that fits a curve through upper or lower quantiles of the data to assess whether an independent variable potentially sets a lower or upper limit to a response variable (Cade et al., 1999). This type of analysis is particularly useful when it is thought that an independent variable places a constraint on a response variable, yet variation in the response variable is influenced by many additional factors that add “noise” to the data, making a simple bivariate relationship difficult to detect (Thomson et al., 1996). Quantile regression is an extension of ordinary least squares regression, which regresses the best fitting line through the 50th percentile of the data. In addition to performing ordinary least squares regression for each pairwise comparison between the four nodes (B1, B2, C1, C2), we performed a series of quantile regressions using the ggplot2 R package (v. 3.4.2), where only the qth quantile was used for the regression (q = 0.99 and 0.95 to 0.5 in increments of 0.05, see Fig. S1) (Cade et al., 1999).

  9. J

    Uncertainty, skewness, and the business cycle through the MIDAS lens:...

    • journaldata.zbw.eu
    pdf, zip
    Updated Sep 4, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Efrem Castelnuovo; Lorenzo Mori; Efrem Castelnuovo; Lorenzo Mori (2024). Uncertainty, skewness, and the business cycle through the MIDAS lens: replication data [Dataset]. http://doi.org/10.15456/jae.2024248.0759770713
    Explore at:
    zip(130962166), pdf(109388)Available download formats
    Dataset updated
    Sep 4, 2024
    Dataset provided by
    ZBW - Leibniz Informationszentrum Wirtschaft
    Authors
    Efrem Castelnuovo; Lorenzo Mori; Efrem Castelnuovo; Lorenzo Mori
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Data and replication information for "Uncertainty, skewness, and the business cycle through the MIDAS lens" by Efrem Castelnuovo and Lorenzo Mori; published in Journal of Applied Econometrics, 2024. We employ a mixed-frequency quantile regression approach to model the time-varying conditional distribution of the US real GDP growth rate. We show that monthly information on financial conditions improves the predictive power of an otherwise quarterly-only model. We combine selected quantiles of the estimated conditional distribution to produce novel measures of uncertainty and skewness. Embedding these measures in a VAR framework, we show that unexpected changes in uncertainty are associated with an increase in (left) skewness and a downturn in real activity. Business cycle effects are significantly downplayed if we consider a quarterly-only quantile regression model. We find the endogenous response of skewness to substantially amplify the recessionary effects of uncertainty shocks. Finally, we construct a monthly-frequency version of our uncertainty measure and document the robustness of our findings.

  10. J

    Value-at-risk for long and short trading positions (replication data)

    • jda-test.zbw.eu
    • journaldata.zbw.eu
    .data, txt
    Updated Nov 4, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Pierre Giot; Sébastien Laurent; Pierre Giot; Sébastien Laurent (2022). Value-at-risk for long and short trading positions (replication data) [Dataset]. https://jda-test.zbw.eu/dataset/valueatrisk-for-long-and-short-trading-positions
    Explore at:
    .data(45920), .data(102325), .data(106150), .data(164969), txt(2441)Available download formats
    Dataset updated
    Nov 4, 2022
    Dataset provided by
    ZBW - Leibniz Informationszentrum Wirtschaft
    Authors
    Pierre Giot; Sébastien Laurent; Pierre Giot; Sébastien Laurent
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    In this paper we model Value-at-Risk (VaR) for daily asset returns using a collection of parametric univariate and multivariate models of the ARCH class based on the skewed Student distribution. We show that models that rely on a symmetric density distribution for the error term underperform with respect to skewed density models when the left and right tails of the distribution of returns must be modelled. Thus, VaR for traders having both long and short positions is not adequately modelled using usual normal or Student distributions. We suggest using an APARCH model based on the skewed Student distribution (combined with a time-varying correlation in the multivariate case) to fully take into account the fat left and right tails of the returns distribution. This allows for an adequate modelling of large returns defined on long and short trading positions. The performances of the univariate models are assessed on daily data for three international stock indexes and three US stocks of the Dow Jones index. In a second application, we consider a portfolio of three US stocks and model its long and short VaR using a multivariate skewed Student density.

  11. t

    Replication data for: does “very” make a difference? effects of intensifiers...

    • service.tib.eu
    Updated May 16, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2025). Replication data for: does “very” make a difference? effects of intensifiers in item stems of employee attitude surveys on response behavior [Dataset]. https://service.tib.eu/ldmservice/dataset/osn-doi-10-26249-fk2-oexojh
    Explore at:
    Dataset updated
    May 16, 2025
    Description

    Abstract: Employee attitude surveys are important tools for organizational development. To gain insights into employees’ attitudes, surveys most often use Likert-type items. Measures assessing these attitudes frequently use intensifiers (e.g., extremely, very) in item stems. To date little is known about the effects of intensifiers in the item stem on response behavior. They are frequently used inconsistently, which potentially has implications for the comparability of results in the context of benchmarking. Also, results often suffer from left-skewed distributions limiting data quality for which the use of intensifiers potentially offers a remedy. Therefore, we systematically examine the effects of intensifiers’ on response behavior in employee attitude surveys and their potential to remedy the issue of left-skewed distributions. In three studies, we assess effects on level, skewness and nomological structure. Study 1 examines the effects of intensifier strength in the item stem, while Studies 2 and 3 assess whether intensifier salience would increase these effects further. Interestingly, results did not show systematic effects. Future research ideas in regards to item design and processing as well as practical implications for the design of employee attitude surveys are discussed. Other: Does “very” make a difference? Effects of intensifiers in item stems of employee attitude surveys on response behavior - in preparation

  12. m

    Impact of limited data availability on the accuracy of project duration...

    • data.mendeley.com
    Updated Nov 22, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Naimeh Sadeghi (2022). Impact of limited data availability on the accuracy of project duration estimation in project networks [Dataset]. http://doi.org/10.17632/bjfdw6xbxw.3
    Explore at:
    Dataset updated
    Nov 22, 2022
    Authors
    Naimeh Sadeghi
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This database includes simulated data showing the accuracy of estimated probability distributions of project durations when limited data are available for the project activities. The base project networks are taken from PSPLIB. Then, various stochastic project networks are synthesized by changing the variability and skewness of project activity durations. Number of variables: 20 Number of cases/rows: 114240 Variable List: • Experiment ID: The ID of the experiment • Experiment for network: The ID of the experiment for each of the synthesized networks • Network ID: ID of the synthesized network • #Activities: Number of activities in the network, including start and finish activities • Variability: Variance of the activities in the network (this value can be either high, low, medium or rand, where rand shows a random combination of low, high and medium variance in the network activities.) • Skewness: Skewness of the activities in the network (Skewness can be either right, left, None or rand, where rand shows a random combination of right, left, and none skewed in the network activities)
    • Fitted distribution type: Distribution type used to fit on sampled data • Sample size: Number of sampled data used for the experiment resembling limited data condition • Benchmark 10th percentile: 10th percentile of project duration in the benchmark stochastic project network • Benchmark 50th percentile: 50th project duration in the benchmark stochastic project network • Benchmark 90th percentile: 90th project duration in the benchmark stochastic project network • Benchmark mean: Mean project duration in the benchmark stochastic project network • Benchmark variance: Variance project duration in the benchmark stochastic project network • Experiment 10th percentile: 10th percentile of project duration distribution for the experiment • Experiment 50th percentile: 50th percentile of project duration distribution for the experiment • Experiment 90th percentile: 90th percentile of project duration distribution for the experiment • Experiment mean: Mean of project duration distribution for the experiment • Experiment variance: Variance of project duration distribution for the experiment • K-S: Kolmogorov–Smirnov test comparing benchmark distribution and project duration • distribution of the experiment • P_value: the P-value based on the distance calculated in the K-S test

  13. Data from: Social contact patterns can buffer costs of forgetting in the...

    • zenodo.org
    • data.niaid.nih.gov
    • +1more
    txt, zip
    Updated May 30, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jeffrey R. Stevens; Jan K. Woike; Lael J. Schooler; Stefan Lindner; Thorsten Pachur; Jeffrey R. Stevens; Jan K. Woike; Lael J. Schooler; Stefan Lindner; Thorsten Pachur (2022). Data from: Social contact patterns can buffer costs of forgetting in the evolution of cooperation [Dataset]. http://doi.org/10.5061/dryad.4cd6042
    Explore at:
    txt, zipAvailable download formats
    Dataset updated
    May 30, 2022
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Jeffrey R. Stevens; Jan K. Woike; Lael J. Schooler; Stefan Lindner; Thorsten Pachur; Jeffrey R. Stevens; Jan K. Woike; Lael J. Schooler; Stefan Lindner; Thorsten Pachur
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    Analyses of the evolution of cooperation often rely on two simplifying assumptions: (i) individuals interact equally frequently with all social network members and (ii) they accurately remember each partner's past cooperation or defection. Here, we examine how more realistic, skewed patterns of contact---in which individuals interact primarily with only a subset of their network's members---influence cooperation. In addition, we test whether skewed contact patterns can counteract the decrease in cooperation caused by memory errors (i.e., forgetting). Finally, we compare two types of memory error that vary in whether forgotten interactions are replaced with random actions or with actions from previous encounters. We use evolutionary simulations of repeated prisoner's dilemma games that vary agents' contact patterns, forgetting rates, and types of memory error. We find that highly skewed contact patterns foster cooperation and also buffer the detrimental effects of forgetting. The type of memory error used also influences cooperation rates. Our findings reveal previously neglected but important roles of contact patterns, type of memory error, and the interaction of contact pattern and memory on cooperation. Although cognitive limitations may constrain the evolution of cooperation, social contact patterns can counteract some of these constraints.

  14. f

    Misleading characterization of data.

    • plos.figshare.com
    xls
    Updated Jun 4, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Eckhard Limpert; Werner A. Stahel (2023). Misleading characterization of data. [Dataset]. http://doi.org/10.1371/journal.pone.0021403.t001
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jun 4, 2023
    Dataset provided by
    PLOS ONE
    Authors
    Eckhard Limpert; Werner A. Stahel
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    a, Frequently, variation in data from across the sciences is characterized with the arithmetic mean and the standard deviation SD. Often, it is evident from the numbers that the data have to be skewed. This becomes clear if the lower end of the 95% interval of normal variation, - 2 SD, extends below zero, thus failing the “95% range check”, as is the case for all cited examples. Values in bold contradict the positive nature of the data. b, More often, variation is described with the standard error of the mean, SEM (SD  =  SEM · √n, with n  =  sample size). Such distributions are often even more skewed, and their original characterization as being symmetric is even more misleading. Original values are given in italics (°estimated from graphs). Most often, each reference cited contains several examples, in addition to the case(s) considered here. Table 2 collects further examples.

  15. I

    Data from: Geographically skewed recruitment and COVID-19 seroprevalence...

    • data.niaid.nih.gov
    url
    Updated Feb 29, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2024). Geographically skewed recruitment and COVID-19 seroprevalence estimates: a cross-sectional serosurveillance study and mathematical modelling analysis [Dataset]. http://doi.org/10.21430/M3RQBC30KW
    Explore at:
    urlAvailable download formats
    Dataset updated
    Feb 29, 2024
    License

    https://www.immport.org/agreementhttps://www.immport.org/agreement

    Description

    Objectives: Convenience sampling is an imperfect but important tool for seroprevalence studies. For COVID-19, local geographic variation in cases or vaccination can confound studies that rely on the geographically skewed recruitment inherent to convenience sampling. The objectives of this study were: (1) quantifying how geographically skewed recruitment influences SARS-CoV-2 seroprevalence estimates obtained via convenience sampling and (2) developing new methods that employ Global Positioning System (GPS)-derived foot traffic data to measure and minimise bias and uncertainty due to geographically skewed recruitment. Design: We used data from a local convenience-sampled seroprevalence study to map the geographic distribution of study participants' reported home locations and compared this to the geographic distribution of reported COVID-19 cases across the study catchment area. Using a numerical simulation, we quantified bias and uncertainty in SARS-CoV-2 seroprevalence estimates obtained using different geographically skewed recruitment scenarios. We employed GPS-derived foot traffic data to estimate the geographic distribution of participants for different recruitment locations and used this data to identify recruitment locations that minimise bias and uncertainty in resulting seroprevalence estimates. Results: The geographic distribution of participants in convenience-sampled seroprevalence surveys can be strongly skewed towards individuals living near the study recruitment location. Uncertainty in seroprevalence estimates increased when neighbourhoods with higher disease burden or larger populations were undersampled. Failure to account for undersampling or oversampling across neighbourhoods also resulted in biased seroprevalence estimates. GPS-derived foot traffic data correlated with the geographic distribution of serosurveillance study participants. Conclusions: Local geographic variation in seropositivity is an important concern in SARS-CoV-2 serosurveillance studies that rely on geographically skewed recruitment strategies. Using GPS-derived foot traffic data to select recruitment sites and recording participants' home locations can improve study design and interpretation.

  16. n

    Data from: Optimists or realists? How ants allocate resources in making...

    • data.niaid.nih.gov
    • datadryad.org
    zip
    Updated Feb 2, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Brittany L. Enzmann; Peter Nonacs (2019). Optimists or realists? How ants allocate resources in making reproductive investments. [Dataset]. http://doi.org/10.5061/dryad.b63r7v0
    Explore at:
    zipAvailable download formats
    Dataset updated
    Feb 2, 2019
    Dataset provided by
    University of California, Los Angeles
    Authors
    Brittany L. Enzmann; Peter Nonacs
    License

    https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html

    Area covered
    Eastern Sierra Nevada mountains
    Description
    1. Parents often face an investment trade-off between either producing many small or fewer large offspring. When environments vary predictably, the fittest parental solution matches available resources by varying only number of offspring and never optimal individual size. However when mismatches occur often between parental expectations and true resource levels, dynamic models like multifaceted parental investment (MFPI) and parental optimism (PO) both predict offspring size can vary significantly. MFPI is a “realist” strategy: parents assume future environments of average richness. When resources exceed expectations and it is too late to add more offspring, the best-case solution increases investment per individual. Brood size distributions therefore track the degree of mismatch from right-skewed around an optimal size (slight underestimation of resources), to left-skewed around a maximal size (gross underestimation). Conversely, PO is an “optimist” strategy: parents assume maximally good resource futures and match numbers to that situation. Normal or lean years do not affect “core” brood as costs primarily fall on excess “marginal” siblings who die or experience stunted growth (producing left-skewed distributions). 2. Investment patterns supportive of both MFPI and PO models have been observed in nature, but studies that directly manipulate food resources in order to test predictions are lacking. Ant colonies produce many offspring per reproductive cycle, and are amenable to experimental manipulation in ways that can differentiate between MFPI and PO investment strategies. 3. Colonies in a natural population of a harvester ant (Pogonomymex salinus) were protein-supplemented over two years and mature sexual offspring were collected annually prior to their nuptial flight. 4. Several results support either MFPI or PO in terms of patterns in offspring size distributions and how protein differentially affected male and female production. Unpredicted by either model, however, is that supplementation affected distributions more strongly across years than within (e.g., small females are significantly rarer in the year after colonies receive protein). 5. Parental investment strategies in P. salinus vary dynamically across years and conditions. Finding that past conditions can more strongly affect reproductive decisions than current ones, however, is not addressed by models of parental investment.
  17. Data from: Enhancing Statistical Education in Chemistry and STEAM Using...

    • acs.figshare.com
    zip
    Updated Oct 18, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Roberto Silva de Souza; Crissanto António Sequeira; Endler Marcel Borges (2024). Enhancing Statistical Education in Chemistry and STEAM Using JAMOVI. Part 1: Descriptive Statistics and Comparing Independent Groups [Dataset]. http://doi.org/10.1021/acs.jchemed.4c00563.s001
    Explore at:
    zipAvailable download formats
    Dataset updated
    Oct 18, 2024
    Dataset provided by
    ACS Publications
    Authors
    Roberto Silva de Souza; Crissanto António Sequeira; Endler Marcel Borges
    License

    Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
    License information was derived automatically

    Description

    This laboratory experiment was devoted to teaching descriptive statistics and comparing independent groups to STEAM (Science, Technology, Engineering, Arts, and Mathematics) students using open-source and graphical user interface software. Students answered 21 questions using JAMOVI in previously published data sets to learn fundamental statistics concepts. It was divided into four parts. In the first part, descriptive statistics were carried out (mean, median, standard deviation, interquartile range, data normality, and skewness). In the second part, data normality was checked by using visual inspection of plots (histograms and Q–Q plots). In the third part, two independent groups were compared. In the fourth part, more than two independent groups were compared. Normally, comparisons between two or more groups are presented in many textbooks, and a normal and homogeneous distribution of the data is assumed. Only parametric tests were taught, while nonparametric tests were not presented. Thus, data normality was checked using hypothesis tests (Shapiro–Wilk, Kolmogorov–Smirnov, and Anderson–Darling tests). Then, homogeneity was checked using Levene’s and Bartlett’s tests. Normality and homogeneity were also checked using a visual inspection of plots. Once normality and homogeneity were checked, parametric tests were used (t test and ANOVA). If the normality of the data was not checked, nonparametric tests were used (Mann–Whitney and Kruskal–Wallis tests).

  18. Data from: Complexity increases predictability in allometrically constrained...

    • zenodo.org
    • data.niaid.nih.gov
    • +1more
    bin
    Updated May 28, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Alison Catherine Iles; Mark Novak; Alison Catherine Iles; Mark Novak (2022). Data from: Complexity increases predictability in allometrically constrained food webs [Dataset]. http://doi.org/10.5061/dryad.m27p0
    Explore at:
    binAvailable download formats
    Dataset updated
    May 28, 2022
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Alison Catherine Iles; Mark Novak; Alison Catherine Iles; Mark Novak
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    All ecosystems are subjected to chronic disturbances, such as harvest, pollution, and climate change. The capacity to forecast how species respond to such press perturbations is limited by our imprecise knowledge of pairwise species interaction strengths and the many direct and indirect pathways along which perturbations can propagate between species. Network complexity (size and connectance) has thereby been seen to limit the predictability of ecological systems. Here we demonstrate a counteracting mechanism in which the influence of indirect effects declines with increasing network complexity when species interactions are governed by universal allometric constraints. With these constraints, network size and connectance interact to produce a skewed distribution of interaction strengths whose skew becomes more pronounced with increasing complexity. Together, the increased prevalence of weak interactions and the increased relative strength and rarity of strong interactions in complex networks limit disturbance propagation and preserve the qualitative predictability of net effects even when pairwise interaction strengths exhibit substantial variation or uncertainty.

  19. s

    Data for Nova, N., Pagliara, R., Gordon, D. M. 2022. Individual variation...

    • purl.stanford.edu
    Updated Jan 4, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Nicole Nova; Renato Pagliara; Deborah M. Gordon (2022). Data for Nova, N., Pagliara, R., Gordon, D. M. 2022. Individual variation does not regulate foraging response to humidity in harvester ants. Frontiers Ecology and Evolution [Dataset]. http://doi.org/10.25740/gr652yp4782
    Explore at:
    Dataset updated
    Jan 4, 2022
    Authors
    Nicole Nova; Renato Pagliara; Deborah M. Gordon
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Differences among groups in collective behavior may arise from responses that all group members share, or instead from differences in the distribution of individuals of particular types. We examined whether the collective regulation of foraging behavior in colonies of the desert red harvester ant (Pogonomyrmex barbatus) depends on individual differences among foragers. Foragers lose water while searching for seeds in hot, dry conditions, so colonies regulate foraging activity in response to humidity. In the summer, foraging activity begins in the early morning when humidity is high, and ends at midday when humidity is low. We investigated whether individual foragers within a colony differ in the decision whether to leave the nest on their next foraging trip as humidity decreases, by tracking the foraging trips of marked individuals. We found that individuals did not differ in response to current humidity. No ants were consistently more likely than others to stop foraging when humidity is low. Each day there is a skewed distribution of trip number: only a few individuals make many trips, but most individuals make few trips. We found that from one day to the next, individual foragers do not show any consistent tendency to make a similar number of trips. These results suggest that the differences among colonies in response to humidity, found in previous work, are due to behavioral responses to current humidity that all workers in a colony share, rather than to the distribution within a colony of foragers that differ in response.

  20. d

    Data for: Mechanisms that can cause population decline under heavily skewed...

    • datadryad.org
    • search.dataone.org
    zip
    Updated Jun 26, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Susumu Chiba (2023). Data for: Mechanisms that can cause population decline under heavily skewed male-biased adult sex ratios [Dataset]. http://doi.org/10.5061/dryad.zw3r228d3
    Explore at:
    zipAvailable download formats
    Dataset updated
    Jun 26, 2023
    Dataset provided by
    Dryad
    Authors
    Susumu Chiba
    Time period covered
    2023
    Description

    While adult sex ratio (ASR) is a crucial component for population management, there is still a limited understanding of how its fluctuation affects population dynamics. To demonstrate mechanisms that hinder population growth under a biased ASR, we examined changes in reproductive success with ASR using a decapod crustacean exposed to female-selective harvesting.

    We examined the effect of ASR on the spawning success of females. A laboratory experiment showed that the number of eggs carried by females decreased as the proportion of males in the mating groups increased. Although the same result was not observed in data collected over 25 years in the wild, the negative effect of ASR was suggested when success in carrying eggs was considered as a spawning success. These results indicate that a surplus of males results in females failing to carry eggs, probably due to sexual coercion, and the negative effect of ASR can be detected at the population level only when the bias increases becaus...

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Caitlin Jenvey (2021). Power analysis for non-normaly distributed traits [Dataset]. http://doi.org/10.26181/5FCDAB1677082

Power analysis for non-normaly distributed traits

Explore at:
Dataset updated
Dec 17, 2021
Dataset provided by
La Trobe University
Authors
Caitlin Jenvey
License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

An important step in the design of any research experiment is to determine the sample size required to detect an effect. Statistical power is the likelihood that a study will detect an effect when there is an effect to be detected.

A power analysis will estimate the sample size required for a given statistical power, effect size and significance level, assuming that your data is normally distributed. But what if your data is skewed, or non-normally distributed? This workshop will show you how to use R to assess if your data is non-normally distributed, how to determine the distribution of your data and how to perform a power analysis appropriate for that distribution.

Search
Clear search
Close search
Google apps
Main menu