48 datasets found

Data from: Improving structured population models with more realistic...
zenodo.org
data.niaid.nih.gov
+3more
Updated Jun 1, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Megan L. Peterson; William Morris; Cristina Linares; Daniel Doak; Megan L. Peterson; William Morris; Cristina Linares; Daniel Doak (2022). Data from: Improving structured population models with more realistic representations of non-normal growth [Dataset]. http://doi.org/10.5061/dryad.t6c3573
Explore at:
Unique identifier
https://doi.org/10.5061/dryad.t6c3573
Dataset updated
Jun 1, 2022
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Megan L. Peterson; William Morris; Cristina Linares; Daniel Doak; Megan L. Peterson; William Morris; Cristina Linares; Daniel Doak
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Description
Structured population models are among the most widely used tools in ecology and evolution. Integral projection models (IPMs) use continuous representations of how survival, reproduction, and growth change as functions of state variables such as size, requiring fewer parameters to be estimated than projection matrix models (PPMs). Yet almost all published IPMs make an important assumption: that size-dependent growth transitions are or can be transformed to be normally distributed. In fact, many organisms exhibit highly skewed size transitions. Small individuals can grow more than they can shrink, and large individuals may often shrink more dramatically than they can grow. Yet the implications of such skew for inference from IPMs has not been explored, nor have general methods been developed to incorporate skewed size transitions into IPMs, or deal with other aspects of real growth rates, including bounds on possible growth or shrinkage. 2. Here we develop a flexible approach to modeling skewed growth data using a modified beta regression model. We propose that sizes first be converted to a (0,1) interval by estimating size-dependent minimum and maximum sizes through quantile regression. Transformed data can then be modeled using beta regression with widely available statistical tools. We demonstrate the utility of this approach using demographic data for a long-lived plant, gorgonians, and an epiphytic lichen. Specifically, we compare inferences of population parameters from discrete PPMs to those from IPMs that either assume normality or incorporate skew using beta regression or, alternatively, a skewed normal model. 3. The beta and skewed normal distributions accurately capture the mean, variance, and skew of real growth distributions. Incorporating skewed growth into IPMs decreases population growth and estimated lifespan relative to IPMs that assume normally-distributed growth, and more closely approximate the parameters of PPMs that do not assume a particular growth distribution. A bounded distribution, such as the beta, also avoids the eviction problem caused by predicting some growth outside the modeled size range. 4. Incorporating biologically relevant skew in growth data has important consequences for inference from IPMs. The approaches we outline here are flexible and easy to implement with existing statistical tools.
Skewness project raw data files and codes
figshare.com
xlsx
Updated Mar 14, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Raunak Dey; Sreekanth K Manikandan (2022). Skewness project raw data files and codes [Dataset]. http://doi.org/10.6084/m9.figshare.17703269.v2
Explore at:
xlsxAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.17703269.v2
Dataset updated
Mar 14, 2022
Dataset provided by
Figsharehttp://figshare.com/
Authors
Raunak Dey; Sreekanth K Manikandan
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This repository contains raw data files and base codes to analyze them.A. The 'powerx_y.xlsx' files are the data files with the one dimensional trajectory of optically trapped probes modulated by an Ornstein-Uhlenbeck noise of given 'x' amplitude. For the corresponding diffusion amplitude A=0.1X(0.6X10-6)2 m2/s, x is labelled as '1'B. The codes are of three types. The skewness codes are used to calculate the skewness of the trajectory. The error_in_fit codes are used to calculate deviations from arcsine behavior. The sigma_exp codes point to the deviation of the mean from 0.5. All the codes are written three times to look ar T+, Tlast and Tmax.C. More information can be found in the manuscript.
n
Data from: Selection on skewed characters and the paradox of stasis
data.niaid.nih.gov
datadryad.org
zip
Updated Sep 8, 2017
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Suzanne Bonamour; Céline Teplitsky; Anne Charmantier; Pierre-André Crochet; Luis-Miguel Chevin (2017). Selection on skewed characters and the paradox of stasis [Dataset]. http://doi.org/10.5061/dryad.pt07g
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.5061/dryad.pt07g
Dataset updated
Sep 8, 2017
Dataset provided by
Centre National de la Recherche Scientifique
Authors
Suzanne Bonamour; Céline Teplitsky; Anne Charmantier; Pierre-André Crochet; Luis-Miguel Chevin
License
https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html
Description
Observed phenotypic responses to selection in the wild often differ from predictions based on measurements of selection and genetic variance. An overlooked hypothesis to explain this paradox of stasis is that a skewed phenotypic distribution affects natural selection and evolution. We show through mathematical modelling that, when a trait selected for an optimum phenotype has a skewed distribution, directional selection is detected even at evolutionary equilibrium, where it causes no change in the mean phenotype. When environmental effects are skewed, Lande and Arnold’s (1983) directional gradient is in the direction opposite to the skew. In contrast, skewed breeding values can displace the mean phenotype from the optimum, causing directional selection in the direction of the skew. These effects can be partitioned out using alternative selection estimates based on average derivatives of individual relative fitness, or additive genetic covariances between relative fitness and trait (Robertson-Price identity). We assess the validity of these predictions using simulations of selection estimation under moderate samples size. Ecologically relevant traits may commonly have skewed distributions, as we here exemplify with avian laying date – repeatedly described as more evolutionarily stable than expected –, so this skewness should be accounted for when investigating evolutionary dynamics in the wild.
u
Results and analysis using the Lean Six-Sigma define, measure, analyze,...
researchdata.up.ac.za
docx
Updated Mar 12, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Modiehi Mophethe (2024). Results and analysis using the Lean Six-Sigma define, measure, analyze, improve, and control (DMAIC) Framework [Dataset]. http://doi.org/10.25403/UPresearchdata.25370374.v1
Explore at:
docxAvailable download formats
Unique identifier
https://doi.org/10.25403/UPresearchdata.25370374.v1
Dataset updated
Mar 12, 2024
Dataset provided by
University of Pretoria
Authors
Modiehi Mophethe
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This section presents a discussion of the research data. The data was received as secondary data however, it was originally collected using the time study techniques. Data validation is a crucial step in the data analysis process to ensure that the data is accurate, complete, and reliable. Descriptive statistics was used to validate the data. The mean, mode, standard deviation, variance and range determined provides a summary of the data distribution and assists in identifying outliers or unusual patterns. The data presented in the dataset show the measures of central tendency which includes the mean, median and the mode. The mean signifies the average value of each of the factors presented in the tables. This is the balance point of the dataset, the typical value and behaviour of the dataset. The median is the middle value of the dataset for each of the factors presented. This is the point where the dataset is divided into two parts, half of the values lie below this value and the other half lie above this value. This is important for skewed distributions. The mode shows the most common value in the dataset. It was used to describe the most typical observation. These values are important as they describe the central value around which the data is distributed. The mean, mode and median give an indication of a skewed distribution as they are not similar nor are they close to one another. In the dataset, the results and discussion of the results is also presented. This section focuses on the customisation of the DMAIC (Define, Measure, Analyse, Improve, Control) framework to address the specific concerns outlined in the problem statement. To gain a comprehensive understanding of the current process, value stream mapping was employed, which is further enhanced by measuring the factors that contribute to inefficiencies. These factors are then analysed and ranked based on their impact, utilising factor analysis. To mitigate the impact of the most influential factor on project inefficiencies, a solution is proposed using the EOQ (Economic Order Quantity) model. The implementation of the 'CiteOps' software facilitates improved scheduling, monitoring, and task delegation in the construction project through digitalisation. Furthermore, project progress and efficiency are monitored remotely and in real time. In summary, the DMAIC framework was tailored to suit the requirements of the specific project, incorporating techniques from inventory management, project management, and statistics to effectively minimise inefficiencies within the construction project.
f
Dataset for: Some Remarks on the R2 for Clustering
wiley.figshare.com
txt
Updated Jun 1, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Nicola Loperfido; Thaddeus Tarpey (2023). Dataset for: Some Remarks on the R2 for Clustering [Dataset]. http://doi.org/10.6084/m9.figshare.6124508.v1
Explore at:
txtAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.6124508.v1
Dataset updated
Jun 1, 2023
Dataset provided by
Wiley
Authors
Nicola Loperfido; Thaddeus Tarpey
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Description
A common descriptive statistic in cluster analysis is the $R^2$ that measures the overall proportion of variance explained by the cluster means. This note highlights properties of the $R^2$ for clustering. In particular, we show that generally the $R^2$ can be artificially inflated by linearly transforming the data by ``stretching'' and by projecting. Also, the $R^2$ for clustering will often be a poor measure of clustering quality in high-dimensional settings. We also investigate the $R^2$ for clustering for misspecified models. Several simulation illustrations are provided highlighting weaknesses in the clustering $R^2$, especially in high-dimensional settings. A functional data example is given showing how that $R^2$ for clustering can vary dramatically depending on how the curves are estimated.
U
Annual peak-flow data and results of flood-frequency analysis for 76...
data.usgs.gov
s.cnmilf.com
+1more
Updated Sep 3, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Daniel Wagner; Jon Voss; Roger D.; David Heimann (2024). Annual peak-flow data and results of flood-frequency analysis for 76 selected streamflow gaging stations operated by the U.S. Geological Survey in the upper White River basin, Missouri and Arkansas, computed using an updated generalized (regional) flood skew [Dataset]. http://doi.org/10.5066/P9C3L7IN
Explore at:
Unique identifier
https://doi.org/10.5066/P9C3L7IN
Dataset updated
Sep 3, 2024
Dataset provided by
United States Geological Surveyhttp://www.usgs.gov/
Authors
Daniel Wagner; Jon Voss; Roger D.; David Heimann
License
U.S. Government Workshttps://www.usa.gov/government-works
License information was derived automatically
Time period covered
1904 - 2020
Area covered
Missouri, Arkansas
Description
This dataset contains site information, basin characteristics, results of flood-frequency analysis, and a generalized (regional) flood skew for 76 selected streamgages operated by the U.S. Geological Survey (USGS) in the upper White River basin (4-digit hydrologic unit 1101) in southern Missouri and northern Arkansas. The Little Rock District U.S. Army Corps of Engineers (USACE) needed updated estimates of streamflows corresponding to selected annual exceedance probabilities (AEPs) and a basin-specific regional flood skew. USGS selected 111 candidate streamgages in the study area that had 20 or more years of gaged annual peak-flow data available through the 2020 water year. After screening for regulation, urbanization, redundant/nested basins, drainage areas greater than 2,500 square miles, and streamgage basins located in the Mississippi Alluvial Plain (8-digit hydrologic unit 11010013), 77 candidate streamgages remained. After conducting the initial flood-frequency analysis ...
Reaction times and other skewed distributions: problems with the mean and...
figshare.com
pdf
Updated May 31, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Guillaume Rousselet; Rand Wilcox (2023). Reaction times and other skewed distributions: problems with the mean and the median [Dataset]. http://doi.org/10.6084/m9.figshare.6911924.v4
Explore at:
pdfAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.6911924.v4
Dataset updated
May 31, 2023
Dataset provided by
Figsharehttp://figshare.com/
Authors
Guillaume Rousselet; Rand Wilcox
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Reproducibility package for the article:Reaction times and other skewed distributions: problems with the mean and the medianGuillaume A. Rousselet & Rand R. Wilcoxpreprint: https://psyarxiv.com/3y54rdoi: 10.31234/osf.io/3y54rThis package contains all the code and data to reproduce the figures and analyses in the article.
n
Data from: Body temperature distributions of active diurnal lizards in three...
data.niaid.nih.gov
datadryad.org
zip
Updated Aug 4, 2018
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Raymond B. Huey; Eric R. Pianka (2018). Body temperature distributions of active diurnal lizards in three deserts: skewed up or skewed down? [Dataset]. http://doi.org/10.5061/dryad.45g3s
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.5061/dryad.45g3s
Dataset updated
Aug 4, 2018
Dataset provided by
The University of Texas at Austin
University of Washington
Authors
Raymond B. Huey; Eric R. Pianka
License
https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html
Area covered
Australia, North America, Africa
Description
The performance of ectotherms integrated over time depends in part on the position and shape of the distribution of body temperatures (Tb) experienced during activity. For several complementary reasons, physiological ecologists have long expected that Tb distributions during activity should have a long left tail (left-skewed); but only infrequently have they quantified the magnitude and direction of Tb skewness in nature.

To evaluate whether left-skewed Tb distributions are general for diurnal desert lizards, we compiled and analyzed Tb (∑ = 9,023 temperatures) from our own prior studies of active desert lizards on three continents (25 species in Western Australia, 10 in the Kalahari Desert of Africa, and 10 species in western North America). We gathered these data over several decades, using standardized techniques.

Many species showed significantly left-skewed Tb distributions, even when records were restricted to summer months. However, magnitudes of skewness were always small, such that mean Tb were never more than 1°C lower than median Tb. The significance of Tb skewness was sensitive to sample size, and power tests reinforced this sensitivity.

The magnitude of skewness was not obviously related to phylogeny, desert, body size, or median body temperature. Moreover, formal phylogenetic analysis is inappropriate because geography and phylogeny are confounded (that is, are highly collinear).

Skewness might be limited if lizards pre-warm inside retreats before emerging in the morning, emerge only when operative temperatures are high enough to speed warming to activity Tb, or if cold lizards are especially wary and difficult to spot or catch. Telemetry studies may help evaluate these possibilities.
h
Data from: Table 4
hepdata.net
Updated Feb 20, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2024). Table 4 [Dataset]. http://doi.org/10.17182/hepdata.147284.v1/t4
Explore at:
Unique identifier
https://doi.org/10.17182/hepdata.147284.v1/t4
Dataset updated
Feb 20, 2024
Description
Intensive skewness of $\langle p_\mathrm{T}\rangle$ as a function of $\langle\mathrm{d}N_\mathrm{ch}/\mathrm{d}\eta\rangle^{1/3}_{|\eta|<0.5}$ in pp collisions at $\sqrt{s}$ = 5.02 TeV.
d
Regional flood skew for the Tennessee and parts of the Ohio and Lower...
catalog.data.gov
Updated Jul 24, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
U.S. Geological Survey (2025). Regional flood skew for the Tennessee and parts of the Ohio and Lower Mississippi River basins (hydrologic unit codes 06, 05, and 08, respectively) in Tennessee, Kentucky, western Virginia, western West Virginia, far western Maryland and parts of North Carolina, Georgia, Alabama, and Mississippi [Dataset]. https://catalog.data.gov/dataset/regional-flood-skew-for-the-tennessee-and-parts-of-the-ohio-and-lower-mississippi-river-ba
Explore at:
Dataset updated
Jul 24, 2025
Dataset provided by
United States Geological Surveyhttp://www.usgs.gov/
Area covered
Ohio River, Kentucky, Mississippi River, Tennessee, West Virginia, Alabama
Description
This dataset contains site information, basin characteristics, results of flood-frequency analysis, and results of Bayesian weighted least-squares/Bayesian generalized least-squares (B-WLS/B-GLS) analysis of regional skewness of the annual peak flows for 785 streamflow gaging stations (streamgages) operated by the U.S. Geological Survey (USGS) in the Tennessee and parts of the Ohio and Lower Mississippi River basins (hydrologic unit codes 06, 05, and 08, respectively) in Tennessee, Kentucky, western Virginia, western West Virginia, far western Maryland and parts of North Carolina, Georgia, Alabama, and Mississippi. Annual peak-flow data through the 2021 water year (a water year is defined as the period October 1-September 30 and named for the year in which it ends) were used in the study. For regional skew analysis, 283 of the 785 candidate streamgages were removed for pseudo record length (PRL; Veilleux and Wagner, 2021) less than 30 years, 108 were removed for redundancy, 4 were removed for regulation and 2 were removed for urbanization (see file "VAskew_Region2.csv" in this dataset). For the remaining 387 of 785 candidate streamgages, B-WLS/B-GLS regression (Veilleux and Wagner, 2021) was used to relate flood skew to a suite of 32 explanatory variables. None of the explanatory variables tested had sufficient predictive power in explaining the variability in skew in the region; thus, a constant model of regional skew, 0.048 (average variance of prediction 0.16, standard error 0.4) was selected for the study area (Messinger and others, 2025). For the 785 candidate streamgages, annual peak-flow data through the 2021 water year ("VAskew_region2.pkf") and specification ("VAskew_region2.psf"), output ("VASKEW_REGION2.PRT"), and export ("VASKEW_REGION2.EXP") files from flood-frequency analysis in version 7.4.1 of USGS PeakFQ software (hereafter referred to as "PeakFQ"; Veilleux and others, 2014; Flynn and others, 2006) are provided. Two .csv files are provided, one describing the basin characteristics tested ("BasinCharsTested.csv") and the other ("VAskew_Region2.csv") containing site information (U.S. Geological Survey, 2023), results of flood-frequency analysis in PeakFQ, and, for the 387 streamgages used in the B-WLS/B-GLS regression, PRL, unbiased at-site skew, unbiased mean squared error of the at-site skew, the B-WLS/B-GLS residual, and metrics of leverage and influence. A geographic information systems (GIS) shapefile ("VA_SkewRegion2.shp") containing a polygon representing the geographic extent of the skew region is also included.
m
Impact of limited data availability on the accuracy of project duration...
data.mendeley.com
Updated Nov 22, 2022
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Naimeh Sadeghi (2022). Impact of limited data availability on the accuracy of project duration estimation in project networks [Dataset]. http://doi.org/10.17632/bjfdw6xbxw.3
Explore at:
Unique identifier
https://doi.org/10.17632/bjfdw6xbxw.3
Dataset updated
Nov 22, 2022
Authors
Naimeh Sadeghi
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This database includes simulated data showing the accuracy of estimated probability distributions of project durations when limited data are available for the project activities. The base project networks are taken from PSPLIB. Then, various stochastic project networks are synthesized by changing the variability and skewness of project activity durations. Number of variables: 20 Number of cases/rows: 114240 Variable List: • Experiment ID: The ID of the experiment • Experiment for network: The ID of the experiment for each of the synthesized networks • Network ID: ID of the synthesized network • #Activities: Number of activities in the network, including start and finish activities • Variability: Variance of the activities in the network (this value can be either high, low, medium or rand, where rand shows a random combination of low, high and medium variance in the network activities.) • Skewness: Skewness of the activities in the network (Skewness can be either right, left, None or rand, where rand shows a random combination of right, left, and none skewed in the network activities)
• Fitted distribution type: Distribution type used to fit on sampled data • Sample size: Number of sampled data used for the experiment resembling limited data condition • Benchmark 10th percentile: 10th percentile of project duration in the benchmark stochastic project network • Benchmark 50th percentile: 50th project duration in the benchmark stochastic project network • Benchmark 90th percentile: 90th project duration in the benchmark stochastic project network • Benchmark mean: Mean project duration in the benchmark stochastic project network • Benchmark variance: Variance project duration in the benchmark stochastic project network • Experiment 10th percentile: 10th percentile of project duration distribution for the experiment • Experiment 50th percentile: 50th percentile of project duration distribution for the experiment • Experiment 90th percentile: 90th percentile of project duration distribution for the experiment • Experiment mean: Mean of project duration distribution for the experiment • Experiment variance: Variance of project duration distribution for the experiment • K-S: Kolmogorov–Smirnov test comparing benchmark distribution and project duration • distribution of the experiment • P_value: the P-value based on the distance calculated in the K-S test
4
Supplementary data for the paper "Why psychologists should not default to...
data.4tu.nl
zip
Updated Apr 28, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Joost de Winter (2025). Supplementary data for the paper "Why psychologists should not default to Welch’s t-test instead of Student’s t-test (and why the Anderson–Darling test is an underused alternative)" [Dataset]. http://doi.org/10.4121/e8e6861a-7ab0-4b6d-bd67-5f95029322c5.v4
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.4121/e8e6861a-7ab0-4b6d-bd67-5f95029322c5.v4
Dataset updated
Apr 28, 2025
Dataset provided by
4TU.ResearchData
Authors
Joost de Winter
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This paper evaluates the claim that Welch’s t-test (WT) should replace the independent-samples t-test (IT) as the default approach for comparing sample means. Simulations involving unequal and equal variances, skewed distributions, and different sample sizes were performed. For normal distributions, we confirm that the WT maintains the false positive rate close to the nominal level of 0.05 when sample sizes and standard deviations are unequal. However, the WT was found to yield inflated false positive rates under skewed distributions with unequal sample sizes. A complementary empirical study based on gender differences in two psychological scales corroborates these findings. Finally, we contend that the null hypothesis of unequal variances together with equal means lacks plausibility, and that empirically, a difference in means typically coincides with differences in variance and skewness. An additional analysis using the Kolmogorov-Smirnov and Anderson-Darling tests demonstrates that examining entire distributions, rather than just their means, can provide a more suitable alternative when facing unequal variances or skewed distributions. Given these results, researchers should remain cautious with software defaults, such as R favoring Welch’s test.
Data from: Using social parasitism to test reproductive skew models in a...
zenodo.org
data.niaid.nih.gov
+1more
Updated May 30, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Jonathan P. Green; Michael A. Cant; Jeremy Field; Jonathan P. Green; Michael A. Cant; Jeremy Field (2022). Data from: Using social parasitism to test reproductive skew models in a primitively eusocial wasp [Dataset]. http://doi.org/10.5061/dryad.84mf4
Explore at:
Unique identifier
https://doi.org/10.5061/dryad.84mf4
Dataset updated
May 30, 2022
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Jonathan P. Green; Michael A. Cant; Jeremy Field; Jonathan P. Green; Michael A. Cant; Jeremy Field
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Description
Remarkable variation exists in the distribution of reproduction (skew) among members of cooperatively breeding groups, both within and between species. Reproductive skew theory has provided an important framework for understanding this variation. In the primitively eusocial Hymenoptera, two models have been routinely tested: concessions models, which assume complete control of reproduction by a dominant individual, and tug-of-war models, which assume on-going competition among group members over reproduction. Current data provide little support for either model, but uncertainty about the ability of individuals to detect genetic relatedness and difficulties in identifying traits conferring competitive ability mean that the relative importance of concessions versus tug-of-war remains unresolved. Here, we suggest that the use of social parasitism to generate meaningful variation in key social variables represents a valuable opportunity to explore the mechanisms underpinning reproductive skew within the social Hymenoptera. We present a direct test of concessions and tug-of-war models in the paper wasp Polistes dominulus by exploiting pronounced changes in relatedness and power structures that occur following replacement of the dominant by a congeneric social parasite. Comparisons of skew in parasitized and unparasitized colonies are consistent with a tug-of-war over reproduction within P. dominulus groups, but provide no evidence for reproductive concessions.
d
PeakFQ program input and output files for selected streamgages in...
catalog.data.gov
data.usgs.gov
+1more
Updated Jul 6, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
U.S. Geological Survey (2024). PeakFQ program input and output files for selected streamgages in Connecticut, based on data through water year 2015 [Dataset]. https://catalog.data.gov/dataset/peakfq-program-input-and-output-files-for-selected-streamgages-in-connecticut-based-on-dat
Explore at:
Dataset updated
Jul 6, 2024
Dataset provided by
United States Geological Surveyhttp://www.usgs.gov/
Area covered
Connecticut
Description
Flood-frequency analyses for 141 streamgages in Connecticut were updated using the U.S. Geological Survey program PeakFQ, version 7.2 (https://water.usgs.gov/software/PeakFQ/; Veilleux and others, 2014). The PeakFQ program follows Bulletin 17C national guidelines for flood-frequency analysis (https://doi.org/10.3133/tm4B5). The input and output files to PeakFQ that were used in the Connecticut flood-frequency update are presented. Individual file folders for the 141 streamgages using the streamgage identification number as the folder name contain three files: ".TXT" file used as input to PeakFQ contains the annual peak flows for the streamgage in standard PeakFQ (WATSTORE) text format available from NWIS web at https://nwis.waterdata.usgs.gov/usa/nwis/peak; ".PRT" text file provides estimates of flood magnitudes and their corresponding variance for a range of annual exceedance probabilities, estimates of the parameters of the log-Pearson Type III frequency distribution, including the logarithmic mean, standard deviation, skew, and mean square error of the skew; and ".JPG" image file shows the fitted frequency curve, systematic peaks, confidence limits, and associated information on low outliers, censored peaks, interval peaks, historic peaks, and thresholds if applicable.
f
Quantifying Distribution of Flow Cytometric TCR-Vβ Usage with Economic...
plos.figshare.com
docx
Updated Jun 1, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Kornelis S. M. van der Geest; Wayel H. Abdulahad; Gerda Horst; Pedro G. Lorencetti; Johan Bijzet; Suzanne Arends; Marieke van der Heiden; Anne-Marie Buisman; Bart-Jan Kroesen; Elisabeth Brouwer; Annemieke M. H. Boots (2023). Quantifying Distribution of Flow Cytometric TCR-Vβ Usage with Economic Statistics [Dataset]. http://doi.org/10.1371/journal.pone.0125373
Explore at:
docxAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0125373
Dataset updated
Jun 1, 2023
Dataset provided by
PLOS ONE
Authors
Kornelis S. M. van der Geest; Wayel H. Abdulahad; Gerda Horst; Pedro G. Lorencetti; Johan Bijzet; Suzanne Arends; Marieke van der Heiden; Anne-Marie Buisman; Bart-Jan Kroesen; Elisabeth Brouwer; Annemieke M. H. Boots
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Measuring changes of the T cell receptor (TCR) repertoire is important to many fields of medicine. Flow cytometry is a popular technique to study the TCR repertoire, as it quickly provides insight into the TCR-Vβ usage among well-defined populations of T cells. However, the interpretation of the flow cytometric data remains difficult, and subtle TCR repertoire changes may go undetected. Here, we introduce a novel means for analyzing the flow cytometric data on TCR-Vβ usage. By applying economic statistics, we calculated the Gini-TCR skewing index from the flow cytometric TCR-Vβ analysis. The Gini-TCR skewing index, which is a direct measure of TCR-Vβ distribution among T cells, allowed us to track subtle changes of the TCR repertoire among distinct populations of T cells. Application of the Gini-TCR skewing index to the flow cytometric TCR-Vβ analysis will greatly help to gain better understanding of the TCR repertoire in health and disease.
d
Data and reproducible analysis files from: Latitudinal clines in floral...
search.dataone.org
datadryad.org
Updated Jan 9, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Mia Akbar; Dale Moskoff; Spencer Barrett; Robert Colautti (2025). Data and reproducible analysis files from: Latitudinal clines in floral display associated with adaptive evolution during a biological invasion [Dataset]. http://doi.org/10.5061/dryad.jdfn2z3jz
Explore at:
Unique identifier
https://doi.org/10.5061/dryad.jdfn2z3jz
Dataset updated
Jan 9, 2025
Dataset provided by
Dryad Digital Repository
Authors
Mia Akbar; Dale Moskoff; Spencer Barrett; Robert Colautti
Description
Premise: Flowering phenology strongly influences reproductive success in plants. Days to first flower is easy to quantify and widely used to characterize phenology, but reproductive fitness depends on the full schedule of flower production over time.Â Methods: We examined floral display traits associated with rapid adaptive evolution and range expansion among thirteen populations of Lythrum salicaria, sampled along a 10-degree latitudinal gradient in eastern North America. We grew these collections in a common garden field experiment at a mid-latitude site and quantified variation in flowering schedule shape using Principal Coordinates Analysis (PCoA) and quantitative metrics analogous to central moments of probability distributions (i.e., mean, variance, skew, and kurtosis). Key Results:Â Consistent with earlier evidence for adaptation to shorter growing seasons, we found that populations from higher latitudes had earlier start and mean flowering day, on average, when compared to popul..., , , # Data and analysis files from: Latitudinal clines in floral display associated with adaptive evolution during a biological invasion

https://doi.org/10.5061/dryad.jdfn2z3jz

Reference Information

Provenance for this README

File name: README.md

Authors: Mia Akbar

Other contributors: Dale Moskoff, Spencer C.H. Barrett, Robert I. Colautti

Date created: 2024-05-30

Dataset Version and Release History

Current Version:

Number: 1.0.0

Date: 2024-05-30

Persistent identifier: n/a

Summary of changes: n/a

Embargo Provenance: n/a

Scope of embargo: n/a

Embargo period: n/a

Description of the data and file structure

Methodological Information

Methods of data collection/generation: see publication for details

Data and File Overview

Summary Metrics

Data File count: 1

Total file size: 37 KB

File formats: .csv

Naming Conventions

File naming scheme: The data file within the "Data" f...
f
Cumulative viral load as a predictor of CD4+ T-cell response to...
plos.figshare.com
docx
Updated Jun 3, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Joseph B. Sempa; Theresa M. Rossouw; Emmanuel Lesaffre; Martin Nieuwoudt (2023). Cumulative viral load as a predictor of CD4+ T-cell response to antiretroviral therapy using Bayesian statistical models [Dataset]. http://doi.org/10.1371/journal.pone.0224723
Explore at:
docxAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0224723
Dataset updated
Jun 3, 2023
Dataset provided by
PLOS ONE
Authors
Joseph B. Sempa; Theresa M. Rossouw; Emmanuel Lesaffre; Martin Nieuwoudt
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
IntroductionThere are Challenges in statistically modelling immune responses to longitudinal HIV viral load exposure as a function of covariates. We define Bayesian Markov Chain Monte Carlo mixed effects models to incorporate priors and examine the effect of different distributional assumptions. We prospectively fit these models to an as-yet-unpublished data from the Tshwane District Hospital HIV treatment clinic in South Africa, to determine if cumulative log viral load, an indicator of long-term viral exposure, is a valid predictor of immune response.MethodsModels are defined, to express ‘slope’, i.e. mean annual increase in CD4 counts, and ‘asymptote’, i.e. the odds of having a CD4 count ≥500 cells/μL during antiretroviral treatment, as a function of covariates and random-effects. We compare the effect of using informative versus non-informative prior distributions on model parameters. Models with cubic splines or Skew-normal distributions are also compared using the conditional Deviance Information Criterion.ResultsThe data of 750 patients are analyzed. Overall, models adjusting for cumulative log viral load provide a significantly better fit than those that do not. An increase in cumulative log viral load is associated with a decrease in CD4 count slope (19.6 cells/μL (95% credible interval: 28.26, 10.93)) and a reduction in the odds of achieving a CD4 counts ≥500 cells/μL (0.42 (95% CI: 0.236, 0.730)) during 5 years of therapy. Using informative priors improves the cumulative log viral load estimate, and a skew-normal distribution for the random-intercept and measurement error results is a better fit compared to using classical Gaussian distributions.DiscussionWe demonstrate in an unpublished South African cohort that cumulative log viral load is a strong and significant predictor of both CD4 count slope and asymptote. We argue that Bayesian methods should be used more frequently for such data, given their flexibility to incorporate prior information and non-Gaussian distributions.
Data from: Sediment particle size analysis for stations from the Western...
data-search.nerc.ac.uk
http
Updated Jul 25, 2020
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
UK Polar Data Centre, Natural Environment Research Council, UK Research & Innovation (2020). Sediment particle size analysis for stations from the Western Barents Sea for summer 2017 and 2018 [Dataset]. https://data-search.nerc.ac.uk/geonetwork/srv/api/records/GB_NERC_BAS_PDC_01373
Explore at:
httpAvailable download formats
Dataset updated
Jul 25, 2020
Dataset provided by
Natural Environment Research Councilhttps://www.ukri.org/councils/nerc
Authors
UK Polar Data Centre, Natural Environment Research Council, UK Research & Innovation
Time period covered
Jul 19, 2018 - Jul 28, 2018
Area covered

Description
Sediment particle size frequency distributions from the USNL (Unites States Naval Laboratory) box cores were determined optically using a Malvern Mastersizer 2000 He-Ne LASER diffraction sizer and were used to resolve mean particle size, sorting, skewness and kurtosis.

Samples were collected on cruises JR16006 and JR17007.

Funding was provided by ''The Changing Arctic Ocean Seafloor (ChAOS) - how changing sea ice conditions impact biological communities, biogeochemical processes and ecosystems'' project (NE/N015894/1 and NE/P006426/1, 2017-2021), part of the NERC funded Changing Arctic Ocean programme.
n
Data from: SkewDB: A comprehensive database of GC and 10 other skews for...
data.niaid.nih.gov
search.dataone.org
+2more
zip
Updated Oct 4, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Bert Hubert (2021). SkewDB: A comprehensive database of GC and 10 other skews for over 28,000 chromosomes and plasmids [Dataset]. http://doi.org/10.5061/dryad.g4f4qrfr6
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.5061/dryad.g4f4qrfr6
Dataset updated
Oct 4, 2021
Dataset provided by
Independent researcher
Authors
Bert Hubert
License
https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html
Description
GC skew denotes the relative excess of G nucleotides over C nucleotides on the leading versus the lagging replication strand of eubacteria. While the effect is small, typically around 2.5%, it is robust and pervasive. GC skew and the analogous TA skew are a localized deviation from Chargaff’s second parity rule, which states that G and C, and T and A occur with (mostly) equal frequency even within a strand.

Most bacteria also show the analogous TA skew. Different phyla show different kinds of skew and differing relations between TA and GC skew. This article introduces an open access database (https://skewdb.org) of GC and 10 other skews for over 28,000 chromosomes and plasmids. Further details like codon bias, strand bias, strand lengths and taxonomic data are also included.

The SkewDB database can be used to generate or verify hypotheses. Since the origins of both the second parity rule, as well as GC skew itself, are not yet satisfactorily explained, such a database may enhance our understanding of microbial DNA.

Methods The SkewDB analysis relies exclusively on the tens of thousands of FASTA and GFF3 files available through the NCBI download service, which covers both GenBank and RefSeq. The database includes bacteria, archaea and their plasmids. Furthermore, to ease analysis, the NCBI Taxonomy database is sourced and merged so output data can quickly be related to (super)phyla or specific species. No other data is used, which greatly simplifies processing. Data is read directly in the compressed format provided by NCBI.

All results are emitted as standard CSV files. In the first step of the analysis, for each organism the FASTA sequence and the GFF3 annotation file are parsed. Every chromosome in the FASTA file is traversed from beginning to end, while a running total is kept for cumulative GC and TA skew. In addition, within protein coding genes, such totals are also kept separately for these skews on the first, second and third codon position. Furthermore, separate totals are kept for regions which do not code for proteins. In addition, to enable strand bias measurements, a cumulative count is maintained of nucleotides that are part of a positive or negative sense gene. The counter is increased for positive sense nucleotides, decreased for negative sense nucleotides, and left alone for non-genic regions.

A separate counter is kept for non-genic nucleotides. Finally, G and C nucleotides are counted, regardless of if they are part of a gene or not. These running totals are emitted at 4096 nucleotide intervals, a resolution suitable for determining skews and shifts. In addition, one line summaries are stored for each chromosome. These line includes the RefSeq identifier of the chromosome, the full name mentioned in the FASTA file, plus counts of A, C, G and T nucleotides. Finally five levels of taxonomic data are stored.

Chromosomes and plasmids of fewer than 100 thousand nucleotides are ignored, as these are too noisy to model faithfully. Plasmids are clearly marked in the database, enabling researchers to focus on chromosomes if so desired. Fitting Once the genomes have been summarised at 4096-nucleotide resolution, the skews are fitted to a simple model. The fits are based on four parameters. Alpha1 and alpha2 denote the relative excess of G over C on the leading and lagging strands. If alpha1 is 0.046, this means that for every 1000 nucleotides on the leading strand, the cumulative count of G excess increases by 46. The third parameter is div and it describes how the chromosome is divided over leading and lagging strands. If this number is 0.557, the leading replication strand is modeled to make up 55.7% of the chromosome. The final parameter is shift (the dotted vertical line), and denotes the offset of the origin of replication compared to the DNA FASTA file. This parameter has no biological meaning of itself, and is an artifact of the DNA assembly process.

The goodness-of-fit number consists of the root mean squared error of the fit, divided by the absolute mean skew. This latter correction is made to not penalize good fits for bacteria showing significant skew. GC skew tends to be defined very strongly, and it is therefore used to pick the div and shift parameters of the DNA sequence, which are then kept as a fixed constraint for all the other skews, which might not be present as clearly. The fitting process itself is a downhill simplex method optimization over the three dimensions, seeded with the average observed skew over the whole genome, and assuming there is no shift, and that the leading and lagging strands are evenly distributed. The simplex optimization is tuned so that it takes sufficiently large steps so it can reach the optimum even if some initial assumptions are off.
E
A database of 100 years (1915-2014) of coastal flooding in the UK
edmed.seadatanet.org
bodc.ac.uk
+1more
nc
Updated Nov 21, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
University of Southampton School of Ocean and Earth Science (2024). A database of 100 years (1915-2014) of coastal flooding in the UK [Dataset]. https://edmed.seadatanet.org/report/6120/
Explore at:
ncAvailable download formats
Dataset updated
Nov 21, 2024
Dataset authored and provided by
University of Southampton School of Ocean and Earth Science
License
https://vocab.nerc.ac.uk/collection/L08/current/UN/https://vocab.nerc.ac.uk/collection/L08/current/UN/
Time period covered
Jan 1, 1915 - Dec 31, 2014
Area covered

Description
This database, and the accompanying website called ‘SurgeWatch’ (http://surgewatch.stg.rlp.io), provides a systematic UK-wide record of high sea level and coastal flood events over the last 100 years (1915-2014). Derived using records from the National Tide Gauge Network, a dataset of exceedence probabilities from the Environment Agency and meteorological fields from the 20th Century Reanalysis, the database captures information of 96 storm events that generated the highest sea levels around the UK since 1915. For each event, the database contains information about: (1) the storm that generated that event; (2) the sea levels recorded around the UK during the event; and (3) the occurrence and severity of coastal flooding as consequence of the event. The data are presented to be easily assessable and understandable to a wide range of interested parties. The database contains 100 files; four CSV files and 96 PDF files. Two CSV files contain the meteorological and sea level data for each of the 96 events. A third file contains the list of the top 20 largest skew surges at each of the 40 study tide gauge site. In the file containing the sea level and skew surge data, the tide gauge sites are numbered 1 to 40. A fourth accompanying CSV file lists, for reference, the site name and location (longitude and latitude). A description of the parameters in each of the four CSV files is given in the table below. There are also 96 separate PDF files containing the event commentaries. For each event these contain a concise narrative of the meteorological and sea level conditions experienced during the event, and a succinct description of the evidence available in support of coastal flooding, with a brief account of the recorded consequences to people and property. In addition, these contain graphical representation of the storm track and mean sea level pressure and wind fields at the time of maximum high water, the return period and skew surge magnitudes at sites around the UK, and a table of the date and time, offset return period, water level, predicted tide and skew surge for each site where the 1 in 5 year threshold was reached or exceeded for each event. A detailed description of how the database was created is given in Haigh et al. (2015). Coastal flooding caused by extreme sea levels can be devastating, with long-lasting and diverse consequences. The UK has a long history of severe coastal flooding. The recent 2013-14 winter in particular, produced a sequence of some of the worst coastal flooding the UK has experienced in the last 100 years. At present 2.5 million properties and £150 billion of assets are potentially exposed to coastal flooding. Yet despite these concerns, there is no formal, national framework in the UK to record flood severity and consequences and thus benefit an understanding of coastal flooding mechanisms and consequences. Without a systematic record of flood events, assessment of coastal flooding around the UK coast is limited. The database was created at the School of Ocean and Earth Science, National Oceanography Centre, University of Southampton with help from the Faculty of Engineering and the Environment, University of Southampton, the National Oceanography Centre and the British Oceanographic Data Centre. Collation of the database and the development of the website was funded through a Natural Environment Research Council (NERC) impact acceleration grant. The database contributes to the objectives of UK Engineering and Physical Sciences Research Council (EPSRC) consortium project FLOOD Memory (EP/K013513/1).

Facebook

Twitter

Click to copy link

Link copied

Cite

Megan L. Peterson; William Morris; Cristina Linares; Daniel Doak; Megan L. Peterson; William Morris; Cristina Linares; Daniel Doak (2022). Data from: Improving structured population models with more realistic representations of non-normal growth [Dataset]. http://doi.org/10.5061/dryad.t6c3573

Data from: Improving structured population models with more realistic representations of non-normal growth

Explore at:

Unique identifier

https://doi.org/10.5061/dryad.t6c3573

Dataset updated

Jun 1, 2022

Dataset provided by

Zenodohttp://zenodo.org/

Authors

Megan L. Peterson; William Morris; Cristina Linares; Daniel Doak; Megan L. Peterson; William Morris; Cristina Linares; Daniel Doak

License

CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically

Description

Structured population models are among the most widely used tools in ecology and evolution. Integral projection models (IPMs) use continuous representations of how survival, reproduction, and growth change as functions of state variables such as size, requiring fewer parameters to be estimated than projection matrix models (PPMs). Yet almost all published IPMs make an important assumption: that size-dependent growth transitions are or can be transformed to be normally distributed. In fact, many organisms exhibit highly skewed size transitions. Small individuals can grow more than they can shrink, and large individuals may often shrink more dramatically than they can grow. Yet the implications of such skew for inference from IPMs has not been explored, nor have general methods been developed to incorporate skewed size transitions into IPMs, or deal with other aspects of real growth rates, including bounds on possible growth or shrinkage. 2. Here we develop a flexible approach to modeling skewed growth data using a modified beta regression model. We propose that sizes first be converted to a (0,1) interval by estimating size-dependent minimum and maximum sizes through quantile regression. Transformed data can then be modeled using beta regression with widely available statistical tools. We demonstrate the utility of this approach using demographic data for a long-lived plant, gorgonians, and an epiphytic lichen. Specifically, we compare inferences of population parameters from discrete PPMs to those from IPMs that either assume normality or incorporate skew using beta regression or, alternatively, a skewed normal model. 3. The beta and skewed normal distributions accurately capture the mean, variance, and skew of real growth distributions. Incorporating skewed growth into IPMs decreases population growth and estimated lifespan relative to IPMs that assume normally-distributed growth, and more closely approximate the parameters of PPMs that do not assume a particular growth distribution. A bounded distribution, such as the beta, also avoids the eviction problem caused by predicting some growth outside the modeled size range. 4. Incorporating biologically relevant skew in growth data has important consequences for inference from IPMs. The approaches we outline here are flexible and easy to implement with existing statistical tools.

Clear search

Close search

Google apps

Main menu

Data from: Improving structured population models with more realistic...

Skewness project raw data files and codes

Data from: Selection on skewed characters and the paradox of stasis

Results and analysis using the Lean Six-Sigma define, measure, analyze,...

Dataset for: Some Remarks on the R2 for Clustering

Annual peak-flow data and results of flood-frequency analysis for 76...

Reaction times and other skewed distributions: problems with the mean and...

Data from: Body temperature distributions of active diurnal lizards in three...

Data from: Table 4

Regional flood skew for the Tennessee and parts of the Ohio and Lower...

Impact of limited data availability on the accuracy of project duration...

Supplementary data for the paper "Why psychologists should not default to...

Data from: Using social parasitism to test reproductive skew models in a...

PeakFQ program input and output files for selected streamgages in...

Quantifying Distribution of Flow Cytometric TCR-Vβ Usage with Economic...

Data and reproducible analysis files from: Latitudinal clines in floral...

Reference Information

Provenance for this README

Dataset Version and Release History

Description of the data and file structure

Methodological Information

Data and File Overview

Summary Metrics

Naming Conventions

Cumulative viral load as a predictor of CD4+ T-cell response to...

Data from: Sediment particle size analysis for stations from the Western...

Data from: SkewDB: A comprehensive database of GC and 10 other skews for...

A database of 100 years (1915-2014) of coastal flooding in the UK

Data from: Improving structured population models with more realistic representations of non-normal growth