100+ datasets found

f
Data from: Clustering Spatial Data with a Mixture of Skewed Regression...
tandf.figshare.com
pdf
Updated May 12, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Junho Lee; Michael P. B. Gallaugher; Amanda S. Hering (2025). Clustering Spatial Data with a Mixture of Skewed Regression Models [Dataset]. http://doi.org/10.6084/m9.figshare.28454482.v1
Explore at:
pdfAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.28454482.v1
Dataset updated
May 12, 2025
Dataset provided by
Taylor & Francis
Authors
Junho Lee; Michael P. B. Gallaugher; Amanda S. Hering
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
A single regression model is unlikely to hold throughout a large and complex spatial domain. A finite mixture of regression models can address this issue by clustering the data and assigning a regression model to explain each homogenous group. However, a typical finite mixture of regressions does not account for spatial dependencies. Furthermore, the number of components selected can be too high in the presence of skewed data and/or heavy tails. Here, we propose a mixture of regression models on a Markov random field with skewed distributions. The proposed model identifies the locations wherein the relationship between the predictors and the response is similar and estimates the model within each group as well as the number of groups. Overfitting is addressed by using skewed distributions, such as the skew-t or normal inverse Gaussian, in the error term of each regression model. Model estimation is carried out using an EM algorithm, and the performance of the estimators and model selection are illustrated through an extensive simulation study and two case studies.
f
Normalization of High Dimensional Genomics Data Where the Distribution of...
plos.figshare.com
tiff
Updated Jun 1, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Mattias Landfors; Philge Philip; Patrik Rydén; Per Stenberg (2023). Normalization of High Dimensional Genomics Data Where the Distribution of the Altered Variables Is Skewed [Dataset]. http://doi.org/10.1371/journal.pone.0027942
Explore at:
tiffAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0027942
Dataset updated
Jun 1, 2023
Dataset provided by
PLOS ONE
Authors
Mattias Landfors; Philge Philip; Patrik Rydén; Per Stenberg
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Genome-wide analysis of gene expression or protein binding patterns using different array or sequencing based technologies is now routinely performed to compare different populations, such as treatment and reference groups. It is often necessary to normalize the data obtained to remove technical variation introduced in the course of conducting experimental work, but standard normalization techniques are not capable of eliminating technical bias in cases where the distribution of the truly altered variables is skewed, i.e. when a large fraction of the variables are either positively or negatively affected by the treatment. However, several experiments are likely to generate such skewed distributions, including ChIP-chip experiments for the study of chromatin, gene expression experiments for the study of apoptosis, and SNP-studies of copy number variation in normal and tumour tissues. A preliminary study using spike-in array data established that the capacity of an experiment to identify altered variables and generate unbiased estimates of the fold change decreases as the fraction of altered variables and the skewness increases. We propose the following work-flow for analyzing high-dimensional experiments with regions of altered variables: (1) Pre-process raw data using one of the standard normalization techniques. (2) Investigate if the distribution of the altered variables is skewed. (3) If the distribution is not believed to be skewed, no additional normalization is needed. Otherwise, re-normalize the data using a novel HMM-assisted normalization procedure. (4) Perform downstream analysis. Here, ChIP-chip data and simulated data were used to evaluate the performance of the work-flow. It was found that skewed distributions can be detected by using the novel DSE-test (Detection of Skewed Experiments). Furthermore, applying the HMM-assisted normalization to experiments where the distribution of the truly altered variables is skewed results in considerably higher sensitivity and lower bias than can be attained using standard and invariant normalization methods.
d
Replication Data for: Accounting for Skewed or One-Sided Measurement Error...
dataone.org
dataverse.harvard.edu
Updated Nov 22, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Millimet, Daniel; Parmeter, Christopher (2023). Replication Data for: Accounting for Skewed or One-Sided Measurement Error in the Dependent Variable [Dataset]. http://doi.org/10.7910/DVN/IKSE2O
Explore at:
Unique identifier
https://doi.org/10.7910/DVN/IKSE2O
Dataset updated
Nov 22, 2023
Dataset provided by
Harvard Dataverse
Authors
Millimet, Daniel; Parmeter, Christopher
Description
While classical measurement error in the dependent variable in a linear regression framework results only in a loss of precision, nonclassical measurement error can lead to estimates which are biased and inference which lacks power. Here, we consider a particular type of nonclassical measurement error: skewed errors. Unfortunately, skewed measurement error is likely to be a relatively common feature of many out- comes of interest in political science research. This study highlights the bias that can result even from relatively "small" amounts of skewed measurement error, particularly if the measurement error is heteroskedastic. We also assess potential solutions to this problem, focusing on the stochastic frontier model and nonlinear least squares. Simulations and three replications highlight the importance of thinking carefully about skewed measurement error, as well as appropriate solutions.
o
Data and Code for: Intrinsic Information Preferences and Skewness
openicpsr.org
Updated May 2, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Yusufcan Masatlioglu; Yesim Orhun; Collin Raymond (2023). Data and Code for: Intrinsic Information Preferences and Skewness [Dataset]. http://doi.org/10.3886/E190641V1
Explore at:
Unique identifier
https://doi.org/10.3886/E190641V1
Dataset updated
May 2, 2023
Dataset provided by
American Economic Association
Authors
Yusufcan Masatlioglu; Yesim Orhun; Collin Raymond
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Time period covered
Nov 2014
Area covered
US
Description
This project examines whether people have an intrinsic preference for negatively skewed or positively skewed information structures and how these preferences relate to intrinsic preferences for informativeness. It reports results from 5 studies (3 lab experiments, 2 online studies).
U
Annual peak-flow data and PeakFQ output files for selected streamflow gaging...
data.usgs.gov
catalog.data.gov
Updated Feb 24, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Daniel Wagner; Andrea Veilleux (2024). Annual peak-flow data and PeakFQ output files for selected streamflow gaging stations operated by the U.S. Geological Survey in the New England region that were used to estimate regional skewness of annual peak flows [Dataset]. http://doi.org/10.5066/P9MC98OM
Explore at:
Unique identifier
https://doi.org/10.5066/P9MC98OM
Dataset updated
Feb 24, 2024
Dataset provided by
United States Geological Surveyhttp://www.usgs.gov/
Authors
Daniel Wagner; Andrea Veilleux
License
U.S. Government Workshttps://www.usa.gov/government-works
License information was derived automatically
Time period covered
Sep 30, 2011
Area covered
New England
Description
"NewEngland_pkflows.PRT" is a text file that contains results of flood-frequency analysis of annual peak flows from 186 selected streamflow gaging stations (streamgages) operated by the U.S. Geological Survey (USGS) in the New England region (Maine, Connecticut, Massachusetts, Rhode Island, New York, New Hampshire, and Vermont). Only streamgages in the region that were also in the USGS "GAGES II" database (https://water.usgs.gov/GIS/metadata/usgswrd/XML/gagesII_Sept2011.xml) were considered for use in the study. The file was generated by combining PeakFQ output (.PRT) files created using version 7.0 of USGS software PeakFQ (https://water.usgs.gov/software/PeakFQ/; Veilleux and others, 2014) to conduct flood-frequency analyses using the Expected Moments Algorithm (England and others, 2018). The peak-flow files used as input to PeakFQ were obtained from the USGS National Water Information System (NWIS) database (https://nwis.waterdata.usgs.gov/usa/nwis/peak) and contained annual ...
f
Experimental results for birth-death trees with the diameter of 2 and edge...
plos.figshare.com
xls
Updated May 31, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Seunghwa Kang; Jijun Tang; Stephen W. Schaeffer; David A. Bader (2023). Experimental results for birth-death trees with the diameter of 2 and edge lengths in a skewed distribution. [Dataset]. http://doi.org/10.1371/journal.pone.0022483.t004
Explore at:
xlsAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0022483.t004
Dataset updated
May 31, 2023
Dataset provided by
PLOS ONE
Authors
Seunghwa Kang; Jijun Tang; Stephen W. Schaeffer; David A. Bader
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
We generate 10 model trees for a given number of genomes (). The number of false positives (FP), the number of false negatives (FN), and the execution time (time) in a cell are the average of the finished computations (finished: the number of finished computations within 24 hours) out of 10 trials using 10 different model trees. , , and in the tables are hours, minutes, and seconds, respectively. is the number of genes in a genome, which is 100 in our experiments.
Median and IQR of skewed data for CRP.
plos.figshare.com
datasetcatalog.nlm.nih.gov
xls
Updated Jun 3, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
John R. Balmes; Mehrdad Arjomandi; Philip A. Bromberg; Maria G. Costantini; Nicholas Dagincourt; Milan J. Hazucha; Danielle Hollenbeck-Pringle; David Q. Rich; Paul Stark; Mark W. Frampton (2023). Median and IQR of skewed data for CRP. [Dataset]. http://doi.org/10.1371/journal.pone.0222601.t003
Explore at:
xlsAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0222601.t003
Dataset updated
Jun 3, 2023
Dataset provided by
PLOShttp://plos.org/
Authors
John R. Balmes; Mehrdad Arjomandi; Philip A. Bromberg; Maria G. Costantini; Nicholas Dagincourt; Milan J. Hazucha; Danielle Hollenbeck-Pringle; David Q. Rich; Paul Stark; Mark W. Frampton
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Median and IQR of skewed data for CRP.
Dealing with highly skewed hospital length of stay distributions: The use of...
plos.figshare.com
datasetcatalog.nlm.nih.gov
docx
Updated Jun 1, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Eva Williford; Valerie Haley; Louise-Anne McNutt; Victoria Lazariu (2023). Dealing with highly skewed hospital length of stay distributions: The use of Gamma mixture models to study delivery hospitalizations [Dataset]. http://doi.org/10.1371/journal.pone.0231825
Explore at:
docxAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0231825
Dataset updated
Jun 1, 2023
Dataset provided by
PLOShttp://plos.org/
Authors
Eva Williford; Valerie Haley; Louise-Anne McNutt; Victoria Lazariu
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The increased focus on addressing severe maternal morbidity and maternal mortality has led to studies investigating patient and hospital characteristics associated with longer hospital stays. Length of stay (LOS) for delivery hospitalizations has a strongly skewed distribution with the vast majority of LOS lasting two to three days in the United States. Prior studies typically focused on common LOSs and dealt with the long LOS distribution tail in ways to fit conventional statistical analyses (e.g., log transformation, trimming). This study demonstrates the use of Gamma mixture models to analyze the skewed LOS distribution. Gamma mixture models are flexible and, do not require data transformation or removal of outliers to accommodate many outcome distribution shapes, these models allow for the analysis of patients staying in the hospital for a longer time, which often includes those women experiencing worse outcomes. Random effects are included in the model to account for patients being treated within the same hospitals. Further, the role and influence of differing placements of covariates on the results is discussed in the context of distinct model specifications of the Gamma mixture regression model. The application of these models shows that they are robust to the placement of covariates and random effects. Using New York State data, the models showed that longer LOS for childbirth hospitalizations were more common in hospitals designated to accept more complicated deliveries, across hospital types, and among Black women. Primary insurance also was associated with LOS. Substantial variation between hospitals suggests the need to investigate protocols to standardize evidence-based medical care.
d
Data from: Selection on skewed characters and the paradox of stasis
datadryad.org
zip
Updated Sep 8, 2017
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Suzanne Bonamour; Céline Teplitsky; Anne Charmantier; Pierre-André Crochet; Luis-Miguel Chevin (2017). Selection on skewed characters and the paradox of stasis [Dataset]. http://doi.org/10.5061/dryad.pt07g
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.5061/dryad.pt07g
Dataset updated
Sep 8, 2017
Dataset provided by
Dryad
Authors
Suzanne Bonamour; Céline Teplitsky; Anne Charmantier; Pierre-André Crochet; Luis-Miguel Chevin
Time period covered
Sep 6, 2017
Description
Observed phenotypic responses to selection in the wild often differ from predictions based on measurements of selection and genetic variance. An overlooked hypothesis to explain this paradox of stasis is that a skewed phenotypic distribution affects natural selection and evolution. We show through mathematical modelling that, when a trait selected for an optimum phenotype has a skewed distribution, directional selection is detected even at evolutionary equilibrium, where it causes no change in the mean phenotype. When environmental effects are skewed, Lande and Arnold’s (1983) directional gradient is in the direction opposite to the skew. In contrast, skewed breeding values can displace the mean phenotype from the optimum, causing directional selection in the direction of the skew. These effects can be partitioned out using alternative selection estimates based on average derivatives of individual relative fitness, or additive genetic covariances between relative fitness and trait (Robe...
Data for "Preference patterns for skewed gambles in rhesus monkeys"
figshare.com
datasetcatalog.nlm.nih.gov
+1more
bin
Updated Jan 18, 2016
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Caleb Strait; Benjamin Hayden (2016). Data for "Preference patterns for skewed gambles in rhesus monkeys" [Dataset]. http://doi.org/10.6084/m9.figshare.827292.v1
Explore at:
binAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.827292.v1
Dataset updated
Jan 18, 2016
Dataset provided by
figshare
Figsharehttp://figshare.com/
Authors
Caleb Strait; Benjamin Hayden
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Data for the paper, "Preference patterns for skewed gambles in rhesus monkeys."
d
Randomized Battery Usage 7: Low-Temperature Left-Skewed Random Walk
catalog.data.gov
s.cnmilf.com
+1more
Updated Apr 11, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
PCoE (2025). Randomized Battery Usage 7: Low-Temperature Left-Skewed Random Walk [Dataset]. https://catalog.data.gov/dataset/randomized-battery-usage-7-low-temperature-left-skewed-random-walk
Explore at:
Dataset updated
Apr 11, 2025
Dataset provided by
PCoE
Description
This dataset is part of a series of datasets, where batteries are continuously cycled with randomly generated current profiles. Reference charging and discharging cycles are also performed after a fixed interval of randomized usage to provide reference benchmarks for battery state of health. In this dataset, four 18650 Li-ion batteries (Identified as RW13, RW14, RW15 and RW16) were continuously operated by repeatedly charging them to 4.2V and then discharging them to 3.2V using a randomized sequence of discharging currents between 0.5A and 5A. This type of discharging profile is referred to here as random walk (RW) discharging. A customized probability distribution is used in this experiment to select a new load setpoint every 1 minute during RW discharging operation. The custom probability distribution was designed to be skewed towards selecting lower currents.
u
Results and analysis using the Lean Six-Sigma define, measure, analyze,...
researchdata.up.ac.za
docx
Updated Mar 12, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Modiehi Mophethe (2024). Results and analysis using the Lean Six-Sigma define, measure, analyze, improve, and control (DMAIC) Framework [Dataset]. http://doi.org/10.25403/UPresearchdata.25370374.v1
Explore at:
docxAvailable download formats
Unique identifier
https://doi.org/10.25403/UPresearchdata.25370374.v1
Dataset updated
Mar 12, 2024
Dataset provided by
University of Pretoria
Authors
Modiehi Mophethe
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This section presents a discussion of the research data. The data was received as secondary data however, it was originally collected using the time study techniques. Data validation is a crucial step in the data analysis process to ensure that the data is accurate, complete, and reliable. Descriptive statistics was used to validate the data. The mean, mode, standard deviation, variance and range determined provides a summary of the data distribution and assists in identifying outliers or unusual patterns. The data presented in the dataset show the measures of central tendency which includes the mean, median and the mode. The mean signifies the average value of each of the factors presented in the tables. This is the balance point of the dataset, the typical value and behaviour of the dataset. The median is the middle value of the dataset for each of the factors presented. This is the point where the dataset is divided into two parts, half of the values lie below this value and the other half lie above this value. This is important for skewed distributions. The mode shows the most common value in the dataset. It was used to describe the most typical observation. These values are important as they describe the central value around which the data is distributed. The mean, mode and median give an indication of a skewed distribution as they are not similar nor are they close to one another. In the dataset, the results and discussion of the results is also presented. This section focuses on the customisation of the DMAIC (Define, Measure, Analyse, Improve, Control) framework to address the specific concerns outlined in the problem statement. To gain a comprehensive understanding of the current process, value stream mapping was employed, which is further enhanced by measuring the factors that contribute to inefficiencies. These factors are then analysed and ranked based on their impact, utilising factor analysis. To mitigate the impact of the most influential factor on project inefficiencies, a solution is proposed using the EOQ (Economic Order Quantity) model. The implementation of the 'CiteOps' software facilitates improved scheduling, monitoring, and task delegation in the construction project through digitalisation. Furthermore, project progress and efficiency are monitored remotely and in real time. In summary, the DMAIC framework was tailored to suit the requirements of the specific project, incorporating techniques from inventory management, project management, and statistics to effectively minimise inefficiencies within the construction project.
Results for datasets with simulated error (derived from Datasets 1-5)
zenodo.org
zip
Updated Mar 29, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Elise Rivett; Elise Rivett (2024). Results for datasets with simulated error (derived from Datasets 1-5) [Dataset]. http://doi.org/10.5281/zenodo.10888931
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.10888931
Dataset updated
Mar 29, 2024
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Elise Rivett; Elise Rivett
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Datasets w/ simulated error derived from Datasets 1-5. There are 1000 datasets per combination of error level and skewness type.

e1: Low error, no skew

e2: Medium error, no skew

e3: High error, no skew

e4: Low error, left skew

e5: Medium error, left skew

e6: High error, left skew

e7: Low error, right skew

e8: Medium error, right skew

e9: High error, right skew
Skewness project raw data files and codes
figshare.com
xlsx
Updated Mar 14, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Raunak Dey; Sreekanth K Manikandan (2022). Skewness project raw data files and codes [Dataset]. http://doi.org/10.6084/m9.figshare.17703269.v2
Explore at:
xlsxAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.17703269.v2
Dataset updated
Mar 14, 2022
Dataset provided by
Figsharehttp://figshare.com/
Authors
Raunak Dey; Sreekanth K Manikandan
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This repository contains raw data files and base codes to analyze them.A. The 'powerx_y.xlsx' files are the data files with the one dimensional trajectory of optically trapped probes modulated by an Ornstein-Uhlenbeck noise of given 'x' amplitude. For the corresponding diffusion amplitude A=0.1X(0.6X10-6)2 m2/s, x is labelled as '1'B. The codes are of three types. The skewness codes are used to calculate the skewness of the trajectory. The error_in_fit codes are used to calculate deviations from arcsine behavior. The sigma_exp codes point to the deviation of the mean from 0.5. All the codes are written three times to look ar T+, Tlast and Tmax.C. More information can be found in the manuscript.
o
Data and Code for: Robot Hubs: The Skewed Distribution of Robots in U.S....
openicpsr.org
delimited
Updated May 1, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
ERIK BRYNJOLFSSON; CATHERINE BUFFINGTON; NATHAN GOLDSCHLAG; J. FRANK LI; JAVIER MIRANDA; ROBERT SEAMANS (2023). Data and Code for: Robot Hubs: The Skewed Distribution of Robots in U.S. Manufacturing [Dataset]. http://doi.org/10.3886/E190543V1
Explore at:
delimitedAvailable download formats
Unique identifier
https://doi.org/10.3886/E190543V1
Dataset updated
May 1, 2023
Dataset provided by
American Economic Association
Authors
ERIK BRYNJOLFSSON; CATHERINE BUFFINGTON; NATHAN GOLDSCHLAG; J. FRANK LI; JAVIER MIRANDA; ROBERT SEAMANS
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
United States
Description
New technologies drive productivity growth (Romer 1990). Preliminary evidence using national-level data from 17 countries between 1993 and 2007 suggests that robots, like prior generations of general-purpose technologies, are also driving productivity growth (Graetz & Michaels 2018). Moreover, according to data compiled by the International Federation of Robotics (IFR), since 2010 the number of robot shipments has nearly quadrupled from about 100,000 to almost 400,000 per year, so the impact of robots on the economy is likely even greater. However, while robotics may be contributing to GDP growth at a national level, scholars are still working to understand how robots affect employment and other outcomes at other levels of analysis, leading to calls for establishment-level measures of robots and other new technologies (Brynjolfsson & Mitchell 2017; Raj and Seamans 2019). To address this need, the U.S. Census Bureau, working with external researchers, developed a series of questions on the adoption and use of robots. These questions have subsequently been included in the Annual Survey of Manufactures and other Census surveys (Buffington, Miranda & Seamans 2018; Brynjolfsson et al. 2020; Acemoglu et al. 2022). In this paper we present results on the distribution of robots in U.S. manufacturing, using the new establishment-level microdata collected by the U.S. Census Bureau. We use the data to present several facts about the location and use of robots. We find that the distribution of robots is highly skewed across locations, even accounting for the different mix of industry and manufacturing employment. Some locations - which we call “Robot Hubs” - have many more robots than one would expect after accounting for industry and manufacturing employment. We characterize these Robot Hubs along several industry, demographic, and institutional dimensions, and find that the presence of robot integrators and union membership are distinguishing features of Robot Hubs.
n
Data from: Body temperature distributions of active diurnal lizards in three...
data.niaid.nih.gov
datadryad.org
zip
Updated Aug 4, 2018
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Raymond B. Huey; Eric R. Pianka (2018). Body temperature distributions of active diurnal lizards in three deserts: skewed up or skewed down? [Dataset]. http://doi.org/10.5061/dryad.45g3s
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.5061/dryad.45g3s
Dataset updated
Aug 4, 2018
Dataset provided by
The University of Texas at Austin
University of Washington
Authors
Raymond B. Huey; Eric R. Pianka
License
https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html
Area covered
North America, Africa, Australia
Description
The performance of ectotherms integrated over time depends in part on the position and shape of the distribution of body temperatures (Tb) experienced during activity. For several complementary reasons, physiological ecologists have long expected that Tb distributions during activity should have a long left tail (left-skewed); but only infrequently have they quantified the magnitude and direction of Tb skewness in nature.

To evaluate whether left-skewed Tb distributions are general for diurnal desert lizards, we compiled and analyzed Tb (∑ = 9,023 temperatures) from our own prior studies of active desert lizards on three continents (25 species in Western Australia, 10 in the Kalahari Desert of Africa, and 10 species in western North America). We gathered these data over several decades, using standardized techniques.

Many species showed significantly left-skewed Tb distributions, even when records were restricted to summer months. However, magnitudes of skewness were always small, such that mean Tb were never more than 1°C lower than median Tb. The significance of Tb skewness was sensitive to sample size, and power tests reinforced this sensitivity.

The magnitude of skewness was not obviously related to phylogeny, desert, body size, or median body temperature. Moreover, formal phylogenetic analysis is inappropriate because geography and phylogeny are confounded (that is, are highly collinear).

Skewness might be limited if lizards pre-warm inside retreats before emerging in the morning, emerge only when operative temperatures are high enough to speed warming to activity Tb, or if cold lizards are especially wary and difficult to spot or catch. Telemetry studies may help evaluate these possibilities.
n
Data from: Evolution of quantitative traits under a migration-selection...
data.niaid.nih.gov
search.dataone.org
+1more
zip
Updated Jul 21, 2015
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Florence Débarre; Sam Yeaman; Frédéric Guillaume (2015). Evolution of quantitative traits under a migration-selection balance: when does skew matter? [Dataset]. http://doi.org/10.5061/dryad.ms52b
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.5061/dryad.ms52b
Dataset updated
Jul 21, 2015
Authors
Florence Débarre; Sam Yeaman; Frédéric Guillaume
License
https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html
Description
Quantitative-genetic models of differentiation under migration-selection balance often rely on the assumption of normally distributed genotypic and phenotypic values. When a population is subdivided into demes with selection toward different local optima, migration between demes may result in asymmetric, or skewed, local distributions. Using a simplified two-habitat model, we derive formulas without a priori assuming a Gaussian distribution of genotypic values, and we find expressions that naturally incorporate higher moments, such as skew. These formulas yield predictions of the expected divergence under migration-selection balance that are more accurate than models assuming Gaussian distributions, which illustrates the importance of incorporating these higher moments to assess the response to selection in heterogeneous environments. We further show with simulations that traits with loci of large effect display the largest skew in their distribution at migration-selection balance.
t
Replication data for: does “very” make a difference? effects of intensifiers...
service.tib.eu
Updated May 16, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2025). Replication data for: does “very” make a difference? effects of intensifiers in item stems of employee attitude surveys on response behavior [Dataset]. https://service.tib.eu/ldmservice/dataset/osn-doi-10-26249-fk2-oexojh
Explore at:
Dataset updated
May 16, 2025
Description
Abstract: Employee attitude surveys are important tools for organizational development. To gain insights into employees’ attitudes, surveys most often use Likert-type items. Measures assessing these attitudes frequently use intensifiers (e.g., extremely, very) in item stems. To date little is known about the effects of intensifiers in the item stem on response behavior. They are frequently used inconsistently, which potentially has implications for the comparability of results in the context of benchmarking. Also, results often suffer from left-skewed distributions limiting data quality for which the use of intensifiers potentially offers a remedy. Therefore, we systematically examine the effects of intensifiers’ on response behavior in employee attitude surveys and their potential to remedy the issue of left-skewed distributions. In three studies, we assess effects on level, skewness and nomological structure. Study 1 examines the effects of intensifier strength in the item stem, while Studies 2 and 3 assess whether intensifier salience would increase these effects further. Interestingly, results did not show systematic effects. Future research ideas in regards to item design and processing as well as practical implications for the design of employee attitude surveys are discussed. Other: Does “very” make a difference? Effects of intensifiers in item stems of employee attitude surveys on response behavior - in preparation
o
Data from: Improving structured population models with more realistic...
explore.openaire.eu
search.dataone.org
+1more
Updated Jun 14, 2019
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Megan L. Peterson; William Morris; Cristina Linares; Daniel Doak (2019). Data from: Improving structured population models with more realistic representations of non-normal growth [Dataset]. http://doi.org/10.5061/dryad.t6c3573
Explore at:
Unique identifier
https://doi.org/10.5061/dryad.t6c3573
Dataset updated
Jun 14, 2019
Authors
Megan L. Peterson; William Morris; Cristina Linares; Daniel Doak
Description
Structured population models are among the most widely used tools in ecology and evolution. Integral projection models (IPMs) use continuous representations of how survival, reproduction, and growth change as functions of state variables such as size, requiring fewer parameters to be estimated than projection matrix models (PPMs). Yet almost all published IPMs make an important assumption: that size-dependent growth transitions are or can be transformed to be normally distributed. In fact, many organisms exhibit highly skewed size transitions. Small individuals can grow more than they can shrink, and large individuals may often shrink more dramatically than they can grow. Yet the implications of such skew for inference from IPMs has not been explored, nor have general methods been developed to incorporate skewed size transitions into IPMs, or deal with other aspects of real growth rates, including bounds on possible growth or shrinkage. 2. Here we develop a flexible approach to modeling skewed growth data using a modified beta regression model. We propose that sizes first be converted to a (0,1) interval by estimating size-dependent minimum and maximum sizes through quantile regression. Transformed data can then be modeled using beta regression with widely available statistical tools. We demonstrate the utility of this approach using demographic data for a long-lived plant, gorgonians, and an epiphytic lichen. Specifically, we compare inferences of population parameters from discrete PPMs to those from IPMs that either assume normality or incorporate skew using beta regression or, alternatively, a skewed normal model. 3. The beta and skewed normal distributions accurately capture the mean, variance, and skew of real growth distributions. Incorporating skewed growth into IPMs decreases population growth and estimated lifespan relative to IPMs that assume normally-distributed growth, and more closely approximate the parameters of PPMs that do not assume a particular growth distribution. A bounded distribution, such as the beta, also avoids the eviction problem caused by predicting some growth outside the modeled size range. 4. Incorporating biologically relevant skew in growth data has important consequences for inference from IPMs. The approaches we outline here are flexible and easy to implement with existing statistical tools. Bistort raw dataDemographic data for Polygonum viviparum collected at Niwot Ridge, CO from 2001-2011. szs0 = size at time t, szs1 = size at time t+1, bulbs0 = number of bulbils produced at time t. Details of data collection given in the supporting information.Gorgonian raw dataDemographic data for Paramuricea clavata collected in the NW Mediterranean Sea from 1999-2004. Mortality = source of mortality, Site = site, Plot = plot, Year = annual transition from time t to time t+1, Ncol = colony id, Size = size at time t, Sizenext = size at time t+1, Survnext = dead (0) or alive (1) at time t+1. Details of data collection and reproduction given in supporting information.Vulpicida raw dataDemographic data for Vulpicida pinastri collected in Kennicott Valley, AK from 2004-2009. Year = year at time t, site = site, t0 = size at time t, t1 = size at time t+1, repro = number of offspring assigned based on thallus circumference, survival = dead (0) or alive (1) at time t+1. Details of data collection and reproduction are given in supporting information.Appendix 1R script for analyses in Figure 2Appendix 2R script for analyses of coralAppendix 3R script to simultaneously fit the minimum, maximum, mean, and precision parameters of the beta approach using maximum likelihood
u
Skew-T Plots: Boise
data.ucar.edu
image
Updated Aug 1, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Research Applications Laboratory (RAL); NCAR (2025). Skew-T Plots: Boise [Dataset]. http://doi.org/10.26023/B3PP-7VG2-5Z0S
Explore at:
imageAvailable download formats
Unique identifier
https://doi.org/10.26023/B3PP-7VG2-5Z0S
Dataset updated
Aug 1, 2025
Authors
Research Applications Laboratory (RAL); NCAR
Time period covered
Nov 8, 2007 - Jan 4, 2008
Area covered

Description
This dataset contains upper air Skew-T Log-P charts taken at Boise, Idaho during the ICE-L project. The imagery are in GIF format. The imagery cover the time span from 2007-11-08 12:00:00 to 2008-01-03 12:00:00.

Facebook

Twitter

Click to copy link

Link copied

Cite

Junho Lee; Michael P. B. Gallaugher; Amanda S. Hering (2025). Clustering Spatial Data with a Mixture of Skewed Regression Models [Dataset]. http://doi.org/10.6084/m9.figshare.28454482.v1

Data from: Clustering Spatial Data with a Mixture of Skewed Regression Models

Explore at:

pdfAvailable download formats

Unique identifier

https://doi.org/10.6084/m9.figshare.28454482.v1

Dataset updated

May 12, 2025

Dataset provided by

Taylor & Francis

Authors

Junho Lee; Michael P. B. Gallaugher; Amanda S. Hering

License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

A single regression model is unlikely to hold throughout a large and complex spatial domain. A finite mixture of regression models can address this issue by clustering the data and assigning a regression model to explain each homogenous group. However, a typical finite mixture of regressions does not account for spatial dependencies. Furthermore, the number of components selected can be too high in the presence of skewed data and/or heavy tails. Here, we propose a mixture of regression models on a Markov random field with skewed distributions. The proposed model identifies the locations wherein the relationship between the predictors and the response is similar and estimates the model within each group as well as the number of groups. Overfitting is addressed by using skewed distributions, such as the skew-t or normal inverse Gaussian, in the error term of each regression model. Model estimation is carried out using an EM algorithm, and the performance of the estimators and model selection are illustrated through an extensive simulation study and two case studies.

Clear search

Close search

Google apps

Main menu

Data from: Clustering Spatial Data with a Mixture of Skewed Regression...

Normalization of High Dimensional Genomics Data Where the Distribution of...

Replication Data for: Accounting for Skewed or One-Sided Measurement Error...

Data and Code for: Intrinsic Information Preferences and Skewness

Annual peak-flow data and PeakFQ output files for selected streamflow gaging...

Experimental results for birth-death trees with the diameter of 2 and edge...

Median and IQR of skewed data for CRP.

Dealing with highly skewed hospital length of stay distributions: The use of...

Data from: Selection on skewed characters and the paradox of stasis

Data for "Preference patterns for skewed gambles in rhesus monkeys"

Randomized Battery Usage 7: Low-Temperature Left-Skewed Random Walk

Results and analysis using the Lean Six-Sigma define, measure, analyze,...

Results for datasets with simulated error (derived from Datasets 1-5)

Skewness project raw data files and codes

Data and Code for: Robot Hubs: The Skewed Distribution of Robots in U.S....

Data from: Body temperature distributions of active diurnal lizards in three...

Data from: Evolution of quantitative traits under a migration-selection...

Replication data for: does “very” make a difference? effects of intensifiers...

Data from: Improving structured population models with more realistic...

Skew-T Plots: Boise

Data from: Clustering Spatial Data with a Mixture of Skewed Regression Models