100+ datasets found
  1. f

    Data from: Clustering Spatial Data with a Mixture of Skewed Regression...

    • tandf.figshare.com
    pdf
    Updated May 12, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Junho Lee; Michael P. B. Gallaugher; Amanda S. Hering (2025). Clustering Spatial Data with a Mixture of Skewed Regression Models [Dataset]. http://doi.org/10.6084/m9.figshare.28454482.v1
    Explore at:
    pdfAvailable download formats
    Dataset updated
    May 12, 2025
    Dataset provided by
    Taylor & Francis
    Authors
    Junho Lee; Michael P. B. Gallaugher; Amanda S. Hering
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    A single regression model is unlikely to hold throughout a large and complex spatial domain. A finite mixture of regression models can address this issue by clustering the data and assigning a regression model to explain each homogenous group. However, a typical finite mixture of regressions does not account for spatial dependencies. Furthermore, the number of components selected can be too high in the presence of skewed data and/or heavy tails. Here, we propose a mixture of regression models on a Markov random field with skewed distributions. The proposed model identifies the locations wherein the relationship between the predictors and the response is similar and estimates the model within each group as well as the number of groups. Overfitting is addressed by using skewed distributions, such as the skew-t or normal inverse Gaussian, in the error term of each regression model. Model estimation is carried out using an EM algorithm, and the performance of the estimators and model selection are illustrated through an extensive simulation study and two case studies.

  2. f

    Normalization of High Dimensional Genomics Data Where the Distribution of...

    • plos.figshare.com
    tiff
    Updated Jun 1, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mattias Landfors; Philge Philip; Patrik Rydén; Per Stenberg (2023). Normalization of High Dimensional Genomics Data Where the Distribution of the Altered Variables Is Skewed [Dataset]. http://doi.org/10.1371/journal.pone.0027942
    Explore at:
    tiffAvailable download formats
    Dataset updated
    Jun 1, 2023
    Dataset provided by
    PLOS ONE
    Authors
    Mattias Landfors; Philge Philip; Patrik Rydén; Per Stenberg
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Genome-wide analysis of gene expression or protein binding patterns using different array or sequencing based technologies is now routinely performed to compare different populations, such as treatment and reference groups. It is often necessary to normalize the data obtained to remove technical variation introduced in the course of conducting experimental work, but standard normalization techniques are not capable of eliminating technical bias in cases where the distribution of the truly altered variables is skewed, i.e. when a large fraction of the variables are either positively or negatively affected by the treatment. However, several experiments are likely to generate such skewed distributions, including ChIP-chip experiments for the study of chromatin, gene expression experiments for the study of apoptosis, and SNP-studies of copy number variation in normal and tumour tissues. A preliminary study using spike-in array data established that the capacity of an experiment to identify altered variables and generate unbiased estimates of the fold change decreases as the fraction of altered variables and the skewness increases. We propose the following work-flow for analyzing high-dimensional experiments with regions of altered variables: (1) Pre-process raw data using one of the standard normalization techniques. (2) Investigate if the distribution of the altered variables is skewed. (3) If the distribution is not believed to be skewed, no additional normalization is needed. Otherwise, re-normalize the data using a novel HMM-assisted normalization procedure. (4) Perform downstream analysis. Here, ChIP-chip data and simulated data were used to evaluate the performance of the work-flow. It was found that skewed distributions can be detected by using the novel DSE-test (Detection of Skewed Experiments). Furthermore, applying the HMM-assisted normalization to experiments where the distribution of the truly altered variables is skewed results in considerably higher sensitivity and lower bias than can be attained using standard and invariant normalization methods.

  3. d

    Replication Data for: Accounting for Skewed or One-Sided Measurement Error...

    • dataone.org
    • dataverse.harvard.edu
    Updated Nov 22, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Millimet, Daniel; Parmeter, Christopher (2023). Replication Data for: Accounting for Skewed or One-Sided Measurement Error in the Dependent Variable [Dataset]. http://doi.org/10.7910/DVN/IKSE2O
    Explore at:
    Dataset updated
    Nov 22, 2023
    Dataset provided by
    Harvard Dataverse
    Authors
    Millimet, Daniel; Parmeter, Christopher
    Description

    While classical measurement error in the dependent variable in a linear regression framework results only in a loss of precision, nonclassical measurement error can lead to estimates which are biased and inference which lacks power. Here, we consider a particular type of nonclassical measurement error: skewed errors. Unfortunately, skewed measurement error is likely to be a relatively common feature of many out- comes of interest in political science research. This study highlights the bias that can result even from relatively "small" amounts of skewed measurement error, particularly if the measurement error is heteroskedastic. We also assess potential solutions to this problem, focusing on the stochastic frontier model and nonlinear least squares. Simulations and three replications highlight the importance of thinking carefully about skewed measurement error, as well as appropriate solutions.

  4. o

    Data and Code for: Intrinsic Information Preferences and Skewness

    • openicpsr.org
    Updated May 2, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Yusufcan Masatlioglu; Yesim Orhun; Collin Raymond (2023). Data and Code for: Intrinsic Information Preferences and Skewness [Dataset]. http://doi.org/10.3886/E190641V1
    Explore at:
    Dataset updated
    May 2, 2023
    Dataset provided by
    American Economic Association
    Authors
    Yusufcan Masatlioglu; Yesim Orhun; Collin Raymond
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    Nov 2014
    Area covered
    US
    Description

    This project examines whether people have an intrinsic preference for negatively skewed or positively skewed information structures and how these preferences relate to intrinsic preferences for informativeness. It reports results from 5 studies (3 lab experiments, 2 online studies).

  5. U

    Annual peak-flow data and PeakFQ output files for selected streamflow gaging...

    • data.usgs.gov
    • catalog.data.gov
    Updated Feb 24, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Daniel Wagner; Andrea Veilleux (2024). Annual peak-flow data and PeakFQ output files for selected streamflow gaging stations operated by the U.S. Geological Survey in the New England region that were used to estimate regional skewness of annual peak flows [Dataset]. http://doi.org/10.5066/P9MC98OM
    Explore at:
    Dataset updated
    Feb 24, 2024
    Dataset provided by
    United States Geological Surveyhttp://www.usgs.gov/
    Authors
    Daniel Wagner; Andrea Veilleux
    License

    U.S. Government Workshttps://www.usa.gov/government-works
    License information was derived automatically

    Time period covered
    Sep 30, 2011
    Area covered
    New England
    Description

    "NewEngland_pkflows.PRT" is a text file that contains results of flood-frequency analysis of annual peak flows from 186 selected streamflow gaging stations (streamgages) operated by the U.S. Geological Survey (USGS) in the New England region (Maine, Connecticut, Massachusetts, Rhode Island, New York, New Hampshire, and Vermont). Only streamgages in the region that were also in the USGS "GAGES II" database (https://water.usgs.gov/GIS/metadata/usgswrd/XML/gagesII_Sept2011.xml) were considered for use in the study. The file was generated by combining PeakFQ output (.PRT) files created using version 7.0 of USGS software PeakFQ (https://water.usgs.gov/software/PeakFQ/; Veilleux and others, 2014) to conduct flood-frequency analyses using the Expected Moments Algorithm (England and others, 2018). The peak-flow files used as input to PeakFQ were obtained from the USGS National Water Information System (NWIS) database (https://nwis.waterdata.usgs.gov/usa/nwis/peak) and contained annual ...

  6. f

    Experimental results for birth-death trees with the diameter of 2 and edge...

    • plos.figshare.com
    xls
    Updated May 31, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Seunghwa Kang; Jijun Tang; Stephen W. Schaeffer; David A. Bader (2023). Experimental results for birth-death trees with the diameter of 2 and edge lengths in a skewed distribution. [Dataset]. http://doi.org/10.1371/journal.pone.0022483.t004
    Explore at:
    xlsAvailable download formats
    Dataset updated
    May 31, 2023
    Dataset provided by
    PLOS ONE
    Authors
    Seunghwa Kang; Jijun Tang; Stephen W. Schaeffer; David A. Bader
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    We generate 10 model trees for a given number of genomes (). The number of false positives (FP), the number of false negatives (FN), and the execution time (time) in a cell are the average of the finished computations (finished: the number of finished computations within 24 hours) out of 10 trials using 10 different model trees. , , and in the tables are hours, minutes, and seconds, respectively. is the number of genes in a genome, which is 100 in our experiments.

  7. Median and IQR of skewed data for CRP.

    • plos.figshare.com
    • datasetcatalog.nlm.nih.gov
    xls
    Updated Jun 3, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    John R. Balmes; Mehrdad Arjomandi; Philip A. Bromberg; Maria G. Costantini; Nicholas Dagincourt; Milan J. Hazucha; Danielle Hollenbeck-Pringle; David Q. Rich; Paul Stark; Mark W. Frampton (2023). Median and IQR of skewed data for CRP. [Dataset]. http://doi.org/10.1371/journal.pone.0222601.t003
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jun 3, 2023
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    John R. Balmes; Mehrdad Arjomandi; Philip A. Bromberg; Maria G. Costantini; Nicholas Dagincourt; Milan J. Hazucha; Danielle Hollenbeck-Pringle; David Q. Rich; Paul Stark; Mark W. Frampton
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Median and IQR of skewed data for CRP.

  8. Dealing with highly skewed hospital length of stay distributions: The use of...

    • plos.figshare.com
    • datasetcatalog.nlm.nih.gov
    docx
    Updated Jun 1, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Eva Williford; Valerie Haley; Louise-Anne McNutt; Victoria Lazariu (2023). Dealing with highly skewed hospital length of stay distributions: The use of Gamma mixture models to study delivery hospitalizations [Dataset]. http://doi.org/10.1371/journal.pone.0231825
    Explore at:
    docxAvailable download formats
    Dataset updated
    Jun 1, 2023
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Eva Williford; Valerie Haley; Louise-Anne McNutt; Victoria Lazariu
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The increased focus on addressing severe maternal morbidity and maternal mortality has led to studies investigating patient and hospital characteristics associated with longer hospital stays. Length of stay (LOS) for delivery hospitalizations has a strongly skewed distribution with the vast majority of LOS lasting two to three days in the United States. Prior studies typically focused on common LOSs and dealt with the long LOS distribution tail in ways to fit conventional statistical analyses (e.g., log transformation, trimming). This study demonstrates the use of Gamma mixture models to analyze the skewed LOS distribution. Gamma mixture models are flexible and, do not require data transformation or removal of outliers to accommodate many outcome distribution shapes, these models allow for the analysis of patients staying in the hospital for a longer time, which often includes those women experiencing worse outcomes. Random effects are included in the model to account for patients being treated within the same hospitals. Further, the role and influence of differing placements of covariates on the results is discussed in the context of distinct model specifications of the Gamma mixture regression model. The application of these models shows that they are robust to the placement of covariates and random effects. Using New York State data, the models showed that longer LOS for childbirth hospitalizations were more common in hospitals designated to accept more complicated deliveries, across hospital types, and among Black women. Primary insurance also was associated with LOS. Substantial variation between hospitals suggests the need to investigate protocols to standardize evidence-based medical care.

  9. d

    Data from: Selection on skewed characters and the paradox of stasis

    • datadryad.org
    zip
    Updated Sep 8, 2017
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Suzanne Bonamour; Céline Teplitsky; Anne Charmantier; Pierre-André Crochet; Luis-Miguel Chevin (2017). Selection on skewed characters and the paradox of stasis [Dataset]. http://doi.org/10.5061/dryad.pt07g
    Explore at:
    zipAvailable download formats
    Dataset updated
    Sep 8, 2017
    Dataset provided by
    Dryad
    Authors
    Suzanne Bonamour; Céline Teplitsky; Anne Charmantier; Pierre-André Crochet; Luis-Miguel Chevin
    Time period covered
    Sep 6, 2017
    Description

    Observed phenotypic responses to selection in the wild often differ from predictions based on measurements of selection and genetic variance. An overlooked hypothesis to explain this paradox of stasis is that a skewed phenotypic distribution affects natural selection and evolution. We show through mathematical modelling that, when a trait selected for an optimum phenotype has a skewed distribution, directional selection is detected even at evolutionary equilibrium, where it causes no change in the mean phenotype. When environmental effects are skewed, Lande and Arnold’s (1983) directional gradient is in the direction opposite to the skew. In contrast, skewed breeding values can displace the mean phenotype from the optimum, causing directional selection in the direction of the skew. These effects can be partitioned out using alternative selection estimates based on average derivatives of individual relative fitness, or additive genetic covariances between relative fitness and trait (Robe...

  10. Data for "Preference patterns for skewed gambles in rhesus monkeys"

    • figshare.com
    • datasetcatalog.nlm.nih.gov
    • +1more
    bin
    Updated Jan 18, 2016
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Caleb Strait; Benjamin Hayden (2016). Data for "Preference patterns for skewed gambles in rhesus monkeys" [Dataset]. http://doi.org/10.6084/m9.figshare.827292.v1
    Explore at:
    binAvailable download formats
    Dataset updated
    Jan 18, 2016
    Dataset provided by
    figshare
    Figsharehttp://figshare.com/
    Authors
    Caleb Strait; Benjamin Hayden
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Data for the paper, "Preference patterns for skewed gambles in rhesus monkeys."

  11. d

    Randomized Battery Usage 7: Low-Temperature Left-Skewed Random Walk

    • catalog.data.gov
    • s.cnmilf.com
    • +1more
    Updated Apr 11, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    PCoE (2025). Randomized Battery Usage 7: Low-Temperature Left-Skewed Random Walk [Dataset]. https://catalog.data.gov/dataset/randomized-battery-usage-7-low-temperature-left-skewed-random-walk
    Explore at:
    Dataset updated
    Apr 11, 2025
    Dataset provided by
    PCoE
    Description

    This dataset is part of a series of datasets, where batteries are continuously cycled with randomly generated current profiles. Reference charging and discharging cycles are also performed after a fixed interval of randomized usage to provide reference benchmarks for battery state of health. In this dataset, four 18650 Li-ion batteries (Identified as RW13, RW14, RW15 and RW16) were continuously operated by repeatedly charging them to 4.2V and then discharging them to 3.2V using a randomized sequence of discharging currents between 0.5A and 5A. This type of discharging profile is referred to here as random walk (RW) discharging. A customized probability distribution is used in this experiment to select a new load setpoint every 1 minute during RW discharging operation. The custom probability distribution was designed to be skewed towards selecting lower currents.

  12. u

    Results and analysis using the Lean Six-Sigma define, measure, analyze,...

    • researchdata.up.ac.za
    docx
    Updated Mar 12, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Modiehi Mophethe (2024). Results and analysis using the Lean Six-Sigma define, measure, analyze, improve, and control (DMAIC) Framework [Dataset]. http://doi.org/10.25403/UPresearchdata.25370374.v1
    Explore at:
    docxAvailable download formats
    Dataset updated
    Mar 12, 2024
    Dataset provided by
    University of Pretoria
    Authors
    Modiehi Mophethe
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This section presents a discussion of the research data. The data was received as secondary data however, it was originally collected using the time study techniques. Data validation is a crucial step in the data analysis process to ensure that the data is accurate, complete, and reliable. Descriptive statistics was used to validate the data. The mean, mode, standard deviation, variance and range determined provides a summary of the data distribution and assists in identifying outliers or unusual patterns. The data presented in the dataset show the measures of central tendency which includes the mean, median and the mode. The mean signifies the average value of each of the factors presented in the tables. This is the balance point of the dataset, the typical value and behaviour of the dataset. The median is the middle value of the dataset for each of the factors presented. This is the point where the dataset is divided into two parts, half of the values lie below this value and the other half lie above this value. This is important for skewed distributions. The mode shows the most common value in the dataset. It was used to describe the most typical observation. These values are important as they describe the central value around which the data is distributed. The mean, mode and median give an indication of a skewed distribution as they are not similar nor are they close to one another. In the dataset, the results and discussion of the results is also presented. This section focuses on the customisation of the DMAIC (Define, Measure, Analyse, Improve, Control) framework to address the specific concerns outlined in the problem statement. To gain a comprehensive understanding of the current process, value stream mapping was employed, which is further enhanced by measuring the factors that contribute to inefficiencies. These factors are then analysed and ranked based on their impact, utilising factor analysis. To mitigate the impact of the most influential factor on project inefficiencies, a solution is proposed using the EOQ (Economic Order Quantity) model. The implementation of the 'CiteOps' software facilitates improved scheduling, monitoring, and task delegation in the construction project through digitalisation. Furthermore, project progress and efficiency are monitored remotely and in real time. In summary, the DMAIC framework was tailored to suit the requirements of the specific project, incorporating techniques from inventory management, project management, and statistics to effectively minimise inefficiencies within the construction project.

  13. Results for datasets with simulated error (derived from Datasets 1-5)

    • zenodo.org
    zip
    Updated Mar 29, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Elise Rivett; Elise Rivett (2024). Results for datasets with simulated error (derived from Datasets 1-5) [Dataset]. http://doi.org/10.5281/zenodo.10888931
    Explore at:
    zipAvailable download formats
    Dataset updated
    Mar 29, 2024
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Elise Rivett; Elise Rivett
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Datasets w/ simulated error derived from Datasets 1-5. There are 1000 datasets per combination of error level and skewness type.

    • e1: Low error, no skew
    • e2: Medium error, no skew
    • e3: High error, no skew
    • e4: Low error, left skew
    • e5: Medium error, left skew
    • e6: High error, left skew
    • e7: Low error, right skew
    • e8: Medium error, right skew
    • e9: High error, right skew
  14. Skewness project raw data files and codes

    • figshare.com
    xlsx
    Updated Mar 14, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Raunak Dey; Sreekanth K Manikandan (2022). Skewness project raw data files and codes [Dataset]. http://doi.org/10.6084/m9.figshare.17703269.v2
    Explore at:
    xlsxAvailable download formats
    Dataset updated
    Mar 14, 2022
    Dataset provided by
    Figsharehttp://figshare.com/
    Authors
    Raunak Dey; Sreekanth K Manikandan
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This repository contains raw data files and base codes to analyze them.A. The 'powerx_y.xlsx' files are the data files with the one dimensional trajectory of optically trapped probes modulated by an Ornstein-Uhlenbeck noise of given 'x' amplitude. For the corresponding diffusion amplitude A=0.1X(0.6X10-6)2 m2/s, x is labelled as '1'B. The codes are of three types. The skewness codes are used to calculate the skewness of the trajectory. The error_in_fit codes are used to calculate deviations from arcsine behavior. The sigma_exp codes point to the deviation of the mean from 0.5. All the codes are written three times to look ar T+, Tlast and Tmax.C. More information can be found in the manuscript.

  15. o

    Data and Code for: Robot Hubs: The Skewed Distribution of Robots in U.S....

    • openicpsr.org
    delimited
    Updated May 1, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    ERIK BRYNJOLFSSON; CATHERINE BUFFINGTON; NATHAN GOLDSCHLAG; J. FRANK LI; JAVIER MIRANDA; ROBERT SEAMANS (2023). Data and Code for: Robot Hubs: The Skewed Distribution of Robots in U.S. Manufacturing [Dataset]. http://doi.org/10.3886/E190543V1
    Explore at:
    delimitedAvailable download formats
    Dataset updated
    May 1, 2023
    Dataset provided by
    American Economic Association
    Authors
    ERIK BRYNJOLFSSON; CATHERINE BUFFINGTON; NATHAN GOLDSCHLAG; J. FRANK LI; JAVIER MIRANDA; ROBERT SEAMANS
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    United States
    Description

    New technologies drive productivity growth (Romer 1990). Preliminary evidence using national-level data from 17 countries between 1993 and 2007 suggests that robots, like prior generations of general-purpose technologies, are also driving productivity growth (Graetz & Michaels 2018). Moreover, according to data compiled by the International Federation of Robotics (IFR), since 2010 the number of robot shipments has nearly quadrupled from about 100,000 to almost 400,000 per year, so the impact of robots on the economy is likely even greater. However, while robotics may be contributing to GDP growth at a national level, scholars are still working to understand how robots affect employment and other outcomes at other levels of analysis, leading to calls for establishment-level measures of robots and other new technologies (Brynjolfsson & Mitchell 2017; Raj and Seamans 2019). To address this need, the U.S. Census Bureau, working with external researchers, developed a series of questions on the adoption and use of robots. These questions have subsequently been included in the Annual Survey of Manufactures and other Census surveys (Buffington, Miranda & Seamans 2018; Brynjolfsson et al. 2020; Acemoglu et al. 2022). In this paper we present results on the distribution of robots in U.S. manufacturing, using the new establishment-level microdata collected by the U.S. Census Bureau. We use the data to present several facts about the location and use of robots. We find that the distribution of robots is highly skewed across locations, even accounting for the different mix of industry and manufacturing employment. Some locations - which we call “Robot Hubs” - have many more robots than one would expect after accounting for industry and manufacturing employment. We characterize these Robot Hubs along several industry, demographic, and institutional dimensions, and find that the presence of robot integrators and union membership are distinguishing features of Robot Hubs.

  16. n

    Data from: Body temperature distributions of active diurnal lizards in three...

    • data.niaid.nih.gov
    • datadryad.org
    zip
    Updated Aug 4, 2018
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Raymond B. Huey; Eric R. Pianka (2018). Body temperature distributions of active diurnal lizards in three deserts: skewed up or skewed down? [Dataset]. http://doi.org/10.5061/dryad.45g3s
    Explore at:
    zipAvailable download formats
    Dataset updated
    Aug 4, 2018
    Dataset provided by
    The University of Texas at Austin
    University of Washington
    Authors
    Raymond B. Huey; Eric R. Pianka
    License

    https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html

    Area covered
    North America, Africa, Australia
    Description
    1. The performance of ectotherms integrated over time depends in part on the position and shape of the distribution of body temperatures (Tb) experienced during activity. For several complementary reasons, physiological ecologists have long expected that Tb distributions during activity should have a long left tail (left-skewed); but only infrequently have they quantified the magnitude and direction of Tb skewness in nature.
    2. To evaluate whether left-skewed Tb distributions are general for diurnal desert lizards, we compiled and analyzed Tb (∑ = 9,023 temperatures) from our own prior studies of active desert lizards on three continents (25 species in Western Australia, 10 in the Kalahari Desert of Africa, and 10 species in western North America). We gathered these data over several decades, using standardized techniques.
    3. Many species showed significantly left-skewed Tb distributions, even when records were restricted to summer months. However, magnitudes of skewness were always small, such that mean Tb were never more than 1°C lower than median Tb. The significance of Tb skewness was sensitive to sample size, and power tests reinforced this sensitivity.
    4. The magnitude of skewness was not obviously related to phylogeny, desert, body size, or median body temperature. Moreover, formal phylogenetic analysis is inappropriate because geography and phylogeny are confounded (that is, are highly collinear).
    5. Skewness might be limited if lizards pre-warm inside retreats before emerging in the morning, emerge only when operative temperatures are high enough to speed warming to activity Tb, or if cold lizards are especially wary and difficult to spot or catch. Telemetry studies may help evaluate these possibilities.
  17. n

    Data from: Evolution of quantitative traits under a migration-selection...

    • data.niaid.nih.gov
    • search.dataone.org
    • +1more
    zip
    Updated Jul 21, 2015
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Florence Débarre; Sam Yeaman; Frédéric Guillaume (2015). Evolution of quantitative traits under a migration-selection balance: when does skew matter? [Dataset]. http://doi.org/10.5061/dryad.ms52b
    Explore at:
    zipAvailable download formats
    Dataset updated
    Jul 21, 2015
    Authors
    Florence Débarre; Sam Yeaman; Frédéric Guillaume
    License

    https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html

    Description

    Quantitative-genetic models of differentiation under migration-selection balance often rely on the assumption of normally distributed genotypic and phenotypic values. When a population is subdivided into demes with selection toward different local optima, migration between demes may result in asymmetric, or skewed, local distributions. Using a simplified two-habitat model, we derive formulas without a priori assuming a Gaussian distribution of genotypic values, and we find expressions that naturally incorporate higher moments, such as skew. These formulas yield predictions of the expected divergence under migration-selection balance that are more accurate than models assuming Gaussian distributions, which illustrates the importance of incorporating these higher moments to assess the response to selection in heterogeneous environments. We further show with simulations that traits with loci of large effect display the largest skew in their distribution at migration-selection balance.

  18. t

    Replication data for: does “very” make a difference? effects of intensifiers...

    • service.tib.eu
    Updated May 16, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2025). Replication data for: does “very” make a difference? effects of intensifiers in item stems of employee attitude surveys on response behavior [Dataset]. https://service.tib.eu/ldmservice/dataset/osn-doi-10-26249-fk2-oexojh
    Explore at:
    Dataset updated
    May 16, 2025
    Description

    Abstract: Employee attitude surveys are important tools for organizational development. To gain insights into employees’ attitudes, surveys most often use Likert-type items. Measures assessing these attitudes frequently use intensifiers (e.g., extremely, very) in item stems. To date little is known about the effects of intensifiers in the item stem on response behavior. They are frequently used inconsistently, which potentially has implications for the comparability of results in the context of benchmarking. Also, results often suffer from left-skewed distributions limiting data quality for which the use of intensifiers potentially offers a remedy. Therefore, we systematically examine the effects of intensifiers’ on response behavior in employee attitude surveys and their potential to remedy the issue of left-skewed distributions. In three studies, we assess effects on level, skewness and nomological structure. Study 1 examines the effects of intensifier strength in the item stem, while Studies 2 and 3 assess whether intensifier salience would increase these effects further. Interestingly, results did not show systematic effects. Future research ideas in regards to item design and processing as well as practical implications for the design of employee attitude surveys are discussed. Other: Does “very” make a difference? Effects of intensifiers in item stems of employee attitude surveys on response behavior - in preparation

  19. o

    Data from: Improving structured population models with more realistic...

    • explore.openaire.eu
    • search.dataone.org
    • +1more
    Updated Jun 14, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Megan L. Peterson; William Morris; Cristina Linares; Daniel Doak (2019). Data from: Improving structured population models with more realistic representations of non-normal growth [Dataset]. http://doi.org/10.5061/dryad.t6c3573
    Explore at:
    Dataset updated
    Jun 14, 2019
    Authors
    Megan L. Peterson; William Morris; Cristina Linares; Daniel Doak
    Description
    1. Structured population models are among the most widely used tools in ecology and evolution. Integral projection models (IPMs) use continuous representations of how survival, reproduction, and growth change as functions of state variables such as size, requiring fewer parameters to be estimated than projection matrix models (PPMs). Yet almost all published IPMs make an important assumption: that size-dependent growth transitions are or can be transformed to be normally distributed. In fact, many organisms exhibit highly skewed size transitions. Small individuals can grow more than they can shrink, and large individuals may often shrink more dramatically than they can grow. Yet the implications of such skew for inference from IPMs has not been explored, nor have general methods been developed to incorporate skewed size transitions into IPMs, or deal with other aspects of real growth rates, including bounds on possible growth or shrinkage. 2. Here we develop a flexible approach to modeling skewed growth data using a modified beta regression model. We propose that sizes first be converted to a (0,1) interval by estimating size-dependent minimum and maximum sizes through quantile regression. Transformed data can then be modeled using beta regression with widely available statistical tools. We demonstrate the utility of this approach using demographic data for a long-lived plant, gorgonians, and an epiphytic lichen. Specifically, we compare inferences of population parameters from discrete PPMs to those from IPMs that either assume normality or incorporate skew using beta regression or, alternatively, a skewed normal model. 3. The beta and skewed normal distributions accurately capture the mean, variance, and skew of real growth distributions. Incorporating skewed growth into IPMs decreases population growth and estimated lifespan relative to IPMs that assume normally-distributed growth, and more closely approximate the parameters of PPMs that do not assume a particular growth distribution. A bounded distribution, such as the beta, also avoids the eviction problem caused by predicting some growth outside the modeled size range. 4. Incorporating biologically relevant skew in growth data has important consequences for inference from IPMs. The approaches we outline here are flexible and easy to implement with existing statistical tools. Bistort raw dataDemographic data for Polygonum viviparum collected at Niwot Ridge, CO from 2001-2011. szs0 = size at time t, szs1 = size at time t+1, bulbs0 = number of bulbils produced at time t. Details of data collection given in the supporting information.Gorgonian raw dataDemographic data for Paramuricea clavata collected in the NW Mediterranean Sea from 1999-2004. Mortality = source of mortality, Site = site, Plot = plot, Year = annual transition from time t to time t+1, Ncol = colony id, Size = size at time t, Sizenext = size at time t+1, Survnext = dead (0) or alive (1) at time t+1. Details of data collection and reproduction given in supporting information.Vulpicida raw dataDemographic data for Vulpicida pinastri collected in Kennicott Valley, AK from 2004-2009. Year = year at time t, site = site, t0 = size at time t, t1 = size at time t+1, repro = number of offspring assigned based on thallus circumference, survival = dead (0) or alive (1) at time t+1. Details of data collection and reproduction are given in supporting information.Appendix 1R script for analyses in Figure 2Appendix 2R script for analyses of coralAppendix 3R script to simultaneously fit the minimum, maximum, mean, and precision parameters of the beta approach using maximum likelihood
  20. u

    Skew-T Plots: Boise

    • data.ucar.edu
    image
    Updated Aug 1, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Research Applications Laboratory (RAL); NCAR (2025). Skew-T Plots: Boise [Dataset]. http://doi.org/10.26023/B3PP-7VG2-5Z0S
    Explore at:
    imageAvailable download formats
    Dataset updated
    Aug 1, 2025
    Authors
    Research Applications Laboratory (RAL); NCAR
    Time period covered
    Nov 8, 2007 - Jan 4, 2008
    Area covered
    Description

    This dataset contains upper air Skew-T Log-P charts taken at Boise, Idaho during the ICE-L project. The imagery are in GIF format. The imagery cover the time span from 2007-11-08 12:00:00 to 2008-01-03 12:00:00.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Junho Lee; Michael P. B. Gallaugher; Amanda S. Hering (2025). Clustering Spatial Data with a Mixture of Skewed Regression Models [Dataset]. http://doi.org/10.6084/m9.figshare.28454482.v1

Data from: Clustering Spatial Data with a Mixture of Skewed Regression Models

Related Article
Explore at:
pdfAvailable download formats
Dataset updated
May 12, 2025
Dataset provided by
Taylor & Francis
Authors
Junho Lee; Michael P. B. Gallaugher; Amanda S. Hering
License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

A single regression model is unlikely to hold throughout a large and complex spatial domain. A finite mixture of regression models can address this issue by clustering the data and assigning a regression model to explain each homogenous group. However, a typical finite mixture of regressions does not account for spatial dependencies. Furthermore, the number of components selected can be too high in the presence of skewed data and/or heavy tails. Here, we propose a mixture of regression models on a Markov random field with skewed distributions. The proposed model identifies the locations wherein the relationship between the predictors and the response is similar and estimates the model within each group as well as the number of groups. Overfitting is addressed by using skewed distributions, such as the skew-t or normal inverse Gaussian, in the error term of each regression model. Model estimation is carried out using an EM algorithm, and the performance of the estimators and model selection are illustrated through an extensive simulation study and two case studies.

Search
Clear search
Close search
Google apps
Main menu