Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Supplementary Material 3: A supplementary file with examples of SAS script for all models that have been fitted in this paper.
"NewEngland_pkflows.PRT" is a text file that contains results of flood-frequency analysis of annual peak flows from 186 selected streamflow gaging stations (streamgages) operated by the U.S. Geological Survey (USGS) in the New England region (Maine, Connecticut, Massachusetts, Rhode Island, New York, New Hampshire, and Vermont). Only streamgages in the region that were also in the USGS "GAGES II" database (https://water.usgs.gov/GIS/metadata/usgswrd/XML/gagesII_Sept2011.xml) were considered for use in the study. The file was generated by combining PeakFQ output (.PRT) files created using version 7.0 of USGS software PeakFQ (https://water.usgs.gov/software/PeakFQ/; Veilleux and others, 2014) to conduct flood-frequency analyses using the Expected Moments Algorithm (England and others, 2018). The peak-flow files used as input to PeakFQ were obtained from the USGS National Water Information System (NWIS) database (https://nwis.waterdata.usgs.gov/usa/nwis/peak) and contained annual peak flows ending in water year 2011. Results of the flood-frequency analyses were used to estimate skewness of annual peak flows in the New England region using Bayesian Weighted Least Squares / Bayesian Generalized Least Squares (B-WLS / B-GLS) regression (Veilleux and others, 2019).
To improve flood-frequency estimates at rural streams in Mississippi, annual exceedance probability (AEP) flows at gaged streams in Mississippi and regional-regression equations, used to estimate annual exceedance probability flows for ungaged streams in Mississippi, were developed by using current geospatial data, additional statistical methods, and annual peak-flow data through the 2013 water year. The regional-regression equations were derived from statistical analyses of peak-flow data, basin characteristics associated with 281 streamgages, the generalized skew from Bulletin 17B (Interagency Advisory Committee on Water Data, 1982), and a newly developed study-specific skew for select four-digit hydrologic unit code (HUC4) watersheds in Mississippi. Four flood regions were identified based on residuals from the regional-regression analyses. No analysis was conducted for streams in the Mississippi Alluvial Plain flood region because of a lack of long-term streamflow data and poorly defined basin characteristics. Flood regions containing sites with similar basin and climatic characteristics yielded better regional-regression equations with lower error percentages. The generalized least squares method was used to develop the final regression models for each flood region for annual exceedance probability flows. The peak-flow statistics were estimated by fitting a log-Pearson type III distribution to records of annual peak flows and then applying two additional statistical methods: (1) the expected moments algorithm to help describe uncertainty in annual peak flows and to better represent missing and historical record; and (2) the generalized multiple Grubbs-Beck test to screen out potentially influential low outliers and to better fit the upper end of the peak-flow distribution. Standard errors of prediction of the generalized least-squares models ranged from 28 to 46 percent. Pseudo coefficients of determination of the models ranged from 91 to 96 percent. Flood Region A, located in north-central Mississippi, contained 27 streamgages with drainage areas that ranged from 1.41 to 612 square miles. The 1% annual exceedance probability had a standard error of prediction of 31 percent which was lower than the prediction errors in Flood Regions B and C.
While classical measurement error in the dependent variable in a linear regression framework results only in a loss of precision, nonclassical measurement error can lead to estimates which are biased and inference which lacks power. Here, we consider a particular type of nonclassical measurement error: skewed errors. Unfortunately, skewed measurement error is likely to be a relatively common feature of many out- comes of interest in political science research. This study highlights the bias that can result even from relatively "small" amounts of skewed measurement error, particularly if the measurement error is heteroskedastic. We also assess potential solutions to this problem, focusing on the stochastic frontier model and nonlinear least squares. Simulations and three replications highlight the importance of thinking carefully about skewed measurement error, as well as appropriate solutions.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
In the fixed-effects stochastic frontier model an efficiency measure relative to the best firm in the sample is universally employed. This paper considers a new measure relative to the worst firm in the sample. We find that estimates of this measure have smaller bias than those of the traditional measure when the sample consists of many firms near the efficient frontier. Moreover, a two-sided measure relative to both the best and the worst firms is proposed. Simulations suggest that the new measures may be preferred depending on the skewness of the inefficiency distribution and the scale of efficiency differences.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
AUC of ROC curves of the simulation on data in skew-normal distribution.
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
GC skew denotes the relative excess of G nucleotides over C nucleotides on the leading versus the lagging replication strand of eubacteria. While the effect is small, typically around 2.5%, it is robust and pervasive. GC skew and the analogous TA skew are a localized deviation from Chargaff's second parity rule, which states that G and C, and T and A occur with (mostly) equal frequency even within a strand.
Most bacteria also show the analogous TA skew. Different phyla show different kinds of skew and differing relations between TA and GC skew.
This article introduces an open access database (https://skewdb.org) of GC and 10 other skews for over 28,000 chromosomes and plasmids. Further details like codon bias, strand bias, strand lengths and taxonomic data are also included.
The SkewDB database can be used to generate or verify hypotheses. Since the origins of both the second parity rule, as well as GC skew itself, are not yet satisfactorily explained, such a database may enhance our understanding of microbial DNA.
This dataset contains upper air Skew-T Log-P charts taken at Denver during the HIPPO-3 project. The imagery are in GIF format. The imagery cover the time span from 2010-03-17 00:00:00 to 2010-04-19 12:00:00.
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Reproducibility package for the article:Reaction times and other skewed distributions: problems with the mean and the medianGuillaume A. Rousselet & Rand R. Wilcoxpreprint: https://psyarxiv.com/3y54rdoi: 10.31234/osf.io/3y54rThis package contains all the code and data to reproduce the figures and analyses in the article.
Observed phenotypic responses to selection in the wild often differ from predictions based on measurements of selection and genetic variance. An overlooked hypothesis to explain this paradox of stasis is that a skewed phenotypic distribution affects natural selection and evolution. We show through mathematical modelling that, when a trait selected for an optimum phenotype has a skewed distribution, directional selection is detected even at evolutionary equilibrium, where it causes no change in the mean phenotype. When environmental effects are skewed, Lande and Arnold’s (1983) directional gradient is in the direction opposite to the skew. In contrast, skewed breeding values can displace the mean phenotype from the optimum, causing directional selection in the direction of the skew. These effects can be partitioned out using alternative selection estimates based on average derivatives of individual relative fitness, or additive genetic covariances between relative fitness and trait (Robe...
https://www.usa.gov/government-workshttps://www.usa.gov/government-works
This dataset is part of a series of datasets, where batteries are continuously cycled with randomly generated current profiles. Reference charging and discharging cycles are also performed after a fixed interval of randomized usage to provide reference benchmarks for battery state of health.
In this dataset, four 18650 Li-ion batteries (Identified as RW13, RW14, RW15 and RW16) were continuously operated by repeatedly charging them to 4.2V and then discharging them to 3.2V using a randomized sequence of discharging currents between 0.5A and 5A. This type of discharging profile is referred to here as random walk (RW) discharging. A customized probability distribution is used in this experiment to select a new load setpoint every 1 minute during RW discharging operation. The custom probability distribution was designed to be skewed towards selecting lower currents.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This paper uses extreme value theory to study the implications of skewness risk for nominal loan contracts in a production economy. Productivity and inflation innovations are drawn from generalized extreme value distributions. The model is solved using a third-order perturbation and estimated by the simulated method of moments. Results show that the data reject the hypothesis that innovations are drawn from normal distributions and favor instead the alternative that they are drawn from asymmetric distributions. Estimates indicate that skewness risk accounts for 12% of the risk premia and reduces bond yields by approximately 55 basis points. For a bond that pays 1 dollar at maturity, the adjustment factor associated with skewness risk ranges from 0.15 cents for a 3?month bond to 2.05 cents for a 5?year bond.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Conditional heteroskedasticity, skewness and leverage effects are well-known features of financial returns. The literature on factor models has often made assumptions that preclude the three effects to occur simultaneously. In this paper I propose a conditionally heteroskedastic factor model that takes into account the presence of both the conditional skewness and leverage effects. This model is specified in terms of conditional moment restrictions and unconditional moment conditions are proposed allowing inference by the generalized method of moments (GMM). The model is also shown to be closed under temporal aggregation. An application to daily excess returns on sectorial indices from the UK stock market provides strong evidence for dynamic conditional skewness and leverage with a sharp efficiency gain resulting from accounting for both effects. The estimated volatilitypersistence from the proposed model is lower than that estimated from models that rule out such effects. I also find that the longer the returns' horizon, the fewer conditionally heteroskedastic factors may be required for suitable modeling and the less strong is the evidence for dynamic leverage. Some of these results are in line with the main findings of Harvey and Siddique (1999) and Jondeau and Rockinger (2003), namely that accounting for conditional skewness impacts the persistence in the conditional variance of the return process.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Psychological data often violate the normality assumptions made by commonly used statistical methods. These violations are addressed in a variety of ways such as transformations or assuming the employed method is robust to violations. Here we argue that data transformations are unnecessary at best and severely misleading at worst. An alternative approach is to use a Bayesian model that permits skewness and other perturbations to classical assumptions (e.g., heteroskedasticity). Through simulation, we demonstrate that a Bayesian skew-normal model has optimal frequentist properties (i.e., "type 1" error, "power", unbiasedness) compared to normal-assumptive models with or without transformation. Furthermore, the Bayesian skew-normal model has greater predictive utility, as indicated by posterior predictive checking and approximate leave-one-out cross-validation. After an applied example, we discuss practical implications of our findings for psychological science in general, and specifically how Bayesian modeling can improve reproducibility in psychology.
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Inlcuded is the supplementary data for Smith, B. T., Mauck, W. M., Benz, B., & Andersen, M. J. (2018). Uneven missing data skews phylogenomic relationships within the lories and lorikeets. BioRxiv, 398297.
The resolution of the Tree of Life has accelerated with advances in DNA sequencing technology. To achieve dense taxon sampling, it is often necessary to obtain DNA from historical museum specimens to supplement modern genetic samples. However, DNA from historical material is generally degraded, which presents various challenges. In this study, we evaluated how the coverage at variant sites and missing data among historical and modern samples impacts phylogenomic inference. We explored these patterns in the brush-tongued parrots (lories and lorikeets) of Australasia by sampling ultraconserved elements in 105 taxa. Trees estimated with low coverage characters had several clades where relationships appeared to be influenced by whether the sample came from historical or modern specimens, which were not observed when more stringent filtering was applied. To assess if the topologies were affected by missing data, we performed an outlier analysis of sites and loci, and a data reduction approach where we excluded sites based on data completeness. Depending on the outlier test, 0.15% of total sites or 38% of loci were driving the topological differences among trees, and at these sites, historical samples had 10.9x more missing data than modern ones. In contrast, 70% data completeness was necessary to avoid spurious relationships. Predictive modeling found that outlier analysis scores were correlated with parsimony informative sites in the clades whose topologies changed the most by filtering. After accounting for biased loci and understanding the stability of relationships, we inferred a more robust phylogenetic hypothesis for lories and lorikeets.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
We construct a copula from the skew t distribution of Sahu et al. (2003). This copula can capture asymmetric and extreme dependence between variables, and is one of the few copulas that can do so and still be used in high dimensions effectively. However, it is difficult to estimate the copula model by maximum likelihood when the multivariate dimension is high, or when some or all of the marginal distributions are discrete-valued, or when the parameters in the marginal distributions and copula are estimated jointly. We therefore propose a Bayesian approach that overcomes all these problems. The computations are undertaken using a Markov chain Monte Carlo simulation method which exploits the conditionally Gaussian representation of the skew t distribution. We employ the approach in two contemporary econometric studies. The first is the modelling of regional spot prices in the Australian electricity market. Here, we observe complex non-Gaussian margins and nonlinear inter-regional dependence. Accurate characterization of this dependence is important for the study of market integration and risk management purposes. The second is the modelling of ordinal exposure measures for 15 major websites. Dependence between websites is important when measuring the impact of multi-site advertising campaigns. In both cases the skew t copula substantially outperforms symmetric elliptical copula alternatives, demonstrating that the skew t copula is a powerful modelling tool when coupled with Bayesian inference.
No description was included in this Dataset collected from the OSF
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Review of Economics and Statistics: Forthcoming.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Regressors of interest Z-scores and Talaraich coordinates for peak activation foci. Variance contrast compared high variance versus low variance gambles (High-Variance + Positive-Skew + Negative-Skew > Low-Variance). Skewness contrast compared skewed versus symmetric gambles of equal variance (Positive-Skew + Negative-Skew > High-Variance). Positive Skewness contrast compared positively skewed versus negatively skewed gambles (Positive-Skew > Negative-Skew). Regions surpassed threshold of Z>3.28 (p
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Supplementary Material 3: A supplementary file with examples of SAS script for all models that have been fitted in this paper.