Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This repository contains raw data files and base codes to analyze them.A. The 'powerx_y.xlsx' files are the data files with the one dimensional trajectory of optically trapped probes modulated by an Ornstein-Uhlenbeck noise of given 'x' amplitude. For the corresponding diffusion amplitude A=0.1X(0.6X10-6)2 m2/s, x is labelled as '1'B. The codes are of three types. The skewness codes are used to calculate the skewness of the trajectory. The error_in_fit codes are used to calculate deviations from arcsine behavior. The sigma_exp codes point to the deviation of the mean from 0.5. All the codes are written three times to look ar T+, Tlast and Tmax.C. More information can be found in the manuscript.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
In the fixed-effects stochastic frontier model an efficiency measure relative to the best firm in the sample is universally employed. This paper considers a new measure relative to the worst firm in the sample. We find that estimates of this measure have smaller bias than those of the traditional measure when the sample consists of many firms near the efficient frontier. Moreover, a two-sided measure relative to both the best and the worst firms is proposed. Simulations suggest that the new measures may be preferred depending on the skewness of the inefficiency distribution and the scale of efficiency differences.
This dataset contains upper air Skew-T Log-P charts taken at Boise, Idaho during the ICE-L project. The imagery are in GIF format. The imagery cover the time span from 2007-11-08 12:00:00 to 2008-01-03 12:00:00.
https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html
Inlcuded is the supplementary data for Smith, B. T., Mauck, W. M., Benz, B., & Andersen, M. J. (2018). Uneven missing data skews phylogenomic relationships within the lories and lorikeets. BioRxiv, 398297. The resolution of the Tree of Life has accelerated with advances in DNA sequencing technology. To achieve dense taxon sampling, it is often necessary to obtain DNA from historical museum specimens to supplement modern genetic samples. However, DNA from historical material is generally degraded, which presents various challenges. In this study, we evaluated how the coverage at variant sites and missing data among historical and modern samples impacts phylogenomic inference. We explored these patterns in the brush-tongued parrots (lories and lorikeets) of Australasia by sampling ultraconserved elements in 105 taxa. Trees estimated with low coverage characters had several clades where relationships appeared to be influenced by whether the sample came from historical or modern specimens, which were not observed when more stringent filtering was applied. To assess if the topologies were affected by missing data, we performed an outlier analysis of sites and loci, and a data reduction approach where we excluded sites based on data completeness. Depending on the outlier test, 0.15% of total sites or 38% of loci were driving the topological differences among trees, and at these sites, historical samples had 10.9x more missing data than modern ones. In contrast, 70% data completeness was necessary to avoid spurious relationships. Predictive modeling found that outlier analysis scores were correlated with parsimony informative sites in the clades whose topologies changed the most by filtering. After accounting for biased loci and understanding the stability of relationships, we inferred a more robust phylogenetic hypothesis for lories and lorikeets.
"NewEngland_pkflows.PRT" is a text file that contains results of flood-frequency analysis of annual peak flows from 186 selected streamflow gaging stations (streamgages) operated by the U.S. Geological Survey (USGS) in the New England region (Maine, Connecticut, Massachusetts, Rhode Island, New York, New Hampshire, and Vermont). Only streamgages in the region that were also in the USGS "GAGES II" database (https://water.usgs.gov/GIS/metadata/usgswrd/XML/gagesII_Sept2011.xml) were considered for use in the study. The file was generated by combining PeakFQ output (.PRT) files created using version 7.0 of USGS software PeakFQ (https://water.usgs.gov/software/PeakFQ/; Veilleux and others, 2014) to conduct flood-frequency analyses using the Expected Moments Algorithm (England and others, 2018). The peak-flow files used as input to PeakFQ were obtained from the USGS National Water Information System (NWIS) database (https://nwis.waterdata.usgs.gov/usa/nwis/peak) and contained annual peak flows ending in water year 2011. Results of the flood-frequency analyses were used to estimate skewness of annual peak flows in the New England region using Bayesian Weighted Least Squares / Bayesian Generalized Least Squares (B-WLS / B-GLS) regression (Veilleux and others, 2019).
GC skew denotes the relative excess of G nucleotides over C nucleotides on the leading versus the lagging replication strand of eubacteria. While the effect is small, typically around 2.5%, it is robust and pervasive. GC skew and the analogous TA skew are a localized deviation from Chargaff’s second parity rule, which states that G and C, and T and A occur with (mostly) equal frequency even within a strand.
Most bacteria also show the analogous TA skew. Different phyla show different kinds of skew and differing relations between TA and GC skew. This article introduces an open access database (https://skewdb.org) of GC and 10 other skews for over 28,000 chromosomes and plasmids. Further details like codon bias, strand bias, strand lengths and taxonomic data are also included.
The SkewDB database can be used to generate or verify hypotheses. Since the origins of both the second parity rule, as well as GC skew itself, are not yet satisfactorily explained, such a database may enhance...
To improve flood-frequency estimates at rural streams in Mississippi, annual exceedance probability (AEP) flows at gaged streams in Mississippi and regional-regression equations, used to estimate annual exceedance probability flows for ungaged streams in Mississippi, were developed by using current geospatial data, additional statistical methods, and annual peak-flow data through the 2013 water year. The regional-regression equations were derived from statistical analyses of peak-flow data, basin characteristics associated with 281 streamgages, the generalized skew from Bulletin 17B (Interagency Advisory Committee on Water Data, 1982), and a newly developed study-specific skew for select four-digit hydrologic unit code (HUC4) watersheds in Mississippi. Four flood regions were identified based on residuals from the regional-regression analyses. No analysis was conducted for streams in the Mississippi Alluvial Plain flood region because of a lack of long-term streamflow data and poorly defined basin characteristics. Flood regions containing sites with similar basin and climatic characteristics yielded better regional-regression equations with lower error percentages. The generalized least squares method was used to develop the final regression models for each flood region for annual exceedance probability flows. The peak-flow statistics were estimated by fitting a log-Pearson type III distribution to records of annual peak flows and then applying two additional statistical methods: (1) the expected moments algorithm to help describe uncertainty in annual peak flows and to better represent missing and historical record; and (2) the generalized multiple Grubbs-Beck test to screen out potentially influential low outliers and to better fit the upper end of the peak-flow distribution. Standard errors of prediction of the generalized least-squares models ranged from 28 to 46 percent. Pseudo coefficients of determination of the models ranged from 91 to 96 percent. Flood Region A, located in north-central Mississippi, contained 27 streamgages with drainage areas that ranged from 1.41 to 612 square miles. The 1% annual exceedance probability had a standard error of prediction of 31 percent which was lower than the prediction errors in Flood Regions B and C.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
A single regression model is unlikely to hold throughout a large and complex spatial domain. A finite mixture of regression models can address this issue by clustering the data and assigning a regression model to explain each homogenous group. However, a typical finite mixture of regressions does not account for spatial dependencies. Furthermore, the number of components selected can be too high in the presence of skewed data and/or heavy tails. Here, we propose a mixture of regression models on a Markov random field with skewed distributions. The proposed model identifies the locations wherein the relationship between the predictors and the response is similar and estimates the model within each group as well as the number of groups. Overfitting is addressed by using skewed distributions, such as the skew-t or normal inverse Gaussian, in the error term of each regression model. Model estimation is carried out using an EM algorithm, and the performance of the estimators and model selection are illustrated through an extensive simulation study and two case studies.
This dataset contains upper air Skew-T Log-P data collected at Denver during the HIPPO project. The imagery are in GIF format. The imagery cover the time span from 2009-01-06 00:00:00 to 2009-02-01 12:00:00.
skew_indices
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
A permanent copy of a data repository resulting from project 0-6977 Update Texas Skew Coefficients, including interim reports and the final report (as pdf and source) When the data are final this sentence will be deleted
https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html
While classical measurement error in the dependent variable in a linear regression framework results only in a loss of precision, nonclassical measurement error can lead to estimates which are biased and inference which lacks power. Here, we consider a particular type of nonclassical measurement error: skewed errors. Unfortunately, skewed measurement error is likely to be a relatively common feature of many out- comes of interest in political science research. This study highlights the bias that can result even from relatively "small" amounts of skewed measurement error, particularly if the measurement error is heteroskedastic. We also assess potential solutions to this problem, focusing on the stochastic frontier model and nonlinear least squares. Simulations and three replications highlight the importance of thinking carefully about skewed measurement error, as well as appropriate solutions.
Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
This dataset accompanies the paper titled "Unified Actuator Nonlinear Dynamic Inversion controller for the Variable Skew Quad Plane." It includes flight test data, simulation data, post-processing scripts, and derivations. The presented controller is demonstrated to be superior in tracking position and attitude trajectories compared to an Incremental Nonlinear Dynamic Inversion controller. The dataset contains simulation and real indoor testing data comparing the trajectory tracking performance of the two controllers.
Reproductive skew data for parasitised and unparasitised nestsNest attributes (group size, nest size, presence of workers, site) come from field records. Numbers of eggs produced by dominants and subordinates were obtained through microsatellite genotyping of parents and offspring. Relatedness estimates were calculated based on microsatellite genotypes of nest-mates. Body size data and data on egg size and number were obtained through morphological measurements.
This dataset contains upper air Skew-T Log-P charts taken at Grand Junction, Colorado during the ICE-L project. The imagery are in GIF format. The imagery cover the time span from 2007-10-24 12:00:00 to 2008-01-03 12:00:00.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This project examines whether people have an intrinsic preference for negatively skewed or positively skewed information structures and how these preferences relate to intrinsic preferences for informativeness. It reports results from 5 studies (3 lab experiments, 2 online studies).
Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
These data set reports the data and script used to produce the paper :
" Incremental Nonlinear Dynamic Inversion controller for a Variable Skew Quad Plane ".
The data refers to the the novel Variable Skew Quad plane that has been tested in the Open Jet Facility wind tunnel of TuDelft. The objective of the experiments is to characterize the control capabilities of VSQP. The data collected are the forces and moments exerted by the drone at different state combinations. The data has been aquired through OJF externam moment balance.
Quantitative-genetic models of differentiation under migration-selection balance often rely on the assumption of normally distributed genotypic and phenotypic values. When a population is subdivided into demes with selection toward different local optima, migration between demes may result in asymmetric, or skewed, local distributions. Using a simplified two-habitat model, we derive formulas without a priori assuming a Gaussian distribution of genotypic values, and we find expressions that naturally incorporate higher moments, such as skew. These formulas yield predictions of the expected divergence under migration-selection balance that are more accurate than models assuming Gaussian distributions, which illustrates the importance of incorporating these higher moments to assess the response to selection in heterogeneous environments. We further show with simulations that traits with loci of large effect display the largest skew in their distribution at migration-selection balance.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
We construct a copula from the skew t distribution of Sahu et al. (2003). This copula can capture asymmetric and extreme dependence between variables, and is one of the few copulas that can do so and still be used in high dimensions effectively. However, it is difficult to estimate the copula model by maximum likelihood when the multivariate dimension is high, or when some or all of the marginal distributions are discrete-valued, or when the parameters in the marginal distributions and copula are estimated jointly. We therefore propose a Bayesian approach that overcomes all these problems. The computations are undertaken using a Markov chain Monte Carlo simulation method which exploits the conditionally Gaussian representation of the skew t distribution. We employ the approach in two contemporary econometric studies. The first is the modelling of regional spot prices in the Australian electricity market. Here, we observe complex non-Gaussian margins and nonlinear inter-regional dependence. Accurate characterization of this dependence is important for the study of market integration and risk management purposes. The second is the modelling of ordinal exposure measures for 15 major websites. Dependence between websites is important when measuring the impact of multi-site advertising campaigns. In both cases the skew t copula substantially outperforms symmetric elliptical copula alternatives, demonstrating that the skew t copula is a powerful modelling tool when coupled with Bayesian inference.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This repository contains raw data files and base codes to analyze them.A. The 'powerx_y.xlsx' files are the data files with the one dimensional trajectory of optically trapped probes modulated by an Ornstein-Uhlenbeck noise of given 'x' amplitude. For the corresponding diffusion amplitude A=0.1X(0.6X10-6)2 m2/s, x is labelled as '1'B. The codes are of three types. The skewness codes are used to calculate the skewness of the trajectory. The error_in_fit codes are used to calculate deviations from arcsine behavior. The sigma_exp codes point to the deviation of the mean from 0.5. All the codes are written three times to look ar T+, Tlast and Tmax.C. More information can be found in the manuscript.