Facebook
TwitterThis dataset was created by sciencestoked
Facebook
TwitterLicense: GPL-v2 The R script presents an advanced sampling approach for monitoring biodiversity on agricultural land by combining multiple objectives and integrating environmental and geographic space. The example demonstrates the first-stage selection of squares (km2) in the ALL-EMA sampling design using modern sampling techniques such as unequal probability sampling with fixed sample size, balanced sampling, stratified balancing and geographic spreading. Sampling is done with unequal probabilities and weights defined by power allocation to give equal weight to extrapolations to the total agricultural area of Switzerland and two stratifications of predefined interest (regions and agricultural production zones). Calibration is used to limit the distribution of the sampling weights. The sample sizes are almost fixed within the strata and evenly distributed across the years of a temporal rotation plan, which is favourable for the organisation of the field survey. Sampling also ensures an optimal (annual) distribution across geographic space, including altitude. Despite the complexity of the sampling, estimation based on probability theory is straightforward. Ecker, K.T., Meier, E.S. & Tillé, Y. 2023. Integrating spatial and ecological information into comprehensive biodiversity monitoring on agricultural land. Environmental Monitoring and Assessment 195.
Facebook
TwitterMIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
This dataset was created by JORGE GARCIA-INIGUEZ
Released under MIT
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Summary statistics of population and samples taken at different sampling schemes for n = 4, r = 1.
Facebook
TwitterBiostatistics Using R: A Laboratory Manual was created with the goals of providing biological content to lab sessions by using authentic research data and introducing R programming language. Chapter 2 introduces sampling, accuracy, and precision.
Facebook
TwitterSSP (simulation-based sampling protocol) is an R package that uses simulations of ecological data and dissimilarity-based multivariate standard error (MultSE) as an estimator of precision to evaluate the adequacy of different sampling efforts for studies that will test hypothesis using permutational multivariate analysis of variance. The procedure consists in simulating several extensive data matrixes that mimic some of the relevant ecological features of the community of interest using a pilot data set. For each simulated data, several sampling efforts are repeatedly executed and MultSE calculated. The mean value, 0.025 and 0.975 quantiles of MultSE for each sampling effort across all simulated data are then estimated and standardized regarding the lowest sampling effort. The optimal sampling effort is identified as that in which the increase in sampling effort does not improve the highest MultSE beyond a threshold value (e.g. 2.5 %). The performance of SSP was validated using real dat...
Facebook
TwitterGeneralized distance sampling (GDS) models are the distance sampling equivalent of temporary emigration N-mixture models. In addition to density and the perceptibility component of detection, both contain an additional parameter for availability for detection which becomes estimable when data from repeated 'visits' are available. GDS models thus account for open populations. This makes them more robust, since natural populations are hardly ever perfectly closed, arguably even over the course of a single breeding season. However, the performance of these models has not been tested thoroughly, and prior (unpublished) analyses suggested that biased estimates, especially for density (high) and availability (low), may typically occur under certain conditions. We conducted three simulation studies and found that bias arises in low-information scenarios, particularly with low sample sizes and low parameter values. Our simulations enable us to determine "estimation frontiers", which separate sa..., , # Title of Dataset: Performance of generalized distance sampling models with temporary emigration: a simulation study
The study was not based on real data. All data used in the study were generated using simulation code.
The dataset contains four R files with simulation codes:
First, run Code_1. The other codes are independent, but the first simul...,
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Default sim_abundance function call, with descriptions, default values and associated parameter symbols of key arguments.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The dataset comprises analysis results (major ions, N2, and Ar concentrations) from groundwater samples taken in the Lower Rhine Embayment (Germany) to assess the impact of sampling methods on N2, Ar, and excess-N2 concentrations. The data is used in the manuscript "Comparing Groundwater Sampling Devices for Denitrification Assessment using the N2/Ar Method" by Felix Fahrenbach and Thomas R. Rüde, which is currently undergoing review by Groundwater.
The libraries tidyverse (Wickham et al. 2019), psych (Revelle 2014), car (Fox and Weisberg 2019), rstatix (Kassambara 2023), and PMCMRplus (Pohlert 2024) need to be installed to run the R scripts. Running the Python scripts requires the following packages: numpy (Harris et al. 2020), pandas (McKinney 2010), scipy (Virtanen et al. 2020), statsmodels (Seabold and Perktold 2010), and matplotlib (Hunter 2007).
References Fox, J., and S. Weisberg. 2019. An R Companion to Applied Regression. 3rd ed. Thousand Oaks CA: Sage, https://www.john-fox.ca/Companion/. Harris, C. R., K. J. Millman, S. J. van der Walt, R. Gommers, P. Virtanen, D. Cournapeau, E. Wieser, et al. 2020. Array programming with NumPy. Nature 585, no. 7825: 357–62, https://doi.org/10.1038/s41586-020-2649-2. Hunter, J. D. 2007. Matplotlib: A 2D graphics environment. Computing in Science & Engineering 9, no. 3: 90–95, https://doi.org/10.1109/MCSE.2007.55. Kassambara, A. 2023. rstatix: Pipe-Friendly Framework for Basic Statistical Tests, https://rpkgs.datanovia.com/rstatix/. McKinney, W. 2010. Data Structures for Statistical Computing in Python. In Proceedings of the 9th Python in Science Conference, edited by S. van der Walt and J. Millman, 56–61, https://doi.org/10.25080/Majora-92bf1922-00a. Pohlert, T. 2024. PMCMRplus: Calculate Pairwise Multiple Comparisons of Mean Rank Sums Extended, https://CRAN.R-project.org/package=PMCMRplus. Revelle, W. 2014. psych: Procedures for Psychological, Psychometric, and Personality Research. Evanston, Illinois: Northwestern University, https://CRAN.R-project.org/package=psych. Seabold, S., and J. Perktold. 2010. statsmodels: Econometric and statistical modeling with python. In Proceedings of the 9th Python in Science Conference. Virtanen, P., R. Gommers, T. E. Oliphant, M. Haberland, T. Reddy, D. Cournapeau, E. Burovski, et al. 2020. SciPy 1.0: Fundamental algorithms for scientific computing in python. Nature Methods 17: 261–72, https://doi.org/10.1038/s41592-019-0686-2. Wickham, H., M. Averick, J. Bryan, W. Chang, L. McGowan, R. François, G. Grolemund, et al. 2019. Welcome to the Tidyverse. Journal of Open Source Software 4, no. 43: 1686, https://doi.org/10.21105/joss.01686.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Default sim_distribution function call, with descriptions and associated parameter symbols of key arguments.
Facebook
TwitterLead concentrations in drinking water samples collected under various sampling protocols in homes with lead service lines and in homes without lead service lines in two US cities. This dataset is associated with the following publication: Lytle, D., M. Urbanic, A. Paul, R. Achtemeier, A. Lewis, S. Hammaker, A. Estep, M. Nadagouda, R. James, and S. Triantafyllidou. Alternative approaches to lead sampling in drinking water: A comparative study of homes with and without lead service lines in two cities. WATER RESEARCH. Elsevier Science Ltd, New York, NY, USA, 994: 180063, (2025).
Facebook
Twitter*n is the number of days in which samples were collected at each site on the same day.
Facebook
TwitterInferences of population structure and more precisely the identification of genetically homogeneous groups of individuals are essential to the fields of ecology, evolutionary biology, and conservation biology. Such population structure inferences are routinely investigated via the program STRUCTURE implementing a Bayesian algorithm to identify groups of individuals at Hardy-Weinberg and linkage equilibrium. While the method is performing relatively well under various population models with even sampling between subpopulations, the robustness of the method to uneven sample size between subpopulations and/or hierarchical levels of population structure has not yet been tested despite being commonly encountered in empirical datasets. In this study, I used simulated and empirical microsatellite datasets to investigate the impact of uneven sample size between subpopulations and/or hierarchical levels of population structure on the detected population structure. The results demonstrated that u...
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Default sim_survey function call, with descriptions and associated parameter symbols of key arguments.
Facebook
TwitterFIsh caught on NOAA R/V Townsend Cromwell cruises from 1982 to 1998 and NOAA R/V Oscar E Sette in 2007 and 2009 were measured and/or weighed and sex determination was conducted. Specimen samples were also preserved from selected fishes.
Facebook
TwitterThis scientific sampling event log was created using an early implementation of the Rolling Deck to Repository (R2R) event log application (ELOG with cruise-specific custom configuration files). The log includes a record of all scientific sampling events from the cruise. In addition to event identification numbers unique for the cruise, the scientific sampling event log includes date and time (GMT), position (latittude and longitude), station and cast identifier as appropriate to the sampling event, sampling instrument name (e.g. CTD, TM, MOC10), name of person responsible for the sampling event, and a comment field to record additional information.
Facebook
TwitterEmory University (analyzed the urine samples for pyrethroid metabolites). This dataset is not publicly accessible because: EPA cannot release personally identifiable information regarding living individuals, according to the Privacy Act and the Freedom of Information Act (FOIA). This dataset contains information about human research subjects. Because there is potential to identify individual participants and disclose personal information, either alone or in combination with other datasets, individual level data are not appropriate to post for public access. Restricted access may be granted to authorized persons by contacting the party listed. It can be accessed through the following means: Contact Researcher. Format: Pyrethroid metabolite concentration data for 50 adults over six-weeks. This dataset is associated with the following publication: Morgan , M., J. Sobus , D.B. Barr, C. Croghan , F. Chen , R. Walker, L. Alston, E. Andersen, and M. Clifton. Temporal variability of pyrethroid metabolite levels in bedtime, morning, and 24-hr urine samples for 50 adults in North Carolina. ENVIRONMENT INTERNATIONAL. Elsevier Science Ltd, New York, NY, USA, 144: 81-91, (2015).
Facebook
TwitterThis data release presents calculated accumulated wastewater (ACCWW, as a percent of total streamflow) values for 43 National Hydrologic Dataset Version 2.1 (NHDPlus V2.1) stream segments coinciding with long-term smallmouth bass sampling locations (Table 1) in the Shenandoah River Watershed (encompassing parts of Virginia and West Virginia, USA). Values are calculated for quarter-year (Quarter 1 [Q1], January - March; Quarter 2 [Q2], April - June; Quarter 3 [Q3], July-September; Quarter 4 [Q4], October-December) time scales (Table 2) and annual time scales (Table 3) for years 2000 to 2018. Estimates at a stream segment represent the combined total upstream wastewater discharges as well as direct discharges into the stream segment. Any users of these data should review the entire metadata record and the associated manuscript (see Larger Work Citation). See 'Distribution Liability' statements for more information.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The case of size-biased sampling of known order from a finite population without replacement is considered. The behavior of such a sampling scheme is studied with respect to the sampling fraction. Based on a simulation study, it is concluded that such a sample cannot be treated either as a random sample from the parent distribution or as a random sample from the corresponding r-size weighted distribution and as the sampling fraction increases, the biasness in the sample decreases resulting in a transition from an r-size-biased sample to a random sample. A modified version of a likelihood-free method is adopted for making statistical inference for the unknown population parameters, as well as for the size of the population when it is unknown. A simulation study, which takes under consideration the sampling fraction, demonstrates that the proposed method presents better and more robust behavior compared to the approaches, which treat the r-size-biased sample either as a random sample from the parent distribution or as a random sample from the corresponding r-size weighted distribution. Finally, a numerical example which motivates this study illustrates our results.
Facebook
TwitterSeabird distance sampling dataThis .R file contains all data to repeat the community distance sampling case study on seabird abundance and distribution off the mid-Atlantic coast presented in the associated paper. The data are in the form of a R list object; the R script to read in and analyze the data are part of the Supplement 2, available with the paper. The ReadMe file contains a detailed description of the data.sbdata.R
Facebook
TwitterThis dataset was created by sciencestoked