Multiple sampling campaigns were conducted near Boulder, Colorado, to quantify constituent concentrations and loads in Boulder Creek and its tributary, South Boulder Creek. Diel sampling was initiated at approximately 1100 hours on September 17, 2019, and continued until approximately 2300 hours on September 18, 2019. During this time period, samples were collected at two locations on Boulder Creek approximately every 3.5 hours to quantify the diel variability of constituent concentrations at low flow. Synoptic sampling campaigns on South Boulder Creek and Boulder Creek were conducted October 15-18, 2019, to develop spatial profiles of concentration, streamflow, and load. Numerous main stem and inflow locations were sampled during each synoptic campaign using the simple grab technique (17 main stem and 2 inflow locations on South Boulder Creek; 34 main stem and 17 inflow locations on Boulder Creek). Streamflow at each main stem location was measured using acoustic doppler velocimetry. Bulk samples from all sampling campaigns were processed within one hour of sample collection. Processing steps included measurement of pH and specific conductance, and filtration using 0.45-micron filters. Laboratory analyses were subsequently conducted to determine dissolved and total recoverable constituent concentrations. Filtered samples were analyzed for a suite of dissolved anions using ion chromatography. Filtered, acidified samples and unfiltered acidified samples were analyzed by inductively coupled plasma-mass spectrometry and inductively coupled plasma-optical emission spectroscopy to determine dissolved and total recoverable cation concentrations, respectively. This data release includes three data tables, three photographs, and a kmz file showing the sampling locations. Additional information on the data table contents, including the presentation of data below the analytical detection limits, is provided in a Data Dictionary.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Sample names, sampling descriptions and contextual data.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Despite the wide application of longitudinal studies, they are often plagued by missing data and attrition. The majority of methodological approaches focus on participant retention or modern missing data analysis procedures. This paper, however, takes a new approach by examining how researchers may supplement the sample with additional participants. First, refreshment samples use the same selection criteria as the initial study. Second, replacement samples identify auxiliary variables that may help explain patterns of missingness and select new participants based on those characteristics. A simulation study compares these two strategies for a linear growth model with five measurement occasions. Overall, the results suggest that refreshment samples lead to less relative bias, greater relative efficiency, and more acceptable coverage rates than replacement samples or not supplementing the missing participants in any way. Refreshment samples also have high statistical power. The comparative strengths of the refreshment approach are further illustrated through a real data example. These findings have implications for assessing change over time when researching at-risk samples with high levels of permanent attrition.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Location, presence of fibrous minerals (including asbestos), brief description of the sites and of the samples, extractable fraction of macro- and micronutrients (µg of ions/g of soil ± standard deviation) C%, N%, C/N (the statistical analysis was performed by ANOVA with Tukey as post-hoc test (P
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Tool support in software engineering often depends on relationships, regularities, patterns, or rules, mined from sampled code. Examples are approaches to bug prediction, code recommendation, and code autocompletion. Samples are relevant to scale the analysis of data. Many such samples consist of software projects taken from GitHub; however, the specifics of sampling might influence the generalization of the patterns.
In this paper, we focus on how to sample software projects that are clients of libraries and frameworks, when mining for interlibrary usage patterns. We notice that when limiting the sample to a very specific library, inter-library patterns in the form of implications from one library to another may not generalize well. Using a simulation and a real case study, we analyze different sampling methods. Most importantly, our simulation shows that only when sampling for the disjunction of both libraries involved in the implication, the implication generalizes well. Second, we show that real empirical data sampled from GitHub does not behave as we would expect it from our simulation. This identifies a potential problem with the usage of such API for studying inter-library usage patterns.
Link to the ScienceBase Item Summary page for the item described by this metadata record. Service Protocol: Link to the ScienceBase Item Summary page for the item described by this metadata record. Application Profile: Web Browser. Link Function: information
Link to the ScienceBase Item Summary page for the item described by this metadata record. Service Protocol: Link to the ScienceBase Item Summary page for the item described by this metadata record. Application Profile: Web Browser. Link Function: information
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
These data are detections of reptiles in 2006 under 2 ft x 2 ft dimension plywood coverboards at four of the 15 sample points at Spears and Didion Ranches, Placer County, California. There are 81 coverboards in a 9 board by 9 board array (on 15 m spacing) centered on the sample points. Coverboards were placed in oak woodland and annual grassland habitat in November-December 2006 and checked on a bi-weekly basis between March-July 2006. All 204 animals found under the coverboards were counted, identified to species, and aged and sexed.
Establishment specific sampling results for Raw Beef sampling projects. Current data is updated quarterly; archive data is updated annually. Data is split by FY. See the FSIS website for additional information.
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
This collection contains small mammal vouchers collected during small mammal sampling (NEON sample classes: mam_pertrapnight_in.voucherSampleID). Small mammal sampling is based on the lunar calendar, with timing of sampling constrained to occur within 10 days before or after the new moon. Typically, core sites are sampled 6 times per year, and gradient sites 4 times per year. Small mammals are sampled using box traps (models LFA, XLK, H.B. Sherman Traps, Inc., Tallahassee, FL, USA). Box traps are arrayed in three to eight (depending on the size of the site) 10 x 10 grids with 10m spacing between traps at all sites. Small mammal trapping bouts are comprised of one or three nights of trapping, depending on whether a grid is designated for pathogen sample collection (3 nights) or not (1 night). Only mortalities and individuals that require euthanasia due to injuries are vouchered. The NEON Biorepository receives whole frozen specimens and prepares vouchers as either study skins with skulls (or full skeletons) or in 70-95% ethanol. Standard mammalian measurements are taken during specimen preparation (in mm; total length, tail length, hind foot length, ear length; and in g: mass) and are accessible in downloaded records (note: field measurements are listed in parentheses after preparation measurements, when available). Additional notes about parasites and reproductive condition are also accessible in downloaded records. See related links below for protocols and NEON related data products.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Information on samples submitted for RNAseq
Rows are individual samples
Columns are: ID Sample Name Date sampled Species Sex Tissue Geographic location Date extracted Extracted by Nanodrop Conc. (ng/µl) 260/280 260/230 RIN Plate ID Position Index name Index Seq Qubit BR kit Conc. (ng/ul) BioAnalyzer Conc. (ng/ul) BioAnalyzer bp (region 200-1200) Submission reference Date submitted Conc. (nM) Volume provided PE/SE Number of reads Read length
Dataset Card for "sampling-distill-train-data-kth-shift4"
Training data for sampling-based watermark distillation using the KTH s=4s=4s=4 watermarking strategy in the paper On the Learnability of Watermarks for Language Models. Llama 2 7Bwith decoding-based watermarking was used to generate 640,000 watermarked samples, each 256 tokens long. Each sample is prompted with 50-token prefixes from OpenWebText (prompts not included in the samples).
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Sampling intervals highlighted in bold numbers indicate the approximate vertical extent of the oxygen minimum zone (O2≤45 µmol kg−1). D = Discovery cruise, MSM = Maria S. Merian cruises, UTC = universal time code, O2 min = lowest oxygen concentration at the respective station, O2 min depth = depth of the oxygen minimum at the respective station, SST = sea surface temperature, n.d. = no data, * = stations analysed for copepod abundance.
Sampling stations are installed to enhance water quality monitoring in a water distribution system. They are strategically placed to allow for samples to be taken from a variety of locations around the city to determine if local factors are having a negative effect on water quality.Attribute Information: Field Name Description
AssetID A unique identifier for the asset class. Infor required field.
InstallDate The date when the asset was installed. Typically pulled from the as-built cover sheet for consistency. Infor required field.
LocationDescription Information related to the construction location or project name. Infor required field.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
I explore the sample size in qualitative research that is required to reach theoretical saturation. I conceptualize a population as consisting of sub-populations that contain different types of information sources that hold a number of codes. Theoretical saturation is reached after all the codes in the population have been observed once in the sample. I delineate three different scenarios to sample information sources: “random chance,” which is based on probability sampling, “minimal information,” which yields at least one new code per sampling step, and “maximum information,” which yields the largest number of new codes per sampling step. Next, I use simulations to assess the minimum sample size for each scenario for systematically varying hypothetical populations. I show that theoretical saturation is more dependent on the mean probability of observing codes than on the number of codes in a population. Moreover, the minimal and maximal information scenarios are significantly more efficient than random chance, but yield fewer repetitions per code to validate the findings. I formulate guidelines for purposive sampling and recommend that researchers follow a minimum information scenario.
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
A point sampling methodology was employed using imagery available through Google Earth Pro to assess land cover changes in Buffalo, NY, Denver, CO, and San Diego, CA. Analysis began during the Great Recession and concluded in 2018. Point sampling was used to classify land cover using 11 land cover classes. Further, point pattern analysis was employed to determine the existance (or lack thereof) of patterns associated with land cover change in each city.
Biological sampling data is information that comes from biological samples of fish harvested in Virginia for aging purposes to aid in coastal stock assessments
Establishment specific sampling results for Siluriformes Product sampling projects. Current data is updated quarterly; archive data is updated annually. Data is split by FY. See the FSIS website for additional information.
Two different problems, i.e. a low-dimensional (LD) and a high-dimensional (HD) problems are considered. The LD problem has 2 variables for a 4-ply symmetric square composite laminate. Similarly, the HD problem consists of 16 variables for a 32-ply symmetric square composite laminate. The value of h for LD and HD problems is taken as 0.005 and 0.04 respectively. For each problem, three different types of sampling technique, i.e. random sampling (RS), Latin hypercube sampling (LHS) [1] and Hammersley sampling (HS) [2] are adopted. The RS, LHS and HS primarily differ in the uniformity of sample points over the design space such that RS has the least and HS has the maximum uniform distributions of sample points. Based on the recommendations of Jin et al. [3], and Zhao and Xue [4], 72 and 612 sample points are considered in each training dataset of LD and HD problems respectively. Based on the FE formulation, several high-fidelity datasets for the LD and HD problems are generated, as presented in the Supplementary Material file “Predictive modelling of laminated composite plates.xlsx” in nine sheets that are organized as detailed out in Table 1. References: 1. McKay, M. D.; Beckman, R. J.; Conover, W. J. A comparison of three methods for selecting values of input variables in the analysis of output from a computer code. Technometrics, 2000, 42, 55-61. 2. Hammersley, J. M. Monte Carlo methods for solving multivariable problems. Annals of the New York Academy of Sciences, 1960, 86, 844-874. 3. Jin, R.; Chen, W.; Simpson, T. W. Comparative studies of metamodelling techniques under multiple modelling criteria. Structural and Multidisciplinary Optimization, 2001, 23, 1-13. 4. Zhao, D.; Xue, D. A comparative study of metamodeling methods considering sample quality merits. Structural and Multidisciplinary Optimization, 2010, 42, 923-938.
U.S. Government Workshttps://www.usa.gov/government-works
License information was derived automatically
The sampling of 41 hydrologically diverse rivers that are monitored through the National Water Quality Network (NWQN) by the U.S. Geological Survey (USGS) took place during the water years of 2008 through 2018. Water samples were collected and filtered in the field (unless otherwise noted) using 0.45 micrometer pre-rinsed capsule filters (Versapor membrane), silicon tubing, and a peristaltic pump. Water samples were then shipped on ice to the USGS in Boulder, Colorado and chilled to approximately 4 to 6 degrees Celsius until analysis. Dissolved organic carbon (DOC) was measured on an OI700 Analytical total organic carbon analyzer by wet-oxidation; each sample was measured in replicate and the average was reported. Ultraviolet (UV) absorbance at the wavelength of 254 nanometers was measured with an Agilent H ...
Multiple sampling campaigns were conducted near Boulder, Colorado, to quantify constituent concentrations and loads in Boulder Creek and its tributary, South Boulder Creek. Diel sampling was initiated at approximately 1100 hours on September 17, 2019, and continued until approximately 2300 hours on September 18, 2019. During this time period, samples were collected at two locations on Boulder Creek approximately every 3.5 hours to quantify the diel variability of constituent concentrations at low flow. Synoptic sampling campaigns on South Boulder Creek and Boulder Creek were conducted October 15-18, 2019, to develop spatial profiles of concentration, streamflow, and load. Numerous main stem and inflow locations were sampled during each synoptic campaign using the simple grab technique (17 main stem and 2 inflow locations on South Boulder Creek; 34 main stem and 17 inflow locations on Boulder Creek). Streamflow at each main stem location was measured using acoustic doppler velocimetry. Bulk samples from all sampling campaigns were processed within one hour of sample collection. Processing steps included measurement of pH and specific conductance, and filtration using 0.45-micron filters. Laboratory analyses were subsequently conducted to determine dissolved and total recoverable constituent concentrations. Filtered samples were analyzed for a suite of dissolved anions using ion chromatography. Filtered, acidified samples and unfiltered acidified samples were analyzed by inductively coupled plasma-mass spectrometry and inductively coupled plasma-optical emission spectroscopy to determine dissolved and total recoverable cation concentrations, respectively. This data release includes three data tables, three photographs, and a kmz file showing the sampling locations. Additional information on the data table contents, including the presentation of data below the analytical detection limits, is provided in a Data Dictionary.