Facebook
TwitterBiological sampling data is information that comes from biological samples of fish harvested in Virginia for aging purposes to aid in coastal stock assessments
Facebook
TwitterData collected to assess water quality conditions in the natural creeks, aquifers and lakes in the Austin area. This is raw data, provided directly from our Water Resources Monitoring database (WRM) and should be considered provisional. Data may or may not have been reviewed by project staff. A map of site locations can be found by searching for LOCATION.WRM_SAMPLE_SITES; you may then use those WRM_SITE_IDs to filter in this dataset using the field SAMPLE_SITE_NO.
Facebook
TwitterEstablishment specific sampling results for Raw Beef sampling projects. Current data is updated quarterly; archive data is updated annually. Data is split by FY. See the FSIS website for additional information.
Facebook
Twittercrumb/dummy-cot-sampling-dataset dataset hosted on Hugging Face and contributed by the HF Datasets community
Facebook
TwitterA data set of cross-nationally comparable microdata samples for 15 Economic Commission for Europe (ECE) countries (Bulgaria, Canada, Czech Republic, Estonia, Finland, Hungary, Italy, Latvia, Lithuania, Romania, Russia, Switzerland, Turkey, UK, USA) based on the 1990 national population and housing censuses in countries of Europe and North America to study the social and economic conditions of older persons. These samples have been designed to allow research on a wide range of issues related to aging, as well as on other social phenomena. A common set of nomenclatures and classifications, derived on the basis of a study of census data comparability in Europe and North America, was adopted as a standard for recoding. This series was formerly called Dynamics of Population Aging in ECE Countries. The recommendations regarding the design and size of the samples drawn from the 1990 round of censuses envisaged: (1) drawing individual-based samples of about one million persons; (2) progressive oversampling with age in order to ensure sufficient representation of various categories of older people; and (3) retaining information on all persons co-residing in the sampled individual''''s dwelling unit. Estonia, Latvia and Lithuania provided the entire population over age 50, while Finland sampled it with progressive over-sampling. Canada, Italy, Russia, Turkey, UK, and the US provided samples that had not been drawn specially for this project, and cover the entire population without over-sampling. Given its wide user base, the US 1990 PUMS was not recoded. Instead, PAU offers mapping modules, which recode the PUMS variables into the project''''s classifications, nomenclatures, and coding schemes. Because of the high sampling density, these data cover various small groups of older people; contain as much geographic detail as possible under each country''''s confidentiality requirements; include more extensive information on housing conditions than many other data sources; and provide information for a number of countries whose data were not accessible until recently. Data Availability: Eight of the fifteen participating countries have signed the standard data release agreement making their data available through NACDA/ICPSR (see links below). Hungary and Switzerland require a clearance to be obtained from their national statistical offices for the use of microdata, however the documents signed between the PAU and these countries include clauses stipulating that, in general, all scholars interested in social research will be granted access. Russia requested that certain provisions for archiving the microdata samples be removed from its data release arrangement. The PAU has an agreement with several British scholars to facilitate access to the 1991 UK data through collaborative arrangements. Statistics Canada and the Italian Institute of statistics (ISTAT) provide access to data from Canada and Italy, respectively. * Dates of Study: 1989-1992 * Study Features: International, Minority Oversamples * Sample Size: Approx. 1 million/country Links: * Bulgaria (1992), http://www.icpsr.umich.edu/icpsrweb/ICPSR/studies/02200 * Czech Republic (1991), http://www.icpsr.umich.edu/icpsrweb/ICPSR/studies/06857 * Estonia (1989), http://www.icpsr.umich.edu/icpsrweb/ICPSR/studies/06780 * Finland (1990), http://www.icpsr.umich.edu/icpsrweb/ICPSR/studies/06797 * Romania (1992), http://www.icpsr.umich.edu/icpsrweb/ICPSR/studies/06900 * Latvia (1989), http://www.icpsr.umich.edu/icpsrweb/ICPSR/studies/02572 * Lithuania (1989), http://www.icpsr.umich.edu/icpsrweb/ICPSR/studies/03952 * Turkey (1990), http://www.icpsr.umich.edu/icpsrweb/ICPSR/studies/03292 * U.S. (1990), http://www.icpsr.umich.edu/icpsrweb/ICPSR/studies/06219
Facebook
TwitterSampling overview.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Sample names, sampling descriptions and contextual data.
Facebook
TwitterMultiple sampling campaigns were conducted near Boulder, Colorado, to quantify constituent concentrations and loads in Boulder Creek and its tributary, South Boulder Creek. Diel sampling was initiated at approximately 1100 hours on September 17, 2019, and continued until approximately 2300 hours on September 18, 2019. During this time period, samples were collected at two locations on Boulder Creek approximately every 3.5 hours to quantify the diel variability of constituent concentrations at low flow. Synoptic sampling campaigns on South Boulder Creek and Boulder Creek were conducted October 15-18, 2019, to develop spatial profiles of concentration, streamflow, and load. Numerous main stem and inflow locations were sampled during each synoptic campaign using the simple grab technique (17 main stem and 2 inflow locations on South Boulder Creek; 34 main stem and 17 inflow locations on Boulder Creek). Streamflow at each main stem location was measured using acoustic doppler velocimetry. Bulk samples from all sampling campaigns were processed within one hour of sample collection. Processing steps included measurement of pH and specific conductance, and filtration using 0.45-micron filters. Laboratory analyses were subsequently conducted to determine dissolved and total recoverable constituent concentrations. Filtered samples were analyzed for a suite of dissolved anions using ion chromatography. Filtered, acidified samples and unfiltered acidified samples were analyzed by inductively coupled plasma-mass spectrometry and inductively coupled plasma-optical emission spectroscopy to determine dissolved and total recoverable cation concentrations, respectively. This data release includes three data tables, three photographs, and a kmz file showing the sampling locations. Additional information on the data table contents, including the presentation of data below the analytical detection limits, is provided in a Data Dictionary.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
(n): natural substratum; (a): artificial substratum; (0.5 m): 0.5 meter deep; (2.0 m): 2.0 meters deep; (5.0 m): 5.0 meters deep. Total number of samples: 72.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Number of MRSP+ samples found at the different sampling sites.
Facebook
TwitterThis study is an experiment designed to compare the performance of three methodologies for sampling households with migrants:
Researchers from the World Bank applied these methods in the context of a survey of Brazilians of Japanese descent (Nikkei), requested by the World Bank. There are approximately 1.2-1.9 million Nikkei among Brazil’s 170 million population.
The survey was designed to provide detail on the characteristics of households with and without migrants, to estimate the proportion of households receiving remittances and with migrants in Japan, and to examine the consequences of migration and remittances on the sending households.
The same questionnaire was used for the stratified random sample and snowball surveys, and a shorter version of the questionnaire was used for the intercept surveys. Researchers can directly compare answers to the same questions across survey methodologies and determine the extent to which the intercept and snowball surveys can give similar results to the more expensive census-based survey, and test for the presence of biases.
Sao Paulo and Parana states
Japanese-Brazilian (Nikkei) households and individuals
The 2000 Brazilian Census was used to classify households as Nikkei or non-Nikkei. The Brazilian Census does not ask ethnicity but instead asks questions on race, country of birth and whether an individual has lived elsewhere in the last 10 years. On the basis of these questions, a household is classified as (potentially) Nikkei if it has any of the following: 1) a member born in Japan; 2) a member who is of yellow race and who has lived in Japan in the last 10 years; 3) a member who is of yellow race, who was not born in a country other than Japan (predominantly Korea, Taiwan or China) and who did not live in a foreign country other than Japan in the last 10 years.
Sample survey data [ssd]
1) Stratified random sample survey
Two states with the largest Nikkei population - Sao Paulo and Parana - were chosen for the study.
The sampling process consisted of three stages. First, a stratified random sample of 75 census tracts was selected based on 2000 Brazilian census. Second, interviewers carried out a door-to-door listing within each census tract to determine which households had a Nikkei member. Third, the survey questionnaire was then administered to households that were identified as Nikkei. A door-to-door listing exercise of the 75 census tracts was then carried out between October 13th, 2006, and October 29th, 2006. The fieldwork began on November 19, 2006, and all dwellings were visited at least once by December 22, 2006. The second wave of surveying took place from January 18th, 2007, to February 2nd, 2007, which was intended to increase the number of households responding.
2) Intercept survey
The intercept survey was designed to carry out interviews at a range of locations that were frequented by the Nikkei population. It was originally designed to be done in Sao Paulo city only, but a second intercept point survey was later carried out in Curitiba, Parana. Intercept survey took place between December 9th, 2006, and December 20th, 2006, whereas the Curitiba intercept survey took place between March 3rd and March 12th, 2007.
Consultations with Nikkei community organizations, local researchers and officers of the bank Sudameris, which provides remittance services to this community, were used to select a broad range of locations. Interviewers were assigned to visit each location during prespecified blocks of time. Two fieldworkers were assigned to each location. One fieldworker carried out the interviews, while the other carried out a count of the number of people with Nikkei appearance who appeared to be 18 years old or older who passed by each location. For the fixed places, this count was made throughout the prespecified time block. For example, between 2.30 p.m. and 3.30 p.m. at the sports club, the interviewer counted 57 adult Nikkeis. Refusal rates were carefully recorded, along with the sex and approximate age of the person refusing.
In all, 516 intercept interviews were collected.
3) Snowball sampling survey
The questionnaire that was used was the same as used for the stratified random sample. The plan was to begin with a seed list of 75 households, and to aim to reach a total sample of 300 households through referrals from the initial seed households. Each household surveyed was asked to supply the names of three contacts: (a) a Nikkei household with a member currently in Japan; (b) a Nikkei household with a member who has returned from Japan; (c) a Nikkei household without members in Japan and where individuals had not returned from Japan.
The snowball survey took place from December 5th to 20th, 2006. The second phase of the snowballing survey ran from January 22nd, 2007, to March 23rd, 2007. More associations were contacted to provide additional seed names (69 more names were obtained) and, as with the stratified sample, an adaptation of the intercept survey was used when individuals refused to answer the longer questionnaire. A decision was made to continue the snowball process until a target sample size of 100 had been achieved.
The final sample consists of 60 households who came as seed households from Japanese associations, and 40 households who were chain referrals. The longest chain achieved was three links.
Face-to-face [f2f]
1) Stratified sampling and snowball survey questionnaire
This questionnaire has 36 pages with over 1,000 variables, taking over an hour to complete.
If subjects refused to answer the questionnaire, interviewers would leave a much shorter version of the questionnaire to be completed by the household by themselves, and later picked up. This shorter questionnaire was the same as used in the intercept point survey, taking seven minutes on average. The intention with the shorter survey was to provide some data on households that would not answer the full survey because of time constraints, or because respondents were reluctant to have an interviewer in their house.
2) Intercept questionnaire
The questionnaire is four pages in length, consisting of 62 questions and taking a mean time of seven minutes to answer. Respondents had to be 18 years old or older to be interviewed.
1) Stratified random sampling 403 out of the 710 Nikkei households were surveyed, an interview rate of 57%. The refusal rate was 25%, whereas the remaining households were either absent on three attempts or were not surveyed because building managers refused permission to enter the apartment buildings. Refusal rates were higher in Sao Paulo than in Parana, reflecting greater concerns about crime and a busier urban environment.
2) Intercept Interviews 516 intercept interviews were collected, along with 325 refusals. The average refusal rate is 39%, with location-specific refusal rates ranging from only 3% at the food festival to almost 66% at one of the two grocery stores.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Tool support in software engineering often depends on relationships, regularities, patterns, or rules, mined from sampled code. Examples are approaches to bug prediction, code recommendation, and code autocompletion. Samples are relevant to scale the analysis of data. Many such samples consist of software projects taken from GitHub; however, the specifics of sampling might influence the generalization of the patterns.
In this paper, we focus on how to sample software projects that are clients of libraries and frameworks, when mining for interlibrary usage patterns. We notice that when limiting the sample to a very specific library, inter-library patterns in the form of implications from one library to another may not generalize well. Using a simulation and a real case study, we analyze different sampling methods. Most importantly, our simulation shows that only when sampling for the disjunction of both libraries involved in the implication, the implication generalizes well. Second, we show that real empirical data sampled from GitHub does not behave as we would expect it from our simulation. This identifies a potential problem with the usage of such API for studying inter-library usage patterns.
Facebook
Twitterhttps://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html
Understanding population dynamics requires reliable estimates of population density, yet this basic information is often surprisingly difficult to obtain. With rare or difficult-to-capture species, genetic surveys from noninvasive collection of hair or scat has proved cost-efficient for estimating densities. Here, we explored whether noninvasive genetic sampling (NGS) also offers promise for sampling a relatively common species, the snowshoe hare (Lepus americanus Erxleben, 1777), in comparison with traditional live trapping. We optimized a protocol for single-session NGS sampling of hares. We compared spatial capture–recapture population estimates from live trapping to estimates derived from NGS, and assessed NGS costs. NGS provided population estimates similar to those derived from live trapping, but a higher density of sampling plots was required for NGS. The optimal NGS protocol for our study entailed deploying 160 sampling plots for 4 days and genotyping one pellet per plot. NGS laboratory costs ranged from approximately $670 to $3000 USD per field site. While live trapping does not incur laboratory costs, its field costs can be considerably higher than for NGS, especially when study sites are difficult to access. We conclude that NGS can work for common species, but that it will require field and laboratory pilot testing to develop cost-effective sampling protocols.
Facebook
TwitterIf the Substance Abuse and Mental Health Services Administration (SAMHSA) is to move NSDUH to a hybrid ABS/field-enumerated frame, several questions will need to be answered, procedures will need to be developed and tested, and costs and benefits will need to be weighed. This report outlines what is known to date, how it may be applied to NSDUH, and what additional considerations need to be addressed.
Facebook
Twitter
Facebook
TwitterThis TB describes how ACF will identify and finalize each cohort of youth in the NYTD follow-up population (or follow-up population sample for those States that opt to sample) for the purposes of assessing States' compliance with NYTD data collection and reporting requirements. The TB also specifies how States may opt to sample the baseline population for the purposes of collecting information on the follow-up population. Metadata-only record linking to the original dataset. Open original dataset below.
Facebook
TwitterThis view presents data selected from the geochemical mapping of North Greenland that are relevant for an evaluation of the potential for zinc mineralisation: CaO, K2O, Ba, Cu, Sr, Zn. The data represent the most reliable analytical values from 2469 stream sediment and 204 soil samples collected and analysed over a period from 1978 to 1999 plus a large number of reanalyses in 2011. The compiled data have been quality controlled and calibrated to eliminate bias between methods and time of analysis as described in Thrane et al., 2011. In the present dataset, all values below lower detection limit are indicated by the digit 0. Sampling The regional geochemical surveys undertaken in North Greenland follows the procedure for stream sediment sampling given in Steenfelt, 1999. Thrane et al., 2011 give more information on sampling campaigns in the area. The sample consists of 500 g sediment collected into paper bags from stream bed and banks, alternatively soil from areas devoid of streams. The sampling density is not consistent throughout the covered area and varies from regular with 1 sample per 30 to 50 km2 to scarce and irregular in other areas. Analyses were made on screened < 0.1 mm or <0.075 mm grain size fractions.
Facebook
TwitterThis survey was conducted in Slovak Republic between January 2013 and March 2014 as part of the fifth round of the Business Environment and Enterprise Performance Survey (BEEPS V), a joint initiative of the World Bank Group and the European Bank for Reconstruction and Development. The objective of the survey is to obtain feedback from enterprises on the state of the private sector as well as to help in building a panel of enterprise data that will make it possible to track changes in the business environment over time, thus allowing, for example, impact assessments of reforms. Through interviews with firms in the manufacturing and services sectors, the survey assesses the constraints to private sector growth and creates statistically significant business environment indicators that are comparable across countries.
Data from 276 establishments was analyzed. Stratified random sampling was used to select the surveyed businesses.
The survey topics include firm characteristics, information about sales and suppliers, competition, infrastructure services, judiciary and law enforcement collaboration, security, government policies, laws and regulations, financing, overall business environment, bribery, capacity utilization, performance and investment activities, and workforce composition.
In 2011, the innovation module was added to the standard set of Enterprise Surveys questionnaires to examine in detail how introduction of new products and practices influence firms' performance and management.
National
The primary sampling unit of the study is an establishment. An establishment is a physical location where business is carried out and where industrial operations take place or services are provided. A firm may be composed of one or more establishments. For example, a brewery may have several bottling plants and several establishments for distribution. For the purposes of this survey an establishment must make its own financial decisions and have its own financial statements separate from those of the firm. An establishment must also have its own management and control over its payroll.
The whole population, or universe of the study, is the non-agricultural economy. It comprises: all manufacturing sectors according to the group classification of ISIC Revision 3.1: (group D), construction sector (group F), services sector (groups G and H), and transport, storage, and communications sector (group I). Note that this definition excludes the following sectors: financial intermediation (group J), real estate and renting activities (group K, except sub-sector 72, IT, which was added to the population under study), and all public or utilities-sectors.
Sample survey data [ssd]
The sample was selected using stratified random sampling. Three levels of stratification were used in this country: industry, establishment size, and region.
Industry stratification was designed in the way that follows: the universe was stratified into one manufacturing industry, and two service industries (retail, and other services).
Size stratification was defined following the standardized definition for the rollout: small (5 to 19 employees), medium (20 to 99 employees), and large (more than 99 employees). For stratification purposes, the number of employees was defined on the basis of reported permanent full-time workers. This seems to be an appropriate definition of the labor force since seasonal/casual/part-time employment is not common practice, apart from the construction and agriculture sectors which are not included in the survey.
Regional stratification was defined in four regions (city and the surrounding business area) throughout Slovak Republic.
The database "Albertina Company Monitor" was used as the frame for the selection of a sample with the aim of obtaining interviews at 270 establishments with five or more employees.
Given the impact that non-eligible units included in the sample universe may have on the results, adjustments may be needed when computing the appropriate weights for individual observations. The percentage of confirmed non-eligible units as a proportion of the total number of sampled establishments contacted for the survey was 1.9 % (31 out of 1,613 establishments).
In the dataset, the variables a2 (sampling region), a6a (sampling establishment's size), and a4a (sampling sector) contain the establishment's classification into the strata chosen for each country using information from the sample frame. Variable a4a is coded using ISIC Rev 3.1 codes for the chosen industries for stratification. These codes include most manufacturing industries (15 to 37), retail (52), and (45, 50, 51, 55, 60-64, 72) for other services.
Face-to-face [f2f]
Three different versions of the questionnaire were used. The basic questionnaire, the Core Module, includes all common questions asked to all establishments from all sectors. The second expanded variation, the Manufacturing Questionnaire, is built upon the Core Module and adds some specific questions relevant to manufacturing sectors. The third expanded variation, the Retail Questionnaire, is also built upon the Core Module and adds to the core specific questions.
The innovation module was added to the standard set of Enterprise Surveys questionnaires to examine how introduction of new products and practices influence firms' performance and management.
Data entry and quality controls are implemented by the contractor and data is delivered to the World Bank in batches (typically 10%, 50% and 100%). These data deliveries are checked for logical consistency, out of range values, skip patterns, and duplicate entries. Problems are flagged by the World Bank and corrected by the implementing contractor through data checks, callbacks, and revisiting establishments.
Survey non-response must be differentiated from item non-response. The former refers to refusals to participate in the survey altogether, while the latter refers to the refusals to answer some specific questions. Enterprise Surveys suffer from both problems and different strategies were used to address these issues.
Item non-response was addressed by two strategies: a- For sensitive questions that may generate negative reactions from the respondent, such as corruption or tax evasion, enumerators were instructed to collect the refusal to respond as a different option from don’t know. b- Establishments with incomplete information were re-contacted in order to complete this information, whenever necessary.
Survey non-response was addressed by maximizing efforts to contact establishments that were initially selected for interview. Attempts were made to contact the establishment for interview at different times/days of the week before a replacement establishment (with similar strata characteristics) was suggested for interview. Survey non-response did occur but substitutions were made in order to potentially achieve strata-specific goals.
The number of contacted establishments per realized interview was 0.14. This number is the result of two factors: explicit refusals to participate in the survey, as reflected by the rate of rejection (which includes rejections of the screener and the main survey) and the quality of the sample frame, as represented by the presence of ineligible units. The number of rejections per contact was 0.56.
Facebook
TwitterCC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
An increasing number of studies are using landscape genomics to investigate local adaptation in wild and domestic populations. The implementation of this approach requires the sampling phase to consider the complexity of environmental settings and the burden of logistic constraints. These important aspects are often underestimated in the literature dedicated to sampling strategies. In this study, we computed simulated genomic datasets to run against actual environmental data in order to trial landscape genomics experiments under distinct sampling strategies. These strategies differed by design approach (to enhance environmental and/or geographic representativeness at study sites), number of sampling locations and sample sizes. We then evaluated how these elements affected statistical performances (power and false discoveries) under two antithetical demographic scenarios. Our results highlight the importance of selecting an appropriate sample size, which should be modified based on the demographic characteristics of the studied population. For species with limited dispersal, sample sizes above 200 units are generally sufficient to detect most adaptive signals, while in random mating populations this threshold should be increased to 400 units. Furthermore, we describe a design approach that maximizes both environmental and geographical representativeness of sampling sites and show how it systematically outperforms random or regular sampling schemes. Finally, we show that although having more sampling locations (between 40 and 50 sites) increase statistical power and reduce false discovery rate, similar results can be achieved with a moderate number of sites (20 sites). Overall, this study provides valuable guidelines for optimizing sampling strategies for landscape genomics experiments.
Facebook
TwitterCC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
The data products are the sampling results from FSIS’ National Antimicrobial Resistance Monitoring System (NARMS) Cecal sampling program. Data for sampling results from NARMS Product sampling program is currently posted on the FSIS Website and are grouped by commodity (https://www.fsis.usda.gov/science-data/data-sets-visualizations/laboratory-sampling-data). The antimicrobials and bacteria tested under NARMS are selected are based on their importance to human health and use in food-producing animals (FDA Guidance for Industry # 152 (https://www.fda.gov/media/69949/download)). Cecal contents from cattle, swine, chicken, and turkeys were sampled as part of FSIS’s routine NARMS cecal sampling program for major species.
Facebook
TwitterBiological sampling data is information that comes from biological samples of fish harvested in Virginia for aging purposes to aid in coastal stock assessments