100+ datasets found

f
Data from: Evaluating Supplemental Samples in Longitudinal Research:...
tandf.figshare.com
txt
Updated Feb 9, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Laura K. Taylor; Xin Tong; Scott E. Maxwell (2024). Evaluating Supplemental Samples in Longitudinal Research: Replacement and Refreshment Approaches [Dataset]. http://doi.org/10.6084/m9.figshare.12162072.v1
Explore at:
txtAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.12162072.v1
Dataset updated
Feb 9, 2024
Dataset provided by
Taylor & Francis
Authors
Laura K. Taylor; Xin Tong; Scott E. Maxwell
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Despite the wide application of longitudinal studies, they are often plagued by missing data and attrition. The majority of methodological approaches focus on participant retention or modern missing data analysis procedures. This paper, however, takes a new approach by examining how researchers may supplement the sample with additional participants. First, refreshment samples use the same selection criteria as the initial study. Second, replacement samples identify auxiliary variables that may help explain patterns of missingness and select new participants based on those characteristics. A simulation study compares these two strategies for a linear growth model with five measurement occasions. Overall, the results suggest that refreshment samples lead to less relative bias, greater relative efficiency, and more acceptable coverage rates than replacement samples or not supplementing the missing participants in any way. Refreshment samples also have high statistical power. The comparative strengths of the refreshment approach are further illustrated through a real data example. These findings have implications for assessing change over time when researching at-risk samples with high levels of permanent attrition.
Dataset for paper: How Twitter Data Sampling Biases U.S. Voter Behavior...
zenodo.org
zip
Updated May 14, 2022
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Kai-Cheng Yang; Pik-Mai Hui; Filippo Menczer; Kai-Cheng Yang; Pik-Mai Hui; Filippo Menczer (2022). Dataset for paper: How Twitter Data Sampling Biases U.S. Voter Behavior Characterizations [Dataset]. http://doi.org/10.5281/zenodo.6547792
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.6547792
Dataset updated
May 14, 2022
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Kai-Cheng Yang; Pik-Mai Hui; Filippo Menczer; Kai-Cheng Yang; Pik-Mai Hui; Filippo Menczer
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
United States
Description
This repository contains the data and code for the paper "How Twitter Data Sampling Biases U.S. Voter Behavior Characterizations."
d
FSIS Laboratory Sampling Data - Raw Beef Sampling
catalog.data.gov
s.cnmilf.com
Updated May 8, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Food Safety and Inspection Service (2025). FSIS Laboratory Sampling Data - Raw Beef Sampling [Dataset]. https://catalog.data.gov/dataset/fsis-raw-beef-sampling-data
Explore at:
Dataset updated
May 8, 2025
Dataset provided by
Food Safety and Inspection Servicehttp://www.fsis.usda.gov/
Description
Establishment specific sampling results for Raw Beef sampling projects. Current data is updated quarterly; archive data is updated annually. Data is split by FY. See the FSIS website for additional information.
Z
Data from: GIRT-Data: Sampling GitHub Issue Report Templates
data.niaid.nih.gov
zenodo.org
Updated Mar 13, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Hinrich Schütze (2023). GIRT-Data: Sampling GitHub Issue Report Templates [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_7724792
Explore at:
Dataset updated
Mar 13, 2023
Dataset provided by
Amir Hossein Kargaran
Nafiseh Nikeghbal
Abbas Heydarnoori
Hinrich Schütze
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
GIRT-Data is the first and largest dataset of issue report templates (IRTs) in both YAML and Markdown format. This dataset and its corresponding open-source crawler tool are intended to support research in this area and to encourage more developers to use IRTs in their repositories. The stable version of the dataset, containing 1_084_300 repositories, that 50_032 of them support IRTs.

For more details see the GitHub page of the dataset: https://github.com/kargaranamir/girt-data

The dataset is accepted for MSR 2023 conference, under the title of "GIRT-Data: Sampling GitHub Issue Report Templates" Search in Google Scholar.
f
Sampling data.
figshare.com
xls
Updated Jun 1, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Lena Teuber; Anna Schukat; Wilhelm Hagen; Holger Auel (2023). Sampling data. [Dataset]. http://doi.org/10.1371/journal.pone.0077590.t001
Explore at:
xlsAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0077590.t001
Dataset updated
Jun 1, 2023
Dataset provided by
PLOS ONE
Authors
Lena Teuber; Anna Schukat; Wilhelm Hagen; Holger Auel
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Sampling intervals highlighted in bold numbers indicate the approximate vertical extent of the oxygen minimum zone (O2≤45 µmol kg−1). D = Discovery cruise, MSM = Maria S. Merian cruises, UTC = universal time code, O2 min = lowest oxygen concentration at the respective station, O2 min depth = depth of the oxygen minimum at the respective station, SST = sea surface temperature, n.d. = no data, * = stations analysed for copepod abundance.
d
FSIS Laboratory Sampling Data - Siluriformes Product Sampling
catalog.data.gov
Updated May 8, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Food Safety and Inspection Service (2025). FSIS Laboratory Sampling Data - Siluriformes Product Sampling [Dataset]. https://catalog.data.gov/dataset/fsis-raw-siluriformes-product-sampling-data
Explore at:
Dataset updated
May 8, 2025
Dataset provided by
Food Safety and Inspection Servicehttp://www.fsis.usda.gov/
Description
Establishment specific sampling results for Siluriformes Product sampling projects. Current data is updated quarterly; archive data is updated annually. Data is split by FY. See the FSIS website for additional information.
g
Alabama Near Coastal Meteorological & Hydrographic Continuous Data Sampling...
gimi9.com
accession.nodc.noaa.gov
+2more
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Alabama Near Coastal Meteorological & Hydrographic Continuous Data Sampling from 2003 to present [Dataset]. https://gimi9.com/dataset/data-gov_756237c2a0ad892b5eda5cdabcda495ce3389908
Explore at:
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Area covered
Alabama
Description
The Alabama Real-time Coastal Observing System (ARCOS) with support of the Dauphin Island Sea Lab is a network of continuously sampling observing stations that collect observations of meteorological and hydrographic data from fixed stations operating across coastal Alabama. Data were collected from 2003 through the present and include parameters such as air temperature, relative humidity, solar and quantum radiation, barometric pressure, wind speed, wind direction, precipitation amounts, water temperature, salinity, dissolved oxygen, water height, and other water quality data. Stations, when possible, are designed to collect the same data in the same way, though there are exceptions given unique location needs (see individual accession abstracts for details). Stations are strategically placed to sample across salinity gradients, from delta to offshore, and the width of the coast.
f
Sample names, sampling descriptions and contextual data.
plos.figshare.com
xls
Updated May 30, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Linda A. Amaral-Zettler; Elizabeth A. McCliment; Hugh W. Ducklow; Susan M. Huse (2023). Sample names, sampling descriptions and contextual data. [Dataset]. http://doi.org/10.1371/journal.pone.0006372.t001
Explore at:
xlsAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0006372.t001
Dataset updated
May 30, 2023
Dataset provided by
PLOS ONE
Authors
Linda A. Amaral-Zettler; Elizabeth A. McCliment; Hugh W. Ducklow; Susan M. Huse
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Sample names, sampling descriptions and contextual data.
d
Water Quality Sampling Data
catalog.data.gov
data.austintexas.gov
+1more
Updated Jul 25, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
data.austintexas.gov (2025). Water Quality Sampling Data [Dataset]. https://catalog.data.gov/dataset/water-quality-sampling-data
Explore at:
Dataset updated
Jul 25, 2025
Dataset provided by
data.austintexas.gov
Description
Data collected to assess water quality conditions in the natural creeks, aquifers and lakes in the Austin area. This is raw data, provided directly from our Water Resources Monitoring database (WRM) and should be considered provisional. Data may or may not have been reviewed by project staff. A map of site locations can be found by searching for LOCATION.WRM_SAMPLE_SITES; you may then use those WRM_SITE_IDs to filter in this dataset using the field SAMPLE_SITE_NO.
f
Data from: Sample metadata
fairdomhub.org
xlsx
Updated Jul 1, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Thomas Harvey (2021). Sample metadata [Dataset]. https://fairdomhub.org/data_files/1440
Explore at:
xlsx(43.9 KB)Available download formats
Dataset updated
Jul 1, 2021
Authors
Thomas Harvey
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Information on samples submitted for RNAseq

Rows are individual samples

Columns are: ID Sample Name Date sampled Species Sex Tissue Geographic location Date extracted Extracted by Nanodrop Conc. (ng/µl) 260/280 260/230 RIN Plate ID Position Index name Index Seq Qubit BR kit Conc. (ng/ul) BioAnalyzer Conc. (ng/ul) BioAnalyzer bp (region 200-1200) Submission reference Date submitted Conc. (nM) Volume provided PE/SE Number of reads Read length
Demographic and Health Survey 1996-1997 - Bangladesh
microdata.worldbank.org
catalog.ihsn.org
+1more
Updated May 26, 2017
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Mitra & Associates/ NIPORT (2017). Demographic and Health Survey 1996-1997 - Bangladesh [Dataset]. https://microdata.worldbank.org/index.php/catalog/1335
Explore at:
Dataset updated
May 26, 2017
Dataset provided by
National Institute of Population Research and Traininghttp://niport.gov.bd/
Authors
Mitra & Associates/ NIPORT
Time period covered
1996 - 1997
Area covered
Bangladesh
Description
Abstract

The Bangladesh Demographic and Health Survey (BDHS) is part of the worldwide Demographic and Health Surveys program, which is designed to collect data on fertility, family planning, and maternal and child health.

The BDHS is intended to serve as a source of population and health data for policymakers and the research community. In general, the objectives of the BDHS are to: - assess the overall demographic situation in Bangladesh, - assist in the evaluation of the population and health programs in Bangladesh, and - advance survey methodology.

More specifically, the objective of the BDHS is to provide up-to-date information on fertility and childhood mortality levels; nuptiality; fertility preferences; awareness, approval, and use of family planning methods; breastfeeding practices; nutrition levels; and maternal and child health. This information is intended to assist policymakers and administrators in evaluating and designing programs and strategies for improving health and family planning services in the country.

Geographic coverage

National

Analysis unit

Household

Children under five years

Women age 10-49

Men age 15-59

Kind of data

Sample survey data

Sampling procedure

Bangladesh is divided into six administrative divisions, 64 districts (zillas), and 490 thanas. In rural areas, thanas are divided into unions and then mauzas, a land administrative unit. Urban areas are divided into wards and then mahallas. The 1996-97 BDHS employed a nationally-representative, two-stage sample that was selected from the Integrated Multi-Purpose Master Sample (IMPS) maintained by the Bangladesh Bureau of Statistics. Each division was stratified into three groups: 1 ) statistical metropolitan areas (SMAs), 2) municipalities (other urban areas), and 3) rural areas. 3 In the rural areas, the primary sampling unit was the mauza, while in urban areas, it was the mahalla. Because the primary sampling units in the IMPS were selected with probability proportional to size from the 1991 Census frame, the units for the BDHS were sub-selected from the IMPS with equal probability so as to retain the overall probability proportional to size. A total of 316 primary sampling units were utilized for the BDHS (30 in SMAs, 42 in municipalities, and 244 in rural areas). In order to highlight changes in survey indicators over time, the 1996-97 BDHS utilized the same sample points (though not necessarily the same households) that were selected for the 1993-94 BDHS, except for 12 additional sample points in the new division of Sylhet. Fieldwork in three sample points was not possible (one in Dhaka Cantonment and two in the Chittagong Hill Tracts), so a total of 313 points were covered.

Since one objective of the BDHS is to provide separate estimates for each division as well as for urban and rural areas separately, it was necessary to increase the sampling rate for Barisal and Sylhet Divisions and for municipalities relative to the other divisions, SMAs and rural areas. Thus, the BDHS sample is not self-weighting and weighting factors have been applied to the data in this report.

Mitra and Associates conducted a household listing operation in all the sample points from 15 September to 15 December 1996. A systematic sample of 9,099 households was then selected from these lists. Every second household was selected for the men's survey, meaning that, in addition to interviewing all ever-married women age 10-49, interviewers also interviewed all currently married men age 15-59. It was expected that the sample would yield interviews with approximately 10,000 ever-married women age 10-49 and 3,000 currently married men age 15-59.

Note: See detailed in APPENDIX A of the survey report.

Mode of data collection

Face-to-face

Research instrument

Four types of questionnaires were used for the BDHS: a Household Questionnaire, a Women's Questionnaire, a Men' s Questionnaire and a Community Questionnaire. The contents of these questionnaires were based on the DHS Model A Questionnaire, which is designed for use in countries with relatively high levels of contraceptive use. These model questionnaires were adapted for use in Bangladesh during a series of meetings with a small Technical Task Force that consisted of representatives from NIPORT, Mitra and Associates, USAID/Bangladesh, the International Centre for Diarrhoeal Disease Research, Bangladesh (ICDDR,B), Population Council/Dhaka, and Macro International Inc (see Appendix D for a list of members). Draft questionnaires were then circulated to other interested groups and were reviewed by the BDHS Technical Review Committee (see Appendix D for list of members). The questionnaires were developed in English and then translated into and printed in Bangla (see Appendix E for final version in English).

The Household Questionnaire was used to list all the usual members and visitors in the selected households. Some basic information was collected on the characteristics of each person listed, including his/her age, sex, education, and relationship to the head of the household. The main purpose of the Household Questionnaire was to identify women and men who were eligible for the individual interview. In addition, information was collected about the dwelling itself, such as the source of water, type of toilet facilities, materials used to construct the house, and ownership of various consumer goods.

The Women's Questionnaire was used to collect information from ever-married women age 10-49. These women were asked questions on the following topics: - Background characteristics (age, education, religion, etc.), - Reproductive history, - Knowledge and use of family planning methods, - Antenatal and delivery care, - Breastfeeding and weaning practices, - Vaccinations and health of children under age five, - Marriage, - Fertility preferences, - Husband's background and respondent's work, - Knowledge of AIDS, - Height and weight of children under age five and their mothers.

The Men's Questionnaire was used to interview currently married men age 15-59. It was similar to that for women except that it omitted the sections on reproductive history, antenatal and delivery care, breastfeeding, vaccinations, and height and weight. The Community Questionnaire was completed for each sample point and included questions about the existence in the community of income-generating activities and other development organizations and the availability of health and family planning services.

Response rate

A total of 9,099 households were selected for the sample, of which 8,682 were successfully interviewed. The shortfall is primarily due to dwellings that were vacant or in which the inhabitants had left for an extended period at the time they were visited by the interviewing teams. Of the 8,762 households occupied, 99 percent were successfully interviewed. In these households, 9,335 women were identified as eligible for the individual interview (i.e., ever-married and age 10-49) and interviews were completed for 9,127 or 98 percent of them. In the half of the households that were selected for inclusion in the men's survey, 3,611 eligible ever-married men age 15-59 were identified, of whom 3,346 or 93 percent were interviewed.

The principal reason for non-response among eligible women and men was the failure to find them at home despite repeated visits to the household. The refusal rate was low.

Note: See summarized response rates by residence (urban/rural) in Table 1.1 of the survey report.

Sampling error estimates

The estimates from a sample survey are affected by two types of errors: (1) non-sampling errors, and (2) sampling errors. Non-sampling errors are the results of mistakes made in implementing data collection and data processing, such as failure to locate and interview the correct household, misunderstanding of the questions on the part of either the interviewer or the respondent, and data entry errors. Although numerous efforts were made during the implementation of the BDHS to minimize this type of error, non-sampling errors are impossible to avoid and difficult to evaluate statistically.

Sampling errors, on the other hand, can be evaluated statistically. The sample of respondents selected in the BDHS is only one of many samples that could have been selected from the same population, using the same design and expected size. Each of these samples would yield results that differ somewhat from the results of the actual sample selected. Sampling errors are a measure of the variability between all possible samples. Although the degree of variability is not known exactly, it can be estimated from the survey results.

A sampling error is usually measured in terms of the standard error for a particular statistic (mean, percentage, etc.), which is the square root of the variance. The standard error can be used to calculate confidence intervals within which the true value for the population can reasonably be assumed to fall. For example, for any given statistic calculated from a sample survey, the value of that statistic will fall within a range of plus or minus two times the standard error of that statistic in 95 percent of all possible samples of identical size and design.

If the sample of respondents had been selected as a simple random sample, it would have been possible to use straightforward formulas for calculating sampling errors. However, the BDHS sample is the result of a two-stage stratified design, and, consequently, it was necessary to use more complex formulae. The computer software used to calculate sampling errors for the BDHS is the ISSA Sampling Error Module. This module used the Taylor
g
FSIS Laboratory Sampling Data - NARMS Cecal Sampling
gimi9.com
catalog.data.gov
Updated Aug 7, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2024). FSIS Laboratory Sampling Data - NARMS Cecal Sampling [Dataset]. https://gimi9.com/dataset/data-gov_fsis-narms-cecal-sampling/
Explore at:
Dataset updated
Aug 7, 2024
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Description
The data products are the sampling results from FSIS’ National Antimicrobial Resistance Monitoring System (NARMS) Cecal sampling program. Data for sampling results from NARMS Product sampling program is currently posted on the FSIS Website and are grouped by commodity (https://www.fsis.usda.gov/science-data/data-sets-visualizations/laboratory-sampling-data). The antimicrobials and bacteria tested under NARMS are selected are based on their importance to human health and use in food-producing animals (FDA Guidance for Industry # 152 (https://www.fda.gov/media/69949/download)). Cecal contents from cattle, swine, chicken, and turkeys were sampled as part of FSIS’s routine NARMS cecal sampling program for major species.
o
Data from: Data for Predictive Modelling of Laminated Composite Plates
explore.openaire.eu
data.niaid.nih.gov
+1more
Updated Jul 5, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Kanak Kalita; Shankar Chakraborty; S Madhu; Manickam Ramachandran; Xiao-Zhi Gao (2021). Data for Predictive Modelling of Laminated Composite Plates [Dataset]. http://doi.org/10.5281/zenodo.5069421
Explore at:
Unique identifier
https://doi.org/10.5281/zenodo.5069421
Dataset updated
Jul 5, 2021
Authors
Kanak Kalita; Shankar Chakraborty; S Madhu; Manickam Ramachandran; Xiao-Zhi Gao
Description
Two different problems, i.e. a low-dimensional (LD) and a high-dimensional (HD) problems are considered. The LD problem has 2 variables for a 4-ply symmetric square composite laminate. Similarly, the HD problem consists of 16 variables for a 32-ply symmetric square composite laminate. The value of h for LD and HD problems is taken as 0.005 and 0.04 respectively. For each problem, three different types of sampling technique, i.e. random sampling (RS), Latin hypercube sampling (LHS) [1] and Hammersley sampling (HS) [2] are adopted. The RS, LHS and HS primarily differ in the uniformity of sample points over the design space such that RS has the least and HS has the maximum uniform distributions of sample points. Based on the recommendations of Jin et al. [3], and Zhao and Xue [4], 72 and 612 sample points are considered in each training dataset of LD and HD problems respectively. Based on the FE formulation, several high-fidelity datasets for the LD and HD problems are generated, as presented in the Supplementary Material file “Predictive modelling of laminated composite plates.xlsx” in nine sheets that are organized as detailed out in Table 1. References: 1. McKay, M. D.; Beckman, R. J.; Conover, W. J. A comparison of three methods for selecting values of input variables in the analysis of output from a computer code. Technometrics, 2000, 42, 55-61. 2. Hammersley, J. M. Monte Carlo methods for solving multivariable problems. Annals of the New York Academy of Sciences, 1960, 86, 844-874. 3. Jin, R.; Chen, W.; Simpson, T. W. Comparative studies of metamodelling techniques under multiple modelling criteria. Structural and Multidisciplinary Optimization, 2001, 23, 1-13. 4. Zhao, D.; Xue, D. A comparative study of metamodeling methods considering sample quality merits. Structural and Multidisciplinary Optimization, 2010, 42, 923-938.
Sample data
figshare.com
application/gzip
Updated Oct 25, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Christopher Bowden (2022). Sample data [Dataset]. http://doi.org/10.6084/m9.figshare.21395373.v1
Explore at:
application/gzipAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.21395373.v1
Dataset updated
Oct 25, 2022
Dataset provided by
figshare
Figsharehttp://figshare.com/
Authors
Christopher Bowden
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Five years of data (1980-1984) that can be used as input (and represents the input format) for the associated RF code R script.

N.B. to use without any modifications to the R script, this dataset must be stored in a sub-directory named 'vars'.
Energy Consumption in Transport Survey 2014, Main Results - West Bank and...
pcbs.gov.ps
Updated Dec 12, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Palestinian Central Bureau of Statistics (2021). Energy Consumption in Transport Survey 2014, Main Results - West Bank and Gaza [Dataset]. https://www.pcbs.gov.ps/PCBS-Metadata-en-v5.2/index.php/catalog/699
Explore at:
Dataset updated
Dec 12, 2021
Dataset authored and provided by
Palestinian Central Bureau of Statisticshttp://pcbs.gov.ps/
Time period covered
2015
Area covered
Palestine, West Bank
Description
Abstract

Most countries collect official statistics on energy use due to its vital role in the infrastructure, economy and living standards.

In Palestine, additional attention is warranted for energy statistics due to a scarcity of natural resources, the high cost of energy and high population density. These factors demand comprehensive and high quality statistics.

In this contest PCBS decided to conduct a special Energy Consumption in Transport Survey to provide high quality data about energy consumption by type, expenditure on maintenance and insurance for vehicles, and questions on vehicles motor capacity and year of production.

The survey aimed to provide data on energy consumption by transport sector and also on the energy consumption by the type of vehicles and its motor capacity and year of production.

Geographic coverage

Palestine

Analysis unit

Vehicles

Universe

All the operating vehicles in Palestine in 2014.

Kind of data

Sample survey data [ssd]

Sampling procedure

Target Population: All the operating vehicles in Palestine in 2014.

2.1Sample Frame A list of the number of the operating vehicles in Palestine in 2014, they are broken down by governorates and vehicle types, this list was obtained from Ministry of transport.

2.2.1 Sample size The sample size is 6,974 vehicles.

2.2.2 Sampling Design it is stratified random sample, and in some of the small size strata the quota sample was used to cover them.

The method of reaching the vehicles sample was through : 1-reaching to all the dynamometers (the centers for testing the vehicles) 2-selecting a random sample of vehicles by type of vehicle, model, fuel type and engine capacity

Mode of data collection

Face-to-face [f2f]

Research instrument

The design of the questionnaire was based on the experiences of other similar countries in energy statistics subject to cover the most important indicators for energy statistics in transport sector, taking into account Palestine's particular situation.

Cleaning operations

The data processing stage consisted of the following operations: Editing and coding prior to data entry: all questionnaires were edited and coded in the office using the same instructions adopted for editing in the field.

Data entry: The survey questionnaire was uploaded on office computers. At this stage, data were entered into the computer using a data entry template developed in Access Database. The data entry program was prepared to satisfy a number of requirements: ·To prevent the duplication of questionnaires during data entry. ·To apply checks on the integrity and consistency of entered data. ·To handle errors in a user friendly manner. ·The ability to transfer captured data to another format for data analysis using statistical analysis software such as SPSS. Audit after data entered at this stage is data entered scrutiny by pulling the data entered file periodically and review the data and examination of abnormal values and check consistency between the different questions in the questionnaire, and if there are any errors in the data entered to be the withdrawal of the questionnaire and make sure this data and adjusted, even been getting the final data file that is the final extract data from it. Extraction Results: The extract final results of the report by using the SPSS program, and then display the results through tables to Excel format.

Response rate

80.7%

Sampling error estimates

Data of this survey may be affected by sampling errors due to use of a sample and not a complete enumeration. Therefore, certain differences are anticipated in comparison with the real values obtained through censuses. The variance was calculated for the most important indicators: the variance table is attached with the final report. There is no problem in the dissemination of results at national and regional level (North, Middle, South of West Bank, Gaza Strip).

Data appraisal

The survey sample consisted of around 6,974 vehicles, of which 5,631 vehicles completed the questionnaire, 3,652 vehicles from the West Bank and 1,979 vehicles in Gaza Strip.
e
Field data for seasonal synoptic sampling of 100 urban streams in Miami,...
portal.edirepository.org
csv
Updated Mar 10, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Liz Ortiz Muñoz; John Kominoski; Christopher Rizzie (2025). Field data for seasonal synoptic sampling of 100 urban streams in Miami, Florida (USA), 2021-2022 [Dataset]. http://doi.org/10.6073/pasta/e5c0e15e0c96eaaf9123e13727dbff4a
Explore at:
csv(49105 byte)Available download formats
Unique identifier
https://doi.org/10.6073/pasta/e5c0e15e0c96eaaf9123e13727dbff4a
Dataset updated
Mar 10, 2025
Dataset provided by
EDI
Authors
Liz Ortiz Muñoz; John Kominoski; Christopher Rizzie
Time period covered
Jul 8, 2021 - Jun 13, 2022
Area covered

Variables measured
ph, city, rain, ORP_mv, curbid, do_mgl, season, stream, temp_c, TDS_g_L, and 15 more
Description
This dataset contains field measurements taken during water sampling from 100 urban stream locations in the greater Miami, Florida metropolitan area. Field collection took place during five synoptic sampling events: Summer 2021 (Wet; July 8 to July 27), Fall 2021 (Wet; September 27 to October 7), Winter 2022 (Dry; January 3 to January 13), Spring 2022 (Dry; April 7 to April 23), and Summer 2022 (Wet; June 1 to June 13) to capture spatial and seasonal variation in stream conditions (specific conductivity, water temperature, dissolved oxygen, pH). Filtered stream samples were analyzed for dissolved organic carbon concentration and characteristics, available in a separate dataset. These data were collected as part of the Carbon in Urban Rivers Biogeochemistry (CURB) Project. Detailed field data and site data are published separately and can be linked using the “curbid” and “synoptic_event” columns in each dataset.
b
Sampling algorithms in statistical physics: a guide for statistics and...
data.bris.ac.uk
Updated Mar 1, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2023). Sampling algorithms in statistical physics: a guide for statistics and machine learning - Datasets - data.bris [Dataset]. https://data.bris.ac.uk/data/dataset/sju7uasr7e2b2n518hk72p3ur
Explore at:
Dataset updated
Mar 1, 2023
Description
This directory contains the research data published in the paper by Michael F. Faulkner and Samuel Livingstone entitled: Sampling algorithms in statistical physics: a guide for statistics and machine learning
Surface Water - Sampling Location Information
s.cnmilf.com
datasets.ai
+2more
Updated Nov 27, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
California State Water Resources Control Board (2024). Surface Water - Sampling Location Information [Dataset]. https://s.cnmilf.com/user74170196/https/catalog.data.gov/dataset/surface-water-sampling-location-information
Explore at:
Dataset updated
Nov 27, 2024
Dataset provided by
California State Water Resources Control Board
Description
Information about sampling locations for data from the California Environmental Data Exchange Network (CEDEN). This set of station/project combinations can be combined with other data sets from CEDEN to provide more information. CEDEN is the California State Water Board's data system for surface water quality in California, and seeks to include all available statewide data (such as that produced by research and volunteer organizations). Data in CEDEN include field, sediment and water column data collected from freshwater, estuarine, and marine environments. Examples of data in CEDEN come from laboratory, physical and biological analyses and include data types associated with chemical, toxicological, field, bioassessment, invertebrate, fish, and bacteriological assay assessments.
Z
UCI and OpenML Data Sets for Ordinal Quantification
data.niaid.nih.gov
zenodo.org
Updated Jul 25, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Moreo, Alejandro (2023). UCI and OpenML Data Sets for Ordinal Quantification [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_8177301
Explore at:
Dataset updated
Jul 25, 2023
Dataset provided by
Bunse, Mirko
Senz, Martin
Sebastiani, Fabrizio
Moreo, Alejandro
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
These four labeled data sets are targeted at ordinal quantification. The goal of quantification is not to predict the label of each individual instance, but the distribution of labels in unlabeled sets of data.

With the scripts provided, you can extract CSV files from the UCI machine learning repository and from OpenML. The ordinal class labels stem from a binning of a continuous regression label.

We complement this data set with the indices of data items that appear in each sample of our evaluation. Hence, you can precisely replicate our samples by drawing the specified data items. The indices stem from two evaluation protocols that are well suited for ordinal quantification. To this end, each row in the files app_val_indices.csv, app_tst_indices.csv, app-oq_val_indices.csv, and app-oq_tst_indices.csv represents one sample.

Our first protocol is the artificial prevalence protocol (APP), where all possible distributions of labels are drawn with an equal probability. The second protocol, APP-OQ, is a variant thereof, where only the smoothest 20% of all APP samples are considered. This variant is targeted at ordinal quantification tasks, where classes are ordered and a similarity of neighboring classes can be assumed.

Usage

You can extract four CSV files through the provided script extract-oq.jl, which is conveniently wrapped in a Makefile. The Project.toml and Manifest.toml specify the Julia package dependencies, similar to a requirements file in Python.

Preliminaries: You have to have a working Julia installation. We have used Julia v1.6.5 in our experiments.

Data Extraction: In your terminal, you can call either

make

(recommended), or

julia --project="." --eval "using Pkg; Pkg.instantiate()" julia --project="." extract-oq.jl

Outcome: The first row in each CSV file is the header. The first column, named "class_label", is the ordinal class.

Further Reading

Implementation of our experiments: https://github.com/mirkobunse/regularized-oq
B
Data Cleaning Sample
borealisdata.ca
Updated Jul 13, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Rong Luo (2023). Data Cleaning Sample [Dataset]. http://doi.org/10.5683/SP3/ZCN177
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Unique identifier
https://doi.org/10.5683/SP3/ZCN177
Dataset updated
Jul 13, 2023
Dataset provided by
Borealis
Authors
Rong Luo
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Description
Sample data for exercises in Further Adventures in Data Cleaning.

Facebook

Twitter

Click to copy link

Link copied

Cite

Laura K. Taylor; Xin Tong; Scott E. Maxwell (2024). Evaluating Supplemental Samples in Longitudinal Research: Replacement and Refreshment Approaches [Dataset]. http://doi.org/10.6084/m9.figshare.12162072.v1

Data from: Evaluating Supplemental Samples in Longitudinal Research: Replacement and Refreshment Approaches

Explore at:

txtAvailable download formats

Unique identifier

https://doi.org/10.6084/m9.figshare.12162072.v1

Dataset updated

Feb 9, 2024

Dataset provided by

Taylor & Francis

Authors

Laura K. Taylor; Xin Tong; Scott E. Maxwell

License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

Despite the wide application of longitudinal studies, they are often plagued by missing data and attrition. The majority of methodological approaches focus on participant retention or modern missing data analysis procedures. This paper, however, takes a new approach by examining how researchers may supplement the sample with additional participants. First, refreshment samples use the same selection criteria as the initial study. Second, replacement samples identify auxiliary variables that may help explain patterns of missingness and select new participants based on those characteristics. A simulation study compares these two strategies for a linear growth model with five measurement occasions. Overall, the results suggest that refreshment samples lead to less relative bias, greater relative efficiency, and more acceptable coverage rates than replacement samples or not supplementing the missing participants in any way. Refreshment samples also have high statistical power. The comparative strengths of the refreshment approach are further illustrated through a real data example. These findings have implications for assessing change over time when researching at-risk samples with high levels of permanent attrition.

Clear search

Close search

Google apps

Main menu

Data from: Evaluating Supplemental Samples in Longitudinal Research:...

Dataset for paper: How Twitter Data Sampling Biases U.S. Voter Behavior...

FSIS Laboratory Sampling Data - Raw Beef Sampling

Data from: GIRT-Data: Sampling GitHub Issue Report Templates

Sampling data.

FSIS Laboratory Sampling Data - Siluriformes Product Sampling

Alabama Near Coastal Meteorological & Hydrographic Continuous Data Sampling...

Sample names, sampling descriptions and contextual data.

Water Quality Sampling Data

Data from: Sample metadata

Demographic and Health Survey 1996-1997 - Bangladesh

Abstract

Geographic coverage

Analysis unit

Kind of data

Sampling procedure

Mode of data collection

Research instrument

Response rate

Sampling error estimates

FSIS Laboratory Sampling Data - NARMS Cecal Sampling

Data from: Data for Predictive Modelling of Laminated Composite Plates

Sample data

Energy Consumption in Transport Survey 2014, Main Results - West Bank and...

Abstract

Geographic coverage

Analysis unit

Universe

Kind of data

Sampling procedure

Mode of data collection

Research instrument

Cleaning operations

Response rate

Sampling error estimates

Data appraisal

Field data for seasonal synoptic sampling of 100 urban streams in Miami,...

Sampling algorithms in statistical physics: a guide for statistics and...

Surface Water - Sampling Location Information

UCI and OpenML Data Sets for Ordinal Quantification

Data Cleaning Sample

Data from: Evaluating Supplemental Samples in Longitudinal Research: Replacement and Refreshment Approaches