100+ datasets found
  1. f

    Data from: Evaluating Supplemental Samples in Longitudinal Research:...

    • tandf.figshare.com
    txt
    Updated Feb 9, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Laura K. Taylor; Xin Tong; Scott E. Maxwell (2024). Evaluating Supplemental Samples in Longitudinal Research: Replacement and Refreshment Approaches [Dataset]. http://doi.org/10.6084/m9.figshare.12162072.v1
    Explore at:
    txtAvailable download formats
    Dataset updated
    Feb 9, 2024
    Dataset provided by
    Taylor & Francis
    Authors
    Laura K. Taylor; Xin Tong; Scott E. Maxwell
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Despite the wide application of longitudinal studies, they are often plagued by missing data and attrition. The majority of methodological approaches focus on participant retention or modern missing data analysis procedures. This paper, however, takes a new approach by examining how researchers may supplement the sample with additional participants. First, refreshment samples use the same selection criteria as the initial study. Second, replacement samples identify auxiliary variables that may help explain patterns of missingness and select new participants based on those characteristics. A simulation study compares these two strategies for a linear growth model with five measurement occasions. Overall, the results suggest that refreshment samples lead to less relative bias, greater relative efficiency, and more acceptable coverage rates than replacement samples or not supplementing the missing participants in any way. Refreshment samples also have high statistical power. The comparative strengths of the refreshment approach are further illustrated through a real data example. These findings have implications for assessing change over time when researching at-risk samples with high levels of permanent attrition.

  2. Dataset for paper: How Twitter Data Sampling Biases U.S. Voter Behavior...

    • zenodo.org
    zip
    Updated May 14, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Kai-Cheng Yang; Pik-Mai Hui; Filippo Menczer; Kai-Cheng Yang; Pik-Mai Hui; Filippo Menczer (2022). Dataset for paper: How Twitter Data Sampling Biases U.S. Voter Behavior Characterizations [Dataset]. http://doi.org/10.5281/zenodo.6547792
    Explore at:
    zipAvailable download formats
    Dataset updated
    May 14, 2022
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Kai-Cheng Yang; Pik-Mai Hui; Filippo Menczer; Kai-Cheng Yang; Pik-Mai Hui; Filippo Menczer
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    United States
    Description

    This repository contains the data and code for the paper "How Twitter Data Sampling Biases U.S. Voter Behavior Characterizations."

  3. d

    FSIS Laboratory Sampling Data - Raw Beef Sampling

    • catalog.data.gov
    • s.cnmilf.com
    Updated May 8, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Food Safety and Inspection Service (2025). FSIS Laboratory Sampling Data - Raw Beef Sampling [Dataset]. https://catalog.data.gov/dataset/fsis-raw-beef-sampling-data
    Explore at:
    Dataset updated
    May 8, 2025
    Dataset provided by
    Food Safety and Inspection Servicehttp://www.fsis.usda.gov/
    Description

    Establishment specific sampling results for Raw Beef sampling projects. Current data is updated quarterly; archive data is updated annually. Data is split by FY. See the FSIS website for additional information.

  4. Z

    Data from: GIRT-Data: Sampling GitHub Issue Report Templates

    • data.niaid.nih.gov
    • zenodo.org
    Updated Mar 13, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Hinrich Schütze (2023). GIRT-Data: Sampling GitHub Issue Report Templates [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_7724792
    Explore at:
    Dataset updated
    Mar 13, 2023
    Dataset provided by
    Amir Hossein Kargaran
    Nafiseh Nikeghbal
    Abbas Heydarnoori
    Hinrich Schütze
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    GIRT-Data is the first and largest dataset of issue report templates (IRTs) in both YAML and Markdown format. This dataset and its corresponding open-source crawler tool are intended to support research in this area and to encourage more developers to use IRTs in their repositories. The stable version of the dataset, containing 1_084_300 repositories, that 50_032 of them support IRTs.

    For more details see the GitHub page of the dataset: https://github.com/kargaranamir/girt-data

    The dataset is accepted for MSR 2023 conference, under the title of "GIRT-Data: Sampling GitHub Issue Report Templates" Search in Google Scholar.

  5. f

    Sampling data.

    • figshare.com
    xls
    Updated Jun 1, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Lena Teuber; Anna Schukat; Wilhelm Hagen; Holger Auel (2023). Sampling data. [Dataset]. http://doi.org/10.1371/journal.pone.0077590.t001
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jun 1, 2023
    Dataset provided by
    PLOS ONE
    Authors
    Lena Teuber; Anna Schukat; Wilhelm Hagen; Holger Auel
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Sampling intervals highlighted in bold numbers indicate the approximate vertical extent of the oxygen minimum zone (O2≤45 µmol kg−1). D = Discovery cruise, MSM = Maria S. Merian cruises, UTC = universal time code, O2 min = lowest oxygen concentration at the respective station, O2 min depth = depth of the oxygen minimum at the respective station, SST = sea surface temperature, n.d. = no data, * = stations analysed for copepod abundance.

  6. d

    FSIS Laboratory Sampling Data - Siluriformes Product Sampling

    • catalog.data.gov
    Updated May 8, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Food Safety and Inspection Service (2025). FSIS Laboratory Sampling Data - Siluriformes Product Sampling [Dataset]. https://catalog.data.gov/dataset/fsis-raw-siluriformes-product-sampling-data
    Explore at:
    Dataset updated
    May 8, 2025
    Dataset provided by
    Food Safety and Inspection Servicehttp://www.fsis.usda.gov/
    Description

    Establishment specific sampling results for Siluriformes Product sampling projects. Current data is updated quarterly; archive data is updated annually. Data is split by FY. See the FSIS website for additional information.

  7. g

    Alabama Near Coastal Meteorological & Hydrographic Continuous Data Sampling...

    • gimi9.com
    • accession.nodc.noaa.gov
    • +2more
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Alabama Near Coastal Meteorological & Hydrographic Continuous Data Sampling from 2003 to present [Dataset]. https://gimi9.com/dataset/data-gov_756237c2a0ad892b5eda5cdabcda495ce3389908
    Explore at:
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Area covered
    Alabama
    Description

    The Alabama Real-time Coastal Observing System (ARCOS) with support of the Dauphin Island Sea Lab is a network of continuously sampling observing stations that collect observations of meteorological and hydrographic data from fixed stations operating across coastal Alabama. Data were collected from 2003 through the present and include parameters such as air temperature, relative humidity, solar and quantum radiation, barometric pressure, wind speed, wind direction, precipitation amounts, water temperature, salinity, dissolved oxygen, water height, and other water quality data. Stations, when possible, are designed to collect the same data in the same way, though there are exceptions given unique location needs (see individual accession abstracts for details). Stations are strategically placed to sample across salinity gradients, from delta to offshore, and the width of the coast.

  8. f

    Sample names, sampling descriptions and contextual data.

    • plos.figshare.com
    xls
    Updated May 30, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Linda A. Amaral-Zettler; Elizabeth A. McCliment; Hugh W. Ducklow; Susan M. Huse (2023). Sample names, sampling descriptions and contextual data. [Dataset]. http://doi.org/10.1371/journal.pone.0006372.t001
    Explore at:
    xlsAvailable download formats
    Dataset updated
    May 30, 2023
    Dataset provided by
    PLOS ONE
    Authors
    Linda A. Amaral-Zettler; Elizabeth A. McCliment; Hugh W. Ducklow; Susan M. Huse
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Sample names, sampling descriptions and contextual data.

  9. d

    Water Quality Sampling Data

    • catalog.data.gov
    • data.austintexas.gov
    • +1more
    Updated Jul 25, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    data.austintexas.gov (2025). Water Quality Sampling Data [Dataset]. https://catalog.data.gov/dataset/water-quality-sampling-data
    Explore at:
    Dataset updated
    Jul 25, 2025
    Dataset provided by
    data.austintexas.gov
    Description

    Data collected to assess water quality conditions in the natural creeks, aquifers and lakes in the Austin area. This is raw data, provided directly from our Water Resources Monitoring database (WRM) and should be considered provisional. Data may or may not have been reviewed by project staff. A map of site locations can be found by searching for LOCATION.WRM_SAMPLE_SITES; you may then use those WRM_SITE_IDs to filter in this dataset using the field SAMPLE_SITE_NO.

  10. f

    Data from: Sample metadata

    • fairdomhub.org
    xlsx
    Updated Jul 1, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Thomas Harvey (2021). Sample metadata [Dataset]. https://fairdomhub.org/data_files/1440
    Explore at:
    xlsx(43.9 KB)Available download formats
    Dataset updated
    Jul 1, 2021
    Authors
    Thomas Harvey
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Information on samples submitted for RNAseq

    Rows are individual samples

    Columns are: ID Sample Name Date sampled Species Sex Tissue Geographic location Date extracted Extracted by Nanodrop Conc. (ng/µl) 260/280 260/230 RIN Plate ID Position Index name Index Seq Qubit BR kit Conc. (ng/ul) BioAnalyzer Conc. (ng/ul) BioAnalyzer bp (region 200-1200) Submission reference Date submitted Conc. (nM) Volume provided PE/SE Number of reads Read length

  11. Demographic and Health Survey 1996-1997 - Bangladesh

    • microdata.worldbank.org
    • catalog.ihsn.org
    • +1more
    Updated May 26, 2017
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mitra & Associates/ NIPORT (2017). Demographic and Health Survey 1996-1997 - Bangladesh [Dataset]. https://microdata.worldbank.org/index.php/catalog/1335
    Explore at:
    Dataset updated
    May 26, 2017
    Dataset provided by
    National Institute of Population Research and Traininghttp://niport.gov.bd/
    Authors
    Mitra & Associates/ NIPORT
    Time period covered
    1996 - 1997
    Area covered
    Bangladesh
    Description

    Abstract

    The Bangladesh Demographic and Health Survey (BDHS) is part of the worldwide Demographic and Health Surveys program, which is designed to collect data on fertility, family planning, and maternal and child health.

    The BDHS is intended to serve as a source of population and health data for policymakers and the research community. In general, the objectives of the BDHS are to: - assess the overall demographic situation in Bangladesh, - assist in the evaluation of the population and health programs in Bangladesh, and - advance survey methodology.

    More specifically, the objective of the BDHS is to provide up-to-date information on fertility and childhood mortality levels; nuptiality; fertility preferences; awareness, approval, and use of family planning methods; breastfeeding practices; nutrition levels; and maternal and child health. This information is intended to assist policymakers and administrators in evaluating and designing programs and strategies for improving health and family planning services in the country.

    Geographic coverage

    National

    Analysis unit

    • Household
    • Children under five years
    • Women age 10-49
    • Men age 15-59

    Kind of data

    Sample survey data

    Sampling procedure

    Bangladesh is divided into six administrative divisions, 64 districts (zillas), and 490 thanas. In rural areas, thanas are divided into unions and then mauzas, a land administrative unit. Urban areas are divided into wards and then mahallas. The 1996-97 BDHS employed a nationally-representative, two-stage sample that was selected from the Integrated Multi-Purpose Master Sample (IMPS) maintained by the Bangladesh Bureau of Statistics. Each division was stratified into three groups: 1 ) statistical metropolitan areas (SMAs), 2) municipalities (other urban areas), and 3) rural areas. 3 In the rural areas, the primary sampling unit was the mauza, while in urban areas, it was the mahalla. Because the primary sampling units in the IMPS were selected with probability proportional to size from the 1991 Census frame, the units for the BDHS were sub-selected from the IMPS with equal probability so as to retain the overall probability proportional to size. A total of 316 primary sampling units were utilized for the BDHS (30 in SMAs, 42 in municipalities, and 244 in rural areas). In order to highlight changes in survey indicators over time, the 1996-97 BDHS utilized the same sample points (though not necessarily the same households) that were selected for the 1993-94 BDHS, except for 12 additional sample points in the new division of Sylhet. Fieldwork in three sample points was not possible (one in Dhaka Cantonment and two in the Chittagong Hill Tracts), so a total of 313 points were covered.

    Since one objective of the BDHS is to provide separate estimates for each division as well as for urban and rural areas separately, it was necessary to increase the sampling rate for Barisal and Sylhet Divisions and for municipalities relative to the other divisions, SMAs and rural areas. Thus, the BDHS sample is not self-weighting and weighting factors have been applied to the data in this report.

    Mitra and Associates conducted a household listing operation in all the sample points from 15 September to 15 December 1996. A systematic sample of 9,099 households was then selected from these lists. Every second household was selected for the men's survey, meaning that, in addition to interviewing all ever-married women age 10-49, interviewers also interviewed all currently married men age 15-59. It was expected that the sample would yield interviews with approximately 10,000 ever-married women age 10-49 and 3,000 currently married men age 15-59.

    Note: See detailed in APPENDIX A of the survey report.

    Mode of data collection

    Face-to-face

    Research instrument

    Four types of questionnaires were used for the BDHS: a Household Questionnaire, a Women's Questionnaire, a Men' s Questionnaire and a Community Questionnaire. The contents of these questionnaires were based on the DHS Model A Questionnaire, which is designed for use in countries with relatively high levels of contraceptive use. These model questionnaires were adapted for use in Bangladesh during a series of meetings with a small Technical Task Force that consisted of representatives from NIPORT, Mitra and Associates, USAID/Bangladesh, the International Centre for Diarrhoeal Disease Research, Bangladesh (ICDDR,B), Population Council/Dhaka, and Macro International Inc (see Appendix D for a list of members). Draft questionnaires were then circulated to other interested groups and were reviewed by the BDHS Technical Review Committee (see Appendix D for list of members). The questionnaires were developed in English and then translated into and printed in Bangla (see Appendix E for final version in English).

    The Household Questionnaire was used to list all the usual members and visitors in the selected households. Some basic information was collected on the characteristics of each person listed, including his/her age, sex, education, and relationship to the head of the household. The main purpose of the Household Questionnaire was to identify women and men who were eligible for the individual interview. In addition, information was collected about the dwelling itself, such as the source of water, type of toilet facilities, materials used to construct the house, and ownership of various consumer goods.

    The Women's Questionnaire was used to collect information from ever-married women age 10-49. These women were asked questions on the following topics: - Background characteristics (age, education, religion, etc.), - Reproductive history, - Knowledge and use of family planning methods, - Antenatal and delivery care, - Breastfeeding and weaning practices, - Vaccinations and health of children under age five, - Marriage, - Fertility preferences, - Husband's background and respondent's work, - Knowledge of AIDS, - Height and weight of children under age five and their mothers.

    The Men's Questionnaire was used to interview currently married men age 15-59. It was similar to that for women except that it omitted the sections on reproductive history, antenatal and delivery care, breastfeeding, vaccinations, and height and weight. The Community Questionnaire was completed for each sample point and included questions about the existence in the community of income-generating activities and other development organizations and the availability of health and family planning services.

    Response rate

    A total of 9,099 households were selected for the sample, of which 8,682 were successfully interviewed. The shortfall is primarily due to dwellings that were vacant or in which the inhabitants had left for an extended period at the time they were visited by the interviewing teams. Of the 8,762 households occupied, 99 percent were successfully interviewed. In these households, 9,335 women were identified as eligible for the individual interview (i.e., ever-married and age 10-49) and interviews were completed for 9,127 or 98 percent of them. In the half of the households that were selected for inclusion in the men's survey, 3,611 eligible ever-married men age 15-59 were identified, of whom 3,346 or 93 percent were interviewed.

    The principal reason for non-response among eligible women and men was the failure to find them at home despite repeated visits to the household. The refusal rate was low.

    Note: See summarized response rates by residence (urban/rural) in Table 1.1 of the survey report.

    Sampling error estimates

    The estimates from a sample survey are affected by two types of errors: (1) non-sampling errors, and (2) sampling errors. Non-sampling errors are the results of mistakes made in implementing data collection and data processing, such as failure to locate and interview the correct household, misunderstanding of the questions on the part of either the interviewer or the respondent, and data entry errors. Although numerous efforts were made during the implementation of the BDHS to minimize this type of error, non-sampling errors are impossible to avoid and difficult to evaluate statistically.

    Sampling errors, on the other hand, can be evaluated statistically. The sample of respondents selected in the BDHS is only one of many samples that could have been selected from the same population, using the same design and expected size. Each of these samples would yield results that differ somewhat from the results of the actual sample selected. Sampling errors are a measure of the variability between all possible samples. Although the degree of variability is not known exactly, it can be estimated from the survey results.

    A sampling error is usually measured in terms of the standard error for a particular statistic (mean, percentage, etc.), which is the square root of the variance. The standard error can be used to calculate confidence intervals within which the true value for the population can reasonably be assumed to fall. For example, for any given statistic calculated from a sample survey, the value of that statistic will fall within a range of plus or minus two times the standard error of that statistic in 95 percent of all possible samples of identical size and design.

    If the sample of respondents had been selected as a simple random sample, it would have been possible to use straightforward formulas for calculating sampling errors. However, the BDHS sample is the result of a two-stage stratified design, and, consequently, it was necessary to use more complex formulae. The computer software used to calculate sampling errors for the BDHS is the ISSA Sampling Error Module. This module used the Taylor

  12. g

    FSIS Laboratory Sampling Data - NARMS Cecal Sampling

    • gimi9.com
    • catalog.data.gov
    Updated Aug 7, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2024). FSIS Laboratory Sampling Data - NARMS Cecal Sampling [Dataset]. https://gimi9.com/dataset/data-gov_fsis-narms-cecal-sampling/
    Explore at:
    Dataset updated
    Aug 7, 2024
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    The data products are the sampling results from FSIS’ National Antimicrobial Resistance Monitoring System (NARMS) Cecal sampling program. Data for sampling results from NARMS Product sampling program is currently posted on the FSIS Website and are grouped by commodity (https://www.fsis.usda.gov/science-data/data-sets-visualizations/laboratory-sampling-data). The antimicrobials and bacteria tested under NARMS are selected are based on their importance to human health and use in food-producing animals (FDA Guidance for Industry # 152 (https://www.fda.gov/media/69949/download)). Cecal contents from cattle, swine, chicken, and turkeys were sampled as part of FSIS’s routine NARMS cecal sampling program for major species.

  13. o

    Data from: Data for Predictive Modelling of Laminated Composite Plates

    • explore.openaire.eu
    • data.niaid.nih.gov
    • +1more
    Updated Jul 5, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Kanak Kalita; Shankar Chakraborty; S Madhu; Manickam Ramachandran; Xiao-Zhi Gao (2021). Data for Predictive Modelling of Laminated Composite Plates [Dataset]. http://doi.org/10.5281/zenodo.5069421
    Explore at:
    Dataset updated
    Jul 5, 2021
    Authors
    Kanak Kalita; Shankar Chakraborty; S Madhu; Manickam Ramachandran; Xiao-Zhi Gao
    Description

    Two different problems, i.e. a low-dimensional (LD) and a high-dimensional (HD) problems are considered. The LD problem has 2 variables for a 4-ply symmetric square composite laminate. Similarly, the HD problem consists of 16 variables for a 32-ply symmetric square composite laminate. The value of h for LD and HD problems is taken as 0.005 and 0.04 respectively. For each problem, three different types of sampling technique, i.e. random sampling (RS), Latin hypercube sampling (LHS) [1] and Hammersley sampling (HS) [2] are adopted. The RS, LHS and HS primarily differ in the uniformity of sample points over the design space such that RS has the least and HS has the maximum uniform distributions of sample points. Based on the recommendations of Jin et al. [3], and Zhao and Xue [4], 72 and 612 sample points are considered in each training dataset of LD and HD problems respectively. Based on the FE formulation, several high-fidelity datasets for the LD and HD problems are generated, as presented in the Supplementary Material file “Predictive modelling of laminated composite plates.xlsx” in nine sheets that are organized as detailed out in Table 1. References: 1. McKay, M. D.; Beckman, R. J.; Conover, W. J. A comparison of three methods for selecting values of input variables in the analysis of output from a computer code. Technometrics, 2000, 42, 55-61. 2. Hammersley, J. M. Monte Carlo methods for solving multivariable problems. Annals of the New York Academy of Sciences, 1960, 86, 844-874. 3. Jin, R.; Chen, W.; Simpson, T. W. Comparative studies of metamodelling techniques under multiple modelling criteria. Structural and Multidisciplinary Optimization, 2001, 23, 1-13. 4. Zhao, D.; Xue, D. A comparative study of metamodeling methods considering sample quality merits. Structural and Multidisciplinary Optimization, 2010, 42, 923-938.

  14. Sample data

    • figshare.com
    application/gzip
    Updated Oct 25, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Christopher Bowden (2022). Sample data [Dataset]. http://doi.org/10.6084/m9.figshare.21395373.v1
    Explore at:
    application/gzipAvailable download formats
    Dataset updated
    Oct 25, 2022
    Dataset provided by
    figshare
    Figsharehttp://figshare.com/
    Authors
    Christopher Bowden
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Five years of data (1980-1984) that can be used as input (and represents the input format) for the associated RF code R script.

    N.B. to use without any modifications to the R script, this dataset must be stored in a sub-directory named 'vars'.

  15. Energy Consumption in Transport Survey 2014, Main Results - West Bank and...

    • pcbs.gov.ps
    Updated Dec 12, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Palestinian Central Bureau of Statistics (2021). Energy Consumption in Transport Survey 2014, Main Results - West Bank and Gaza [Dataset]. https://www.pcbs.gov.ps/PCBS-Metadata-en-v5.2/index.php/catalog/699
    Explore at:
    Dataset updated
    Dec 12, 2021
    Dataset authored and provided by
    Palestinian Central Bureau of Statisticshttp://pcbs.gov.ps/
    Time period covered
    2015
    Area covered
    Palestine, West Bank
    Description

    Abstract

    Most countries collect official statistics on energy use due to its vital role in the infrastructure, economy and living standards.

    In Palestine, additional attention is warranted for energy statistics due to a scarcity of natural resources, the high cost of energy and high population density. These factors demand comprehensive and high quality statistics.

    In this contest PCBS decided to conduct a special Energy Consumption in Transport Survey to provide high quality data about energy consumption by type, expenditure on maintenance and insurance for vehicles, and questions on vehicles motor capacity and year of production.

    The survey aimed to provide data on energy consumption by transport sector and also on the energy consumption by the type of vehicles and its motor capacity and year of production.

    Geographic coverage

    Palestine

    Analysis unit

    Vehicles

    Universe

    All the operating vehicles in Palestine in 2014.

    Kind of data

    Sample survey data [ssd]

    Sampling procedure

    Target Population: All the operating vehicles in Palestine in 2014.

    2.1Sample Frame A list of the number of the operating vehicles in Palestine in 2014, they are broken down by governorates and vehicle types, this list was obtained from Ministry of transport.

    2.2.1 Sample size The sample size is 6,974 vehicles.

    2.2.2 Sampling Design it is stratified random sample, and in some of the small size strata the quota sample was used to cover them.

    The method of reaching the vehicles sample was through : 1-reaching to all the dynamometers (the centers for testing the vehicles) 2-selecting a random sample of vehicles by type of vehicle, model, fuel type and engine capacity

    Mode of data collection

    Face-to-face [f2f]

    Research instrument

    The design of the questionnaire was based on the experiences of other similar countries in energy statistics subject to cover the most important indicators for energy statistics in transport sector, taking into account Palestine's particular situation.

    Cleaning operations

    The data processing stage consisted of the following operations: Editing and coding prior to data entry: all questionnaires were edited and coded in the office using the same instructions adopted for editing in the field.

    Data entry: The survey questionnaire was uploaded on office computers. At this stage, data were entered into the computer using a data entry template developed in Access Database. The data entry program was prepared to satisfy a number of requirements: ·To prevent the duplication of questionnaires during data entry. ·To apply checks on the integrity and consistency of entered data. ·To handle errors in a user friendly manner. ·The ability to transfer captured data to another format for data analysis using statistical analysis software such as SPSS. Audit after data entered at this stage is data entered scrutiny by pulling the data entered file periodically and review the data and examination of abnormal values and check consistency between the different questions in the questionnaire, and if there are any errors in the data entered to be the withdrawal of the questionnaire and make sure this data and adjusted, even been getting the final data file that is the final extract data from it. Extraction Results: The extract final results of the report by using the SPSS program, and then display the results through tables to Excel format.

    Response rate

    80.7%

    Sampling error estimates

    Data of this survey may be affected by sampling errors due to use of a sample and not a complete enumeration. Therefore, certain differences are anticipated in comparison with the real values obtained through censuses. The variance was calculated for the most important indicators: the variance table is attached with the final report. There is no problem in the dissemination of results at national and regional level (North, Middle, South of West Bank, Gaza Strip).

    Data appraisal

    The survey sample consisted of around 6,974 vehicles, of which 5,631 vehicles completed the questionnaire, 3,652 vehicles from the West Bank and 1,979 vehicles in Gaza Strip.

  16. e

    Field data for seasonal synoptic sampling of 100 urban streams in Miami,...

    • portal.edirepository.org
    csv
    Updated Mar 10, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Liz Ortiz Muñoz; John Kominoski; Christopher Rizzie (2025). Field data for seasonal synoptic sampling of 100 urban streams in Miami, Florida (USA), 2021-2022 [Dataset]. http://doi.org/10.6073/pasta/e5c0e15e0c96eaaf9123e13727dbff4a
    Explore at:
    csv(49105 byte)Available download formats
    Dataset updated
    Mar 10, 2025
    Dataset provided by
    EDI
    Authors
    Liz Ortiz Muñoz; John Kominoski; Christopher Rizzie
    Time period covered
    Jul 8, 2021 - Jun 13, 2022
    Area covered
    Variables measured
    ph, city, rain, ORP_mv, curbid, do_mgl, season, stream, temp_c, TDS_g_L, and 15 more
    Description

    This dataset contains field measurements taken during water sampling from 100 urban stream locations in the greater Miami, Florida metropolitan area. Field collection took place during five synoptic sampling events: Summer 2021 (Wet; July 8 to July 27), Fall 2021 (Wet; September 27 to October 7), Winter 2022 (Dry; January 3 to January 13), Spring 2022 (Dry; April 7 to April 23), and Summer 2022 (Wet; June 1 to June 13) to capture spatial and seasonal variation in stream conditions (specific conductivity, water temperature, dissolved oxygen, pH). Filtered stream samples were analyzed for dissolved organic carbon concentration and characteristics, available in a separate dataset. These data were collected as part of the Carbon in Urban Rivers Biogeochemistry (CURB) Project. Detailed field data and site data are published separately and can be linked using the “curbid” and “synoptic_event” columns in each dataset.

  17. b

    Sampling algorithms in statistical physics: a guide for statistics and...

    • data.bris.ac.uk
    Updated Mar 1, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2023). Sampling algorithms in statistical physics: a guide for statistics and machine learning - Datasets - data.bris [Dataset]. https://data.bris.ac.uk/data/dataset/sju7uasr7e2b2n518hk72p3ur
    Explore at:
    Dataset updated
    Mar 1, 2023
    Description

    This directory contains the research data published in the paper by Michael F. Faulkner and Samuel Livingstone entitled: Sampling algorithms in statistical physics: a guide for statistics and machine learning

  18. Surface Water - Sampling Location Information

    • s.cnmilf.com
    • datasets.ai
    • +2more
    Updated Nov 27, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    California State Water Resources Control Board (2024). Surface Water - Sampling Location Information [Dataset]. https://s.cnmilf.com/user74170196/https/catalog.data.gov/dataset/surface-water-sampling-location-information
    Explore at:
    Dataset updated
    Nov 27, 2024
    Dataset provided by
    California State Water Resources Control Board
    Description

    Information about sampling locations for data from the California Environmental Data Exchange Network (CEDEN). This set of station/project combinations can be combined with other data sets from CEDEN to provide more information. CEDEN is the California State Water Board's data system for surface water quality in California, and seeks to include all available statewide data (such as that produced by research and volunteer organizations). Data in CEDEN include field, sediment and water column data collected from freshwater, estuarine, and marine environments. Examples of data in CEDEN come from laboratory, physical and biological analyses and include data types associated with chemical, toxicological, field, bioassessment, invertebrate, fish, and bacteriological assay assessments.

  19. Z

    UCI and OpenML Data Sets for Ordinal Quantification

    • data.niaid.nih.gov
    • zenodo.org
    Updated Jul 25, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Moreo, Alejandro (2023). UCI and OpenML Data Sets for Ordinal Quantification [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_8177301
    Explore at:
    Dataset updated
    Jul 25, 2023
    Dataset provided by
    Bunse, Mirko
    Senz, Martin
    Sebastiani, Fabrizio
    Moreo, Alejandro
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    These four labeled data sets are targeted at ordinal quantification. The goal of quantification is not to predict the label of each individual instance, but the distribution of labels in unlabeled sets of data.

    With the scripts provided, you can extract CSV files from the UCI machine learning repository and from OpenML. The ordinal class labels stem from a binning of a continuous regression label.

    We complement this data set with the indices of data items that appear in each sample of our evaluation. Hence, you can precisely replicate our samples by drawing the specified data items. The indices stem from two evaluation protocols that are well suited for ordinal quantification. To this end, each row in the files app_val_indices.csv, app_tst_indices.csv, app-oq_val_indices.csv, and app-oq_tst_indices.csv represents one sample.

    Our first protocol is the artificial prevalence protocol (APP), where all possible distributions of labels are drawn with an equal probability. The second protocol, APP-OQ, is a variant thereof, where only the smoothest 20% of all APP samples are considered. This variant is targeted at ordinal quantification tasks, where classes are ordered and a similarity of neighboring classes can be assumed.

    Usage

    You can extract four CSV files through the provided script extract-oq.jl, which is conveniently wrapped in a Makefile. The Project.toml and Manifest.toml specify the Julia package dependencies, similar to a requirements file in Python.

    Preliminaries: You have to have a working Julia installation. We have used Julia v1.6.5 in our experiments.

    Data Extraction: In your terminal, you can call either

    make

    (recommended), or

    julia --project="." --eval "using Pkg; Pkg.instantiate()" julia --project="." extract-oq.jl

    Outcome: The first row in each CSV file is the header. The first column, named "class_label", is the ordinal class.

    Further Reading

    Implementation of our experiments: https://github.com/mirkobunse/regularized-oq

  20. B

    Data Cleaning Sample

    • borealisdata.ca
    Updated Jul 13, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Rong Luo (2023). Data Cleaning Sample [Dataset]. http://doi.org/10.5683/SP3/ZCN177
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jul 13, 2023
    Dataset provided by
    Borealis
    Authors
    Rong Luo
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    Sample data for exercises in Further Adventures in Data Cleaning.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Laura K. Taylor; Xin Tong; Scott E. Maxwell (2024). Evaluating Supplemental Samples in Longitudinal Research: Replacement and Refreshment Approaches [Dataset]. http://doi.org/10.6084/m9.figshare.12162072.v1

Data from: Evaluating Supplemental Samples in Longitudinal Research: Replacement and Refreshment Approaches

Related Article
Explore at:
txtAvailable download formats
Dataset updated
Feb 9, 2024
Dataset provided by
Taylor & Francis
Authors
Laura K. Taylor; Xin Tong; Scott E. Maxwell
License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

Despite the wide application of longitudinal studies, they are often plagued by missing data and attrition. The majority of methodological approaches focus on participant retention or modern missing data analysis procedures. This paper, however, takes a new approach by examining how researchers may supplement the sample with additional participants. First, refreshment samples use the same selection criteria as the initial study. Second, replacement samples identify auxiliary variables that may help explain patterns of missingness and select new participants based on those characteristics. A simulation study compares these two strategies for a linear growth model with five measurement occasions. Overall, the results suggest that refreshment samples lead to less relative bias, greater relative efficiency, and more acceptable coverage rates than replacement samples or not supplementing the missing participants in any way. Refreshment samples also have high statistical power. The comparative strengths of the refreshment approach are further illustrated through a real data example. These findings have implications for assessing change over time when researching at-risk samples with high levels of permanent attrition.

Search
Clear search
Close search
Google apps
Main menu