100+ datasets found
  1. m

    Data from: Probability waves: adaptive cluster-based correction by...

    • data.mendeley.com
    • narcis.nl
    Updated Feb 8, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    DIMITRI ABRAMOV (2021). Probability waves: adaptive cluster-based correction by convolution of p-value series from mass univariate analysis [Dataset]. http://doi.org/10.17632/rrm4rkr3xn.1
    Explore at:
    Dataset updated
    Feb 8, 2021
    Authors
    DIMITRI ABRAMOV
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    dataset and Octave/MatLab codes/scripts for data analysis Background: Methods for p-value correction are criticized for either increasing Type II error or improperly reducing Type I error. This problem is worse when dealing with thousands or even hundreds of paired comparisons between waves or images which are performed point-to-point. This text considers patterns in probability vectors resulting from multiple point-to-point comparisons between two event-related potentials (ERP) waves (mass univariate analysis) to correct p-values, where clusters of signiticant p-values may indicate true H0 rejection. New method: We used ERP data from normal subjects and other ones with attention deficit hyperactivity disorder (ADHD) under a cued forced two-choice test to study attention. The decimal logarithm of the p-vector (p') was convolved with a Gaussian window whose length was set as the shortest lag above which autocorrelation of each ERP wave may be assumed to have vanished. To verify the reliability of the present correction method, we realized Monte-Carlo simulations (MC) to (1) evaluate confidence intervals of rejected and non-rejected areas of our data, (2) to evaluate differences between corrected and uncorrected p-vectors or simulated ones in terms of distribution of significant p-values, and (3) to empirically verify rate of type-I error (comparing 10,000 pairs of mixed samples whit control and ADHD subjects). Results: the present method reduced the range of p'-values that did not show covariance with neighbors (type I and also type-II errors). The differences between simulation or raw p-vector and corrected p-vectors were, respectively, minimal and maximal for window length set by autocorrelation in p-vector convolution. Comparison with existing methods: Our method was less conservative while FDR methods rejected basically all significant p-values for Pz and O2 channels. The MC simulations, gold-standard method for error correction, presented 2.78±4.83% of difference (all 20 channels) from p-vector after correction, while difference between raw and corrected p-vector was 5,96±5.00% (p = 0.0003). Conclusion: As a cluster-based correction, the present new method seems to be biological and statistically suitable to correct p-values in mass univariate analysis of ERP waves, which adopts adaptive parameters to set correction.

  2. 3d printing errors

    • kaggle.com
    Updated Feb 20, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    NilsHagenBeyer (2024). 3d printing errors [Dataset]. https://www.kaggle.com/datasets/nimbus200/3d-printing-errors
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Feb 20, 2024
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    NilsHagenBeyer
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    This dataset contains images of 3d printed parts recorded while printing.

    The dataset contains 4 classes and 34 shapes:

    classGOODSRINGINGUNDEREXTRUSIONSPAGHETTI
    images506927982962134

    https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F5666725%2F92b8fca57767fa55ae4e42d3972b2522%2F1.PNG?generation=1708440162571728&alt=media" alt=""> https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F5666725%2Fc36caa40d8d565bafa02d9f97112a777%2F2.PNG?generation=1708440216287321&alt=media" alt=""> https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F5666725%2F3ddeb2380e1106e9d482f3e6940235d3%2F3.PNG?generation=1708440227278455&alt=media" alt="">

    Labels and methadata:

    imageimage file name
    class0: Good, 1: Under-Extrusion, 2: Stringing, 4: Spaghetti
    layerlayer of completion of the printed part
    ex_mulglobal extrusion multiplier during print
    shapeidentifier of the printed geometry (1-34)
    recordingdatetime coded name of the print/recording
    printbed_colorcolor of the printbed (black, silver)

    Recording Process

    The dataset was recorded in the context of this work: https://github.com/NilsHagenBeyer/FDM_error_detection

    The Images were recorded with ELP-USB13MAFKV76 digital autofocus camera with the Sony IMX214 sensor chip, which has a resolution of 3264x2448, which were later downscaled to 256x256px. All Prints were carried out on a customized Creality Ender-3 Pro 3D.

    The Images were mainly recorded with a black printbed from camera position 1. For testing purposes the dataset contains also few images from camera postition 2 (oblique camera) with a black printbed (significant motion blurr) and camera postition 1 with a silver printbed. The positions can be seen in the image below.

    https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F5666725%2F253a5f4c3d83233ddbc943fc1f8273e0%2Fexp_setup.png?generation=1721130817484111&alt=media" alt="">

    Folder Structure

    ├── general data

     └── all_images_no_filter.csv      # Full Dataset, unfiltered
    
     └── all_images.csv         # Full Dataset, no spaghetti error
    
     └── black_bed_all.csv       # Full Dataset, no silver bed
    

    ├── images

     └── all_images
     |   └── ...         # All Images: Full Dataset + Silver Bed + Oblique Camera
     |
     └── test_images_silver265
     |   └── ...         # Silver bed test images
     |
     └── test_images_oblique256
        └── ...         # Oblique camera test images
    
  3. f

    Data from: Characterization of types of errors committed in the evaluation...

    • datasetcatalog.nlm.nih.gov
    • scielo.figshare.com
    Updated Jun 7, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Zwetsch, Iuberi Carson; da Costa-Ferreira, Maria Inês Dornelles; Verdun, Nubia Maria (2022). Characterization of types of errors committed in the evaluation of auditory processing through Staggered Spondaic Word test [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0000246417
    Explore at:
    Dataset updated
    Jun 7, 2022
    Authors
    Zwetsch, Iuberi Carson; da Costa-Ferreira, Maria Inês Dornelles; Verdun, Nubia Maria
    Description

    ABSTRACT: Purpose: to characterize the types of errors committed in Staggered Spondaic Words testing by patients undergoing auditory processing evaluation, and correlate these findings with age, gender, educational level and auditory processing disorder (APD) sub-profile. Methods: the Staggered Spondaic Words test results were obtained from a private database, which evaluated patients aging from 7 to 19 years, between June 2011 and September 2013. Results: the most frequent types of errors detected were: word omission (76.66%), word substitution (45%) and replacement by an adjacent word (20%). The APD sub-profiles observed were auditory decoding deficit coupled with integration deficit (38.33%), auditory decoding deficit (23.33%), normal result (20%), and others (18,34%). When the conditions were compared, we observed a greater number of errors in competing conditions. In relation to age and educational level, the errors occurred in greater number among younger patients with lower levels of educational. The correlation between the total number of errors and gender was not statistically significant. Conclusion: the types of errors made in the Staggered Spondaic Words test were characterized and correlated with the proposed variables (gender, age, educational level and APD sub-profile), emphasizing the importance of the test, which is frequently used in auditory processing evaluations for the diagnosis of human communication disorder, and in the identification of children at risk for learning disorders.

  4. Data from: Do multiple outcome measures require p-value adjustment?

    • healthdata.gov
    • data.virginia.gov
    • +1more
    csv, xlsx, xml
    Updated Jul 14, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2025). Do multiple outcome measures require p-value adjustment? [Dataset]. https://healthdata.gov/d/vkg6-kajc
    Explore at:
    csv, xlsx, xmlAvailable download formats
    Dataset updated
    Jul 14, 2025
    Description

    Background Readers may question the interpretation of findings in clinical trials when multiple outcome measures are used without adjustment of the p-value. This question arises because of the increased risk of Type I errors (findings of false "significance") when multiple simultaneous hypotheses are tested at set p-values. The primary aim of this study was to estimate the need to make appropriate p-value adjustments in clinical trials to compensate for a possible increased risk in committing Type I errors when multiple outcome measures are used.

       Discussion
       The classicists believe that the chance of finding at least one test statistically significant due to chance and incorrectly declaring a difference increases as the number of comparisons increases. The rationalists have the following objections to that theory: 1) P-value adjustments are calculated based on how many tests are to be considered, and that number has been defined arbitrarily and variably; 2) P-value adjustments reduce the chance of making type I errors, but they increase the chance of making type II errors or needing to increase the sample size.
    
    
       Summary
       Readers should balance a study's statistical significance with the magnitude of effect, the quality of the study and with findings from other studies. Researchers facing multiple outcome measures might want to either select a primary outcome measure or use a global assessment measure, rather than adjusting the p-value.
    
  5. V

    Data from: Types and frequency of preanalytical mistakes in the first Thai...

    • data.virginia.gov
    • healthdata.gov
    • +1more
    html
    Updated Sep 6, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    National Institutes of Health (2025). Types and frequency of preanalytical mistakes in the first Thai ISO 9002:1994 certified clinical laboratory, a 6 – month monitoring [Dataset]. https://data.virginia.gov/dataset/types-and-frequency-of-preanalytical-mistakes-in-the-first-thai-iso-9002-1994-certified-clinica
    Explore at:
    htmlAvailable download formats
    Dataset updated
    Sep 6, 2025
    Dataset provided by
    National Institutes of Health
    Description

    Background Reliability cannot be achieved in a clinical laboratory through the control of accuracy in the analytical phase of the testing process alone. Indeed a "mistake" can be defined as any defect occuring during the testing process. In the analysis of clinical specimens, there are many possible preanalytical sources of error. Therefore, the application of quality system to laboratory testing requires total quality management throughout the laboratory process, including the preanalytical and postanalytical phases. ISO 9002:1994 is a model for quality assurance in production, installation, and servicing, which includes a number of clauses providing guidance for implementation in clinical laboratories. Our laboratory at King Chulalongkorn Memorial Hospital, the largest Thai Red Cross Society hospital, is the first clinical laboratory in Thailand with ISO 9002:1994 certified for the whole unit.

       Method
       In this study, we evaluated the frequency and types of preanalytical mistakes found in our laboratory, by monitoring specimens requested for laboratory analyses from both in-patient and out-patient divisions for 6 months.
    
    
       Result
       Among a total of 935,896 specimens for 941,902 analyses, 1,048 findings were confirmed as preanalytical mistakes; this was a relative frequency of 0.11 % (1,048/935,896). A total of 1,240 mistakes were identified during the study period. Comparing the preanalytical mistakes to other mistakes in the laboratory process monitored in the same setting and period, the distribution of mistakes was: preanalytical 84.52 % (1,048 mistakes), analytical 4.35 % (54 mistakes), and postanalytical 11.13 % (138 mistakes). Of 1,048 preanalytical mistakes, 998 (95.2%) originated in the care units. All preanalytical mistakes, except for 12 (1.15 %) relating to the laboratory barcode reading machine, were due to human error.
    
    
       Conclusion
       Most mistakes occurred before samples were analysed, either during sampling or preparation for analysis. This suggests that co-operation with clinicians and personnel outside the laboratory is still the key to improvement of laboratory quality.
    
  6. Z

    Classes of errors in DOI names: output dataset

    • data-staging.niaid.nih.gov
    • data.niaid.nih.gov
    • +1more
    Updated Jun 8, 2021
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Boente, Ricarda; Massari, Arcangelo; Santini, Cristian; Tural, Deniz (2021). Classes of errors in DOI names: output dataset [Dataset]. https://data-staging.niaid.nih.gov/resources?id=zenodo_4733646
    Explore at:
    Dataset updated
    Jun 8, 2021
    Authors
    Boente, Ricarda; Massari, Arcangelo; Santini, Cristian; Tural, Deniz
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    This dataset contains a seven-column CSV file, where the first column ("Valid_citing_DOI") contains the DOI of a citing entity retrieved in Crossref, the second column ("Invalid_cited_DOI") contains the invalid DOI of a cited entity identified by looking at the field "reference" in the JSON document returned by querying the Crossref API with the citing DOI, and the third column ("Valid_DOI"), contains the corrected DOI if it has been identified, an empty string otherwhise. Finally, the last four columns ("Already_valid", "Prefix_error", "Suffix_error", "Other-type_error"), contain a 1 if the error in the DOI was related to that class, 0 otherwise.

    The citations to invalid DOIs have been retrieved from Citations to invalid DOI-identified entities obtained from processing DOI-to-DOI citations to add in COCI (Peroni, 2021), while the valid DOI names and the related classes of errors are the result of a process described in Cleaning different types of DOI errors found in cited references on Crossref using automated methods, by the same authors of this the dataset.

  7. Type 1 error in ERF data analysis.

    • plos.figshare.com
    xls
    Updated Jun 1, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Yakov A. Tsepilov; Janina S. Ried; Konstantin Strauch; Harald Grallert; Cornelia M. van Duijn; Tatiana I. Axenovich; Yurii S. Aulchenko (2023). Type 1 error in ERF data analysis. [Dataset]. http://doi.org/10.1371/journal.pone.0081431.t004
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jun 1, 2023
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Yakov A. Tsepilov; Janina S. Ried; Konstantin Strauch; Harald Grallert; Cornelia M. van Duijn; Tatiana I. Axenovich; Yurii S. Aulchenko
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description
    • for VIFGC corrected genotypic (2df) tests, we used the 1df based test by performing VIFGC-corrected tests for recessive and dominant models [3].The abbreviations are as in Table 1. The values are given for all SNPs as well as for stratified frequency groups.
  8. c

    Research data supporting "Dynamically Diagnosing Type Errors in Unsafe Code"...

    • repository.cam.ac.uk
    bin, txt
    Updated Jan 11, 2017
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Kell, SR (2017). Research data supporting "Dynamically Diagnosing Type Errors in Unsafe Code" [Dataset]. http://doi.org/10.17863/CAM.7021
    Explore at:
    txt(1130 bytes), bin(827094656 bytes)Available download formats
    Dataset updated
    Jan 11, 2017
    Dataset provided by
    University of Cambridge
    Apollo
    Authors
    Kell, SR
    Description

    Buildable source code of the libcrunch system and related codebases, within a Debian 8.5 (Jessie) distribution, packaged as a VirtualBox virtual machine image.

  9. S

    Supplementary materials including additional simulation data and R code

    • scidb.cn
    Updated Aug 22, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Hongmei LIN; Yuanyuan TANG; Xiaorui WANG; Jianming ZHU; Yanlin TANG; Tiejun TONG (2024). Supplementary materials including additional simulation data and R code [Dataset]. http://doi.org/10.57760/sciencedb.j00207.00014
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Aug 22, 2024
    Dataset provided by
    Science Data Bank
    Authors
    Hongmei LIN; Yuanyuan TANG; Xiaorui WANG; Jianming ZHU; Yanlin TANG; Tiejun TONG
    License

    https://api.github.com/licenses/cc0-1.0https://api.github.com/licenses/cc0-1.0

    Description

    The Supplementary materials present simulation data including biases and mean squared errors of estimators, the type I error rate and power curves of rank score test, and the estimated mean lengths and the empirical coverage probabilities of confidence intervals in various cases. The R codes are included.

  10. w

    Population and Family Health Survey 2002 - Jordan

    • microdata.worldbank.org
    • catalog.ihsn.org
    Updated Jun 6, 2017
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Department of Statistics (DOS) (2017). Population and Family Health Survey 2002 - Jordan [Dataset]. https://microdata.worldbank.org/index.php/catalog/1409
    Explore at:
    Dataset updated
    Jun 6, 2017
    Dataset authored and provided by
    Department of Statistics (DOS)
    Time period covered
    2002
    Area covered
    Jordan
    Description

    Abstract

    The JPFHS is part of the worldwide Demographic and Health Surveys Program, which is designed to collect data on fertility, family planning, and maternal and child health. The primary objective of the Jordan Population and Family Health Survey (JPFHS) is to provide reliable estimates of demographic parameters, such as fertility, mortality, family planning, fertility preferences, as well as maternal and child health and nutrition that can be used by program managers and policy makers to evaluate and improve existing programs. In addition, the JPFHS data will be useful to researchers and scholars interested in analyzing demographic trends in Jordan, as well as those conducting comparative, regional or crossnational studies.

    The content of the 2002 JPFHS was significantly expanded from the 1997 survey to include additional questions on women’s status, reproductive health, and family planning. In addition, all women age 15-49 and children less than five years of age were tested for anemia.

    Geographic coverage

    National

    Analysis unit

    • Household
    • Children under five years
    • Women age 15-49
    • Men

    Kind of data

    Sample survey data

    Sampling procedure

    The estimates from a sample survey are affected by two types of errors: 1) nonsampling errors and 2) sampling errors. Nonsampling errors are the result of mistakes made in implementing data collection and data processing, such as failure to locate and interview the correct household, misunderstanding of the questions on the part of either the interviewer or the respondent, and data entry errors. Although numerous efforts were made during the implementation of the 2002 JPFHS to minimize this type of error, nonsampling errors are impossible to avoid and difficult to evaluate statistically.

    Sampling errors, on the other hand, can be evaluated statistically. The sample of respondents selected in the 2002 JPFHS is only one of many samples that could have been selected from the same population, using the same design and expected size. Each of these samples would yield results that differ somewhat from the results of the actual sample selected. Sampling errors are a measure of the variability between all possible samples. Although the degree of variability is not known exactly, it can be estimated from the survey results.

    A sampling error is usually measured in terms of the standard error for a particular statistic (mean, percentage, etc.), which is the square root of the variance. The standard error can be used to calculate confidence intervals within which the true value for the population can reasonably be assumed to fall. For example, for any given statistic calculated from a sample survey, the value of that statistic will fall within a range of plus or minus two times the standard error of that statistic in 95 percent of all possible samples of identical size and design.

    If the sample of respondents had been selected as a simple random sample, it would have been possible to use straightforward formulas for calculating sampling errors. However, the 2002 JPFHS sample is the result of a multistage stratified design and, consequently, it was necessary to use more complex formulas. The computer software used to calculate sampling errors for the 2002 JPFHS is the ISSA Sampling Error Module (ISSAS). This module used the Taylor linearization method of variance estimation for survey estimates that are means or proportions. The Jackknife repeated replication method is used for variance estimation of more complex statistics such as fertility and mortality rates.

    Note: See detailed description of sample design in APPENDIX B of the survey report.

    Mode of data collection

    Face-to-face

    Research instrument

    The 2002 JPFHS used two questionnaires – namely, the Household Questionnaire and the Individual Questionnaire. Both questionnaires were developed in English and translated into Arabic. The Household Questionnaire was used to list all usual members of the sampled households and to obtain information on each member’s age, sex, educational attainment, relationship to the head of household, and marital status. In addition, questions were included on the socioeconomic characteristics of the household, such as source of water, sanitation facilities, and the availability of durable goods. The Household Questionnaire was also used to identify women who are eligible for the individual interview: ever-married women age 15-49. In addition, all women age 15-49 and children under five years living in the household were measured to determine nutritional status and tested for anemia.

    The household and women’s questionnaires were based on the DHS Model “A” Questionnaire, which is designed for use in countries with high contraceptive prevalence. Additions and modifications to the model questionnaire were made in order to provide detailed information specific to Jordan, using experience gained from the 1990 and 1997 Jordan Population and Family Health Surveys. For each evermarried woman age 15 to 49, information on the following topics was collected:

    1. Respondent’s background
    2. Birth history
    3. Knowledge and practice of family planning
    4. Maternal care, breastfeeding, immunization, and health of children under five years of age
    5. Marriage
    6. Fertility preferences
    7. Husband’s background and respondent’s employment
    8. Knowledge of AIDS and STIs

    In addition, information on births and pregnancies, contraceptive use and discontinuation, and marriage during the five years prior to the survey was collected using a monthly calendar.

    Cleaning operations

    Fieldwork and data processing activities overlapped. After a week of data collection, and after field editing of questionnaires for completeness and consistency, the questionnaires for each cluster were packaged together and sent to the central office in Amman where they were registered and stored. Special teams were formed to carry out office editing and coding of the open-ended questions.

    Data entry and verification started after one week of office data processing. The process of data entry, including one hundred percent re-entry, editing and cleaning, was done by using PCs and the CSPro (Census and Survey Processing) computer package, developed specially for such surveys. The CSPro program allows data to be edited while being entered. Data processing operations were completed by the end of October 2002. A data processing specialist from ORC Macro made a trip to Jordan in October and November 2002 to follow up data editing and cleaning and to work on the tabulation of results for the survey preliminary report. The tabulations for the present final report were completed in December 2002.

    Response rate

    A total of 7,968 households were selected for the survey from the sampling frame; among those selected households, 7,907 households were found. Of those households, 7,825 (99 percent) were successfully interviewed. In those households, 6,151 eligible women were identified, and complete interviews were obtained with 6,006 of them (98 percent of all eligible women). The overall response rate was 97 percent.

    Note: See summarized response rates by place of residence in Table 1.1 of the survey report.

    Sampling error estimates

    The estimates from a sample survey are affected by two types of errors: 1) nonsampling errors and 2) sampling errors. Nonsampling errors are the result of mistakes made in implementing data collection and data processing, such as failure to locate and interview the correct household, misunderstanding of the questions on the part of either the interviewer or the respondent, and data entry errors. Although numerous efforts were made during the implementation of the 2002 JPFHS to minimize this type of error, nonsampling errors are impossible to avoid and difficult to evaluate statistically.

    Sampling errors, on the other hand, can be evaluated statistically. The sample of respondents selected in the 2002 JPFHS is only one of many samples that could have been selected from the same population, using the same design and expected size. Each of these samples would yield results that differ somewhat from the results of the actual sample selected. Sampling errors are a measure of the variability between all possible samples. Although the degree of variability is not known exactly, it can be estimated from the survey results.

    A sampling error is usually measured in terms of the standard error for a particular statistic (mean, percentage, etc.), which is the square root of the variance. The standard error can be used to calculate confidence intervals within which the true value for the population can reasonably be assumed to fall. For example, for any given statistic calculated from a sample survey, the value of that statistic will fall within a range of plus or minus two times the standard error of that statistic in 95 percent of all possible samples of identical size and design.

    If the sample of respondents had been selected as a simple random sample, it would have been possible to use straightforward formulas for calculating sampling errors. However, the 2002 JPFHS sample is the result of a multistage stratified design and, consequently, it was necessary to use more complex formulas. The computer software used to calculate sampling errors for the 2002 JPFHS is the ISSA Sampling Error Module (ISSAS). This module used the Taylor linearization method of variance estimation for survey estimates that are means or proportions. The Jackknife repeated replication method is used for variance estimation of more complex statistics such as fertility and mortality rates.

    Note: See detailed

  11. d

    Data from: Positional errors in species distribution modelling are not...

    • search.dataone.org
    • datadryad.org
    Updated May 8, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Lukáš Gábor; Walter Jetz; Muyang Lu; Duccio Rocchini; Anna Cord; Marco Malavasi; Alejandra Zarzo-Arias; VojtÄ›ch Barták; VÃtÄ›zslav Moudrý (2025). Positional errors in species distribution modelling are not overcome by the coarser grains of analysis [Dataset]. http://doi.org/10.5061/dryad.79cnp5hx3
    Explore at:
    Dataset updated
    May 8, 2025
    Dataset provided by
    Dryad Digital Repository
    Authors
    Lukáš Gábor; Walter Jetz; Muyang Lu; Duccio Rocchini; Anna Cord; Marco Malavasi; Alejandra Zarzo-Arias; Vojtěch Barták; Vítězslav Moudrý
    Time period covered
    Jan 1, 2022
    Description

    The performance of species distribution models is known to be affected by the analysis grain and the positional error of species occurrences. Coarsening of the spatial analysis grain has been suggested to compensate for positional errors. Nevertheless, this way of dealing with positional errors has never been thoroughly tested. With increasing use of fine-scale environmental data in predictive models developed for conservation and climate change studies it is increasingly important to test this assumption. Species distribution models using fine-scale environmental data are more likely to be negatively affected by positional error as the inaccurate species occurrences might easier end up in unsuitable environment, which can result in inappropriate conservation actions. Here, we examine the trade-offs between positional error and analysis grain and provide recommendations for best practice. We generated virtual species using tree canopy height, topography wetness index, and altitude deriv...

  12. Miscellaneous Tables (Standard Errors and P Values) - 6.1 to 6.107

    • catalog.data.gov
    • data.virginia.gov
    • +1more
    Updated Sep 6, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Substance Abuse and Mental Health Services Administration (2025). Miscellaneous Tables (Standard Errors and P Values) - 6.1 to 6.107 [Dataset]. https://catalog.data.gov/dataset/miscellaneous-tables-standard-errors-and-p-values-6-1-to-6-107
    Explore at:
    Dataset updated
    Sep 6, 2025
    Dataset provided by
    Substance Abuse and Mental Health Services Administrationhttps://www.samhsa.gov/
    Description

    These detailed tables present the standard errors for the totals and prevalence estimates of the number of days and types of substance use, poly-drug use, nicotine dependence, substance dependence by age of first use, source of substances, social context of substance use, and drunk/drugged driving from the 2010 National Survey on Drug Use and Health (NSDUH). Substances examined include illicit drugs, marijuana, cocaine, heroin, hallucinogens, inhalants, and the nonmedical use of prescription-type pain relievers, tranquilizers, stimulants, and sedatives, and alcohol. Standard errors are provided for totals and prevalence estimates of lifetime, past year, and past month use by age group, gender, race/ethnicity, education level, employment status, geographic area, pregnancy status, college enrollment status, and probation/parole status. Comparisons are made between 2010 and 2009.

  13. Demographic and Health Survey 1996-1997 - Bangladesh

    • microdata.worldbank.org
    • catalog.ihsn.org
    • +1more
    Updated May 26, 2017
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mitra & Associates/ NIPORT (2017). Demographic and Health Survey 1996-1997 - Bangladesh [Dataset]. https://microdata.worldbank.org/index.php/catalog/1335
    Explore at:
    Dataset updated
    May 26, 2017
    Dataset provided by
    National Institute of Population Research and Traininghttp://niport.gov.bd/
    Authors
    Mitra & Associates/ NIPORT
    Time period covered
    1996 - 1997
    Area covered
    Bangladesh
    Description

    Abstract

    The Bangladesh Demographic and Health Survey (BDHS) is part of the worldwide Demographic and Health Surveys program, which is designed to collect data on fertility, family planning, and maternal and child health.

    The BDHS is intended to serve as a source of population and health data for policymakers and the research community. In general, the objectives of the BDHS are to: - assess the overall demographic situation in Bangladesh, - assist in the evaluation of the population and health programs in Bangladesh, and - advance survey methodology.

    More specifically, the objective of the BDHS is to provide up-to-date information on fertility and childhood mortality levels; nuptiality; fertility preferences; awareness, approval, and use of family planning methods; breastfeeding practices; nutrition levels; and maternal and child health. This information is intended to assist policymakers and administrators in evaluating and designing programs and strategies for improving health and family planning services in the country.

    Geographic coverage

    National

    Analysis unit

    • Household
    • Children under five years
    • Women age 10-49
    • Men age 15-59

    Kind of data

    Sample survey data

    Sampling procedure

    Bangladesh is divided into six administrative divisions, 64 districts (zillas), and 490 thanas. In rural areas, thanas are divided into unions and then mauzas, a land administrative unit. Urban areas are divided into wards and then mahallas. The 1996-97 BDHS employed a nationally-representative, two-stage sample that was selected from the Integrated Multi-Purpose Master Sample (IMPS) maintained by the Bangladesh Bureau of Statistics. Each division was stratified into three groups: 1 ) statistical metropolitan areas (SMAs), 2) municipalities (other urban areas), and 3) rural areas. 3 In the rural areas, the primary sampling unit was the mauza, while in urban areas, it was the mahalla. Because the primary sampling units in the IMPS were selected with probability proportional to size from the 1991 Census frame, the units for the BDHS were sub-selected from the IMPS with equal probability so as to retain the overall probability proportional to size. A total of 316 primary sampling units were utilized for the BDHS (30 in SMAs, 42 in municipalities, and 244 in rural areas). In order to highlight changes in survey indicators over time, the 1996-97 BDHS utilized the same sample points (though not necessarily the same households) that were selected for the 1993-94 BDHS, except for 12 additional sample points in the new division of Sylhet. Fieldwork in three sample points was not possible (one in Dhaka Cantonment and two in the Chittagong Hill Tracts), so a total of 313 points were covered.

    Since one objective of the BDHS is to provide separate estimates for each division as well as for urban and rural areas separately, it was necessary to increase the sampling rate for Barisal and Sylhet Divisions and for municipalities relative to the other divisions, SMAs and rural areas. Thus, the BDHS sample is not self-weighting and weighting factors have been applied to the data in this report.

    Mitra and Associates conducted a household listing operation in all the sample points from 15 September to 15 December 1996. A systematic sample of 9,099 households was then selected from these lists. Every second household was selected for the men's survey, meaning that, in addition to interviewing all ever-married women age 10-49, interviewers also interviewed all currently married men age 15-59. It was expected that the sample would yield interviews with approximately 10,000 ever-married women age 10-49 and 3,000 currently married men age 15-59.

    Note: See detailed in APPENDIX A of the survey report.

    Mode of data collection

    Face-to-face

    Research instrument

    Four types of questionnaires were used for the BDHS: a Household Questionnaire, a Women's Questionnaire, a Men' s Questionnaire and a Community Questionnaire. The contents of these questionnaires were based on the DHS Model A Questionnaire, which is designed for use in countries with relatively high levels of contraceptive use. These model questionnaires were adapted for use in Bangladesh during a series of meetings with a small Technical Task Force that consisted of representatives from NIPORT, Mitra and Associates, USAID/Bangladesh, the International Centre for Diarrhoeal Disease Research, Bangladesh (ICDDR,B), Population Council/Dhaka, and Macro International Inc (see Appendix D for a list of members). Draft questionnaires were then circulated to other interested groups and were reviewed by the BDHS Technical Review Committee (see Appendix D for list of members). The questionnaires were developed in English and then translated into and printed in Bangla (see Appendix E for final version in English).

    The Household Questionnaire was used to list all the usual members and visitors in the selected households. Some basic information was collected on the characteristics of each person listed, including his/her age, sex, education, and relationship to the head of the household. The main purpose of the Household Questionnaire was to identify women and men who were eligible for the individual interview. In addition, information was collected about the dwelling itself, such as the source of water, type of toilet facilities, materials used to construct the house, and ownership of various consumer goods.

    The Women's Questionnaire was used to collect information from ever-married women age 10-49. These women were asked questions on the following topics: - Background characteristics (age, education, religion, etc.), - Reproductive history, - Knowledge and use of family planning methods, - Antenatal and delivery care, - Breastfeeding and weaning practices, - Vaccinations and health of children under age five, - Marriage, - Fertility preferences, - Husband's background and respondent's work, - Knowledge of AIDS, - Height and weight of children under age five and their mothers.

    The Men's Questionnaire was used to interview currently married men age 15-59. It was similar to that for women except that it omitted the sections on reproductive history, antenatal and delivery care, breastfeeding, vaccinations, and height and weight. The Community Questionnaire was completed for each sample point and included questions about the existence in the community of income-generating activities and other development organizations and the availability of health and family planning services.

    Response rate

    A total of 9,099 households were selected for the sample, of which 8,682 were successfully interviewed. The shortfall is primarily due to dwellings that were vacant or in which the inhabitants had left for an extended period at the time they were visited by the interviewing teams. Of the 8,762 households occupied, 99 percent were successfully interviewed. In these households, 9,335 women were identified as eligible for the individual interview (i.e., ever-married and age 10-49) and interviews were completed for 9,127 or 98 percent of them. In the half of the households that were selected for inclusion in the men's survey, 3,611 eligible ever-married men age 15-59 were identified, of whom 3,346 or 93 percent were interviewed.

    The principal reason for non-response among eligible women and men was the failure to find them at home despite repeated visits to the household. The refusal rate was low.

    Note: See summarized response rates by residence (urban/rural) in Table 1.1 of the survey report.

    Sampling error estimates

    The estimates from a sample survey are affected by two types of errors: (1) non-sampling errors, and (2) sampling errors. Non-sampling errors are the results of mistakes made in implementing data collection and data processing, such as failure to locate and interview the correct household, misunderstanding of the questions on the part of either the interviewer or the respondent, and data entry errors. Although numerous efforts were made during the implementation of the BDHS to minimize this type of error, non-sampling errors are impossible to avoid and difficult to evaluate statistically.

    Sampling errors, on the other hand, can be evaluated statistically. The sample of respondents selected in the BDHS is only one of many samples that could have been selected from the same population, using the same design and expected size. Each of these samples would yield results that differ somewhat from the results of the actual sample selected. Sampling errors are a measure of the variability between all possible samples. Although the degree of variability is not known exactly, it can be estimated from the survey results.

    A sampling error is usually measured in terms of the standard error for a particular statistic (mean, percentage, etc.), which is the square root of the variance. The standard error can be used to calculate confidence intervals within which the true value for the population can reasonably be assumed to fall. For example, for any given statistic calculated from a sample survey, the value of that statistic will fall within a range of plus or minus two times the standard error of that statistic in 95 percent of all possible samples of identical size and design.

    If the sample of respondents had been selected as a simple random sample, it would have been possible to use straightforward formulas for calculating sampling errors. However, the BDHS sample is the result of a two-stage stratified design, and, consequently, it was necessary to use more complex formulae. The computer software used to calculate sampling errors for the BDHS is the ISSA Sampling Error Module. This module used the Taylor

  14. Household Survey on Information and Communications Technology, 2014 - West...

    • pcbs.gov.ps
    Updated Jan 28, 2020
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Palestinian Central Bureau of statistics (2020). Household Survey on Information and Communications Technology, 2014 - West Bank and Gaza [Dataset]. https://www.pcbs.gov.ps/PCBS-Metadata-en-v5.2/index.php/catalog/465
    Explore at:
    Dataset updated
    Jan 28, 2020
    Dataset provided by
    Palestinian Central Bureau of Statisticshttps://pcbs.gov/
    Authors
    Palestinian Central Bureau of statistics
    Time period covered
    2014
    Area covered
    Gaza Strip, West Bank, Gaza
    Description

    Abstract

    Within the frame of PCBS' efforts in providing official Palestinian statistics in the different life aspects of Palestinian society and because the wide spread of Computer, Internet and Mobile Phone among the Palestinian people, and the important role they may play in spreading knowledge and culture and contribution in formulating the public opinion, PCBS conducted the Household Survey on Information and Communications Technology, 2014.

    The main objective of this survey is to provide statistical data on Information and Communication Technology in the Palestine in addition to providing data on the following: -

    · Prevalence of computers and access to the Internet. · Study the penetration and purpose of Technology use.

    Geographic coverage

    Palestine (West Bank and Gaza Strip) , type of locality (Urban, Rural, Refugee Camps) and governorate

    Analysis unit

    Household. Person 10 years and over .

    Universe

    All Palestinian households and individuals whose usual place of residence in Palestine with focus on persons aged 10 years and over in year 2014.

    Kind of data

    Sample survey data [ssd]

    Sampling procedure

    Sampling Frame The sampling frame consists of a list of enumeration areas adopted in the Population, Housing and Establishments Census of 2007. Each enumeration area has an average size of about 124 households. These were used in the first phase as Preliminary Sampling Units in the process of selecting the survey sample.

    Sample Size The total sample size of the survey was 7,268 households, of which 6,000 responded.

    Sample Design The sample is a stratified clustered systematic random sample. The design comprised three phases:

    Phase I: Random sample of 240 enumeration areas. Phase II: Selection of 25 households from each enumeration area selected in phase one using systematic random selection. Phase III: Selection of an individual (10 years or more) in the field from the selected households; KISH TABLES were used to ensure indiscriminate selection.

    Sample Strata Distribution of the sample was stratified by: 1- Governorate (16 governorates, J1). 2- Type of locality (urban, rural and camps).

    Sampling deviation

    -

    Mode of data collection

    Face-to-face [f2f]

    Research instrument

    The survey questionnaire consists of identification data, quality controls and three main sections: Section I: Data on household members that include identification fields, the characteristics of household members (demographic and social) such as the relationship of individuals to the head of household, sex, date of birth and age.

    Section II: Household data include information regarding computer processing, access to the Internet, and possession of various media and computer equipment. This section includes information on topics related to the use of computer and Internet, as well as supervision by households of their children (5-17 years old) while using the computer and Internet, and protective measures taken by the household in the home.

    Section III: Data on persons (aged 10 years and over) about computer use, access to the Internet and possession of a mobile phone.

    Cleaning operations

    Preparation of Data Entry Program: This stage included preparation of the data entry programs using an ACCESS package and defining data entry control rules to avoid errors, plus validation inquiries to examine the data after it had been captured electronically.

    Data Entry: The data entry process started on 8 May 2014 and ended on 23 June 2014. The data entry took place at the main PCBS office and in field offices using 28 data clerks.

    Editing and Cleaning procedures: Several measures were taken to avoid non-sampling errors. These included editing of questionnaires before data entry to check field errors, using a data entry application that does not allow mistakes during the process of data entry, and then examining the data by using frequency and cross tables. This ensured that data were error free; cleaning and inspection of the anomalous values were conducted to ensure harmony between the different questions on the questionnaire.

    Response rate

    Response Rates= 79%

    Sampling error estimates

    There are many aspects of the concept of data quality; this includes the initial planning of the survey to the dissemination of the results and how well users understand and use the data. There are three components to the quality of statistics: accuracy, comparability, and quality control procedures.

    Checks on data accuracy cover many aspects of the survey and include statistical errors due to the use of a sample, non-statistical errors resulting from field workers or survey tools, and response rates and their effect on estimations. This section includes:

    Statistical Errors Data of this survey may be affected by statistical errors due to the use of a sample and not a complete enumeration. Therefore, certain differences can be expected in comparison with the real values obtained through censuses. Variances were calculated for the most important indicators.

    Variance calculations revealed that there is no problem in disseminating results nationally or regionally (the West Bank, Gaza Strip), but some indicators show high variance by governorate, as noted in the tables of the main report.

    Non-Statistical Errors Non-statistical errors are possible at all stages of the project, during data collection or processing. These are referred to as non-response errors, response errors, interviewing errors and data entry errors. To avoid errors and reduce their effects, strenuous efforts were made to train the field workers intensively. They were trained on how to carry out the interview, what to discuss and what to avoid, and practical and theoretical training took place during the training course. Training manuals were provided for each section of the questionnaire, along with practical exercises in class and instructions on how to approach respondents to reduce refused cases. Data entry staff were trained on the data entry program, which was tested before starting the data entry process.

    Several measures were taken to avoid non-sampling errors. These included editing of questionnaires before data entry to check field errors, using a data entry application that does not allow mistakes during the process of data entry, and then examining the data by using frequency and cross tables. This ensured that data were error free; cleaning and inspection of the anomalous values were conducted to ensure harmony between the different questions on the questionnaire.

    The sources of non-statistical errors can be summarized as: 1. Some of the households were not at home and could not be interviewed, and some households refused to be interviewed. 2. In unique cases, errors occurred due to the way the questions were asked by interviewers and respondents misunderstood some of the questions.

  15. t

    Data from: Numerical experiments to "error analysis of multirate...

    • service.tib.eu
    • radar.kit.edu
    • +1more
    Updated Aug 4, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2023). Numerical experiments to "error analysis of multirate leapfrog-type methods for second-order semilinear odes" [Dataset]. https://service.tib.eu/ldmservice/dataset/rdr-doi-10-35097-1512
    Explore at:
    Dataset updated
    Aug 4, 2023
    License

    Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
    License information was derived automatically

    Description

    Abstract: This code was used for the numerical experiments in the preprint (CRC Preprint 2021/26; URL: https://www.waves.kit.edu/downloads/CRC1173_Preprint_2021-26.pdf) and in the paper "Error analysis of multirate leapfrog-type methods for second-order semilinear odes" by C. Carle and M. Hochbruck. TechnicalRemarks: The scripts inside the subfolders are intended to reproduce the figures from the preprint Error analysis of multirate leapfrog-type methods for second-order semilinear ODEs by Constantin carle and Marlis Hochbruck Requirements The codes are tested with Ubuntu 20.04.2 LTS and Python 3.8.5 and the following version of its modules: numpy - 1.17.4

  16. Laptop Price Dataset Cleaned

    • kaggle.com
    zip
    Updated Jun 20, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Pragati Kumari (2023). Laptop Price Dataset Cleaned [Dataset]. https://www.kaggle.com/datasets/pragatikumari928/cleaned-laptop-price-dataset/data
    Explore at:
    zip(42015 bytes)Available download formats
    Dataset updated
    Jun 20, 2023
    Authors
    Pragati Kumari
    Description

    The Cleaned Laptop Price Dataset is a meticulously curated and refined collection of laptop price information, ensuring data accuracy and consistency. This dataset serves as a valuable resource for analyzing and understanding the pricing trends and specifications of various laptop models.

    Each entry in the dataset is associated with a unique index column, allowing for easy referencing and identification. The dataset provides comprehensive details about the laptops, including the manufacturer or company behind each product, the specific laptop model or type, and the screen size in inches.

    In terms of display quality, the dataset includes information about the resolution width and height, indicating the level of visual clarity and detail that can be expected from each laptop. Furthermore, the dataset indicates whether the laptops feature an IPS panel, which is known for its superior viewing angles and color reproduction.

    For users seeking a more interactive experience, the dataset highlights laptops with touchscreen functionality, allowing for intuitive navigation and control.

    The dataset also offers insights into the processing power of the laptops. It includes details about the CPU brand, CPU name or model, and CPU speed. These specifications provide a glimpse into the computational capabilities and performance potential of each laptop.

    Memory-related information is also available, with the dataset covering the RAM capacity and memory type (e.g., DDR3, DDR4). Furthermore, it provides details about primary and secondary storage capacities, enabling users to understand the available space for storing files, applications, and data.

    Graphics capabilities are a crucial aspect of laptops, and the dataset includes the GPU brand, shedding light on the visual processing prowess of each device.

    To facilitate compatibility and software considerations, the dataset includes information about the operating system installed on each laptop.

    Additional details, such as the weight of the laptops, are also provided, allowing users to evaluate the portability and convenience of different models.

    Lastly, the dataset includes the price of each laptop, enabling users to explore and compare the pricing landscape for informed purchasing decisions.

    Overall, the Cleaned Laptop Price Dataset provides a comprehensive and reliable collection of laptop information, offering valuable insights into pricing trends, specifications, and features. It serves as a valuable resource for market analysis, product comparisons, and decision-making processes related to laptops and consumer electronics.

    The Cleaned Laptop Price Dataset contains various columns that provide comprehensive information about laptop prices and specifications. Here's a brief overview of each column:

    • index_col: The index column representing a unique identifier for each laptop entry.
    • Company: The brand or company manufacturing the laptop.
    • TypeName: The laptop model or type.
    • Inches: The size of the laptop screen in inches.
    • resolution_width: The width resolution of the laptop display.
    • resolution_height: The height resolution of the laptop display.
    • ips_panel: Indicates whether the laptop has an IPS panel for better viewing angles and color reproduction.
    • touchscreen: Indicates whether the laptop has a touchscreen feature.
    • cpu_brand: The brand of the laptop's CPU (Central Processing Unit).
    • cpu_name: The name or model of the laptop's CPU.
    • cpu_speed: The clock speed of the laptop's CPU.
    • Ram: The amount of random-access memory (RAM) available in the laptop.
    • memory_type: The type of memory used in the laptop (e.g., DDR3, DDR4).
    • primary_storage: The primary storage capacity of the laptop, typically referring to the internal hard drive or solid-state drive.
    • secondary_storage: The secondary storage capacity of the laptop, if applicable.
    • gpu_brand: The brand of the laptop's GPU (Graphics Processing Unit).
    • OpSys: The operating system installed on the laptop.
    • Weight: The weight of the laptop.
    • price: The price of the laptop.

    The dataset provides valuable insights into laptop pricing trends, specifications, and features, allowing for analysis and comparison of different laptops based on these attributes.

  17. Artifact: Privacy-Respecting Type Error Telemetry at Scale

    • zenodo.org
    • doi.org
    application/gzip
    Updated Dec 9, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ben Greenman; Ben Greenman; Alan Jeffrey; Alan Jeffrey; Shriram Krishnamurthi; Shriram Krishnamurthi; Mitesh Shah; Mitesh Shah (2023). Artifact: Privacy-Respecting Type Error Telemetry at Scale [Dataset]. http://doi.org/10.5281/zenodo.10313778
    Explore at:
    application/gzipAvailable download formats
    Dataset updated
    Dec 9, 2023
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Ben Greenman; Ben Greenman; Alan Jeffrey; Alan Jeffrey; Shriram Krishnamurthi; Shriram Krishnamurthi; Mitesh Shah; Mitesh Shah
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This artifact packages the data for the paper: Privacy-Respecting Type Error Telemetry at Scale

    There are two files on Zenodo:

    • data.tar.gz has the original Luau telemetry data
    • artifact.tar.gz has a result PDF, intermediate data, and scripts for processing the data

    The artifact code and the source for the paper are also on GitHub:

    This artifact is primarily a **dataset**. It shows how we reached the conclusions in the paper.

    The scripts in this artifact are provided as-is for completeness. They may have bugs. They may not work as advertised.

  18. Expenditure and Consumption Survey, 2004 - West Bank and Gaza

    • catalog.ihsn.org
    Updated Mar 29, 2019
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Palestinian Central Bureau of Statistics (2019). Expenditure and Consumption Survey, 2004 - West Bank and Gaza [Dataset]. https://catalog.ihsn.org/index.php/catalog/3085
    Explore at:
    Dataset updated
    Mar 29, 2019
    Dataset authored and provided by
    Palestinian Central Bureau of Statisticshttps://pcbs.gov/
    Time period covered
    2004 - 2005
    Area covered
    Gaza Strip, West Bank, Gaza
    Description

    Abstract

    The basic goal of this survey is to provide the necessary database for formulating national policies at various levels. It represents the contribution of the household sector to the Gross National Product (GNP). Household Surveys help as well in determining the incidence of poverty, and providing weighted data which reflects the relative importance of the consumption items to be employed in determining the benchmark for rates and prices of items and services. Generally, the Household Expenditure and Consumption Survey is a fundamental cornerstone in the process of studying the nutritional status in the Palestinian territory.

    The raw survey data provided by the Statistical Office was cleaned and harmonized by the Economic Research Forum, in the context of a major research project to develop and expand knowledge on equity and inequality in the Arab region. The main focus of the project is to measure the magnitude and direction of change in inequality and to understand the complex contributing social, political and economic forces influencing its levels. However, the measurement and analysis of the magnitude and direction of change in this inequality cannot be consistently carried out without harmonized and comparable micro-level data on income and expenditures. Therefore, one important component of this research project is securing and harmonizing household surveys from as many countries in the region as possible, adhering to international statistics on household living standards distribution. Once the dataset has been compiled, the Economic Research Forum makes it available, subject to confidentiality agreements, to all researchers and institutions concerned with data collection and issues of inequality. Data is a public good, in the interest of the region, and it is consistent with the Economic Research Forum's mandate to make micro data available, aiding regional research on this important topic.

    Geographic coverage

    The survey data covers urban, rural and camp areas in West Bank and Gaza Strip.

    Analysis unit

    1- Household/families. 2- Individuals.

    Universe

    The survey covered all the Palestinian households who are a usual residence in the Palestinian Territory.

    Kind of data

    Sample survey data [ssd]

    Sampling procedure

    Sample and Frame:

    The sampling frame consists of all enumeration areas which were enumerated in 1997; the enumeration area consists of buildings and housing units and is composed of an average of 120 households. The enumeration areas were used as Primary Sampling Units (PSUs) in the first stage of the sampling selection. The enumeration areas of the master sample were updated in 2003.

    Sample Design:

    The sample is a stratified cluster systematic random sample with two stages: First stage: selection of a systematic random sample of 299 enumeration areas. Second stage: selection of a systematic random sample of 12-18 households from each enumeration area selected in the first stage. A person (18 years and more) was selected from each household in the second stage.

    Sample strata:

    The population was divided by: 1- Governorate 2- Type of Locality (urban, rural, refugee camps)

    Sample Size:

    The calculated sample size is 3,781 households.

    Target cluster size:

    The target cluster size or "sample-take" is the average number of households to be selected per PSU. In this survey, the sample take is around 12 households.

    Detailed information/formulas on the sampling design are available in the user manual.

    Mode of data collection

    Face-to-face [f2f]

    Research instrument

    The PECS questionnaire consists of two main sections:

    First section: Certain articles / provisions of the form filled at the beginning of the month,and the remainder filled out at the end of the month. The questionnaire includes the following provisions:

    Cover sheet: It contains detailed and particulars of the family, date of visit, particular of the field/office work team, number/sex of the family members.

    Statement of the family members: Contains social, economic and demographic particulars of the selected family.

    Statement of the long-lasting commodities and income generation activities: Includes a number of basic and indispensable items (i.e, Livestock, or agricultural lands).

    Housing Characteristics: Includes information and data pertaining to the housing conditions, including type of shelter, number of rooms, ownership, rent, water, electricity supply, connection to the sewer system, source of cooking and heating fuel, and remoteness/proximity of the house to education and health facilities.

    Monthly and Annual Income: Data pertaining to the income of the family is collected from different sources at the end of the registration / recording period.

    Second section: The second section of the questionnaire includes a list of 54 consumption and expenditure groups itemized and serially numbered according to its importance to the family. Each of these groups contains important commodities. The number of commodities items in each for all groups stood at 667 commodities and services items. Groups 1-21 include food, drink, and cigarettes. Group 22 includes homemade commodities. Groups 23-45 include all items except for food, drink and cigarettes. Groups 50-54 include all of the long-lasting commodities. Data on each of these groups was collected over different intervals of time so as to reflect expenditure over a period of one full year.

    Cleaning operations

    Raw Data

    Both data entry and tabulation were performed using the ACCESS and SPSS software programs. The data entry process was organized in 6 files, corresponding to the main parts of the questionnaire. A data entry template was designed to reflect an exact image of the questionnaire, and included various electronic checks: logical check, range checks, consistency checks and cross-validation. Complete manual inspection was made of results after data entry was performed, and questionnaires containing field-related errors were sent back to the field for corrections.

    Harmonized Data

    • The Statistical Package for Social Science (SPSS) is used to clean and harmonize the datasets.
    • The harmonization process starts with cleaning all raw data files received from the Statistical Office.
    • Cleaned data files are then all merged to produce one data file on the individual level containing all variables subject to harmonization.
    • A country-specific program is generated for each dataset to generate/compute/recode/rename/format/label harmonized variables.
    • A post-harmonization cleaning process is run on the data.
    • Harmonized data is saved on the household as well as the individual level, in SPSS and converted to STATA format.

    Response rate

    The survey sample consists of about 3,781 households interviewed over a twelve-month period between January 2004 and January 2005. There were 3,098 households that completed the interview, of which 2,060 were in the West Bank and 1,038 households were in GazaStrip. The response rate was 82% in the Palestinian Territory.

    Sampling error estimates

    The calculations of standard errors for the main survey estimations enable the user to identify the accuracy of estimations and the survey reliability. Total errors of the survey can be divided into two kinds: statistical errors, and non-statistical errors. Non-statistical errors are related to the procedures of statistical work at different stages, such as the failure to explain questions in the questionnaire, unwillingness or inability to provide correct responses, bad statistical coverage, etc. These errors depend on the nature of the work, training, supervision, and conducting all various related activities. The work team spared no effort at different stages to minimize non-statistical errors; however, it is difficult to estimate numerically such errors due to absence of technical computation methods based on theoretical principles to tackle them. On the other hand, statistical errors can be measured. Frequently they are measured by the standard error, which is the positive square root of the variance. The variance of this survey has been computed by using the “programming package” CENVAR.

  19. Household Energy Survey January 2011 - West Bank and Gaza

    • pcbs.gov.ps
    Updated Aug 31, 2020
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Palestinian Central Bureau of Statistics (2020). Household Energy Survey January 2011 - West Bank and Gaza [Dataset]. https://www.pcbs.gov.ps/PCBS-Metadata-en-v5.2/index.php/catalog/572
    Explore at:
    Dataset updated
    Aug 31, 2020
    Dataset authored and provided by
    Palestinian Central Bureau of Statisticshttps://pcbs.gov/
    Time period covered
    2011
    Area covered
    Gaza Strip, West Bank, Gaza
    Description

    Abstract

    The energy statistics program has implemented many rounds of the Household Energy Survey during 1999-2011.

    Because of the importance of the household sector and due to its large contribution to energy consumption in the Palestinian Territory, PCBS decided to conduct a special Household Energy Survey to cover energy indicators in the household sector. To achieve this, a questionnaire was attached to the Labor Force Survey.

    This survey aimed to provide data on energy consumption in the household sector and to provide data on energy consumption behavior and patterns in the society by type of energy.

    The survey presents data on energy indicators pertaining to households in the Palestinian Territory. This includes statistical data on electricity and other fuel consumption by households covering type of fuel for different activities (cooking, baking, heating, lighting, and water heating).

    Geographic coverage

    Geographic Coverage

    Analysis unit

    households

    Universe

    The target population was all Palestinian households living in the Palestinian Territory.

    Kind of data

    Sample survey data [ssd]

    Sampling procedure

    Sample Frame The sample is a two-stage stratified cluster random sample.

    Target Population: The target population was all Palestinian households whom are living in the Palestinian territory.

    Sampling Frame: The sample of this survey is a part of the main sample of Labor Force Survey (LFS) which implemented periodically every quarter by PCBS since 1995, so this survey implement every quarter in the year (distributed over 13 weeks), the survey attached with the LFS in the first quarter of 2011, and the sample contain of 6 weeks from the eighth week to the thirteen week from the round 60 of labor force. The sample is two stage stratified cluster sample with two stages, first stage we selected a systematic random sample of 211 enumeration areas for the semi round, then in the second stage we select a random area sample of average 16 households from each enumeration area selected in the first stage.

    Sampling Design: The sample of this survey is a sub-sample of the Labor Force Survey (LFS) sample, which has been conducted periodically since September 1995. The sample of LFS is distributed over 13 weeks. The sample of the survey occupies six weeks of the first quarter of 2011 within implementing LFS.

    Stratification by number of households: In designing the sample of the LFS, three levels of stratification by number of households were made: Stratification by number of households: Stratification by place of residence which comprises: (a) Urban (b) Rural (c) Refugee camps Stratification by locality size.

    Sample Unit: In the first stage, the sampling units are the enumeration areas (clusters) from the master sample. In the second stage, the sampling units are households.

    Analysis Unit: The unit of analysis is the household.

    Sample Size: The sample size is comprised of (3,313) Palestinian households in the West Bank and Gaza Strip, where this sample was distributed according to locality type (urban, rural and refugee camps).

    Mode of data collection

    Face-to-face [f2f]

    Research instrument

    The design of the questionnaire for the Household Energy Survey was based on the experiences of similar countries as well as on international standards and recommendations for the most important indicators, taking into account the special situation of the Palestinian Territory.

    Cleaning operations

    he data processing stage consisted of the following operations: Editing and coding before data entry: All questionnaires were edited and coded in the office using the same instructions adopted for editing in the field.

    Data entry: At this stage, data was entered into the computer using a data entry template developed in Access. The data entry program was prepared to satisfy a number of requirements such as: · To prevent the duplication of the questionnaires during data entry. · To apply integrity and consistency checks of entered data. · To handle errors in user friendly manner. · The ability to transfer captured data to another format for data analysis using statistical analysis software such as SPSS.

    Response rate

    The survey sample consists of about 3313 households of which 3029 households completed the interview; whereas 1950 households from the West Bank and 1079 households in Gaza Strip. Weights were modified to account for non-response rate. The response rate in the West Bank reached 95. % while in the Gaza Strip it reached 98%.

    Non-response cases

    No of cases non-response cases
    3029 Household completed 22 Traveling households
    19 Unit does not exist 56 No one at home
    22 Refused to cooperate
    139 Vacant Housing unit 1 No available information
    25 Other
    3313 Total sample size

    Sampling error estimates

    It includes many aspects of the survey, mainly statistical errors due to the sample, and non statistical errors referring to the workers and tools of the survey. It includes also the response rates in the survey and their effect on the assumptions. This section includes:

    Sampling Errors These types of errors evolved as a result of studying a part of the population and not all of it. Because this is a sampled survey, the data will be affected by sampling errors due to using a sample and not the whole frame of the population. Differences appear compared to the actual values that could be obtained through a census. For this survey, variance calculations were made for average household consumption and total consumption for the different types of energy in the Palestinian Territory.

    The results of gasoline, wood, charcoal and olive cake suffer from a high variance. This problem should be taken into consideration when dealing with the average household consumption of these types of fuel, keeping in mind that there are no problems in publishing the data at the geographical level (North of the West Bank, Middle of the West Bank, South of the West Bank and Gaza Strip). However, publishing data at the governorate level is not possible due to the high variance, especially for wood, charcoal and olive cake. The variances for the main indicators of this survey are as follows:

    95% Confidence Interval C.V % Standard Error Estimate Variable
    Upper Lower Value Unit
    99.9 99.5 0.001 0.1 99.8 % Main Electricity Source 66.2 61.2 0.020 1.3 63.7 % Use of Solar Heaters
    98.5 97.5 0.003 0.2 98.1 % Use of LPG
    273 259 0.013 3.44 266 KWh Average Electricity Consumption 264 191 0.081 18.47 228 Kg Average wood Consumption
    50.5 41.8 0.047 2.19 46 Liter Average Gasoline Consumption

    Non Sampling Errors These errors are due to non-response cases as well as the implementation of surveys. In this survey, these errors emerged because of (a) the special situation of the questionnaire itself, where some parts depend partially on estimation, (b) diversity of sources (e.g., the interviewers, respondents, editors, coders, data entry operator, etc).

    The sources of these errors can be summarized as: Some of the households were not in their houses and the interviewers could not meet them. Some of the households did not give attention to the questions in questionnaire.
    Some errors occurred due to the way the questions were asked by interviewers. Misunderstanding of the questions by the respondents. Answering the questions related to consumption by making estimations.

    Data appraisal

    The data of the survey is comparable geographically and over time by comparing the data between different geographical areas to data of previous surveys and census 2007.

  20. Data from: Multiple systems in macaques for tracking prediction errors and...

    • zenodo.org
    zip
    Updated Aug 21, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jan Grohn; Jan Grohn (2020). Multiple systems in macaques for tracking prediction errors and other types of surprise [Dataset]. http://doi.org/10.5281/zenodo.3993117
    Explore at:
    zipAvailable download formats
    Dataset updated
    Aug 21, 2020
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Jan Grohn; Jan Grohn
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Data and code to reproduce the figures and major analyses in

    Grohn J, Schüffelgen U, Neubert F-X, Verhagen L, Sallet J, Kolling N, Rushworth MFS. Multiple systems in macaques for tracking prediction errors and other types of surprise. PLOS Biology. 2020.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
DIMITRI ABRAMOV (2021). Probability waves: adaptive cluster-based correction by convolution of p-value series from mass univariate analysis [Dataset]. http://doi.org/10.17632/rrm4rkr3xn.1

Data from: Probability waves: adaptive cluster-based correction by convolution of p-value series from mass univariate analysis

Related Article
Explore at:
Dataset updated
Feb 8, 2021
Authors
DIMITRI ABRAMOV
License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

dataset and Octave/MatLab codes/scripts for data analysis Background: Methods for p-value correction are criticized for either increasing Type II error or improperly reducing Type I error. This problem is worse when dealing with thousands or even hundreds of paired comparisons between waves or images which are performed point-to-point. This text considers patterns in probability vectors resulting from multiple point-to-point comparisons between two event-related potentials (ERP) waves (mass univariate analysis) to correct p-values, where clusters of signiticant p-values may indicate true H0 rejection. New method: We used ERP data from normal subjects and other ones with attention deficit hyperactivity disorder (ADHD) under a cued forced two-choice test to study attention. The decimal logarithm of the p-vector (p') was convolved with a Gaussian window whose length was set as the shortest lag above which autocorrelation of each ERP wave may be assumed to have vanished. To verify the reliability of the present correction method, we realized Monte-Carlo simulations (MC) to (1) evaluate confidence intervals of rejected and non-rejected areas of our data, (2) to evaluate differences between corrected and uncorrected p-vectors or simulated ones in terms of distribution of significant p-values, and (3) to empirically verify rate of type-I error (comparing 10,000 pairs of mixed samples whit control and ADHD subjects). Results: the present method reduced the range of p'-values that did not show covariance with neighbors (type I and also type-II errors). The differences between simulation or raw p-vector and corrected p-vectors were, respectively, minimal and maximal for window length set by autocorrelation in p-vector convolution. Comparison with existing methods: Our method was less conservative while FDR methods rejected basically all significant p-values for Pz and O2 channels. The MC simulations, gold-standard method for error correction, presented 2.78±4.83% of difference (all 20 channels) from p-vector after correction, while difference between raw and corrected p-vector was 5,96±5.00% (p = 0.0003). Conclusion: As a cluster-based correction, the present new method seems to be biological and statistically suitable to correct p-values in mass univariate analysis of ERP waves, which adopts adaptive parameters to set correction.

Search
Clear search
Close search
Google apps
Main menu