76 datasets found
  1. random-points-sampling-r

    • kaggle.com
    zip
    Updated Nov 18, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    JORGE GARCIA-INIGUEZ (2023). random-points-sampling-r [Dataset]. https://www.kaggle.com/datasets/jorgegarciainiguez/random-points-sampling-r
    Explore at:
    zip(712850 bytes)Available download formats
    Dataset updated
    Nov 18, 2023
    Authors
    JORGE GARCIA-INIGUEZ
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    Dataset

    This dataset was created by JORGE GARCIA-INIGUEZ

    Released under MIT

    Contents

  2. Summary statistics of population and samples taken at different sampling...

    • plos.figshare.com
    xls
    Updated Jun 3, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Maria M; Ibrahim M. Almanjahie; Muhammad Ismail; Ammara Nawaz Cheema (2023). Summary statistics of population and samples taken at different sampling schemes for n = 4, r = 1. [Dataset]. http://doi.org/10.1371/journal.pone.0275340.t001
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jun 3, 2023
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Maria M; Ibrahim M. Almanjahie; Muhammad Ismail; Ammara Nawaz Cheema
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Summary statistics of population and samples taken at different sampling schemes for n = 4, r = 1.

  3. Data from: Correction for bias in meta-analysis of little-replicated studies...

    • zenodo.org
    • data.niaid.nih.gov
    • +3more
    txt
    Updated May 30, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    C. Patrick Doncaster; Rebecca Spake; C. Patrick Doncaster; Rebecca Spake (2022). Data from: Correction for bias in meta-analysis of little-replicated studies [Dataset]. http://doi.org/10.5061/dryad.5f4g6
    Explore at:
    txtAvailable download formats
    Dataset updated
    May 30, 2022
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    C. Patrick Doncaster; Rebecca Spake; C. Patrick Doncaster; Rebecca Spake
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description
    1. Meta-analyses conventionally weight study estimates on the inverse of their error variance, in order to maximize precision. Unbiased variability in the estimates of these study-level error variances increases with the inverse of study-level replication. Here we demonstrate how this variability accumulates asymmetrically across studies in precision-weighted meta-analysis, to cause undervaluation of the meta-level effect size or its error variance (the meta-effect and meta-variance).
    2. Small samples, typical of the ecological literature, induce big sampling errors in variance estimation, which substantially bias precision-weighted meta-analysis. Simulations revealed that biases differed little between random- and fixed-effects tests. Meta-estimation of a one-sample mean from 20 studies, with sample sizes of 3 to 20 observations, undervalued the meta-variance by ~20%. Meta-analysis of two-sample designs from 20 studies, with sample sizes of 3 to 10 observations, undervalued the meta-variance by 15-20% for the log response ratio (lnR); it undervalued the meta-effect by ~10% for the standardised mean difference (SMD).
    3. For all estimators, biases were eliminated or reduced by a simple adjustment to the weighting on study precision. The study-specific component of error variance prone to sampling error and not parametrically attributable to study-specific replication was replaced by its cross-study mean, on the assumption of random sampling from the same population variance for all studies, and sufficient studies for averaging. Weighting each study by the inverse of this mean-adjusted error variance universally improved accuracy in estimation of both the meta-effect and its significance, regardless of number of studies. For comparison, weighting only on sample size gave the same improvement in accuracy, but could not sensibly estimate significance.
    4. For the one-sample mean and two-sample lnR, adjusted weighting also improved estimation of between-study variance by DerSimonian-Laird and REML methods. For random-effects meta-analysis of SMD from little-replicated studies, the most accurate meta-estimates obtained from adjusted weights following conventionally-weighted estimation of between-study variance.
    5. We recommend adoption of weighting by inverse adjusted-variance for meta-analyses of well- and little-replicated studies, because it improves accuracy and significance of meta-estimates, and it can extend the scope of the meta-analysis to include some studies without variance estimates.
  4. f

    Data from: Robust inference under r-size-biased sampling without replacement...

    • tandf.figshare.com
    xlsx
    Updated Nov 28, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    P. Economou; G. Tzavelas; A. Batsidis (2023). Robust inference under r-size-biased sampling without replacement from finite population [Dataset]. http://doi.org/10.6084/m9.figshare.11542974.v1
    Explore at:
    xlsxAvailable download formats
    Dataset updated
    Nov 28, 2023
    Dataset provided by
    Taylor & Francis
    Authors
    P. Economou; G. Tzavelas; A. Batsidis
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The case of size-biased sampling of known order from a finite population without replacement is considered. The behavior of such a sampling scheme is studied with respect to the sampling fraction. Based on a simulation study, it is concluded that such a sample cannot be treated either as a random sample from the parent distribution or as a random sample from the corresponding r-size weighted distribution and as the sampling fraction increases, the biasness in the sample decreases resulting in a transition from an r-size-biased sample to a random sample. A modified version of a likelihood-free method is adopted for making statistical inference for the unknown population parameters, as well as for the size of the population when it is unknown. A simulation study, which takes under consideration the sampling fraction, demonstrates that the proposed method presents better and more robust behavior compared to the approaches, which treat the r-size-biased sample either as a random sample from the parent distribution or as a random sample from the corresponding r-size weighted distribution. Finally, a numerical example which motivates this study illustrates our results.

  5. AWC to 60cm DSM data of the Roper catchment NT generated by the Roper River...

    • data.csiro.au
    • researchdata.edu.au
    Updated Apr 16, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ian Watson; Mark Thomas; Seonaid Philip; Uta Stockmann; Ross Searle; Linda Gregory; jason hill; Elisabeth Bui; John Gallant; Peter R Wilson; Peter Wilson (2024). AWC to 60cm DSM data of the Roper catchment NT generated by the Roper River Water Resource Assessment [Dataset]. http://doi.org/10.25919/y0v9-7b58
    Explore at:
    Dataset updated
    Apr 16, 2024
    Dataset provided by
    CSIROhttp://www.csiro.au/
    Authors
    Ian Watson; Mark Thomas; Seonaid Philip; Uta Stockmann; Ross Searle; Linda Gregory; jason hill; Elisabeth Bui; John Gallant; Peter R Wilson; Peter Wilson
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    Jul 1, 2020 - Jun 30, 2023
    Area covered
    Dataset funded by
    Northern Territory Department of Environment, Parks and Water Security
    CSIROhttp://www.csiro.au/
    Description

    AWC to 60cm is one of 18 attributes of soils chosen to underpin the land suitability assessment of the Roper River Water Resource Assessment (ROWRA) through the digital soil mapping process (DSM). AWC (available water capacity) indicates the ability of a soil to retain and supply water for plant growth. This AWC raster data represents a modelled dataset of AWC to 60cm (mm of water to 60cm of soil depth) and is derived from analysed site data, spline calculations and environmental covariates. AWC is a parameter used in land suitability assessments for rainfed cropping and for water use efficiency in irrigated land uses. This raster data provides improved soil information used to underpin and identify opportunities and promote detailed investigation for a range of sustainable regional development options and was created within the ‘Land Suitability’ activity of the CSIRO ROWRA. A companion dataset and statistics reflecting reliability of this data are also provided and can be found described in the lineage section of this metadata record. Processing information is supplied in ranger R scripts and attributes were modelled using a Random Forest approach. The DSM process is described in the CSIRO ROWRA published report ‘Soils and land suitability for the Roper catchment, Northern Territory’. A technical report from the CSIRO Roper River Water Resource Assessment to the Government of Australia. The Roper River Water Resource Assessment provides a comprehensive overview and integrated evaluation of the feasibility of aquaculture and agriculture development in the Roper catchment NT as well as the ecological, social and cultural (indigenous water values, rights and aspirations) impacts of development. Lineage: This AWC to 60cm dataset has been generated from a range of inputs and processing steps. Following is an overview. For more information refer to the CSIRO ROWRA published reports and in particular ' Soils and land suitability for the Roper catchment, Northern Territory’. A technical report from the CSIRO Roper River Water Resource Assessment to the Government of Australia. 1. Collated existing data (relating to: soils, climate, topography, natural resources, remotely sensed, of various formats: reports, spatial vector, spatial raster etc). 2. Selection of additional soil and land attribute site data locations by a conditioned Latin hypercube statistical sampling method applied across the covariate data space. 3. Fieldwork was carried out to collect new attribute data, soil samples for analysis and build an understanding of geomorphology and landscape processes. 4. Database analysis was performed to extract the data to specific selection criteria required for the attribute to be modelled. 5. The R statistical programming environment was used for the attribute computing. Models were built from selected input data and covariate data using predictive learning from a Random Forest approach implemented in the ranger R package. 6. Create AWC to 60cm Digital Soil Mapping (DSM) attribute raster dataset. DSM data is a geo-referenced dataset, generated from field observations and laboratory data, coupled with environmental covariate data through quantitative relationships. It applies pedometrics - the use of mathematical and statistical models that combine information from soil observations with information contained in correlated environmental variables, remote sensing images and some geophysical measurements. 7. Companion predicted reliability data was produced from the 500 individual Random Forest attribute models created. 8. QA Quality assessment of this DSM attribute data was conducted by three methods. Method 1: Statistical (quantitative) method of the model and input data. Testing the quality of the DSM models was carried out using data withheld from model computations and expressed as OOB and R squared results, giving an estimate of the reliability of the model predictions. These results are supplied. Method 2: Statistical (quantitative) assessment of the spatial attribute output data presented as a raster of the attributes “reliability”. This used the 500 individual trees of the attributes RF models to generate 500 datasets of the attribute to estimate model reliability for each attribute. For continuous attributes the method for estimating reliability is the Coefficient of Variation. This data is supplied. Method 3: Collecting independent external validation site data combined with on-ground expert (qualitative) examination of outputs during validation field trips. Across each of the study areas a two week validation field trip was conducted using a new validation site set which was produced by a random sampling design based on conditioned Latin Hypercube sampling using the reliability data of the attribute. The modelled DSM attribute value was assessed against the actual on-ground value. These results are published in the report cited in this metadata record.

  6. f

    Test_Files

    • datasetcatalog.nlm.nih.gov
    • figshare.com
    Updated Dec 23, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Buchowski, Maciej (2020). Test_Files [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0000474626
    Explore at:
    Dataset updated
    Dec 23, 2020
    Authors
    Buchowski, Maciej
    Description

    From the group of selected participant recordings (n=400), 200 were randomly allocated to the validation group using the random sampling function in R. The validation (test) group was used to validate DT prediction from

  7. Additional file 3: of Aiming for a representative sample: Simulating random...

    • figshare.com
    txt
    Updated Jun 1, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Loan van Hoeven; Mart Janssen; Kit Roes; Hendrik Koffijberg (2023). Additional file 3: of Aiming for a representative sample: Simulating random versus purposive strategies for hospital selection [Dataset]. http://doi.org/10.6084/m9.figshare.c.3624569_D2.v1
    Explore at:
    txtAvailable download formats
    Dataset updated
    Jun 1, 2023
    Dataset provided by
    figshare
    Figsharehttp://figshare.com/
    Authors
    Loan van Hoeven; Mart Janssen; Kit Roes; Hendrik Koffijberg
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    R code for simulating sampling strategies. Description: R code that creates an exemplary data set and simulates the sampling strategies. (R 26Â kb)

  8. Ithaka S+R Faculty Survey, United States, 2021

    • icpsr.umich.edu
    ascii, delimited, r +3
    Updated Mar 9, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Blankstein, Melissa (2023). Ithaka S+R Faculty Survey, United States, 2021 [Dataset]. http://doi.org/10.3886/ICPSR38593.v1
    Explore at:
    spss, r, delimited, sas, stata, asciiAvailable download formats
    Dataset updated
    Mar 9, 2023
    Dataset provided by
    Inter-university Consortium for Political and Social Researchhttps://www.icpsr.umich.edu/web/pages/
    Authors
    Blankstein, Melissa
    License

    https://www.icpsr.umich.edu/web/ICPSR/studies/38593/termshttps://www.icpsr.umich.edu/web/ICPSR/studies/38593/terms

    Time period covered
    Oct 6, 2021 - Dec 13, 2021
    Area covered
    United States
    Description

    The eighth cycle of the Ithaka S+R Faculty Survey queried a random sample of higher education faculty members in the United States to learn about their attitudes and practices related to their research and teaching. Respondents were asked about resource discovery and access; research topics and practices; research dissemination, including open access, data management, and preservation; instruction and perceptions of student research skills; the role and value of the academic library; and open-educational resources. Demographic variables include the respondent's age, gender, primary academic field, title or role, institution's Carnegie classification, how many years the respondent has worked at their current college or university, how many years the respondent has worked in their field, what format the courses they are currently teaching (if any) are in (synchronous, asynchronous, or a mix of both) and whether the respondent primarily identifies as a researcher, teacher, or somewhere in between.

  9. Dataset: Differences between Neurodivergent and Neurotypical Software...

    • zenodo.org
    bin, zip
    Updated Jan 31, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Anonymous; Anonymous (2025). Dataset: Differences between Neurodivergent and Neurotypical Software Engineers: Analyzing the 2022 Stack Overflow Survey [Dataset]. http://doi.org/10.5281/zenodo.14779344
    Explore at:
    bin, zipAvailable download formats
    Dataset updated
    Jan 31, 2025
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Anonymous; Anonymous
    License

    Open Database License (ODbL) v1.0https://www.opendatacommons.org/licenses/odbl/1.0/
    License information was derived automatically

    Description

    # Analysis and Figure Generation Scripts for the Paper "Differences between Neurodivergent and Neurotypical Software Engineers: Analyzing the 2022 Stack Overflow Survey"

    This repository contains the necessary R scripts and session data to re-run all tests reported in the paper, as well as generate the figures we use (and a few others we did not use).

    ## Files
    - SO_analysis_EASE25.R: The main R script executing all other parts. Running this script will calculate and the p-values of all tests, sorted by condition (lines 10 to 602). These are the same as reported in the paper in Tables 1 to 7. Additionally, figures will be generated (lines 603 onwards).
    - SO_data_filtered_sampled.RData: A RData file containing the filtered and sampled data and functions necessary to run the tests/generate the figures.
    - 01_SO_preprocessing.R and 02_SO_random_sampling.R: Scripts containing filtering logic and sampling logic. If those files are executed within SO_analysis_EASE25.R instead of loading the RData file (lines 5 to 8), the filtering and sampling is re-run. Note that this results in different random samples, and therefore different results than our paper. Also, the effect size calculations then have to be adjusted to the results that are significant (lines 596 to 601).
    - generatedGraphs.zip: The graph files generated by the main R script.
    - paperFigures.zip: The graphs we used in the paper. These are the generated graphs, but edited for readability.

    ## License
    The Public 2022 Stack Overflow Developer Survey Results is made available under the Open Database License (ODbL): http://opendatacommons.org/licenses/odbl/1.0/. Any rights in individual contents of the database are licensed under the Database Contents License: http://opendatacommons.org/licenses/dbcl/1.0/).
    We hereby would like to attribute Stack Overflow as the source of these results. All data we make available (i.e., the RData file) is a derivative work, and as such also shared under the under the ODbl.

  10. p

    Demographic Health Survey 2007 - Nauru

    • microdata.pacificdata.org
    Updated Aug 18, 2013
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Nauru Bureau of Statistics (2013). Demographic Health Survey 2007 - Nauru [Dataset]. https://microdata.pacificdata.org/index.php/catalog/25
    Explore at:
    Dataset updated
    Aug 18, 2013
    Dataset authored and provided by
    Nauru Bureau of Statistics
    Time period covered
    2007
    Area covered
    Nauru
    Description

    Abstract

    The main objective of a demographic household survey (DHS) is to provide estimates of a number of basic demographic and health variables. This is done through interviews with a scientifically selected probability sample that is chosen from a well-defined population.

    The 2007 Nauru Demographic and Health Survey (2007 NDHS) was one of four pilot demographic and health surveys conducted in the Pacific under an Asian Development Bank ADB/ Secretariat of the Pacific Community (SPC) Regional DHS Pilot Project. The primary objective of this survey was to provide up-to-date information for policy-makers, planners, researchers and programme managers, for use in planning, implementing, monitoring and evaluating population and health programmes within the country. The survey was intended to provide key estimates of Nauru's demographics and health situation. The findings of the 2007 NDHS are very important in measuring the achievements of family planning and other health programmes. To ensure better understanding and use of these data, the results of this survey should be widely disseminated at different planning levels. Different dissemination techniques will be used to reach different segments of society.

    The primary purpose of the 2007 NDHS was to furnish policy-makers and planners with detailed information on fertility, family planning, infant and child mortality, maternal and child health, nutrition, and knowledge of HIV and AIDS and other sexually transmitted infections.

    NOTE: The only dissemination used was wide distribution of the report. A planned data use workshop was not undertaken. Hence there is some misconceptions and lack of awareness on the results obtained from the survey. The report is provided on the NBOS website free for download.

    Geographic coverage

    National Coverage - Districts

    Analysis unit

    • Households
    • Children (0-14yrs)
    • Individual women of reproductive age (15-49 yrs)
    • Individual men of reproductive age (15yrs+)
    • Facilities providing reproductive and child health services

    Universe

    The survey covered all household members (usual residents), - All children (aged 0-14 years) resident in the household - All women of reproductive age (15-49 years) resident in all household - All males (15yrs and above) in every second household (approx. 50%) resident in selected household

    Results: The 2007 Nauru Demographic Health Survey (2007 NDHS) is a nationally representative survey of 655 eligible women (aged 15-49) and 392 eligible men (aged 15 and above).

    Kind of data

    Sample survey data [ssd]

    Sampling procedure

    IDG NOTES: Locate sampling documentation with SPC (Graeme Brown) and internal files. Add in this sections. Or second option dilute appendix A Sampling and extract key issues.

    ESTIMATES OF SAMPLING ERRORS - Refer to Appendix A of final NDHS2007 report or; - External Resources - 2007 DHS- Appendix A and B Sampling (to be created separatedly by IDG progress ongoing)

    Sampling deviation

    IDG NOTES: Locate sampling documentation with Macro and internal files. Add in this section. Or second option dilute appendix B Sampling and extract key issues.

    ESTIMATES OF SAMPLING ERRORS - Refer to Appendix B of final NDHS2007 report or;

    • External Resources
      • 2007 DHS- Appendix A and B Sampling (to be created separatedly by IDG progress ongoing)

    Extract:

    In the 2007 NDHS Report of the survey results, sampling errors for selected variables have been presented in a tabular format. The sampling error tables should include:

    .. Variable name

    R: Value of the estimate; SE: Sampling error of the estimate; N: Unweighted number of cases on which the estimate is based; WN: Weighted number of cases; DEFT: Design effect value that compensates for the loss of precision that results from using cluster rather than simple random sampling; SE/R: Relative standard error (i.e. ratio of the sampling error to the value estimate); R-2SE: Lower limit of the 95% confidence interval; R+2SE: Upper limit of the 95% confidence interval (never >1.000 for a proportion).

    Mode of data collection

    Face-to-face [f2f]

    Research instrument

    DHS questionnaire for women cover the following sections:

    • Background characteristics (age, education, religion, etc)
    • Reproductive history
    • Knowledge and use of contraception methods
    • Antenatal care, delivery care and postnatal care
    • Breastfeeding and infant feeding
    • Immunization, child health and nutrition
    • Marriage and recent sexual activity
    • Fertility preferences
    • Knowledge about HIV/AIDS and other sexually transmitted infections
    • Husbands background and women's work

    The men's questionnaire covers the same except for sections 4, 5, 6 which are not applicable to men.

    It was also recognized that some countries have a need for special information that is not contained in the core questionnaire. Separate questionnaire modules were developed on a series of topics. These topics are optional and include:

    • maternal mortality
    • pill-taking behaviour
    • sterilization experience
    • children's education
    • women's status
    • domestic violence
    • health expenditures
    • consanguinity

    The Papua New Guinea (PNG) questionnaire was proposed for Nauru to adapt as in comparison to the existing DHS model, this is not as lengthy and time-consuming. The PNG questionnaire also dealt with high incidence of alcohol and tobacco in Nauru. Questions on HIV/AIDS and STI knowledge were included in the men's questionnaire where it was not included in the PNG questionnaire.

    Response rate

    IDG NOTES: Locate response rate documentation with SPC (Graeme Brown) and internal files. Add in this sections.

  11. Multiple Indicator Cluster Survey 2010 - Roma Settlements - Serbia

    • microdata.worldbank.org
    • catalog.ihsn.org
    • +1more
    Updated Sep 26, 2013
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statistical Office of the Republic of Serbia (2013). Multiple Indicator Cluster Survey 2010 - Roma Settlements - Serbia [Dataset]. https://microdata.worldbank.org/index.php/catalog/1307
    Explore at:
    Dataset updated
    Sep 26, 2013
    Dataset provided by
    UNICEFhttp://www.unicef.org/
    Statistical Office of the Republic of Serbiahttp://www.stat.gov.rs/
    Time period covered
    2010
    Area covered
    Serbia
    Description

    Abstract

    The Serbia Multiple Indicator Cluster Survey (MICS) is a household survey programme conducted in 2010 by UNICEF and the Statistical Office of the Republic of Serbia (SORS). The survey provides valuable information on the situation of children, women and men in Serbia, and was based, in large part, on the needs to monitor progress towards goals and targets emanating from recent international agreements: the Millennium Declaration, and the Plan of Action of A World Fit For Children. Both of these commitments build upon promises made by the international community at the 1990 World Summit for Children.

    The fourth round of the Multiple Indicator Cluster Survey represents a large source of data for reporting on progress towards the aforementioned goals. The survey provides a rich foundation of comparative data for comprehensive progress reporting, especially regarding the situation of the most vulnerable children (children in the poorest households, Roma children or those living in rural areas). It also provides important information for the new UNICEF Country Programme 2011-2015 as well as the UNDAF 2011-2015. This final report presents the results of the indicators and topics covered in the survey.

    Datasets documented here cover Roma Settlements sample representative of the population living in Roma settlements in Serbia. A total of 1,815 Roma households were selected: 1,311 households with children and 504 households without children. A stratified, two-stage random sampling approach was used for the selection of the survey sample.

    Geographic coverage

    National

    Analysis unit

    • individuals
    • households

    Universe

    The survey covered household members in Roma settlements, all women aged between 15-49 years, all children under 5 living in the household, and all men aged 15-29 years.

    Kind of data

    Sample survey data [ssd]

    Sampling procedure

    The primary objective of the sample design for the Roma settlements Multiple Indicator Cluster Survey was to produce statistically reliable estimates of most indicators, at the level of Serbia, and for urban and rural areas.

    A stratified, two-stage random sampling approach was used for the selection of the survey sample.

    The target sample size for the Roma settlements was calculated as 1,800 households and 100 enumeration areas, considering the proposed formula and budget available. For the calculation of the sample size, the key indicator used was the percentage of children aged 0-4 years who had had Acute Respiratory infections.

    The resulting number of households from this exercise was about 2,700 households, which is the sample size needed to provide a large number of children under 5 (about 1,300) for drawing reliable conclusions. Therefore, in order to reduce the number of households in the sample, but not to lose estimation reliability, the stratification of the sample into categories with and without children aged 0-4 years was needed. The required number of households in each category was obtained supposing an overall sample of 1800 households, 100 clusters and same number of households with children under 5 per cluster. Assuming one child under 5 per household and considering the required number of sample children, the total sample size was calculated as 1,300 (13 per cluster) households with children under 5 and 500(5 per cluster) of households without children under 5.Thus, the overall number of households to be selected per cluster was determined as 18 households.

    Stratification of enumeration areas for Roma settlements was done according to type of settlement (urban and rural), and territory, to the three strata: Vojvodina, Belgrade and Central Serbia without Belgrade.

    Sample allocation of enumeration areas according to territory and type of settlement was not proportional to the number of Roma households. In order to produce estimates with better precision for territories and urban/rural domains, the number of enumeration areas for Vojvodina and rural domains was increased.

    The sampling procedures are more fully described in "Multiple Indicator Cluster Survey 2010 - Final Report" pp.261-263.

    Mode of data collection

    Face-to-face [f2f]

    Research instrument

    The questionnaires for Roma settlements are the Generic MICS questionnaires based on the MICS4 model questionnaire with some modifications and additions. Household questionnaires were administered in each household, which collected various information on household members including sex, age and relationship. The household questionnaire includes household listing form, education, water and sanitation, household characteristics, child discipline and hand washing.

    In addition to a household questionnaire, questionnaires were administered in each household for women age 15-49, children under age five and men age 15-29. For children, the questionnaire was administered to the mother or primary caretaker of the child.

    The women's questionnaire includes woman's background, access to mass media and ICT, child mortality, desire for last birth, maternal and newborn health, illness symptoms, contraception, unmet need, attitudes toward domestic violence, marriage/union, sexual behavior, HIV/AIDS, and life satisfaction.

    The children's questionnaire includes child's age, birth registration, early childhood development, breastfeeding, care of illness, and anthropometry.

    The men's questionnaire includes man's background, access to mass media and ICT, marriage/union, contraception, attitudes toward domestic violence, sexual behavior, HIV/AIDS, and life satisfaction.

    The questionnaires were developed in English from the MICS4 Model Questionnaires, and were translated into Serbian. The Serbian versions were pre-tested in Belgrade during September 2010 and modifications were made to the wording and translation of the questionnaires based on the results of the pre-test.

    Cleaning operations

    Data was entered using the CSPro software. The data entry was carried out on 10 microcomputers by 20 data entry operators and 4 data entry supervisors. In order to ensure quality control, all questionnaires were double entered and internal consistency checks were performed. Procedures and standard programmes developed under the global MICS4 programme and adapted to Serbia’s questionnaire were used throughout.

    Data processing began simultaneously with data collection and was completed in March 2011. Data was analysed using the Statistical Package for Social Sciences (SPSS) software programme, Version 18, and the model syntax and tabulation plans developed by UNICEF were used for this purpose.

    Response rate

    The response rate of households is 96 percent. (Of the 1815 households selected for the sample, 1782 were found to be occupied. Of these, 1711 were successfully interviewed.)

    The response rate of women is 95 percent within interviewed households. (In the interviewed households, 2234 women aged between 15-49 years were identified. Of these, 2118 were successfully interviewed.)

    The response rate of children is 99 percent within interviewed households. (1618 children under the age of five were listed in the household questionnaire. Questionnaires were completed for 1604 of these children.)

    The response rate of men is 78 percent within interviewed households.(1121 men aged between 15-29 years were identified. Of these, 877 were successfully interviewed.)

    Overall response rates of 91, 95 and 75 percent respectively are calculated for the women’s, under-5’s and men’s interviews.

    Sampling error estimates

    Sampling errors are a measure of the variability between the estimates from all possible samples. The extent of variability is not known exactly, but can be estimated statistically from the survey data.

    The following sampling error measures are presented for each of the selected indicators: - Standard error (se): Sampling errors are usually measured in terms of standard errors for particular indicators (means, proportions etc). Standard error is the square root of the variance of the estimate. The Taylor linearization method is used for the estimation of standard errors. - Coefficient of variation (se/r) is the ratio of the standard error to the value of the indicator, and is a measure of the relative sampling error. - Design effect (deff) is the ratio of the actual variance of an indicator, under the sampling method used in the survey, to the variance calculated under the assumption of simple random sampling. The square root of the design effect (deft) is used to show the efficiency of the sample design in relation to the precision. A deft value of 1.0 indicates that the sample design is as efficient as a simple random sample, while a deft value above 1.0 indicates the increase in the standard error due to the use of a more complex sample design. - Confidence limits are calculated to show the interval within which the true value for the population can be reasonably assumed to fall, with a specified level of confidence. For any given statistic calculated from the survey, the value of that statistic will fall within a range of plus or minus two times the standard error (r + 2se or r – 2se) of the statistic in 95 percent of all possible samples of identical size and design.

    For the calculation of sampling errors from MICS data, SPSS Version 18 Complex Samples module has been used.Sampling errors are calculated for indicators of primary interest, for the national level and for urban and rural areas. Five of the selected indicators are based on household members, 18 are based on women, 8 are based on men and 12 are based on children under 5. All

  12. S1 File -

    • plos.figshare.com
    zip
    Updated Nov 27, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Constantino A. García; Sofía Bardají; Pablo Pérez-Tirador; Abraham Otero (2024). S1 File - [Dataset]. http://doi.org/10.1371/journal.pone.0309055.s004
    Explore at:
    zipAvailable download formats
    Dataset updated
    Nov 27, 2024
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Constantino A. García; Sofía Bardají; Pablo Pérez-Tirador; Abraham Otero
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Heart Rate Variability (HRV) analysis aims to characterize the physiological state affecting heart rate, and identify potential markers of underlying pathologies. This typically involves calculating various HRV indices for each recording of two or more populations. Then, statistical tests are used to find differences. The normality of the indices, the number of groups being compared, and the correction of the significance level should be considered in this step. Especially for large studies, this process is tedious and error-prone. This paper presents RHRVEasy, an R open-source package that automates all the steps of HRV analysis. RHRVEasy takes as input a list of folders, each containing all the recordings of the same population. The package loads and preprocesses heart rate data, and computes up to 31 HRV time, frequency, and non-linear indices. Notably, it automates the computation of non-linear indices, which typically demands manual intervention. It then conducts hypothesis tests to find differences between the populations, adjusting significance levels if necessary. It also performs a post-hoc analysis to identify the differing groups if there are more than two populations. RHRVEasy was validated using a database of healthy subjects, and another of congestive heart failure patients. Significant differences in many HRV indices are expected between these groups. Two additional groups were constructed by random sampling of the original databases. Each of these groups should present no statistically significant differences with the group from which it was sampled, and it should present differences with the other two groups. All tests produced the expected results, demonstrating the software’s capability in simplifying HRV analysis. Code is available on https://github.com/constantino-garcia/RHRVEasy.

  13. f

    Radial sampling design versus winding stairs sampling design without random...

    • datasetcatalog.nlm.nih.gov
    • plos.figshare.com
    Updated Oct 25, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Band, Leah R.; Owen, Markus R.; Jones, Matthew D.; Rutjens, Rik J. L. (2023). Radial sampling design versus winding stairs sampling design without random column permutations. [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0000940918
    Explore at:
    Dataset updated
    Oct 25, 2023
    Authors
    Band, Leah R.; Owen, Markus R.; Jones, Matthew D.; Rutjens, Rik J. L.
    Description

    k inputs are considered here, resulting in k + 1 points in parameter space. The base point is given by (a1, a2, …, ak). In OT the ai are elements from a discrete set and bi = ai ± |δi|, whereas in the radial design (as in [30]) ai and bi can take any value in [0, 1]. Table adapted from [30].

  14. i

    Quarterly Labour Force Survey 2011 - St. Lucia

    • catalog.ihsn.org
    Updated Mar 29, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Central Statistics Office of Saint Lucia (2019). Quarterly Labour Force Survey 2011 - St. Lucia [Dataset]. https://catalog.ihsn.org/index.php/catalog/4329
    Explore at:
    Dataset updated
    Mar 29, 2019
    Dataset authored and provided by
    Central Statistics Office of Saint Lucia
    Time period covered
    2011
    Area covered
    Saint Lucia
    Description

    Abstract

    The 2011 Third Quarter Labour Force Survey aims to collect information on the supply side of the labour market. It provides information on the extent of available and unused labour time and on relationships between employment and income. Thus, the data collected can be used for:

    Macro-economic monitoring:- from an economic point of view, a main objective of collecting data on the economically active population is to provide basic information on the size and structure of a country's workforce. The unemployment rate in particular is widely used as an overall indicator of the current performance of a country's economy.

    Human resources development: The economy is changing all the time. In order to meet the needs of the changing economy, people need to be trained. These areas of training must therefore be identified.

    Employment policies: For an economy to work at its maximum potential, all persons wanting to have work should have jobs. Some persons may wish to have full-time jobs, and can only find part-time work. We need to know what proportion of the labour force these people represent in order to assess the social effects of government employment policies.

    Income Support and social programmes: For the majority of people, employment income is their main means of support. People need not only jobs, but more importantly, productive jobs in order to receive reasonable incomes. We need to know what levels of income are being earned by different groups of persons.

    Geographic coverage

    National Coverage

    Analysis unit

    • Households;
    • Individuals.

    Universe

    The survey covered all de jure non-institutional household members (usual residents), it focuses on the employment, unemployment and current activity or inactivity status of all persons aged 15 years and over resident in the household.

    Kind of data

    Sample survey data [ssd]

    Sampling procedure

    Every quarter (three months) approximately 1,000 households are interviewed, there is a one third overlap between the households interviewed between each round of the survey.

    The Multi-Stage sampling procedure developed for the St. Lucia MS (Master Sample) Frame is used for the execution of the labour force survey:

    The two stage process of sample selection in the ST. LUCIA MS entails the selection of the PSUs within the districts. This is followed by the systematic selection of the cluster of households or USU (Ultimate Sampling Units) within the selected PSUs. The two stages in the design is elaborated as follows:

    a. In the first stage, a sampling frame is constructed consisting of all of the enumeration districts from the census of 2001. The size of each enumeration district is measured in units of clusters of households. In the case of the ST. LUCIA MS, approximately seven or eight households were allocated per cluster. The clusters which are allocated to the EDs all have an equal probability of selection within the specified geographic domain in which they are allocated. In addition, the number of clusters allocated to an ED is a measure of the size of the ED. Clusters, therefore ensure the selection of EDs or Primary Sampling Units with probability proportional to the size of the ED. The ST. LUCIA MS frame consists of nine sub-samples / replicates, with each replicate selected with a probability of (1 / (16 * 9)) or 1 / 144.

    b. In the second stage a non-compact cluster of households is selected within the selected PSU using systematic random sampling. There are three elements to the selection of this non-compact cluster. Firstly, there is the sample interval, which is a measure of the size of the ED in terms of the total number of households it contains. The larger the ED or PSU the larger will be the sample interval assigned and consequently the larger will be the number of clusters assigned to the ED. This approach ensures that the total number of households selected in any selected ED is approximately the same. In the case of the "Castries" in the ST. LUCIA MS frame the approximate number is five (5). Secondly, the random start is determined by use of a random number generator. With a Microsoft EXCEL spreadsheet the formulae takes the following form, =ROUND(RAND()*E1,0)+1, where E1 is the cell containing the sample interval (or total number of clusters assigned) RAND() is the function which generates the random number. The round() function is used to round the result to the nearest whole number. The third element of choosing the non compact cluster is a combination of the above. A random number (r) is choosen between 1 and the sample interval value, I, inclusive, then to this number is added the sample interval for the full list of households within the primary sample unit. Thus, the list of selected households would be r, r + I, r + 2I, r + 3I, r + 4I,……, r + (n - 1)I, where n is the cluster size assigned to the district, in the case of Castries n is five.

    A. Size of the Sample

    As has been explained before the decision to use a sampling fraction of 1 : 16 and to assign nine replicates to each District (the geographic domain) was based on the need to take advantage of the small size of the countries covered by this MECOVI project. This was done by increasing the "spread" of the sample across EDs and as a result improving the precision of the estimates which can be obtained from it. In addition, attention was paid to ensuring that were the CSO of ST. LUCIA to consider developing further its Integrated Household Survey Programme, the ground work would have been laid through this Master Sample Frame design for periodic, ad hoc or continuous sample surveys. The achievement of this objective has already been demonstrated through the use of this Sample Frame in the conduct of St. Lucia's continuous Labour Force Survey.

    Therefore for any one sub-sample given that there are nine, the sampling fraction is 1 / 16 by 1 / 9 or 1 / 144. If a periodic, ad hoc or quarterly survey included the use of three replicates then the sampling fraction for these three replicates would be 3 / 144 or 1 /16 by 3 / 9. In both cases the resultant sampling fraction is the product of the sampling probability for the Master Sampling frame and the probability of selection of a specific number of replicates.

    B. Master Sample Domains of Study and Stratification

    1. Domains of Study:

    The Master Sample frame was subdivided into eleven areas for the purpose of the provision of estimates from samples selected from this frame. The following list of the ten domains or sub-populations is based on the Districts which formed the basis for the collection of information on the population in the 2001 Census.

    The total number of PSUs in the ST. LUCIA MS is 401, a breakdown of the number of PSUs by District is shown in the table above. The average size of the PSUs was 118 approximately with a standard deviation of approximately 47. This configuration does not in the near term present a major problem for sample implementation, since the EDs/PSUs size does not exceed 100 by too great an extent, in addition, while consideration must be given to splitting EDs which have grown in size to over 200, there are not as exist in the case of St. Vincent and the Grenadines a significant number of excessively large EDs. Continuous maintenance of this situation is required and can be done by splitting all EDs over 200 in size into smaller ones of approximate size 100. The main objective of controlling the size of the PSUs, is to reduce variability and thereby improve the precision of estimates from the sample. The more equal the sizes of the PSUs the more likely the variance of characteristics between PSUs will be minimized and inversely the precision of the samples derived from the estimates from the Master Sample Frame increased.

    1. Stratification

    As shown in the table above each of the domains of study was stratified according to specific criteria. In the more urban domains the criteria used was the percentage of Managers, professional, sub-professionals in the population. The PSUs or EDs were therefore arranged in descending order of the proportion of this group in the population of the ED. In the rural domains the PSUs were arranged in descending order of the proportion of agriculture workers in the population of the ED. In the case of Canaries and Anse-la-Raye, the sizes of the populations in these domains mandated a joining of the two to allow for the creation of a large enough domain for reporting purposes.

    Mode of data collection

    Face-to-face [f2f]

    Research instrument

    The questionnaire is administered to all members of the household. Questions 1 through 6 are to be completed for all members of the household, these questions cover age, sex, relation to head of household, country of birth etc. All subsequent questions refer to persons 15 year of age and older. The questionnaire is divided into five parts:

    PART 1:For all members of the household (regardless of age) - Demographic and emigration questions
    PART 2: To be completed for persons 15 years and older - Education, Training, activities during the reference week or month, working at a job, on vacation, methods of seeking work, availability for employment PART 3: For persons employed during the reference week - Number of actual hours of work, number of usual hours of work, seeking additional work, status in employment, industry and occupation of employment PART 3A: For persons holding more than one job during the reference week - Number of actual hours of work, number of usual hours of work, seeking additional work, status in employment, industry and occupation of employment
    PART 4: For persons unemployed

  15. i

    Quarterly Labour Force Survey 2012 - St. Lucia

    • datacatalog.ihsn.org
    • catalog.ihsn.org
    Updated Mar 29, 2019
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Central Statistics Office of Saint Lucia (2019). Quarterly Labour Force Survey 2012 - St. Lucia [Dataset]. https://datacatalog.ihsn.org/catalog/4333
    Explore at:
    Dataset updated
    Mar 29, 2019
    Dataset authored and provided by
    Central Statistics Office of Saint Lucia
    Time period covered
    2012
    Area covered
    Saint Lucia
    Description

    Abstract

    The Labour Force Survey aims to collect information on the supply side of the labour market. It provides information on the extent of available and unused labour time and on relationships between employment and income. Thus, the data collected can be used for:

    Macro-economic monitoring:- from an economic point of view, a main objective of collecting data on the economically active population is to provide basic information on the size and structure of a country's workforce. The unemployment rate in particular is widely used as an overall indicator of the current performance of a country's economy.

    Human resources development: The economy is changing all the time. In order to meet the needs of the changing economy, people need to be trained. These areas of training must therefore be identified.

    Employment policies: For an economy to work at its maximum potential, all persons wanting to have work should have jobs. Some persons may wish to have full-time jobs, and can only find part-time work. We need to know what proportion of the labour force these people represent in order to assess the social effects of government employment policies.

    Income Support and social programmes: For the majority of people, employment income is their main means of support. People need not only jobs, but more importantly, productive jobs in order to receive reasonable incomes. We need to know what levels of income are being earned by different groups of persons.

    Geographic coverage

    National Coverage

    Analysis unit

    • Households;
    • Individuals.

    Universe

    The survey covered all de jure non-institutional household members (usual residents), it focuses on the employment, unemployment and current activity or inactivity status of all persons aged 15 years and over resident in the household.

    Kind of data

    Sample survey data [ssd]

    Sampling procedure

    Every quarter (three months) approximately 1,000 households are interviewed, there is a one third overlap between the households interviewed between each round of the survey.

    The Multi-Stage sampling procedure developed for the St. Lucia MS (Master Sample) Frame is used for the execution of the labour force survey:

    The two stage process of sample selection in the ST. LUCIA MS entails the selection of the PSUs within the districts. This is followed by the systematic selection of the cluster of households or USU (Ultimate Sampling Units) within the selected PSUs. The two stages in the design is elaborated as follows:

    a. In the first stage, a sampling frame is constructed consisting of all of the enumeration districts from the census of 2001. The size of each enumeration district is measured in units of clusters of households. In the case of the ST. LUCIA MS, approximately seven or eight households were allocated per cluster. The clusters which are allocated to the EDs all have an equal probability of selection within the specified geographic domain in which they are allocated. In addition, the number of clusters allocated to an ED is a measure of the size of the ED. Clusters, therefore ensure the selection of EDs or Primary Sampling Units with probability proportional to the size of the ED. The ST. LUCIA MS frame consists of nine sub-samples / replicates, with each replicate selected with a probability of (1 / (16 * 9)) or 1 / 144.

    b. In the second stage a non-compact cluster of households is selected within the selected PSU using systematic random sampling. There are three elements to the selection of this non-compact cluster. Firstly, there is the sample interval, which is a measure of the size of the ED in terms of the total number of households it contains. The larger the ED or PSU the larger will be the sample interval assigned and consequently the larger will be the number of clusters assigned to the ED. This approach ensures that the total number of households selected in any selected ED is approximately the same. In the case of the "Castries" in the ST. LUCIA MS frame the approximate number is five (5). Secondly, the random start is determined by use of a random number generator. With a Microsoft EXCEL spreadsheet the formulae takes the following form, =ROUND(RAND()*E1,0)+1, where E1 is the cell containing the sample interval (or total number of clusters assigned) RAND() is the function which generates the random number. The round() function is used to round the result to the nearest whole number. The third element of choosing the non compact cluster is a combination of the above. A random number (r) is choosen between 1 and the sample interval value, I, inclusive, then to this number is added the sample interval for the full list of households within the primary sample unit. Thus, the list of selected households would be r, r + I, r + 2I, r + 3I, r + 4I,……, r + (n - 1)I, where n is the cluster size assigned to the district, in the case of Castries n is five.

    A. Size of the Sample

    As has been explained before the decision to use a sampling fraction of 1 : 16 and to assign nine replicates to each District (the geographic domain) was based on the need to take advantage of the small size of the countries covered by this MECOVI project. This was done by increasing the "spread" of the sample across EDs and as a result improving the precision of the estimates which can be obtained from it. In addition, attention was paid to ensuring that were the CSO of ST. LUCIA to consider developing further its Integrated Household Survey Programme, the ground work would have been laid through this Master Sample Frame design for periodic, ad hoc or continuous sample surveys. The achievement of this objective has already been demonstrated through the use of this Sample Frame in the conduct of St. Lucia's continuous Labour Force Survey.

    Therefore for any one sub-sample given that there are nine, the sampling fraction is 1 / 16 by 1 / 9 or 1 / 144. If a periodic, ad hoc or quarterly survey included the use of three replicates then the sampling fraction for these three replicates would be 3 / 144 or 1 /16 by 3 / 9. In both cases the resultant sampling fraction is the product of the sampling probability for the Master Sampling frame and the probability of selection of a specific number of replicates.

    B. Master Sample Domains of Study and Stratification

    1. Domains of Study:

    The Master Sample frame was subdivided into eleven areas for the purpose of the provision of estimates from samples selected from this frame. The following list of the ten domains or sub-populations is based on the Districts which formed the basis for the collection of information on the population in the 2001 Census.

    The total number of PSUs in the ST. LUCIA MS is 401, a breakdown of the number of PSUs by District is shown in the table above. The average size of the PSUs was 118 approximately with a standard deviation of approximately 47. This configuration does not in the near term present a major problem for sample implementation, since the EDs/PSUs size does not exceed 100 by too great an extent, in addition, while consideration must be given to splitting EDs which have grown in size to over 200, there are not as exist in the case of St. Vincent and the Grenadines a significant number of excessively large EDs. Continuous maintenance of this situation is required and can be done by splitting all EDs over 200 in size into smaller ones of approximate size 100. The main objective of controlling the size of the PSUs, is to reduce variability and thereby improve the precision of estimates from the sample. The more equal the sizes of the PSUs the more likely the variance of characteristics between PSUs will be minimized and inversely the precision of the samples derived from the estimates from the Master Sample Frame increased.

    1. Stratification

    As shown in the table above each of the domains of study was stratified according to specific criteria. In the more urban domains the criteria used was the percentage of Managers, professional, sub-professionals in the population. The PSUs or EDs were therefore arranged in descending order of the proportion of this group in the population of the ED. In the rural domains the PSUs were arranged in descending order of the proportion of agriculture workers in the population of the ED. In the case of Canaries and Anse-la-Raye, the sizes of the populations in these domains mandated a joining of the two to allow for the creation of a large enough domain for reporting purposes.

    Mode of data collection

    Face-to-face [f2f]

    Research instrument

    The questionnaire is administered to all members of the household. Questions 1 through 6 are to be completed for all members of the household, these questions cover age, sex, relation to head of household, country of birth etc. All subsequent questions refer to persons 15 year of age and older. The questionnaire is divided into five parts:

    PART 1:For all members of the household (regardless of age) - Demographic and emigration questions
    PART 2: To be completed for persons 15 years and older - Education, Training, activities during the reference week or month, working at a job, on vacation, methods of seeking work, availability for employment PART 3: For persons employed during the reference week - Number of actual hours of work, number of usual hours of work, seeking additional work, status in employment, industry and occupation of employment PART 3A: For persons holding more than one job during the reference week - Number of actual hours of work, number of usual hours of work, seeking additional work, status in employment, industry and occupation of employment
    PART 4: For persons unemployed during the reference

  16. Ithaka S+R US Faculty Survey 2012

    • icpsr.umich.edu
    ascii, delimited, r +3
    Updated Dec 21, 2016
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Schonfeld, Roger C.; Wulfson, Kate; Housewright, Ross (2016). Ithaka S+R US Faculty Survey 2012 [Dataset]. http://doi.org/10.3886/ICPSR34651.v2
    Explore at:
    delimited, spss, ascii, sas, r, stataAvailable download formats
    Dataset updated
    Dec 21, 2016
    Dataset provided by
    Inter-university Consortium for Political and Social Researchhttps://www.icpsr.umich.edu/web/pages/
    Authors
    Schonfeld, Roger C.; Wulfson, Kate; Housewright, Ross
    License

    https://www.icpsr.umich.edu/web/ICPSR/studies/34651/termshttps://www.icpsr.umich.edu/web/ICPSR/studies/34651/terms

    Time period covered
    Sep 10, 2012 - Oct 15, 2012
    Area covered
    United States
    Description

    This collection represents the fifth cycle of the US Faculty Survey conducted by Ithaka S+R in fall 2012. Investigators surveyed a random sample of higher education faculty members to learn about their attitudes and practices related to research, teaching, and communicating. The fifth cycle differs from previous releases in two significant regards: the questionnaire was developed with input from an advisory committee of academic professionals, and the methodology was revised to take advantage of online distribution and response collection. Demographic and professional information collected includes respondent age, sex, title, primary academic field, number of years working in primary academic field, number of years working at current college or university, and whether the respondent primarily identifies as a researcher, a teacher, or some combination of roles.

  17. Multiple Indicator Cluster Survey 2012 - Mongolia

    • datacatalog.ihsn.org
    • catalog.ihsn.org
    • +1more
    Updated Sep 19, 2018
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statistics Department of the Governor’s Office of Nalaikh District (2018). Multiple Indicator Cluster Survey 2012 - Mongolia [Dataset]. https://datacatalog.ihsn.org/catalog/7438
    Explore at:
    Dataset updated
    Sep 19, 2018
    Dataset provided by
    UNICEFhttp://www.unicef.org/
    Statistics Department of the Governor’s Office of Nalaikh District
    Time period covered
    2012
    Area covered
    Mongolia
    Description

    Abstract

    The Child development survey (or MICS) 2012 provides valuable information on the situation of children, women and men in Nalaikh district, for measuring fulfilment of their rights. It was based largely on the needs to monitor progress towards goals and targets, pertinent to recent international agreements: The Millennium Declaration, adopted by all 191 United Nations Member States in September 2000, and the Plan of Action of A World Fit For Children, adopted by 189 Member States at the United Nations Special Session on Children in May 2002. Both of these commitments build upon promises made by the international community at the 1990 World Summit for Children.

    OBJECTIVES

    The Nalaikh district’s “Child Development Survey-2012” (Multiple Indicator Cluster Survey) has the following primary objectives:

    • To provide up-to-date information for assessing at the district level the following national and international level policies and programmes

    A) the World Fit for Children Declaration

    B) Millennium Development Goals

    C) National Reproductive Health Programme

    • To serve the baseline for UNICEF’s Country Programme 2012-2016

    • To build the capacity of the Statistics Department of the District.

    Geographic coverage

    District level.

    Analysis unit

    • Individuals

    • Households

    Universe

    The survey covered all households, women and men age 15-49 years, and children under age of 5 and age 2-14 years.

    Kind of data

    Sample survey data [ssd]

    Sampling procedure

    The Child Development Survey is a household-based survey. Therefore, households are defined as the final sampling units. The sample for the survey was designed to provide estimates for a number of indicators on the situation of children, women and men at the district level. The total sample size was determined as 1,000 households for the district.

    In total for the Nalaikh district, 40 khesegs were selected systematically with probability proportional to size. After a household listing of the selected PSUs or the selected khesegs was carried out by the khoroo’s governor, 25 households were selected using systematic random sampling in each PSU.

    Data were collected from the households in the sample, and for reporting the district level results, sample weights are used. A more detailed description of the sample design can be found in the Final Report (Appendix A) attached as a Related Material.

    Sampling deviation

    During the data collection fieldwork in July-August 2012, we had encountered a problem due to nonappearance of families at the registered addresses, and absence of family members, because of seasonal resort and vacation period. In spite of this, we managed to collect survey data from all selected PSUs.

    Mode of data collection

    Face-to-face [f2f]

    Research instrument

    Based on the five core questionnaires contents of the Mongolia Child Development Survey, conducted nationwide in 2010, specific supplementary module and questions were added for the Nalaikh “Child Development Survey 2012”. Based on the current priorities and needs, the questionnaire for men age 15-49 years was taken in its entirety for this round of CDS.

    Altogether, five types of questionnaires were used:

    1. A Household Questionnaire

    2. A Questionnaire for Woman, age 15-49

    3. A Questionnaire for Child under 5

    4. A Questionnaire for Child, age 2-14

    5. A Questionnaire for Man, age 15-49

    In addition to the administration of the questionnaires, fieldwork teams tested the salt used for cooking in the households for iodine content, observed the place for hand washing and measured the weights and heights of children age under 5 years.

    In this round CDS 2012, internal migration questions (country specific module in household questionnaire) were asked for all household member listed in household listing module (HL).

    The Questionnaire for Child under 5 was administered to mothers or caretakers of all children under 5 years of age living in the households. Normally, the questionnaire was administered to mothers of under-5 children; in cases when the mother was not listed in the household roster, a primary caretaker for the child was identified and interviewed.

    The Questionnaire for Child age 2-14 was administered to mothers or caretakers of children age 2-14 years living in the households. Normally, the questionnaire was administered to mothers of children age 2-14; in cases when the mother was not listed in the household roster, a primary caretaker for the child was identified and interviewed.

    All questionnaires modules are provided as Related Materials.

    Cleaning operations

    The data collected from the selected households were entered on computers using the CSPro 4.0 software program by one data entry supervisor and two data entry operators from 20 August to 10 September 2012. In order to ensure quality control, all data were double entered and internal consistency checks were performed before finalization of the database. The procedures and standard programs developed under the global MICS4 programme and adapted to the Nalaikh CDS's customized questionnaires with additional module and questions were used throughout.

    The data were analyzed using the standard SPSS 18.0 (Statistical Package for Social Sciences) software program and the model syntax and tabulation plans developed by UNICEF were customized for Nalaikh CDS 2012 questionnaires.

    Response rate

    In total, 1,000 households selected for the sample, and of these 956 were found to be available for the survey. Of these, 949 households were successfully interviewed for a household response rate of 99 percent. In the interviewed households, out of the total 799 men and 929 women, age 15-49 years, enlisted for the survey, 705 men and 889 women were successfully interviewed, yielding a response rate of 88 and 96 percent respectively. In addition, 433 children under age of 5 and 896 children age 2-14 years were listed in the household questionnaire. Questionnaires were completed with mothers/ caretakers for 429 of these under-5 children and for 894 of children age 2-14, which corresponds to response rates of 99 and 100 percent respectively, within interviewed households.

    Nalaikh district’s overall response rates stand at 88 percent for men, 95 percent for women age 15-49 years, 98 percent and 99 percent are calculated for mothers/ caretakers of children under 5's and children age 2-14's respectively.

    However, the response rate for men age 15-49 years’ interviews is relatively lower than the response rates for other interviews, because of the dynamic mobility nature of men, particularly of young men.

    Sampling error estimates

    The sample of respondents selected in the Nalaikh District Multiple Indicator Cluster Survey 2012 is only one of the samples that could have been selected from the same population, using the same design and size. Each of these samples would yield results that slightly differ somewhat from the results of the actual sample selected. Sampling errors are a measure of the variability between the estimates from all possible samples. The extent of variability is not known exactly, but can be estimated statistically from the survey data.

    The sampling error measures for each of the selected indicators are:

    • Standard error (se): Sampling errors are usually measured in terms of standard errors for particular indicators (means, proportions etc). Standard error is the square root of the variance of the estimate. The Taylor linearization method is used for the estimation of standard errors.

    • Coefficient of variation (se/r) is the ratio of the standard error to the value of the indicator, and is a measure of the relative sampling error.

    • Design effect (deff) is the ratio of the actual variance of an indicator, under the sampling method used in the survey, to the variance calculated under the assumption of simple random sampling. The square root of the design effect (deft) is used to show the efficiency of the sample design in relation to the precision. A deft value of 1.0 indicates that the sample design is as efficient as a simple random sample, while a deft value above 1.0 indicates the increase in the standard error due to the use of a more complex sample design.

    • Confidence limits are calculated to show the interval within which the true value for the population can be reasonably assumed to fall, with a specified level of confidence. For any given statistic calculated from the survey, the value of that statistic will fall within a range of plus or minus two times the standard error (r + 2.se or r - 2.se) of the statistic in 95 percent of all possible samples of identical size and design.

    For the calculation of sampling errors from MICS data, SPSS Version 18 Complex Samples module has been used.

  18. u

    Early Learning Outcomes Measure 2016, Age Validation Study - South Africa

    • datafirst.uct.ac.za
    Updated May 31, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Innovation Edge (2024). Early Learning Outcomes Measure 2016, Age Validation Study - South Africa [Dataset]. https://www.datafirst.uct.ac.za/dataportal/index.php/catalog/627
    Explore at:
    Dataset updated
    May 31, 2024
    Dataset authored and provided by
    Innovation Edge
    Time period covered
    2016
    Area covered
    South Africa
    Description

    Abstract

    In 2015, Innovation Edge commissioned the development of South Africa's first national-level preschool child assessment tool. Finalized in 2016, the primary purpose of the Early Learning Outcomes Measure (ELOM) (www.elom.org.za) is to provide the country with a national instrument to fairly assess children age 50-69 months from all socio-economic and cultural backgrounds. With this tool, we will be able to monitor early learning program outcomes, guide program improvement, and test program effectiveness. This study is conducted to assist Innovation Edge with the 2nd phase in the process for the development of ELOM--Age validation. In this study, ELOM tools was administered to children enrolled in public schools at the commencement of their Grade R year in 2016, in schoolf of all five quintiles in three provinces. The goal of the ELOM age validation process was to construct a sample that was likely to be as representative as possible of children eligible to enter Grade R in January 2016, drawn from across South Africa's socio-economic distribution, and including five major language groups.

    The ELOM includes both direct assessment of children's performance as well as an assessment of the child's social and emotional functioning and orientation to tasks. The ELOM Direct Assessment consists of 23 items measuring indicators of the child's early development in five domains: Gross Motor Development Fine Motor Coordination and Visual Motor Integration Emergent Numeracy and Mathematics Cognition and Executive Functioning Emergent Literacy and Language

    Geographic coverage

    It is envisioned that the ELOM will be applied to children from a range of cultural and socio-economic settings. However, as finances did not permit a national sample, three provinces were chosen for the study. The sample for this study then aimed to be representative of public school Grade R students in the target language groups who were between the ages of 54-66 months in selected school districts in North West (Setswana speakers only), the Western Cape (English, isiXhosa and Afrikaans speakers) and KwaZulu Natal (isiZulu speakers only).

    Analysis unit

    Individuals and institutions

    Universe

    Desired target population is Grade R children in South African public schools between the ages of 54-66 months.

    Kind of data

    Observation data

    Sampling procedure

    A two-stage clustered sample design was employed. In the first stage, and in each district, probability proportional to Grade R population size sampling was used to randomly select schools within each of the five School Quintile bands. Two schools in traditional, more rural areas in each of North West and KwaZulu-Natal were recruited independently of this exercise to explore the influence of more "traditional" approaches to child rearing. In the second stage, learners were selected within Grade R classes using simple random sampling. Minimum of nine children per school were selected per cluster.

    Mode of data collection

    Other

  19. Modeling set-up with 14 different combinations of presence and frequency...

    • plos.figshare.com
    xls
    Updated Jun 1, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jana Michaelis; Martin R. Diekmann (2023). Modeling set-up with 14 different combinations of presence and frequency based on random sampling. [Dataset]. http://doi.org/10.1371/journal.pone.0183152.t001
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jun 1, 2023
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Jana Michaelis; Martin R. Diekmann
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Four different presence scenarios (number of randomly selected presences being 10, 25, 50 or 100) combined with four different frequency scenarios (by varying the number of absences) were modeled. Two combinations could not be applied due to model restrictions or data paucity.

  20. i

    Multiple Indicator Cluster Survey 2010 - Gambia, The

    • catalog.ihsn.org
    Updated Mar 29, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Gambia Bureau of Statistics (2019). Multiple Indicator Cluster Survey 2010 - Gambia, The [Dataset]. https://catalog.ihsn.org/index.php/catalog/6776
    Explore at:
    Dataset updated
    Mar 29, 2019
    Dataset authored and provided by
    Gambia Bureau of Statistics
    Time period covered
    2010
    Area covered
    The Gambia
    Description

    Abstract

    The Gambia Multiple Indicator Cluster Survey 2010 is a nationally representative survey of households, children and women. The main objectives of the survey was to provide up-to-date information for assessing the situation of children and women in The Gambia. Another objective was to furnish data needed for monitoring progress towards the goals established at the World Summit for Children and the Millennium Development Goals (MDGs) as a basis for future action. The findings of this survey would also be utilized by government and development partners in planning and monitoring program implementation.

    The module development for the survey captured data on households characteristics, education, water and sanitation, insecticides treated nets, indoor residual spraying, salt iodization, handwashing, birth registration, early childhood development, Breastfeeding, care of illness, malaria, immunization, anthropometry, child mortality, desire for last birth, illness symptoms, maternal and newborn health, rehydration solutions, contraception, unmet need, female genital mutilation, attitudes toward domestic violence, marriage/ union, sexual behavior, and HIV/AIDS. The survey was conducted through inter-agency collaboration with The Gambia Bureau of Statistics (GBoS), acting as the lead agency.

    The Gambia's Multiple Indicator Cluster Survey 2010 has the following primary objectives: 1. To provide up-to-date information for assessing the situation of children and women in The Gambia. 2. To furnish data needed for monitoring progress towards the goals established in the Millennium Declaration, the goals of A World Fit for Children (WFFC) and other internationally agreed upon goals as a basis for future action. 3. To contribute to the improvement of data and monitoring systems in The Gambia and to strengthen technical expertise in the design, implementation and analysis of such systems. 4. To generate data on the situation of children and women, including the identification of vulnerable groups and of disparities, to inform policies and interventions.

    Geographic coverage

    National

    Analysis unit

    • Households (defined as a group of persons who usually live and eat together)
    • Women aged 15-49
    • Children aged 0-4

    Universe

    The survey covered all de jure household members (usual residents), all women aged 15-49 years living in the household, and all children aged 0-4 years (under age 5) living in the household.

    Kind of data

    Sample survey data [ssd]

    Sampling procedure

    The sample for The Gambia Multiple Indicator Cluster Survey (MICS4) was designed to provide estimates on a large number of indicators on the situation of children and women at the national level, for urban and rural areas, and for the eight Local Government Areas (LGAs): Banjul, Kanifing, Brikama, Mansakonko, Kerewan, Kuntaur, Janjanbureh and Basse. Other than Banjul and Kanifing which are entirely urban settlements, urban and rural areas within each LGA were identified as the main sampling domains and the sample was selected in two stages. Within each LGA, at least 44 and at most 60 census enumeration areas, (EA's) or clusters were selected systematically with Probability Proportional to Size (PPS).

    Sampling deviation

    No major deviations from the original sample design were made. All sample enumeration areas were accessed and successfully interviewed with good response rates.

    Mode of data collection

    Face-to-face [f2f]

    Research instrument

    The questionnaires are based on the MICS4 Model Questionnaire III. Given that the MICS4 model questionnaires were in an English version, the questionnaires were not translated into the local languages for the training part. The training program for staff conducting or supervising the interviews included detailed discussions of the contents of the questionnaires, how to complete the questionnaires, and interviewing techniques. In addition to taking the trainees through the questionnaires in English, the questions were also verbally translated into the three main local languages of The Gambia (Wollof, Mandinka and Fula). A participatory approach was adopted during these translation sessions to ensure that all participants had common understanding of the translation of all the questions. The questionnaires were pre-tested in few selected EAs in the Greater Banjul in April, 2010. Based on the results of the pre-test, modifications were made to the wording and translation of the questionnaires.

    Cleaning operations

    Data were entered into 20 microcomputers using the Census and Surveys Processing System (CSPro) software package. Data entry was carried out by forty entry operators and four supervisors. For quality assurance purposes, all questionnaires were double-entered and internal consistency checks performed. Procedures and standard programs developed under the global MICS program and adapted to The Gambia questionnaire were used throughout. Data processing began simultaneously with data collection in April 2010 and was completed in August 2010. Data were analyzed using the Statistical Package for Social Sciences (SPSS) software program, Version 18. Model syntax and tabulation plans developed by UNICEF were customized and used for this purpose.

    Response rate

    Of the 7,800 households selected for the sample survey, 7,799 households were found to be occupied. Of these 7,791 were successfully interviewed for a household response rate of 99.9 percent. In the interviewed households, the survey identified 15,138 women (age 15-49 years). Of these 14,685 were successfully interviewed, resulting to a response rate of 97.0 percent within interviewed households. In addition 11,807 children under age five were listed. Questionnaires were completed for 11,637 of these children, which corresponds to a response rate of 98.6 percent within interviewed households.

    Sampling error estimates

    The sample of respondents selected in the Gambia Multiple Indicator Cluster Survey is only one of the samples that could have been selected from the same population, using the same design and size. Each of these samples would yield results that differ somewhat from the results of the actual sample selected. Sampling errors are a measure of the variability between the estimates from all possible samples. The extent of variability is not known exactly, but can be estimated statistically from the survey data.

    The following sampling error measures are presented in this appendix for each of the selected indicators: 1. Standard error (se): Sampling errors are usually measured in terms of standard errors for particular indicators (means, proportions, etc.). Standard error is the square root of the variance of the estimate. The Taylor linearization method is used for the estimation of standard errors. 2. Coefficient of variation (se/r) is the ratio of the standard error to the value of the indicator, and is a measure of the relative sampling error. 3. Design effect (deff) is the ratio of the actual variance of an indicator, under the sampling method used in the survey, to the variance calculated under the assumption of simple random sampling. The square root of the design effect (deft) is used to show the efficiency of the sample design in relation to the precision. A deft value of 1.0 indicates that the sample design is as efficient as a simple random sample, while a deft value above 1.0 indicates the increase in the standard error due to the use of a more complex sample design. 4. Confidence limits are calculated to show the interval within which the true value for the population can be reasonably assumed to fall, with a specified level of confidence. For any given statistic calculated from the survey, the value of that statistic will fall within a range of plus or minus two times the standard error (r + 2.se or r - 2.se) of the statistic in 95 percent of all possible samples of identical size and design.

    For the calculation of sampling errors from MICS data, SPSS Version 18 Complex Samples module has been used. The results are shown in the tables that follow. In addition to the sampling error measures described above, the tables also include weighted and unweighted counts of denominators for each indicator. Sampling errors are calculated for indicators of primary interest, for the national level, for the regions, and for urban and rural areas. Three of the selected indicators are based on households, 8 are based on household members, 13 are based on women, and 15 are based on children under 5. All indicators presented here are in the form of proportions. Table SE.1 shows the list of indicators for which sampling errors are calculated, including the base population (denominator) for EAC indicator. Tables SE.2 to SE.12 show the calculated sampling errors for selected domains.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
JORGE GARCIA-INIGUEZ (2023). random-points-sampling-r [Dataset]. https://www.kaggle.com/datasets/jorgegarciainiguez/random-points-sampling-r
Organization logo

random-points-sampling-r

Explore at:
2 scholarly articles cite this dataset (View in Google Scholar)
zip(712850 bytes)Available download formats
Dataset updated
Nov 18, 2023
Authors
JORGE GARCIA-INIGUEZ
License

MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically

Description

Dataset

This dataset was created by JORGE GARCIA-INIGUEZ

Released under MIT

Contents

Search
Clear search
Close search
Google apps
Main menu