48 datasets found
  1. i

    Project for Statistics on Living Standards and Development 1993 - South...

    • catalog.ihsn.org
    • microdata.fao.org
    • +2more
    Updated Mar 29, 2019
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Southern Africa Labour and Development Research Unit (2019). Project for Statistics on Living Standards and Development 1993 - South Africa [Dataset]. https://catalog.ihsn.org/catalog/4628
    Explore at:
    Dataset updated
    Mar 29, 2019
    Dataset authored and provided by
    Southern Africa Labour and Development Research Unit
    Time period covered
    1993
    Area covered
    South Africa
    Description

    Abstract

    The Project for Statistics on Living standards and Development was a coutrywide World Bank Living Standards Measurement Survey. It covered approximately 9000 households, drawn from a representative sample of South African households. The fieldwork was undertaken during the nine months leading up to the country's first democratic elections at the end of April 1994. The purpose of the survey was to collect statistical information about the conditions under which South Africans live in order to provide policymakers with the data necessary for planning strategies. This data would aid the implementation of goals such as those outlined in the Government of National Unity's Reconstruction and Development Programme.

    Geographic coverage

    National coverage

    Analysis unit

    • Households
    • Individuals
    • Community

    Universe

    All Household members.

    Individuals in hospitals, old age homes, hotels and hostels of educational institutions were not included in the sample. Migrant labour hostels were included. In addition to those that turned up in the selected ESDs, a sample of three hostels was chosen from a national list provided by the Human Sciences Research Council and within each of these hostels a representative sample was drawn on a similar basis as described above for the households in ESDs.

    Kind of data

    Sample survey data [ssd]

    Sampling procedure

    Sample size is 9,000 households

    The sample design adopted for the study was a two-stage self-weightingdesign in which the first stage units were Census Enumerator Subdistricts (ESDs, or their equivalent) and the second stage were households.

    The advantage of using such a design is that it provides a representative sample that need not be based on accurate census population distribution.in the case of South Africa, the sample will automatically include many poor people, without the need to go beyond this and oversample the poor. Proportionate sampling as in such a self-weighting sample design offers the simplest possible data files for further analysis, as weights do not have to be added. However, in the end this advantage could not be retained and weights had to be added.

    The sampling frame was drawn up on the basis of small, clearly demarcated area units, each with a population estimate. The nature of the self-weighting procedure adopted ensured that this population estimate was not important for determining the final sample, however. For most of the country, census ESDs were used. Where some ESDs comprised relatively large populations as for instance in some black townships such as Soweto, aerial photographs were used to divide the areas into blocks of approximately equal population size. In other instances, particularly in some of the former homelands, the area units were not ESDs but villages or village groups.

    In the sample design chosen, the area stage units (generally ESDs) were selected with probability proportional to size, based on the census population. Systematic sampling was used throughout that is, sampling at fixed interval in a list of ESDs, starting at a randomly selected starting point. Given that sampling was self-weighting, the impact of stratification was expected to be modest. The main objective was to ensure that the racial and geographic breakdown approximated the national population distribution. This was done by listing the area stage units (ESDs) by statistical region and then within the statistical region by urban or rural. Within these sub-statistical regions, the ESDs were then listed in order of percentage African. The sampling interval for the selection of the ESDs was obtained by dividing the 1991 census population of 38,120,853 by the 300 clusters to be selected. This yielded 105,800. Starting at a randomly selected point, every 105,800th person down the cluster list was selected. This ensured both geographic and racial diversity (ESDs were ordered by statistical sub-region and proportion of the population African). In three or four instances, the ESD chosen was judged inaccessible and replaced with a similar one.

    In the second sampling stage the unit of analysis was the household. In each selected ESD a listing or enumeration of households was carried out by means of a field operation. From the households listed in an ESD a sample of households was selected by systematic sampling. Even though the ultimate enumeration unit was the household, in most cases "stands" were used as enumeration units. However, when a stand was chosen as the enumeration unit all households on that stand had to be interviewed.

    Census population data, however, was available only for 1991. An assumption on population growth was thus made to obtain an approximation of the population size for 1993, the year of the survey. The sampling interval at the level of the household was determined in the following way: Based on the decision to have a take of 125 individuals on average per cluster (i.e. assuming 5 members per household to give an average cluster size of 25 households), the interval of households to be selected was determined as the census population divided by 118.1, i.e. allowing for population growth since the census. It was subsequently discovered that population growth was slightly over-estimated but this had little effect on the findings of the survey.

    Individuals in hospitals, old age homes, hotels and hostels of educational institutions were not included in the sample. Migrant labour hostels were included. In addition to those that turned up in the selected ESDs, a sample of three hostels was chosen from a national list provided by the Human Sciences Research Council and within each of these hostels a representative sample was drawn on a similar basis as described abovefor the households in ESDs.

    Mode of data collection

    Face-to-face [f2f]

    Research instrument

    The main instrument used in the survey was a comprehensive household questionnaire. This questionnaire covered a wide range of topics but was not intended to provide exhaustive coverage of any single subject. In other words, it was an integrated questionnaire aimed at capturing different aspects of living standards. The topics covered included demography, household services, household expenditure, educational status and expenditure, remittances and marital maintenance, land access and use, employment and income, health status and expenditure and anthropometry (children under the age of six were weighed and their heights measured). This questionnaire was available to households in two languages, namely English and Afrikaans. In addition, interviewers had in their possession a translation in the dominant African language/s of the region.

    In addition to the detailed household questionnaire referred to above, a community questionnaire was administered in each cluster of the sample. The purpose of this questionnaire was to elicit information on the facilities available to the community in each cluster. Questions related primarily to the provision of education, health and recreational facilities. Furthermore there was a detailed section for the prices of a range of commodities from two retail sources in or near the cluster: a formal source such as a supermarket and a less formal one such as the "corner cafe" or a "spaza". The purpose of this latter section was to obtain a measure of regional price variation both by region and by retail source. These prices were obtained by the interviewer. For the questions relating to the provision of facilities, respondents were "prominent" members of the community such as school principals, priests and chiefs.

    Cleaning operations

    All the questionnaires were checked when received. Where information was incomplete or appeared contradictory, the questionnaire was sent back to the relevant survey organization. As soon as the data was available, it was captured using local development platform ADE. This was completed in February 1994. Following this, a series of exploratory programs were written to highlight inconsistencies and outlier. For example, all person level files were linked together to ensure that the same person code reported in different sections of the questionnaire corresponded to the same person. The error reports from these programs were compared to the questionnaires and the necessary alterations made. This was a lengthy process, as several files were checked more than once, and completed at the beginning of August 1994. In some cases questionnaires would contain missing values, or comments that the respondent did not know, or refused to answer a question.

    These responses are coded in the data files with the following values: VALUE MEANING -1 : The data was not available on the questionnaire or form -2 : The field is not applicable -3 : Respondent refused to answer -4 : Respondent did not know answer to question

    Data appraisal

    The data collected in clusters 217 and 218 should be viewed as highly unreliable and therefore removed from the data set. The data currently available on the web site has been revised to remove the data from these clusters. Researchers who have downloaded the data in the past should revise their data sets. For information on the data in those clusters, contact SALDRU http://www.saldru.uct.ac.za/.

  2. f

    Quantitative Research Methods and Data Analysis Workshop 2020

    • unisa.figshare.com
    pdf
    Updated Jun 12, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Tracy Probert; Maxine Schaefer; Anneke Carien Wilsenach (2025). Quantitative Research Methods and Data Analysis Workshop 2020 [Dataset]. http://doi.org/10.25399/UnisaData.12581483.v1
    Explore at:
    pdfAvailable download formats
    Dataset updated
    Jun 12, 2025
    Dataset provided by
    University of South Africa
    Authors
    Tracy Probert; Maxine Schaefer; Anneke Carien Wilsenach
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    We include the course syllabus used to teach quantitative research design and analysis methods to graduate Linguistics students using a blended teaching and learning approach. The blended course took place over two weeks and builds on a face to face course presented over two days in 2019. Students worked through the topics in preparation for a live interactive video session each Friday to go through the activities. Additional communication took place on Slack for two hours each week. A survey was conducted at the start and end of the course to ascertain participants' perceptions of the usefulness of the course. The links to online elements and the evaluations have been removed from the uploaded course guide.Participants who complete this workshop will be able to:- outline the steps and decisions involved in quantitative data analysis of linguistic data- explain common statistical terminology (sample, mean, standard deviation, correlation, nominal, ordinal and scale data)- perform common statistical tests using jamovi (e.g. t-test, correlation, anova, regression)- interpret and report common statistical tests- describe and choose from the various graphing options used to display data- use jamovi to perform common statistical tests and graph resultsEvaluationParticipants who complete the course will use these skills and knowledge to complete the following activities for evaluation:- analyse the data for a project and/or assignment (in part or in whole)- plan the results section of an Honours research project (where applicable)Feedback and suggestions can be directed to M Schaefer schaemn@unisa.ac.za

  3. o

    OLAF PROJECT DATA SET

    • ordo.open.ac.uk
    xlsx
    Updated Nov 20, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Alexandra Okada (2020). OLAF PROJECT DATA SET [Dataset]. http://doi.org/10.21954/ou.rd.12670949.v2
    Explore at:
    xlsxAvailable download formats
    Dataset updated
    Nov 20, 2020
    Dataset provided by
    The Open University
    Authors
    Alexandra Okada
    License

    Attribution-ShareAlike 2.0 (CC BY-SA 2.0)https://creativecommons.org/licenses/by-sa/2.0/
    License information was derived automatically

    Description

    Subject: EducationSpecific: Online Learning and FunType: Questionnaire survey data (csv / excel)Date: February - March 2020Content: Students' views about online learning and fun Data Source: Project OLAFValue: These data provide students' beliefs about how learning occurs and correlations with fun. Participants were 206 students from the OU

  4. Project procurement for digital activities Indonesia 2014-2020, by...

    • statista.com
    Updated May 2, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2023). Project procurement for digital activities Indonesia 2014-2020, by government agency [Dataset]. https://www.statista.com/statistics/1180116/indonesia-project-procurement-for-digital-activities-by-government-agency/
    Explore at:
    Dataset updated
    May 2, 2023
    Dataset authored and provided by
    Statistahttp://statista.com/
    Area covered
    Indonesia
    Description

    During the period between 2014 and 2020, the Indonesian Ministry of Tourism had a total of 44 project procurement for its digital activities. Social media and influencer advertising were among the digital activities done by the Indonesian government. These types of advertising were seen as a tool to reach out to the millennials in the country who were predominantly active social media users.

  5. m

    Impact of limited data availability on the accuracy of project duration...

    • data.mendeley.com
    Updated Nov 22, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Naimeh Sadeghi (2022). Impact of limited data availability on the accuracy of project duration estimation in project networks [Dataset]. http://doi.org/10.17632/bjfdw6xbxw.3
    Explore at:
    Dataset updated
    Nov 22, 2022
    Authors
    Naimeh Sadeghi
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This database includes simulated data showing the accuracy of estimated probability distributions of project durations when limited data are available for the project activities. The base project networks are taken from PSPLIB. Then, various stochastic project networks are synthesized by changing the variability and skewness of project activity durations. Number of variables: 20 Number of cases/rows: 114240 Variable List: • Experiment ID: The ID of the experiment • Experiment for network: The ID of the experiment for each of the synthesized networks • Network ID: ID of the synthesized network • #Activities: Number of activities in the network, including start and finish activities • Variability: Variance of the activities in the network (this value can be either high, low, medium or rand, where rand shows a random combination of low, high and medium variance in the network activities.) • Skewness: Skewness of the activities in the network (Skewness can be either right, left, None or rand, where rand shows a random combination of right, left, and none skewed in the network activities)
    • Fitted distribution type: Distribution type used to fit on sampled data • Sample size: Number of sampled data used for the experiment resembling limited data condition • Benchmark 10th percentile: 10th percentile of project duration in the benchmark stochastic project network • Benchmark 50th percentile: 50th project duration in the benchmark stochastic project network • Benchmark 90th percentile: 90th project duration in the benchmark stochastic project network • Benchmark mean: Mean project duration in the benchmark stochastic project network • Benchmark variance: Variance project duration in the benchmark stochastic project network • Experiment 10th percentile: 10th percentile of project duration distribution for the experiment • Experiment 50th percentile: 50th percentile of project duration distribution for the experiment • Experiment 90th percentile: 90th percentile of project duration distribution for the experiment • Experiment mean: Mean of project duration distribution for the experiment • Experiment variance: Variance of project duration distribution for the experiment • K-S: Kolmogorov–Smirnov test comparing benchmark distribution and project duration • distribution of the experiment • P_value: the P-value based on the distance calculated in the K-S test

  6. d

    Replication Data for: A Field Experiment in Motivating Employee Ideas

    • search.dataone.org
    Updated Nov 21, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Gibbs, Michael; Neckermann, Susanne; Siemroth, Christoph (2023). Replication Data for: A Field Experiment in Motivating Employee Ideas [Dataset]. http://doi.org/10.7910/DVN/A7FXQU
    Explore at:
    Dataset updated
    Nov 21, 2023
    Dataset provided by
    Harvard Dataverse
    Authors
    Gibbs, Michael; Neckermann, Susanne; Siemroth, Christoph
    Description

    Gibbs, Michael, Neckermann, Susanne, and Siemroth, Christoph, (2017) "A Field Experiment in Motivating Employee Ideas." Review of Economics and Statistics 99:4, 577-590.

  7. A

    ‘Your Voice Your Choice Project Ideas’ analyzed by Analyst-2

    • analyst-2.ai
    Updated Jan 26, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com) (2022). ‘Your Voice Your Choice Project Ideas’ analyzed by Analyst-2 [Dataset]. https://analyst-2.ai/analysis/data-gov-your-voice-your-choice-project-ideas-cd5f/255c66f6/?iid=004-172&v=presentation
    Explore at:
    Dataset updated
    Jan 26, 2022
    Dataset authored and provided by
    Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com)
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Analysis of ‘Your Voice Your Choice Project Ideas’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://catalog.data.gov/dataset/ec9400b9-59fd-4a96-8994-9867942296ea on 26 January 2022.

    --- Dataset description provided by original source is as follows ---

    A program of Seattle Department of Neighborhoods, this is a list of street or park improvement ideas submitted by community members as a part of Your Voice Your Choice Participatory Budgeting. Ideas were vetted by project development teams made up of community members who volunteered to evaluate each project. Seattle Parks and Recreation and Seattle Department of Transportation also reviewed the projects for feasibility. The results and evaluation, along with location are provided in the set. The list will be finalized and ready for the community to vote (by council district) beginning June 3.

    --- Original source retains full ownership of the source dataset ---

  8. d

    Data from: Experimental Data Collection and Modeling for Nominal and Fault...

    • catalog.data.gov
    • data.nasa.gov
    • +1more
    Updated Apr 11, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dashlink (2025). Experimental Data Collection and Modeling for Nominal and Fault Conditions on Electro-Mechanical Actuators [Dataset]. https://catalog.data.gov/dataset/experimental-data-collection-and-modeling-for-nominal-and-fault-conditions-on-electro-mech
    Explore at:
    Dataset updated
    Apr 11, 2025
    Dataset provided by
    Dashlink
    Description

    Being relatively new to the field, electromechanical actuators in aerospace applications lack the knowledge base compared to ones accumulated for the other actuator types, especially when it comes to fault detection and characterization. Lack of health monitoring data from fielded systems and prohibitive costs of carrying out real flight tests push for the need of building system models and designing affordable but realistic experimental setups. This paper presents our approach to accomplish a comprehensive test environment equipped with fault injection and data collection capabilities. Efforts also include development of multiple models for EMA operations, both in nominal and fault conditions that can be used along with measurement data to generate effective diagnostic and prognostic estimates. A detailed description has been provided about how various failure modes are inserted in the test environment and corresponding data is collected to verify the physics based models under these failure modes that have been developed in parallel. A design of experiment study has been included to outline the details of experimental data collection. Furthermore, some ideas about how experimental results can be extended to real flight environments through actual flight tests and using real flight data have been presented. Finally, the roadmap leading from this effort towards developing successful prognostic algorithms for electromechanical actuators is discussed.*

  9. i

    Grant Giving Statistics for Idea Project

    • instrumentl.com
    Updated Dec 19, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2023). Grant Giving Statistics for Idea Project [Dataset]. https://www.instrumentl.com/990-report/idea-project
    Explore at:
    Dataset updated
    Dec 19, 2023
    Variables measured
    Total Assets, Total Giving
    Description

    Financial overview and grant giving statistics of Idea Project

  10. American Time Use Survey (ATUS): Arts Activities, [United States], 2003-2023...

    • icpsr.umich.edu
    ascii, delimited, r +3
    Updated Mar 10, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    United States. Bureau of Labor Statistics (2025). American Time Use Survey (ATUS): Arts Activities, [United States], 2003-2023 [Dataset]. http://doi.org/10.3886/ICPSR36268.v8
    Explore at:
    ascii, stata, sas, delimited, r, spssAvailable download formats
    Dataset updated
    Mar 10, 2025
    Dataset provided by
    Inter-university Consortium for Political and Social Researchhttps://www.icpsr.umich.edu/web/pages/
    Authors
    United States. Bureau of Labor Statistics
    License

    https://www.icpsr.umich.edu/web/ICPSR/studies/36268/termshttps://www.icpsr.umich.edu/web/ICPSR/studies/36268/terms

    Time period covered
    2003 - 2023
    Area covered
    United States
    Description

    The American Time Use Survey (ATUS) is the Nation's first federally administered, continuous survey on time use in the United States. This multi-year data collection contains information on the amount of time (in minutes) that people spent doing various activities on a given day, including the arts activities, in the years 2003 through 2023. Data collection for the ATUS began in January 2003. Sample cases for the survey are selected monthly, and interviews are conducted continuously throughout the year. In 2023, approximately 9,000 individuals were interviewed. Estimates are released annually. ATUS sample households are chosen from the households that completed their eighth (final) interview for the Current Population Survey (CPS), the nation's monthly household labor force survey. ATUS sample households are selected to ensure that estimates will be nationally representative. One individual age 15 or over is randomly chosen from each sampled household. This "designated person" is interviewed by telephone once about his or her activities on the day before the interview--the "diary day." The ATUS Activity Coding Lexicon is a 3-tiered classification system with 17 first-tier categories. Each of the first-tier categories has two additional levels of detail. Respondents' reported activities are assigned 6-digit activity codes based on this classification system. Additionally, the study provides demographic information--including sex, age, ethnicity, race, education, employment, and children in the household. IMPORTANT: The 2020 ATUS was greatly affected by the coronavirus (COVID-19) pandemic. Data collection was suspended in 2020 from mid-March to mid-May. ATUS data files for 2020 contain all ATUS data collected in 2020--both before and after data collection was suspended. For more information, please visit BLS's ATUS page. The weighting method was changed for 2020 to account for the suspension of data collection in early 2020 due to the COVID-19 pandemic. Respondents from 2020 will have missing values for the replicate weights on this data file. The Pandemic Replicate weights file for 2019-20 contains 160 replicate final weights for each ATUS final weight created using the 2020 weighting method. Chapter 7 of the ATUS User's Guide provides more information about the 2020 weighting method.

  11. s

    Project Idea Notes

    • pacific-data.sprep.org
    • tonga-data.sprep.org
    docx
    Updated Feb 14, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Department of Environment (2025). Project Idea Notes [Dataset]. https://pacific-data.sprep.org/dataset/project-idea-notes
    Explore at:
    docxAvailable download formats
    Dataset updated
    Feb 14, 2025
    Dataset provided by
    Tonga
    Department of Environment
    License

    https://pacific-data.sprep.org/resource/private-data-license-agreement-0https://pacific-data.sprep.org/resource/private-data-license-agreement-0

    Area covered
    Tonga
    Description

    Project Idea Notes based on the developed SoE and NEMS

  12. o

    Data and Code for: The Effect of Teaching Economics with Classroom...

    • openicpsr.org
    Updated Apr 29, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sacha Gelfer; Jeffrey A. Livingston; Sutanuka Roy (2022). Data and Code for: The Effect of Teaching Economics with Classroom Experiments: Estimates from a Within-Subject Experiment [Dataset]. http://doi.org/10.3886/E169143V1
    Explore at:
    Dataset updated
    Apr 29, 2022
    Dataset provided by
    American Economic Association
    Authors
    Sacha Gelfer; Jeffrey A. Livingston; Sutanuka Roy
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Massachusetts, USA
    Description

    Classroom experiments are a commonly-used teaching tool in economics classes. As their prevalence has grown, more and more services have been created that facilitate classroom experiments. For example, econport.org, games.moblab.com, classex.de, and Charles Holt's veconlab.econ.virginia.edu each offer numerous easy-to-implement online experiments, many of which are designed for use in Microeconomics or Macroeconomics principles classes.A large literature studies the impact of classroom experiments on student achievement with experiments that randomize treatment across sections of a course, yielding mixed results. We extend this literature with an experimental design inspired by Wozny, Balser, and Ives (2018), who recommend randomizing treatment across topics within a given course section. Each student is taught some topics using a classroom experiment and other topics without one in a within-subject design. We use this design to evaluate the impact of classroom experiments on overall student achievement and separately for male and female students. Avilova and Goldin (2018) show that the degree to which women have been underrepresented among economics majors has been steady for the last 25 years. If classroom experiments boost achievement in introductory classes differentially by gender, they might be used to help address this imbalance by encouraging more women to major in economics. We find our classroom experiments have little overall impact on student learning overall and for both male and female students, but these null results may mask heterogeneous effects across the various experiments used to teach the different topics involved in the study.

  13. Research Data Spring idea to project requirements

    • figshare.com
    pdf
    Updated Jun 2, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Daniela Duca (2023). Research Data Spring idea to project requirements [Dataset]. http://doi.org/10.6084/m9.figshare.1336013.v1
    Explore at:
    pdfAvailable download formats
    Dataset updated
    Jun 2, 2023
    Dataset provided by
    Figsharehttp://figshare.com/
    figshare
    Authors
    Daniela Duca
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Research Data Spring idea to project requirements document to consult in advance and during the sandpit workshop on 26-27 February 2015. For a list and description of ideas see links.

  14. n

    Census Microdata Samples Project

    • neuinfo.org
    • scicrunch.org
    • +2more
    Updated Jan 29, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2022). Census Microdata Samples Project [Dataset]. http://identifiers.org/RRID:SCR_008902
    Explore at:
    Dataset updated
    Jan 29, 2022
    Description

    A data set of cross-nationally comparable microdata samples for 15 Economic Commission for Europe (ECE) countries (Bulgaria, Canada, Czech Republic, Estonia, Finland, Hungary, Italy, Latvia, Lithuania, Romania, Russia, Switzerland, Turkey, UK, USA) based on the 1990 national population and housing censuses in countries of Europe and North America to study the social and economic conditions of older persons. These samples have been designed to allow research on a wide range of issues related to aging, as well as on other social phenomena. A common set of nomenclatures and classifications, derived on the basis of a study of census data comparability in Europe and North America, was adopted as a standard for recoding. This series was formerly called Dynamics of Population Aging in ECE Countries. The recommendations regarding the design and size of the samples drawn from the 1990 round of censuses envisaged: (1) drawing individual-based samples of about one million persons; (2) progressive oversampling with age in order to ensure sufficient representation of various categories of older people; and (3) retaining information on all persons co-residing in the sampled individual''''s dwelling unit. Estonia, Latvia and Lithuania provided the entire population over age 50, while Finland sampled it with progressive over-sampling. Canada, Italy, Russia, Turkey, UK, and the US provided samples that had not been drawn specially for this project, and cover the entire population without over-sampling. Given its wide user base, the US 1990 PUMS was not recoded. Instead, PAU offers mapping modules, which recode the PUMS variables into the project''''s classifications, nomenclatures, and coding schemes. Because of the high sampling density, these data cover various small groups of older people; contain as much geographic detail as possible under each country''''s confidentiality requirements; include more extensive information on housing conditions than many other data sources; and provide information for a number of countries whose data were not accessible until recently. Data Availability: Eight of the fifteen participating countries have signed the standard data release agreement making their data available through NACDA/ICPSR (see links below). Hungary and Switzerland require a clearance to be obtained from their national statistical offices for the use of microdata, however the documents signed between the PAU and these countries include clauses stipulating that, in general, all scholars interested in social research will be granted access. Russia requested that certain provisions for archiving the microdata samples be removed from its data release arrangement. The PAU has an agreement with several British scholars to facilitate access to the 1991 UK data through collaborative arrangements. Statistics Canada and the Italian Institute of statistics (ISTAT) provide access to data from Canada and Italy, respectively. * Dates of Study: 1989-1992 * Study Features: International, Minority Oversamples * Sample Size: Approx. 1 million/country Links: * Bulgaria (1992), http://www.icpsr.umich.edu/icpsrweb/ICPSR/studies/02200 * Czech Republic (1991), http://www.icpsr.umich.edu/icpsrweb/ICPSR/studies/06857 * Estonia (1989), http://www.icpsr.umich.edu/icpsrweb/ICPSR/studies/06780 * Finland (1990), http://www.icpsr.umich.edu/icpsrweb/ICPSR/studies/06797 * Romania (1992), http://www.icpsr.umich.edu/icpsrweb/ICPSR/studies/06900 * Latvia (1989), http://www.icpsr.umich.edu/icpsrweb/ICPSR/studies/02572 * Lithuania (1989), http://www.icpsr.umich.edu/icpsrweb/ICPSR/studies/03952 * Turkey (1990), http://www.icpsr.umich.edu/icpsrweb/ICPSR/studies/03292 * U.S. (1990), http://www.icpsr.umich.edu/icpsrweb/ICPSR/studies/06219

  15. Convergent Aeronautics Solutions Project

    • catalog.data.gov
    • data.nasa.gov
    • +3more
    Updated Apr 11, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Aeronautics Research Mission Directorate (2025). Convergent Aeronautics Solutions Project [Dataset]. https://catalog.data.gov/dataset/convergent-aeronautics-solutions-project
    Explore at:
    Dataset updated
    Apr 11, 2025
    Dataset provided by
    Aeronautics Research Mission Directorate
    Description

    The Convergent Aeronautics Solutions (CAS) Project uses short-duration activities to establish early-stage concept and technology feasibility for high-potential solutions. Internal teams propose ideas for overcoming key barriers associated with large-scale aeronautics problems associated with ARMD’s six strategic thrusts. The teams will conduct initial feasibility studies, perform experiments, try out new ideas, identify failures, and try again. At the end of the cycle, a review determines whether the developed solutions have met their goals, established initial feasibility, and identified potential for future aviation impact. During these reviews, the most promising capabilities will be considered for continued development further by other ARMD programs or by direct transfer to the aviation community. In the dynamic environment of new ideas, ARMD also gains significant value from the knowledge gained in activities that do not proceed.

    In order to enable new capabilities in commercial aviation, the CAS Project’s focus is on merging traditional aeronautics disciplines with advancements driven by the non-aeronautics world.  The Project will draw on external collaborators to supplement in-house NASA expertise in technologies and disciplines that broadly support advancements in all ARMD strategic thrusts.

  16. f

    Data from: Why Come to Class? Post-pandemic Perspectives from Students in an...

    • tandf.figshare.com
    pdf
    Updated Jun 18, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Kelly Findley; Mehmet Aktas; Chloe Yang (2025). Why Come to Class? Post-pandemic Perspectives from Students in an Introductory Statistics Course [Dataset]. http://doi.org/10.6084/m9.figshare.29361137.v1
    Explore at:
    pdfAvailable download formats
    Dataset updated
    Jun 18, 2025
    Dataset provided by
    Taylor & Francis
    Authors
    Kelly Findley; Mehmet Aktas; Chloe Yang
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    As more university instructors continue making recordings of in-person classes available, educators should carefully consider how modality options may affect learners. While most existing studies that compare learning modalities rely on survey studies and broader correlations, we conducted an interventional qualitative study to glean more theory about how students experience different modalities. Nine students enrolled in a large introduction to biostatistics course volunteered to participate. For two different 50-minute class periods, participating students were randomly assigned to do one of the following: attend class in person, watch the class recording, or watch pre-recorded videos made by the instructor. Interviews revealed that students’ ability to self-regulate their learning was a key indicator of whether they could learn richly and successfully with video-based modalities. In-person class attendance had value for several, but typically as a vehicle for maintaining discipline and good habits rather than as an opportunity to learn more richly. We theorize that developing students’ ability to plan, monitor, and evaluate their own learning processes plays a crucial role in their success across multiple modalities. Furthermore, supporting students to notice and focus on conceptual ideas in statistics may better support reflective learning in courses where class recordings are available.

  17. Consumers witnessing false information on certain topics worldwide 2024

    • statista.com
    • ai-chatbox.pro
    Updated Jul 17, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2024). Consumers witnessing false information on certain topics worldwide 2024 [Dataset]. https://www.statista.com/statistics/1317019/false-information-topics-worldwide/
    Explore at:
    Dataset updated
    Jul 17, 2024
    Dataset authored and provided by
    Statistahttp://statista.com/
    Time period covered
    Jan 2024 - Feb 2024
    Area covered
    Worldwide
    Description

    A study held in early 2024 found that more than a third of surveyed consumers in selected countries worldwide had witnessed false news about politics in the week running to the survey. Suspicious or false COVID-19 news was also a problem. False news False news is often at its most insidious when it distorts or misrepresents information about key topics, such as public health, global conflicts, and elections. With 2024 set to be a significant year of political change, with elections taking place worldwide, trustworthy and verifiable information will be crucial. In the U.S., trust in news sources for information about the 2024 presidential election is patchy. Republicans and Independents are notably less trusting of news about the topic than their Democrat-voting peers, with only around 40 percent expressing trust in most news sources in the survey. Social media fared the least well in this respect with just a third of surveyed adults saying that they had faith in such sites to deliver trustworthy updates on the 2024 election. A separate survey revealed that older adults were the least likely to trust the news media for election news. This is something that publishers can bear in mind when targeting audiences with updates and campaign information. Distorting the truth: the impact of false news Aside from reading (and potentially believing) false information, consumers are also at risk of accidentally sharing false news and therefore contributing to its spread. One way in which the dissemination of false news could be stemmed is by consumers educating themselves on how to identify suspicious content, however government intervention has also been tabled. Consumers are split on whether or not governments should take steps to restrict false news, partly due to concerns about the need to protect freedom of information.

  18. Community Survey 2007 - South Africa

    • microdata.worldbank.org
    • catalog.ihsn.org
    • +2more
    Updated May 28, 2019
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statistics South Africa (2019). Community Survey 2007 - South Africa [Dataset]. https://microdata.worldbank.org/index.php/catalog/918
    Explore at:
    Dataset updated
    May 28, 2019
    Dataset authored and provided by
    Statistics South Africahttp://www.statssa.gov.za/
    Time period covered
    2007
    Area covered
    South Africa
    Description

    Abstract

    The Community Survey (CS) is a nationally representative, large-scale household survey which was conducted from February to March 2007. The Community Survey is designed to provide information on the trends and levels of demographic and socio-economic data, such as population size and distribution; the extent of poor households; access to facilities and services, and the levels of employment/unemployment at national, provincial and municipality level. The data can be used to assist government and the private sector in the planning, evaluation and monitoring of programmes and policies. The information collected can also be used to assess the impact of socio-economic policies and provide an indication as to how far the country has gone in its strides to eradicate poverty.

    Censuses 1996 and 2001 are the only all-inclusive censuses that Statistics South Africa has thus far conducted under the new democratic dispensation. Demographic and socio-economic data were collected and the results have enabled government and all other users of this information to make informed decisions. When cabinet took a decision that Stats SA should not conduct a census in 2006, it created a gap in information or data between Census 2001 and the next Census scheduled to be carried out in 2011. A decision was therefore taken to carry out the Community Survey in 2007.

    The main objectives of the survey were: · To provide estimates at lower geographical levels than existing household surveys; · To build human, management and logistical capacities for Census 2011; and · To provide inputs into the preparation of the mid-year population projections.

    The wider project strategic theme is to provide relevant statistical information that meets user needs and aspirations. Some of the main topics that are covered by the survey include demography, migration, disability and social grants, educational levels, employment and economic activities.

    Geographic coverage

    The survey covered the whole of South Africa, including all nine provinces as well as the four settlement types - urban-formal, urban-informal, rural-formal (commercial farms) and rural-informal (tribal areas).

    Analysis unit

    Households

    Universe

    The Community Survey covered all de jure household members (usual residents) in South Africa. The survey excluded collective living quarters (institutions) and some households in EAs classified as recreational areas or institutions. However, an approximation of the out-of-scope population was made from the 2001 Census and added to the final estimates of the CS 2007 results.

    Kind of data

    Sample survey data [ssd]

    Sampling procedure

    Sample Design

    The sampling procedure that was adopted for the CS was a two-stage stratified random sampling process. Stage one involved the selection of enumeration areas, and stage tow was the selection of dwelling units.

    Since the data are required for each local municipality, each municipality was considered as an explicit stratum. The stratification is done for those municipalities classified as category B municipalities (local municipalities) and category A municipalities (metropolitan areas) as proclaimed at the time of Census 2001. However, the newly proclaimed boundaries as well as any other higher level of geography such as province or district municipality, were considered as any other domain variable based on their link to the smallest geographic unit - the enumeration area.

    The Frame

    The Census 2001 enumeration areas were used because they give a full geographic coverage of the country without any overlap. Although changes in settlement type, growth or movement of people have occurred, the enumeration areas assisted in getting a spatial comparison over time. Out of 80 787 enumeration areas countrywide, 79 466 were considered in the frame. A total of 1 321 enumeration areas were excluded (919 covering institutions and 402 recreational areas).

    On the second level, the listing exercise yielded the dwelling frame which facilitated the selection of dwellings to be visited. The dwelling unit is a structure or part of a structure or group of structures occupied or meant to be occupied by one or more households. Some of these structures may be vacant and/or under construction, but can be lived in at the time of the survey. A dwelling unit may also be within collective living quarters where applicable (examples of each are a house, a group of huts, a flat, hostels, etc.).

    The Community Survey universe at the second-level frame is dependent on whether the different structures are classified as dwelling units (DUs) or not. Structures where people stay/live were listed and classified as dwelling units. However, there are special cases of collective living quarters that were also included in the CS frame. These are religious institutions such as convents or monasteries, and guesthouses where people stay for an extended period (more than a month). Student residences - based on how long people have stayed (more than a month) - and old-age homes not similar to hospitals (where people are living in a communal set-up) were treated the same as hostels, thereby listing either the bed or room. In addition, any other family staying in separate quarters within the premises of an institution (like wardens' quarters, military family quarters, teachers' quarters and medical staff quarters) were considered as part of the CS frame. The inclusion of such group quarters in the frame is based on the living circumstances within these structures. Members are independent of each other with the exception that they sleep under one roof.

    The remaining group quarters were excluded from the CS frame because they are difficult to access and have no stable composition. Excluded dwelling types were prisons, hotels, hospitals, military barracks, etc. This is in addition to the exclusion on first level of the enumeration areas (EAs) classified as institutions (military bases) or recreational areas (national parks).

    The Selection of Enumeration Areas (EAs)

    The EAs within each municipality were ordered by geographic type and EA type. The selection was done by using systematic random sampling. The criteria used were as follows: In municipalities with fewer than 30 EAs, all EAs were automatically selected. In municipalities with 30 or more EAs, the sample selection used a fixed proportion of 19% of all sampled EAs. However, if the selected EAs in a municipality were less than 30 EAs, the sample in the municipality was increased to 30 EAs.

    The Selection of Dwelling Units

    The second level of the frame required a full re-listing of dwelling units. The listing exercise was undertaken before the selection of DUs. The adopted listing methodology ensured that the listing route was determined by the lister. Thisapproach facilitated the serpentine selection of dwelling units. The listing exercise provided a complete list of dwelling units in the selected EAs. Only those structures that were classified as dwelling units were considered for selection, whether vacant or occupied. This exercise yielded a total of 2 511 314 dwelling units.

    The selection of the dwelling units was also based on a fixed proportion of 10% of the total listed dwellings in an EA. A constraint was imposed on small-size EAs where, if the listed dwelling units were less than 10 dwellings, the selection was increased to 10 dwelling units. All households within the selected dwelling units were covered. There was no replacement of refusals, vacant dwellings or non-contacts owing to their impact on the probability of selection.

    Mode of data collection

    Face-to-face [f2f]

    Research instrument

    Consultation on Questionnaire Design Ten stakeholder workshops were held across the country during August and September 2004. Approximately 367 stakeholders, predominantly from national, provincial and local government departments, as well as from research and educational institutions, attended. The workshops aimed to achieve two objectives, namely to better understand the type of information stakeholders need to meet their objectives, and to consider the proposed data items to be included in future household surveys. The output from this process was a set of data items relating to a specific, defined focus area and outcomes that culminated with the data collection instrument (see Annexure B for all the data items).

    Questionnaire Design The design of the CS questionnaire was household-based and intended to collect information on 10 people. It was developed in line with the household-based survey questionnaires conducted by Stats SA. The questions were based on the data items generated out of the consultation process described above. Both the design and questionnaire layout were pre-tested in October 2005 and adjustments were made for the pilot in February 2006. Further adjustments were done after the pilot results had been finalised.

    Cleaning operations

    Editing The automated cleaning was implemented based on an editing rules specification defined with reference to the approved questionnaire. Most of the editing rules were categorised into structural edits looking into the relationship between different record type, the minimum processability rules that removed false positive readings or noise, the logical editing that determine the inconsistency between fields of the same statistical unit, and the inferential editing that search similarities across the domain. The edit specifications document for the structural, population, mortality and housing edits was developed by a team of Stats SA subject-matter specialists, demographers, and programmers. The process was successfully

  19. Kickstarter Data, Global, 2009-2020

    • icpsr.umich.edu
    Updated Sep 13, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Leland, Jonathan (2022). Kickstarter Data, Global, 2009-2020 [Dataset]. http://doi.org/10.3886/ICPSR38050.v2
    Explore at:
    Dataset updated
    Sep 13, 2022
    Dataset provided by
    Inter-university Consortium for Political and Social Researchhttps://www.icpsr.umich.edu/web/pages/
    Authors
    Leland, Jonathan
    License

    https://www.icpsr.umich.edu/web/ICPSR/studies/38050/termshttps://www.icpsr.umich.edu/web/ICPSR/studies/38050/terms

    Time period covered
    2009 - 2020
    Area covered
    Global
    Description

    Launched on April 28, 2009, Kickstarter is a Public Benefit Corporation based in Brooklyn, New York. It is a global crowdfunding platform that helps to fund new creative projects and ideas through direct support from individuals (backers) from around the world who pledge money to bring these projects and ideas to life. Kickstarter supports many different kinds of projects. Everything from films, games, and music to art, design, and technology. Funding on Kickstarter is based on the all-or-nothing model. Backers who pledge their support towards a particular project won't be charged unless the funding goal has been reached. Successfully funded projects reward their backers with one-of-a-kind experiences, e.g., limited editions, or copies of the creative work being produced. This study includes three datasets: (1) Kickstarter Project (public-use file), (2) Backer Location file, and (3) Kickstarter Project (restricted-use file). The public-use Kickstarter Project dataset contains detailed information about all successful and unsuccessful Kickstarter projects (N=506,199) from 2009-2020, including the project category and subcategory, project location (city, state (for U.S.-based projects), and country), funding goal in original and U.S. currencies, amount pledged in dollars, and the number of backers for each project. The restricted file adds the project title, 150-character project description, and the URL for the project on the Kickstarter site. The Backer Location dataset includes information about backers' country and state and the total amount pledged for each geographic location.

  20. n

    Data from: Advances in Differential Privacy Concepts and Methods

    • curate.nd.edu
    pdf
    Updated Nov 11, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Xingyuan Zhao (2024). Advances in Differential Privacy Concepts and Methods [Dataset]. http://doi.org/10.7274/25565250.v1
    Explore at:
    pdfAvailable download formats
    Dataset updated
    Nov 11, 2024
    Dataset provided by
    University of Notre Dame
    Authors
    Xingyuan Zhao
    License

    https://www.law.cornell.edu/uscode/text/17/106https://www.law.cornell.edu/uscode/text/17/106

    Description

    Differential privacy (DP) formalizes privacy guarantees in a rigorous mathematical framework and is a state-of-the-art concept in data privacy research. The DP mechanisms ensure the privacy of each individual in a sensitive dataset while releasing useful information about the whole population in that dataset. Since its debut in 2006, significant advancements in DP theory, methodologies, and applications have been made; new research topics and questions have been proposed and studied. This dissertation aims to contribute to the advancement of DP concepts and methods in the robustness of DP mechanisms to privacy attacks, privacy amplification through subsampling, and DP guarantees of procedures with their intrinsic randomness. Specifically, this dissertation consists of three research projects on DP. The first project explores the protection potency of DP mechanisms against homogeneity attacks (HA) by providing analytical relations between measures of disclosure risk from HA and privacy loss parameters, which will assist practitioners in understanding the abstract concepts of DP by putting them in a concrete privacy attack model and offer a perspective for choosing privacy loss parameters. The second project proposes a class of subsampling methods ``MUltistage Sampling Technique (MUST)'' for privacy amplification. It provides the privacy composition analysis over repeated applications of MUST via the Fourier accountant algorithm. The utility experiments show that MUST demonstrates comparable utility and stability in privacy-preserving outputs compared to one-stage subsampling methods at similar privacy loss while improving the computational efficiency of algorithms requiring complex function calculations on distinct data points. MUST can be seamlessly integrated into stochastic optimization algorithms or procedures involving parallel or simultaneous subsampling when DP guarantees are necessary. The third project investigates the inherent DP guarantees in Bayesian posterior sampling. It provides a new privacy loss bound in releasing a single posterior sample with any prior given a bounded log ratio of the likelihood kernels based on two neighboring data sets. The new bound is tighter than the existing bounds and consistent with the likelihood principle. Experiments show that the privacy-preserving synthetic data released from Bayesian models leveraging the inherently private posterior samples are of improved utility compared to those generated by sanitizing the original information through explicit DP mechanisms.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Southern Africa Labour and Development Research Unit (2019). Project for Statistics on Living Standards and Development 1993 - South Africa [Dataset]. https://catalog.ihsn.org/catalog/4628

Project for Statistics on Living Standards and Development 1993 - South Africa

Explore at:
2 scholarly articles cite this dataset (View in Google Scholar)
Dataset updated
Mar 29, 2019
Dataset authored and provided by
Southern Africa Labour and Development Research Unit
Time period covered
1993
Area covered
South Africa
Description

Abstract

The Project for Statistics on Living standards and Development was a coutrywide World Bank Living Standards Measurement Survey. It covered approximately 9000 households, drawn from a representative sample of South African households. The fieldwork was undertaken during the nine months leading up to the country's first democratic elections at the end of April 1994. The purpose of the survey was to collect statistical information about the conditions under which South Africans live in order to provide policymakers with the data necessary for planning strategies. This data would aid the implementation of goals such as those outlined in the Government of National Unity's Reconstruction and Development Programme.

Geographic coverage

National coverage

Analysis unit

  • Households
  • Individuals
  • Community

Universe

All Household members.

Individuals in hospitals, old age homes, hotels and hostels of educational institutions were not included in the sample. Migrant labour hostels were included. In addition to those that turned up in the selected ESDs, a sample of three hostels was chosen from a national list provided by the Human Sciences Research Council and within each of these hostels a representative sample was drawn on a similar basis as described above for the households in ESDs.

Kind of data

Sample survey data [ssd]

Sampling procedure

Sample size is 9,000 households

The sample design adopted for the study was a two-stage self-weightingdesign in which the first stage units were Census Enumerator Subdistricts (ESDs, or their equivalent) and the second stage were households.

The advantage of using such a design is that it provides a representative sample that need not be based on accurate census population distribution.in the case of South Africa, the sample will automatically include many poor people, without the need to go beyond this and oversample the poor. Proportionate sampling as in such a self-weighting sample design offers the simplest possible data files for further analysis, as weights do not have to be added. However, in the end this advantage could not be retained and weights had to be added.

The sampling frame was drawn up on the basis of small, clearly demarcated area units, each with a population estimate. The nature of the self-weighting procedure adopted ensured that this population estimate was not important for determining the final sample, however. For most of the country, census ESDs were used. Where some ESDs comprised relatively large populations as for instance in some black townships such as Soweto, aerial photographs were used to divide the areas into blocks of approximately equal population size. In other instances, particularly in some of the former homelands, the area units were not ESDs but villages or village groups.

In the sample design chosen, the area stage units (generally ESDs) were selected with probability proportional to size, based on the census population. Systematic sampling was used throughout that is, sampling at fixed interval in a list of ESDs, starting at a randomly selected starting point. Given that sampling was self-weighting, the impact of stratification was expected to be modest. The main objective was to ensure that the racial and geographic breakdown approximated the national population distribution. This was done by listing the area stage units (ESDs) by statistical region and then within the statistical region by urban or rural. Within these sub-statistical regions, the ESDs were then listed in order of percentage African. The sampling interval for the selection of the ESDs was obtained by dividing the 1991 census population of 38,120,853 by the 300 clusters to be selected. This yielded 105,800. Starting at a randomly selected point, every 105,800th person down the cluster list was selected. This ensured both geographic and racial diversity (ESDs were ordered by statistical sub-region and proportion of the population African). In three or four instances, the ESD chosen was judged inaccessible and replaced with a similar one.

In the second sampling stage the unit of analysis was the household. In each selected ESD a listing or enumeration of households was carried out by means of a field operation. From the households listed in an ESD a sample of households was selected by systematic sampling. Even though the ultimate enumeration unit was the household, in most cases "stands" were used as enumeration units. However, when a stand was chosen as the enumeration unit all households on that stand had to be interviewed.

Census population data, however, was available only for 1991. An assumption on population growth was thus made to obtain an approximation of the population size for 1993, the year of the survey. The sampling interval at the level of the household was determined in the following way: Based on the decision to have a take of 125 individuals on average per cluster (i.e. assuming 5 members per household to give an average cluster size of 25 households), the interval of households to be selected was determined as the census population divided by 118.1, i.e. allowing for population growth since the census. It was subsequently discovered that population growth was slightly over-estimated but this had little effect on the findings of the survey.

Individuals in hospitals, old age homes, hotels and hostels of educational institutions were not included in the sample. Migrant labour hostels were included. In addition to those that turned up in the selected ESDs, a sample of three hostels was chosen from a national list provided by the Human Sciences Research Council and within each of these hostels a representative sample was drawn on a similar basis as described abovefor the households in ESDs.

Mode of data collection

Face-to-face [f2f]

Research instrument

The main instrument used in the survey was a comprehensive household questionnaire. This questionnaire covered a wide range of topics but was not intended to provide exhaustive coverage of any single subject. In other words, it was an integrated questionnaire aimed at capturing different aspects of living standards. The topics covered included demography, household services, household expenditure, educational status and expenditure, remittances and marital maintenance, land access and use, employment and income, health status and expenditure and anthropometry (children under the age of six were weighed and their heights measured). This questionnaire was available to households in two languages, namely English and Afrikaans. In addition, interviewers had in their possession a translation in the dominant African language/s of the region.

In addition to the detailed household questionnaire referred to above, a community questionnaire was administered in each cluster of the sample. The purpose of this questionnaire was to elicit information on the facilities available to the community in each cluster. Questions related primarily to the provision of education, health and recreational facilities. Furthermore there was a detailed section for the prices of a range of commodities from two retail sources in or near the cluster: a formal source such as a supermarket and a less formal one such as the "corner cafe" or a "spaza". The purpose of this latter section was to obtain a measure of regional price variation both by region and by retail source. These prices were obtained by the interviewer. For the questions relating to the provision of facilities, respondents were "prominent" members of the community such as school principals, priests and chiefs.

Cleaning operations

All the questionnaires were checked when received. Where information was incomplete or appeared contradictory, the questionnaire was sent back to the relevant survey organization. As soon as the data was available, it was captured using local development platform ADE. This was completed in February 1994. Following this, a series of exploratory programs were written to highlight inconsistencies and outlier. For example, all person level files were linked together to ensure that the same person code reported in different sections of the questionnaire corresponded to the same person. The error reports from these programs were compared to the questionnaires and the necessary alterations made. This was a lengthy process, as several files were checked more than once, and completed at the beginning of August 1994. In some cases questionnaires would contain missing values, or comments that the respondent did not know, or refused to answer a question.

These responses are coded in the data files with the following values: VALUE MEANING -1 : The data was not available on the questionnaire or form -2 : The field is not applicable -3 : Respondent refused to answer -4 : Respondent did not know answer to question

Data appraisal

The data collected in clusters 217 and 218 should be viewed as highly unreliable and therefore removed from the data set. The data currently available on the web site has been revised to remove the data from these clusters. Researchers who have downloaded the data in the past should revise their data sets. For information on the data in those clusters, contact SALDRU http://www.saldru.uct.ac.za/.

Search
Clear search
Close search
Google apps
Main menu