46 datasets found
  1. Dataset #1: Cross-sectional survey data

    • figshare.com
    txt
    Updated Jul 19, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Adam Baimel (2023). Dataset #1: Cross-sectional survey data [Dataset]. http://doi.org/10.6084/m9.figshare.23708730.v1
    Explore at:
    txtAvailable download formats
    Dataset updated
    Jul 19, 2023
    Dataset provided by
    Figsharehttp://figshare.com/
    Authors
    Adam Baimel
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    N.B. This is not real data. Only here for an example for project templates.

    Project Title: Add title here

    Project Team: Add contact information for research project team members

    Summary: Provide a descriptive summary of the nature of your research project and its aims/focal research questions.

    Relevant publications/outputs: When available, add links to the related publications/outputs from this data.

    Data availability statement: If your data is not linked on figshare directly, provide links to where it is being hosted here (i.e., Open Science Framework, Github, etc.). If your data is not going to be made publicly available, please provide details here as to the conditions under which interested individuals could gain access to the data and how to go about doing so.

    Data collection details: 1. When was your data collected? 2. How were your participants sampled/recruited?

    Sample information: How many and who are your participants? Demographic summaries are helpful additions to this section.

    Research Project Materials: What materials are necessary to fully reproduce your the contents of your dataset? Include a list of all relevant materials (e.g., surveys, interview questions) with a brief description of what is included in each file that should be uploaded alongside your datasets.

    List of relevant datafile(s): If your project produces data that cannot be contained in a single file, list the names of each of the files here with a brief description of what parts of your research project each file is related to.

    Data codebook: What is in each column of your dataset? Provide variable names as they are encoded in your data files, verbatim question associated with each response, response options, details of any post-collection coding that has been done on the raw-response (and whether that's encoded in a separate column).

    Examples available at: https://www.thearda.com/data-archive?fid=PEWMU17 https://www.thearda.com/data-archive?fid=RELLAND14

  2. General Social Survey 2012 Cross-Section and Panel Combined - Instructional...

    • thearda.com
    Updated 2012
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Tom W. Smith (2012). General Social Survey 2012 Cross-Section and Panel Combined - Instructional Dataset [Dataset]. http://doi.org/10.17605/OSF.IO/TH2CE
    Explore at:
    Dataset updated
    2012
    Dataset provided by
    Association of Religion Data Archives
    Authors
    Tom W. Smith
    Dataset funded by
    National Science Foundation
    Description

    This file contains all of the cases and variables that are in the original 2012 General Social Survey, but is prepared for easier use in the classroom. Changes have been made in two areas. First, to avoid confusion when constructing tables or interpreting basic analysis, all missing data codes have been set to system missing. Second, many of the continuous variables have been categorized into fewer categories, and added as additional variables to the file.

    The General Social Surveys (GSS) have been conducted by the National Opinion Research Center (NORC) annually since 1972, except for the years 1979, 1981, and 1992 (a supplement was added in 1992), and biennially beginning in 1994. The GSS are designed to be part of a program of social indicator research, replicating questionnaire items and wording in order to facilitate time-trend studies. This data file has all cases and variables asked on the 2012 GSS. There are a total of 4,820 cases in the data set but their initial sampling years vary because the GSS now contains panel cases. Sampling years can be identified with the variable SAMPTYPE.

    The 2012 GSS featured special modules on religious scriptures, the environment, dance and theater performances, health care system, government involvement, health concerns, emotional health, financial independence and income inequality.

    The GSS has switched from a repeating, cross-section design to a combined repeating cross-section and panel-component design. This file has a rolling panel design, with the 2008 GSS as the base year for the first panel. A sub-sample of 2,000 GSS cases from 2008 was selected for reinterview in 2010 and again in 2012 as part of the GSSs in those years. The 2010 GSS consisted of a new cross-section plus the reinterviews from 2008. The 2012 GSS consists of a new cross-section of 1,974, the first reinterview wave of the 2010 panel cases with 1,551 completed cases, and the second and final reinterview of the 2008 panel with 1,295 completed cases. Altogether, the 2012 GSS had 4,820 cases (1,974 in the new 2012 panel, 1,551 in the 2010 panel, and 1,295 in the 2008 panel).

    To download syntax files for the GSS that reproduce well-known religious group recodes, including RELTRAD, please visit the "/research/syntax-repository-list" Target="_blank">ARDA's Syntax Repository.

  3. BLM OR Water Quality and Quantity Cross Section Sample Publication Point Hub...

    • catalog.data.gov
    • datasets.ai
    • +1more
    Updated Nov 11, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Bureau of Land Management (2025). BLM OR Water Quality and Quantity Cross Section Sample Publication Point Hub [Dataset]. https://catalog.data.gov/dataset/blm-or-water-quality-and-quantity-cross-section-sample-publication-point-hub-d1d6d
    Explore at:
    Dataset updated
    Nov 11, 2025
    Dataset provided by
    Bureau of Land Managementhttp://www.blm.gov/
    Description

    CROSS_SECT_SAMPLE_PUB_PT: Cross-sectional surveys capture the shape of the stream channel at a specific location by measuring elevations at intervals across the channel. Cross-sections are used to determine bankfull width, mean bankfull depth, and entrenchment of a channel at a specific point. Cross-sections are usually installed and monitored to track geomorphic change in a stream before and after a physical alteration to the channel; these surveys can detect erosion and deposition of stream sediment as well as changes to the shape (profile) of stream bed and banks. The cross-section table defined in this data standard stores the summary measurements. Raw data can be stored in a spreadsheet or document and related to the record.

  4. f

    S1 File -

    • plos.figshare.com
    bin
    Updated Feb 23, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Tamrat Anbesaw; Amare Asmamaw; Kidist Adamu; Million Tsegaw (2024). S1 File - [Dataset]. http://doi.org/10.1371/journal.pone.0298406.s001
    Explore at:
    binAvailable download formats
    Dataset updated
    Feb 23, 2024
    Dataset provided by
    PLOS ONE
    Authors
    Tamrat Anbesaw; Amare Asmamaw; Kidist Adamu; Million Tsegaw
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    BackgroundCurrently, the biggest issue facing the entire world is mental health. According to the Ethiopian Ministry of Health, nearly one-fourth of the community is experiencing any of the mental illness categories. Most of the cases were treated in religious and traditional institutions, which the community most liked to be treated. However, there were very limited studies conducted to show the level of mental health literacy among traditional healers.AimsThe study aimed to assess the level of mental health literacy and its associated factors among traditional healers toward mental illness found in Northeast, Ethiopia from September 1-30/2022.MethodA mixed approach cross-sectional study design was carried out on September 130, 2022, using simple random sampling with a total sample of 343. Pretested, structured questionnaires and face-to-face interviews were utilized for data collection. The level of Mental Health Literacy (MHL) was assessed using the 35 mental health literacy (35-MHLQ) scale. The semi-structured checklist was used for the in-depth interview and the FGD for the qualitative part. Data was entered using Epi-data version 4.6 and, then exported to SPSS version 26 for analysis. The association between outcome and independent variables was analyzed with bivariate and multivariable linear regression. P-values < 0.05 were considered statistically significant. Thematic analysis was used to analyze the qualitative data, and the findings were then referenced with the findings of the quantitative data.ResultsThe findings of this study showed that the sample of traditional healers found in Dessie City scored a total mean of mental health literacy of 91.81 ± 10:53. Age (β = -0.215, 95% CI (-0.233, -0.05), p = 0.003, informal educational status (β = -5.378, 95% CI (-6.505, -0.350), p = 0.029, presence of relative with a mental disorder (β = 6.030, 95% CI (0.073, 7.428),p = 0.046, getting information on mental illness (β = 6.565, 95% CI (3.432, 8.680), p =

  5. D

    Replication Data for: A Three-Year Mixed Methods Study of Undergraduates’...

    • dataverse.no
    • dataverse.azure.uit.no
    • +2more
    Updated Oct 8, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ellen Nierenberg; Ellen Nierenberg (2024). Replication Data for: A Three-Year Mixed Methods Study of Undergraduates’ Information Literacy Development: Knowing, Doing, and Feeling [Dataset]. http://doi.org/10.18710/SK0R1N
    Explore at:
    txt(21865), txt(19475), csv(55030), txt(14751), txt(26578), txt(16861), txt(28211), pdf(107685), pdf(657212), txt(12082), txt(16243), text/x-fixed-field(55030), pdf(65240), txt(8172), pdf(634629), txt(31896), application/x-spss-sav(51476), txt(4141), pdf(91121), application/x-spss-sav(31612), txt(35011), txt(23981), text/x-fixed-field(15653), txt(25369), txt(17935), csv(15653)Available download formats
    Dataset updated
    Oct 8, 2024
    Dataset provided by
    DataverseNO
    Authors
    Ellen Nierenberg; Ellen Nierenberg
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Time period covered
    Aug 8, 2019 - Jun 10, 2022
    Area covered
    Norway
    Description

    This data set contains the replication data and supplements for the article "Knowing, Doing, and Feeling: A three-year, mixed-methods study of undergraduates’ information literacy development." The survey data is from two samples: - cross-sectional sample (different students at the same point in time) - longitudinal sample (the same students and different points in time)Surveys were distributed via Qualtrics during the students' first and sixth semesters. Quantitative and qualitative data were collected and used to describe students' IL development over 3 years. Statistics from the quantitative data were analyzed in SPSS. The qualitative data was coded and analyzed thematically in NVivo. The qualitative, textual data is from semi-structured interviews with sixth-semester students in psychology at UiT, both focus groups and individual interviews. All data were collected as part of the contact author's PhD research on information literacy (IL) at UiT. The following files are included in this data set: 1. A README file which explains the quantitative data files. (2 file formats: .txt, .pdf)2. The consent form for participants (in Norwegian). (2 file formats: .txt, .pdf)3. Six data files with survey results from UiT psychology undergraduate students for the cross-sectional (n=209) and longitudinal (n=56) samples, in 3 formats (.dat, .csv, .sav). The data was collected in Qualtrics from fall 2019 to fall 2022. 4. Interview guide for 3 focus group interviews. File format: .txt5. Interview guides for 7 individual interviews - first round (n=4) and second round (n=3). File format: .txt 6. The 21-item IL test (Tromsø Information Literacy Test = TILT), in English and Norwegian. TILT is used for assessing students' knowledge of three aspects of IL: evaluating sources, using sources, and seeking information. The test is multiple choice, with four alternative answers for each item. This test is a "KNOW-measure," intended to measure what students know about information literacy. (2 file formats: .txt, .pdf)7. Survey questions related to interest - specifically students' interest in being or becoming information literate - in 3 parts (all in English and Norwegian): a) information and questions about the 4 phases of interest; b) interest questionnaire with 26 items in 7 subscales (Tromsø Interest Questionnaire - TRIQ); c) Survey questions about IL and interest, need, and intent. (2 file formats: .txt, .pdf)8. Information about the assignment-based measures used to measure what students do in practice when evaluating and using sources. Students were evaluated with these measures in their first and sixth semesters. (2 file formats: .txt, .pdf)9. The Norwegain Centre for Research Data's (NSD) 2019 assessment of the notification form for personal data for the PhD research project. In Norwegian. (Format: .pdf)

  6. General Social Survey 2012 Cross-Section and Panel Combined

    • thearda.com
    Updated 2012
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Tom W. Smith (2012). General Social Survey 2012 Cross-Section and Panel Combined [Dataset]. http://doi.org/10.17605/OSF.IO/5G3RJ
    Explore at:
    Dataset updated
    2012
    Dataset provided by
    Association of Religion Data Archives
    Authors
    Tom W. Smith
    Dataset funded by
    National Science Foundation
    Description

    The General Social Surveys (GSS) have been conducted by the National Opinion Research Center (NORC) annually since 1972, except for the years 1979, 1981, and 1992 (a supplement was added in 1992), and biennially beginning in 1994. The GSS are designed to be part of a program of social indicator research, replicating questionnaire items and wording in order to facilitate time-trend studies. This data file has all cases and variables asked on the 2012 GSS. There are a total of 4,820 cases in the data set but their initial sampling years vary because the GSS now contains panel cases. Sampling years can be identified with the variable SAMPTYPE.

    The 2012 GSS featured special modules on religious scriptures, the environment, dance and theater performances, health care system, government involvement, health concerns, emotional health, financial independence and income inequality.

    The GSS has switched from a repeating, cross-section design to a combined repeating cross-section and panel-component design. This file has a rolling panel design, with the 2008 GSS as the base year for the first panel. A sub-sample of 2,000 GSS cases from 2008 was selected for reinterview in 2010 and again in 2012 as part of the GSSs in those years. The 2010 GSS consisted of a new cross-section plus the reinterviews from 2008. The 2012 GSS consists of a new cross-section of 1,974, the first reinterview wave of the 2010 panel cases with 1,551 completed cases, and the second and final reinterview of the 2008 panel with 1,295 completed cases. Altogether, the 2012 GSS had 4,820 cases (1,974 in the new 2012 panel, 1,551 in the 2010 panel, and 1,295 in the 2008 panel).

    To download syntax files for the GSS that reproduce well-known religious group recodes, including RELTRAD, please visit the "/research/syntax-repository-list" Target="_blank">ARDA's Syntax Repository.

  7. w

    National Panel Survey 2008-2015, Uniform Panel Dataset - Tanzania

    • microdata.worldbank.org
    • datacatalog.ihsn.org
    • +1more
    Updated Mar 17, 2021
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    National Bureau of Statistics (2021). National Panel Survey 2008-2015, Uniform Panel Dataset - Tanzania [Dataset]. https://microdata.worldbank.org/index.php/catalog/3814
    Explore at:
    Dataset updated
    Mar 17, 2021
    Dataset authored and provided by
    National Bureau of Statistics
    Time period covered
    2008 - 2015
    Area covered
    Tanzania
    Description

    Abstract

    Panel data possess several advantages over conventional cross-sectional and time-series data, including their power to isolate the effects of specific actions, treatments, and general policies often at the core of large-scale econometric development studies. While the concept of panel data alone provides the capacity for modeling the complexities of human behavior, the notion of universal panel data – in which time- and situation-driven variances leading to variations in tools, and thus results, are mitigated – can further enhance exploitation of the richness of panel information.

    This Basic Information Document (BID) provides a brief overview of the Tanzania National Panel Survey (NPS), but focuses primarily on the theoretical development and application of panel data, as well as key elements of the universal panel survey instrument and datasets generated by the four rounds of the NPS. As this Basic Information Document (BID) for the UPD does not describe in detail the background, development, or use of the NPS itself, the round-specific NPS BIDs should supplement the information provided here.

    The NPS Uniform Panel Dataset (UPD) consists of both survey instruments and datasets, meticulously aligned and engineered with the aim of facilitating the use of and improving access to the wealth of panel data offered by the NPS. The NPS-UPD provides a consistent and straightforward means of conducting not only user-driven analyses using convenient, standardized tools, but also for monitoring MKUKUTA, FYDP II, and other national level development indicators reported by the NPS.

    The design of the NPS-UPD combines the four completed rounds of the NPS – NPS 2008/09 (R1), NPS 2010/11 (R2), NPS 2012/13 (R3), and NPS 2014/15 (R4) – into pooled, module-specific survey instruments and datasets. The panel survey instruments offer the ease of comparability over time, with modifications and variances easily identifiable as well as those aspects of the questionnaire which have remained identical and offer consistent information. By providing all module-specific data over time within compact, pooled datasets, panel datasets eliminate the need for user-generated merges between rounds and present data in a clear, logical format, increasing both the usability and comprehension of complex data.

    Geographic coverage

    Designed for analysis of key indicators at four primary domains of inference, namely: Dar es Salaam, other urban, rural, Zanzibar.

    Analysis unit

    • Households
    • Individuals

    Universe

    The universe includes all households and individuals in Tanzania with the exception of those residing in military barracks or other institutions.

    Kind of data

    Sample survey data [ssd]

    Sampling procedure

    While the same sample of respondents was maintained over the first three rounds of the NPS, longitudinal surveys tend to suffer from bias introduced by households leaving the survey over time; i.e. attrition. Although the NPS maintains a highly successful recapture rate (roughly 96% retention at the household level), minimizing the escalation of this selection bias, a refresh of longitudinal cohorts was done for the NPS 2014/15 to ensure proper representativeness of estimates while maintaining a sufficient primary sample to maintain cohesion within panel analysis. A newly completed Population and Housing Census (PHC) in 2012, providing updated population figures along with changes in administrative boundaries, emboldened the opportunity to realign the NPS sample and abate collective bias potentially introduced through attrition.

    To maintain the panel concept of the NPS, the sample design for NPS 2014/2015 consisted of a combination of the original NPS sample and a new NPS sample. A nationally representative sub-sample was selected to continue as part of the “Extended Panel” while an entirely new sample, “Refresh Panel”, was selected to represent national and sub-national domains. Similar to the sample in NPS 2008/2009, the sample design for the “Refresh Panel” allows analysis at four primary domains of inference, namely: Dar es Salaam, other urban areas on mainland Tanzania, rural mainland Tanzania, and Zanzibar. This new cohort in NPS 2014/2015 will be maintained and tracked in all future rounds between national censuses.

    Mode of data collection

    Face-to-face [f2f]

    Research instrument

    The format of the NPS-UPD survey instrument is similar to previously disseminated NPS survey instruments. Each module has a questionnaire and clearly identifies if the module collects information at the individual or household level. Within each module-specific questionnaire of the NPS-UPD survey instrument, there are five distinct sections, arranged vertically: (1) the UPD - “U” on the survey instrument, (2) R4, (3), R3, (4) R2, and (5) R1 – the latter 4 sections presenting each questionnaire in its original form at time of its respective dissemination.

    The uppermost section of each module’s questionnaire (“U”) represents the model universal panel questionnaire, with questions generated from the comprehensive listing of questions across all four rounds of the NPS and codes generated from the comprehensive collection of codes. The following sections are arranged vertically by round, considering R4 as most recent. While not all rounds will have data reported for each question in the UPD and not each question will have reports for each of the UPD codes listed, the NPS-UPD survey instrument represents the visual, all-inclusive set of information collected by the NPS over time.

    The four round-specific sections (R4, R3, R2, R1) are aligned with their UPD-equivalent question, visually presenting their contribution to compatibility with the UPD. Each round-specific section includes the original round-specific variable names, response codes and skip patterns (corresponding to their respective round-specific NPS data sets, and despite their variance from other rounds or from the comprehensive UPD code listing)4.

  8. d

    Health and Retirement Study (HRS)

    • search.dataone.org
    • dataverse.harvard.edu
    Updated Nov 21, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Damico, Anthony (2023). Health and Retirement Study (HRS) [Dataset]. http://doi.org/10.7910/DVN/ELEKOY
    Explore at:
    Dataset updated
    Nov 21, 2023
    Dataset provided by
    Harvard Dataverse
    Authors
    Damico, Anthony
    Description

    analyze the health and retirement study (hrs) with r the hrs is the one and only longitudinal survey of american seniors. with a panel starting its third decade, the current pool of respondents includes older folks who have been interviewed every two years as far back as 1992. unlike cross-sectional or shorter panel surveys, respondents keep responding until, well, death d o us part. paid for by the national institute on aging and administered by the university of michigan's institute for social research, if you apply for an interviewer job with them, i hope you like werther's original. figuring out how to analyze this data set might trigger your fight-or-flight synapses if you just start clicking arou nd on michigan's website. instead, read pages numbered 10-17 (pdf pages 12-19) of this introduction pdf and don't touch the data until you understand figure a-3 on that last page. if you start enjoying yourself, here's the whole book. after that, it's time to register for access to the (free) data. keep your username and password handy, you'll need it for the top of the download automation r script. next, look at this data flowchart to get an idea of why the data download page is such a righteous jungle. but wait, good news: umich recently farmed out its data management to the rand corporation, who promptly constructed a giant consolidated file with one record per respondent across the whole panel. oh so beautiful. the rand hrs files make much of the older data and syntax examples obsolete, so when you come across stuff like instructions on how to merge years, you can happily ignore them - rand has done it for you. the health and retirement study only includes noninstitutionalized adults when new respondents get added to the panel (as they were in 1992, 1993, 1998, 2004, and 2010) but once they're in, they're in - respondents have a weight of zero for interview waves when they were nursing home residents; but they're still responding and will continue to contribute to your statistics so long as you're generalizing about a population from a previous wave (for example: it's possible to compute "among all americans who were 50+ years old in 1998, x% lived in nursing homes by 2010"). my source for that 411? page 13 of the design doc. wicked. this new github repository contains five scripts: 1992 - 2010 download HRS microdata.R loop through every year and every file, download, then unzip everything in one big party impor t longitudinal RAND contributed files.R create a SQLite database (.db) on the local disk load the rand, rand-cams, and both rand-family files into the database (.db) in chunks (to prevent overloading ram) longitudinal RAND - analysis examples.R connect to the sql database created by the 'import longitudinal RAND contributed files' program create tw o database-backed complex sample survey object, using a taylor-series linearization design perform a mountain of analysis examples with wave weights from two different points in the panel import example HRS file.R load a fixed-width file using only the sas importation script directly into ram with < a href="http://blog.revolutionanalytics.com/2012/07/importing-public-data-with-sas-instructions-into-r.html">SAScii parse through the IF block at the bottom of the sas importation script, blank out a number of variables save the file as an R data file (.rda) for fast loading later replicate 2002 regression.R connect to the sql database created by the 'import longitudinal RAND contributed files' program create a database-backed complex sample survey object, using a taylor-series linearization design exactly match the final regression shown in this document provided by analysts at RAND as an update of the regression on pdf page B76 of this document . click here to view these five scripts for more detail about the health and retirement study (hrs), visit: michigan's hrs homepage rand's hrs homepage the hrs wikipedia page a running list of publications using hrs notes: exemplary work making it this far. as a reward, here's the detailed codebook for the main rand hrs file. note that rand also creates 'flat files' for every survey wave, but really, most every analysis you c an think of is possible using just the four files imported with the rand importation script above. if you must work with the non-rand files, there's an example of how to import a single hrs (umich-created) file, but if you wish to import more than one, you'll have to write some for loops yourself. confidential to sas, spss, stata, and sudaan users: a tidal wave is coming. you can get water up your nose and be dragged out to sea, or you can grab a surf board. time to transition to r. :D

  9. European Union Statistics on Income and Living Conditions 2013 -...

    • catalog.ihsn.org
    • datacatalog.ihsn.org
    Updated Mar 29, 2019
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Eurostat (2019). European Union Statistics on Income and Living Conditions 2013 - Cross-Sectional User Database - Netherlands [Dataset]. https://catalog.ihsn.org/index.php/catalog/7684
    Explore at:
    Dataset updated
    Mar 29, 2019
    Dataset authored and provided by
    Eurostathttps://ec.europa.eu/eurostat
    Time period covered
    2013
    Area covered
    Netherlands
    Description

    Abstract

    In 2013, the EU-SILC instrument covered all EU Member States plus Iceland, Turkey, Norway, Switzerland and Croatia. EU-SILC has become the EU reference source for comparative statistics on income distribution and social exclusion at European level, particularly in the context of the "Program of Community action to encourage cooperation between Member States to combat social exclusion" and for producing structural indicators on social cohesion for the annual spring report to the European Council. The first priority is to be given to the delivery of comparable, timely and high quality cross-sectional data.

    There are two types of datasets: 1) Cross-sectional data pertaining to fixed time periods, with variables on income, poverty, social exclusion and living conditions. 2) Longitudinal data pertaining to individual-level changes over time, observed periodically - usually over four years.

    Social exclusion and housing-condition information is collected at household level. Income at a detailed component level is collected at personal level, with some components included in the "Household" section. Labor, education and health observations only apply to persons aged 16 and over. EU-SILC was established to provide data on structural indicators of social cohesion (at-risk-of-poverty rate, S80/S20 and gender pay gap) and to provide relevant data for the two 'open methods of coordination' in the field of social inclusion and pensions in Europe.

    This is the 1st version of the 2013 Cross-Sectional User Database as released in July 2015.

    Geographic coverage

    The survey covers following countries: Austria; Belgium; Bulgaria; Croatia; Cyprus; Czech Republic; Denmark; Estonia; Finland; France; Germany; Greece; Spain; Ireland; Italy; Latvia; Lithuania; Luxembourg; Hungary; Malta; Netherlands; Poland; Portugal; Romania; Slovenia; Slovakia; Serbia; Sweden; United Kingdom; Iceland; Norway; Turkey; Switzerland

    Small parts of the national territory amounting to no more than 2% of the national population and the national territories listed below may be excluded from EU-SILC: France - French Overseas Departments and territories; Netherlands - The West Frisian Islands with the exception of Texel; Ireland - All offshore islands with the exception of Achill, Bull, Cruit, Gorumna, Inishnee, Lettermore, Lettermullan and Valentia; United Kingdom - Scotland north of the Caledonian Canal, the Scilly Islands.

    Analysis unit

    • Households;
    • Individuals 16 years and older.

    Universe

    The survey covered all household members over 16 years old. Persons living in collective households and in institutions are generally excluded from the target population.

    Kind of data

    Sample survey data [ssd]

    Sampling procedure

    On the basis of various statistical and practical considerations and the precision requirements for the most critical variables, the minimum effective sample sizes to be achieved were defined. Sample size for the longitudinal component refers, for any pair of consecutive years, to the number of households successfully interviewed in the first year in which all or at least a majority of the household members aged 16 or over are successfully interviewed in both the years.

    For the cross-sectional component, the plans are to achieve the minimum effective sample size of around 131.000 households in the EU as a whole (137.000 including Iceland and Norway). The allocation of the EU sample among countries represents a compromise between two objectives: the production of results at the level of individual countries, and production for the EU as a whole. Requirements for the longitudinal data will be less important. For this component, an effective sample size of around 98.000 households (103.000 including Iceland and Norway) is planned.

    Member States using registers for income and other data may use a sample of persons (selected respondents) rather than a sample of complete households in the interview survey. The minimum effective sample size in terms of the number of persons aged 16 or over to be interviewed in detail is in this case taken as 75 % of the figures shown in columns 3 and 4 of the table I, for the cross-sectional and longitudinal components respectively.

    The reference is to the effective sample size, which is the size required if the survey were based on simple random sampling (design effect in relation to the 'risk of poverty rate' variable = 1.0). The actual sample sizes will have to be larger to the extent that the design effects exceed 1.0 and to compensate for all kinds of non-response. Furthermore, the sample size refers to the number of valid households which are households for which, and for all members of which, all or nearly all the required information has been obtained. For countries with a sample of persons design, information on income and other data shall be collected for the household of each selected respondent and for all its members.

    At the beginning, a cross-sectional representative sample of households is selected. It is divided into say 4 sub-samples, each by itself representative of the whole population and similar in structure to the whole sample. One sub-sample is purely cross-sectional and is not followed up after the first round. Respondents in the second sub-sample are requested to participate in the panel for 2 years, in the third sub-sample for 3 years, and in the fourth for 4 years. From year 2 onwards, one new panel is introduced each year, with request for participation for 4 years. In any one year, the sample consists of 4 sub-samples, which together constitute the cross-sectional sample. In year 1 they are all new samples; in all subsequent years, only one is new sample. In year 2, three are panels in the second year; in year 3, one is a panel in the second year and two in the third year; in subsequent years, one is a panel for the second year, one for the third year, and one for the fourth (final) year.

    According to the Commission Regulation on sampling and tracing rules, the selection of the sample will be drawn according to the following requirements:

    1. For all components of EU-SILC (whether survey or register based), the crosssectional and longitudinal (initial sample) data shall be based on a nationally representative probability sample of the population residing in private households within the country, irrespective of language, nationality or legal residence status. All private households and all persons aged 16 and over within the household are eligible for the operation.
    2. Representative probability samples shall be achieved both for households, which form the basic units of sampling, data collection and data analysis, and for individual persons in the target population.
    3. The sampling frame and methods of sample selection shall ensure that every individual and household in the target population is assigned a known and non-zero probability of selection.
    4. By way of exception, paragraphs 1 to 3 shall apply in Germany exclusively to the part of the sample based on probability sampling according to Article 8 of the Regulation of the European Parliament and of the Council (EC) No 1177/2003 concerning

    Community Statistics on Income and Living Conditions. Article 8 of the EU-SILC Regulation of the European Parliament and of the Council mentions: 1. The cross-sectional and longitudinal data shall be based on nationally representative probability samples. 2. By way of exception to paragraph 1, Germany shall supply cross-sectional data based on a nationally representative probability sample for the first time for the year 2008. For the year 2005, Germany shall supply data for one fourth based on probability sampling and for three fourths based on quota samples, the latter to be progressively replaced by random selection so as to achieve fully representative probability sampling by 2008. For the longitudinal component, Germany shall supply for the year 2006 one third of longitudinal data (data for year 2005 and 2006) based on probability sampling and two thirds based on quota samples. For the year 2007, half of the longitudinal data relating to years 2005, 2006 and 2007 shall be based on probability sampling and half on quota sample. After 2007 all of the longitudinal data shall be based on probability sampling.

    Detailed information about sampling is available in Quality Reports in Related Materials.

    Mode of data collection

    Mixed

  10. i

    Survey of Income and Living Conditions-Cross-Sectional Database 2017 - North...

    • datacatalog.ihsn.org
    • catalog.ihsn.org
    Updated Dec 5, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    State Statistical Office of the Republic of Macedonia (2019). Survey of Income and Living Conditions-Cross-Sectional Database 2017 - North Macedonia [Dataset]. https://datacatalog.ihsn.org/catalog/8325
    Explore at:
    Dataset updated
    Dec 5, 2019
    Dataset authored and provided by
    State Statistical Office of the Republic of Macedonia
    Time period covered
    2017
    Area covered
    North Macedonia
    Description

    Abstract

    The Survey of Income and Living Conditions (EU-SILC) is the European Union reference source for comparative statistics on income distribution and social exclusion at the European level, particularly in the context of the 'Programme of Community action to encourage cooperation between Member States to combat social exclusion' and for producing key policy indicators on social cohesion for the follow up of the EU2020 main target on poverty and social inclusion and flagship initiatives in related domains, e.g. in the context of the European Semester. It provides two types of annual data: Cross-sectional data pertaining to a given time or a certain time period with variables on income, poverty, social exclusion and other living conditions, and Longitudinal data pertaining to individual-level changes over time, observed periodically over a four-year period. The first priority is to be given to the delivery of comparable, timely and high quality data. The cross-sectional data is collected in two stages: An early subset of variables collected by register or interview to assess as early as possible poverty trends. A full set of variables provided along with the longitudinal data to produce main key policy indicators on social cohesion.

    Geographic coverage

    National

    Universe

    The reference population of EU-SILC is all private households and their current members residing in the territory of the Member States (MS) at the time of data collection. Persons living in collective households and in institutions are generally excluded from the target population.

    Kind of data

    Sample survey data [ssd]

    Sampling procedure

    According to the Commission Regulation on sampling and tracing rules, the selection of the sample will be drawn according to the following requirements: For all components of EU-SILC (whether survey or register based), the cross-sectional and longitudinal (initial sample) data shall be based on a nationally representative probability sample of the population residing in private households within the country, irrespective of language, nationality or legal residence status. All private households and all persons aged 16 and over within the household are eligible for the operation. Representative probability samples shall be achieved both for households, which form the basic units of sampling, data collection and data analysis, and for individual persons in the target population. The sampling frame and methods of sample selection shall ensure that every individual and household in the target population is assigned a known and non-zero probability of selection.

    Mode of data collection

    Face-to-face [f2f]

  11. f

    Integrated Household Living Conditions Survey - Wave 5, Cross-Sectional...

    • microdata.fao.org
    Updated Mar 7, 2021
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    National Institute of Statistics Rwanda (NISR) (2021). Integrated Household Living Conditions Survey - Wave 5, Cross-Sectional Sample, 2016-2017. - Rwanda [Dataset]. https://microdata.fao.org/index.php/catalog/1839
    Explore at:
    Dataset updated
    Mar 7, 2021
    Dataset authored and provided by
    National Institute of Statistics Rwanda (NISR)
    Time period covered
    2016 - 2017
    Area covered
    Rwanda
    Description

    Abstract

    The EICV5 survey (Enquête Intégrale sur les Conditions de Vie des ménages) was conducted over a 12-month cycle from October 2016 to October 2017. Data collection was divided into 10 cycles in order to represent seasonality in the income and consumption data. A main cross-sectional sample survey, a panel survey and a VUP sample survey were conducted simultaneously.

    The objectives of the EICV5 Panel Survey are to measure the trends in key socioeconomic indicators over time for a nationally representative panel of households. EICV5 aims to provide timely and updated statistics to facilitate monitoring progress on poverty reduction programmes and evaluation of different policies as stipulated in the First National Strategy for Transformation (NST1), the 2030 Sustainable Development Goals (SDGs), as well as the Vision 2020 and Vision 2050. The survey data are also very important for national accounts and updating the consumer price index (CPI).

    Geographic coverage

    National coverage.

    Analysis unit

    Households

    Universe

    All household members

    Kind of data

    Sample survey data [ssd]

    Sampling procedure

    The sampling frame for the EICV5 cross-sectional survey is based on the NISR master sample data. More recently, the NISR used the 2012 Census frame to select a large master sample of villages 3,960 that can be used for the different national household surveys in Rwanda. The primary sampling units (PSUs) for the Master Sample are individual villages, or a combination of small villages, with the number of households tabulated from the 2012 Census data. A new listing of households was conducted in order to update the frame for the EICV5 cross-sectional survey. The sample households in the EICV5 sample villages were selected from the new listing.

    1) The EICV5 Cross-sectional survey sample size

    The sample size for the EICV5 cross-sectional survey depends on the level of precision that is required for key indicators at the district level, as well as on resource constraints and logistical considerations. It is very important to ensure good quality control in order to minimize the non-sampling errors. The estimates of the sampling errors for the poverty rate by district from the EICV4 data were examined in order to determine whether it would be necessary to adjust the sample size. For EIVC4 the number of households selected per cluster was 9 for Kigali Province, which is mostly urban, and 12 for the remaining provinces, which are mostly rural. This sampling strategy has been consistent for all the EICV surveys because it is statistically efficient and is also effective for the EICV logistics of the fieldwork and the workload of the team of enumerators each cycle. The urban areas generally have a higher intraclass correlation for socioeconomic characteristics between households within a cluster compared to rural areas. There is also a different interviewing schedule for the sample households in Kigali Province, so only 9 households are interviewed in each cluster. In terms of the number of sample clusters allocated to each district, it should be a multiple of 10 so that the sample can be evenly distributed to the 10 cycles. In the case of EICV4 the districts in Kigali Province were assigned 5 sample clusters each month, and in the other provinces each district was assigned 4 sample clusters each month.

    In EICV5, the sample was increased for the districts in Kigali Province because the estimates of the poverty rate for those districts had higher coefficients of variation (CVs) or relative standard errors (RSEs) compared to the other districts. However, one reason why the RSEs for the districts of Kigali Province were higher is that the value of the poverty rate is lower for these districts. It was pointed out that in the case of estimates of percentages or proportions, it is more effective to use the margin of error to study the sample size. The margin of error is equal to half of the width of the 95% confidence interval, or 1.96 times the standard error. Therefore, the margins of error for the estimates of the poverty rate by district were also examined. In this case the margins of error were also higher for the districts of Kigali Province, given the relatively higher design effects (especially for Gasabo District), and considering that the number of sample households for these districts in EICV4 was only 450, compared to 480 sample households in the districts of the other provinces. For these reasons, it was decided to increase the number of sample PSUs for each district in Kigali Province from 50 to 60, for a total increase of 30 sample clusters and 270 sample households. For the districts in the other provinces it was decided to have the same sample size of 40 clusters and 480 households each cycle, since the level of precision of the EICV4 results for these districts was considered satisfactory.

    The sample PSUs in each district were allocated to the urban and rural strata proportionately to the number of households in the 2012 Census frame. In the case of districts where the proportional number of sample PSUs was only 1 for the urban stratum, the number of sample PSUs was increased to 2. For the selection of sample villages for EICV5, it was assumed that the Master Sample villages for each district were explicitly stratified by urban and rural areas. A separate subsample of villages was selected within each stratum from the Master Sample.

    At the national level, there are 1,260 sample villages and 14,580 sample households. In the urban strata there are 245 sample villages and 2,526 sample households, and in the rural strata there are 1,015 sample villages and 12,054 sample households. The sample size for the EICV5 cross-sectional survey has 30 more sample PSUs and 270 more sample households than the corresponding sample for EICV4.

    In the case of EICV4 the national sample of 177 villages selected from EICV3 for the Panel Survey were also used as part of the EICV4 cross-sectional survey. However, for EICV5 it was decided to select a completely separate sample of villages for the cross-sectional survey.

    2) Assignment of sample villages to cycles and sub-cycles

    Similar to the EICV4 methodology, a nationally-representative sample of clusters will be assigned for the EICV5 data collection each cycle, so that the sample is geographically representative over time. A subsample serial number from 1 to 10 can be assigned systematically to the geographically ordered list of all sample clusters in each district. In order to assign the cycles to the EICV5 cross-sectional sample villages, random cycle numbers from 1 to 10 were generated to identify the selection sequence. For the 27 districts outside of Kigali Province, the sub-cycle numbers of 1 or 2 were assigned systematically with a random start. This process ensured that the final distribution of the sample clusters to cycles and sub-cycles was geographically representative within each district.

    Mode of data collection

    Face-to-face paper [f2f]

    Research instrument

    The same questionnaire was used for cross-sectional, panel and VUP samples. Part A of the questionnaire contains modules on household and individual information. Part B is on agriculture and consumption. The questionnaire was developed in English, and translated into Kinyarwanda.

    Questionnaire design took into account the requests raised by major data users and stakeholders, as well as consistency with the previous EICV questionnaires. In addition to methodological improvements, some simplifications were made:

    -The major changes introduced in this survey were changes to Section 6, the Economic Activity. Further questioning was added on unemployment and underemployment in response to questions from users, and also to comply with international standards. The section was simplified to enable the analysis to be undertaken by local analysts.

    -The Section on the VUP participation was expanded to provide more information, better classification of beneficiaries and to provide greater consistency within the questionnaire. The same questionnaire is to be used on the separate VUP sample which runs in parallel with the EICV5

    Questionnaire was tested in pilot surveys and amended in time prior to the fieldwork starting in October 2016. The complete questionnaire is provided as external resources.

    Cleaning operations

    A day before the interview started, the enumerator, accompanied by a controller, did an introduction to household, explaining how often they will come in that household and delivering a letter indicating that the HH has been selected.

    During the field work, after each cycle, the data processing team produced tables and reports of inconsistencies, which were checked by the field supervisor. The data entry system also contained consistency checks that alerted the data entry operators. In case of an alert, the questionnaire was sent back to the supervisor of data entry for correction.

    Response rate

    The response rate for EICV5 (cross-sectional) is 100%. All households sampled(14,580) were interviewed with no refusal.

  12. H

    Pricing example and sample data for "Cross-Sectional Variation of...

    • dataverse.harvard.edu
    Updated Nov 10, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    LIUREN WU (2025). Pricing example and sample data for "Cross-Sectional Variation of Risk-targeting Option Portfolios" [Dataset]. http://doi.org/10.7910/DVN/G2YIUR
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Nov 10, 2025
    Dataset provided by
    Harvard Dataverse
    Authors
    LIUREN WU
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    The excel file contains one day's data on one stock and shows how to construct risk-targeting option portfolios and estimate the market price of risk for each risk dimension. The Internet Appendix describes the operations in the excel file.

  13. u

    UKHLS

    • beta.ukdataservice.ac.uk
    Updated Oct 21, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    UK Data Service (2022). UKHLS [Dataset]. http://doi.org/10.5255/UKDA-SN-9019-1
    Explore at:
    Dataset updated
    Oct 21, 2022
    Dataset provided by
    UK Data Servicehttps://ukdataservice.ac.uk/
    Area covered
    United Kingdom
    Description

    As the UK went into the first lockdown of the COVID-19 pandemic, the team behind the biggest social survey in the UK, Understanding Society (UKHLS), developed a way to capture these experiences. From April 2020, participants from this Study were asked to take part in the Understanding Society COVID-19 survey, henceforth referred to as the COVID-19 survey or the COVID-19 study.

    The COVID-19 survey regularly asked people about their situation and experiences. The resulting data gives a unique insight into the impact of the pandemic on individuals, families, and communities. The COVID-19 Teaching Dataset contains data from the main COVID-19 survey in a simplified form. It covers topics such as

    • Socio-demographics
    • Whether working at home and home-schooling
    • COVID symptoms
    • Health and well-being
    • Social contact and neighbourhood cohesion
    • Volunteering

    The resource contains two data files:

    • Cross-sectional: contains data collected in Wave 4 in July 2020 (with some additional variables from other waves);
    • Longitudinal: Contains mainly data from Waves 1, 4 and 9 with key variables measured at three time points.

    Key features of the dataset

    • Missing values: in the web survey, participants clicking "Next" but not answering a question were given further options such as "Don't know" and "Prefer not to say". Missing observations like these are recorded using negative values such as -1 for "Don't know". In many instances, users of the data will need to set these values as missing. The User Guide includes Stata and SPSS code for setting negative missing values to system missing.
    • The Longitudinal file is a balanced panel and is in wide format. A balanced panel means it only includes participants that took part in every wave. In wide format, each participant has one row of information, and each measurement of the same variable is a different variable.
    • Weights: both the cross-sectional and longitudinal files include survey weights that adjust the sample to represent the UK adult population. The cross-sectional weight (betaindin_xw) adjusts for unequal selection probabilities in the sample design and for non-response. The longitudinal weight (ci_betaindin_lw) adjusts for the sample design and also for the fact that not all those invited to participate in the survey, do participate in all waves.
    • Both the cross-sectional and longitudinal datasets include the survey design variables (psu and strata).

    A full list of variables in both files can be found in the User Guide appendix.

    Who is in the sample?

    All adults (16 years old and over as of April 2020), in households who had participated in at least one of the last two waves of the main study Understanding Society, were invited to participate in this survey. From the September 2020 (Wave 5) survey onwards, only sample members who had completed at least one partial interview in any of the first four web surveys were invited to participate. From the November 2020 (Wave 6) survey onwards, those who had only completed the initial survey in April 2020 and none since, were no longer invited to participate

    The User guide accompanying the data adds to the information here and includes a full variable list with details of measurement levels and links to the relevant questionnaire.

  14. i

    Russia Longitudinal Monitoring Survey - Higher School of Economics 1995 -...

    • datacatalog.ihsn.org
    • catalog.ihsn.org
    Updated Mar 29, 2019
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    National Research University Higher School of Economics (2019). Russia Longitudinal Monitoring Survey - Higher School of Economics 1995 - Russian Federation [Dataset]. https://datacatalog.ihsn.org/catalog/6193
    Explore at:
    Dataset updated
    Mar 29, 2019
    Dataset provided by
    National Research University Higher School of Economics
    Carolina Population Center
    ZAO "Demoscope"
    Time period covered
    1995
    Area covered
    Russia
    Description

    Abstract

    The Russia Longitudinal Monitoring Survey (RLMS) is a household-based survey designed to measure the effects of Russian reforms on the economic well-being of households and individuals. In particular, determining the impact of reforms on household consumption and individual health is essential, as most of the subsidies provided to protect food production and health care have been or will be reduced, eliminated, or at least dramatically changed. These effects are measured by a variety of means: detailed monitoring of individuals' health status and dietary intake, precise measurement of household-level expenditures and service utilization, and collection of relevant community-level data, including region-specific prices and community infrastructure data. Data have been collected since 1992.

    As its name implies, the RLMS is a longitudinal study of populations of dwelling units. Rounds V-VII are designed to provide a repeated cross-section sampling. Barring the construction of major new housing structures, renewed contact with a fixed national probability sample of dwelling units provides high coverage cross-sectional representation. The repeat visit at each round to a static sample of dwelling units also introduces a correlation between successive samples that leads to improved efficiency in longitudinal analyses comparing aggregate statistics.

    The repeated cross-section design is far and away the simplest alternative for the RLMS. The sampling is cost efficient, easy to maintain, and easy to update when needed. The design supports both efficient cross-sectional and aggregate longitudinal analyses of change in the Russian household population. Updates to the sample, including a full replenishment of the probability sample of dwelling units, will not seriously disrupt the longitudinal data series.

    Geographic coverage

    National

    Analysis unit

    Households and individuals.

    Kind of data

    Sample survey data [ssd]

    Sampling procedure

    The goal was to develop a sample of households (excluding institutionalized people) that would meet accepted scientific standards of a true probability sample to the greatest extent possible, while taking into account the severe operational constraints of Goskomstat. With the advice of William Kalsbeek [a sampling expert at the University of North Carolina at Chapel Hill (UNC-CH)] and later with help from Leslie Kish, the project developed a replicated three-stratified cluster sample of residential addresses, excluding military, penal, and other institutionalized populations. Replication was designated for Stage 1 of sampling so that the number of primary sampling units (PSUs) could be kept manageable, with the understanding that later they would be expanded. The sample size of each replicate was set at 20 PSUs. The quality of this sample was statistically analyzed.

    Sample attrition due to nonresponse cannot be avoided. Table 1 summarizes RLMS Round V interview completion rates for the original sample of dwelling units in the eight regions that comprise the survey population. These are not response rates; each denominator includes dwelling units that were vacant or uninhabitable at the time of the Round V interviews. Overall, interviews were completed in 84.3% of the original national probability sample of n=4718 dwelling units.

    Interview completion rates outside St. Petersburg, Moscow City, and Moscow Oblast range from 84.8% in the combined Central/Central Black Earth region to 92.6% in Western Siberia. Rates in the highly urban Moscow/St. Petersburg region are much lower. In part, these rates may reflect higher vacancy rates in metropolitan areas, but clearly lower household contact and response rates also come into play. Lower rates in Moscow and St. Petersburg were anticipated at the design stage, and initial allocations to these strata were increased to offset expected losses from refusal and noncontact. This is one form of what we might call "designing for nonresponse." The over-sampling strategy is beneficial in that it means reduced variability in the final analysis weights (due to the offset in the product of higher sample selection probability and lower response propensity); however, over-sampling eliminates the potential for bias only if attrition is occurring at random within the final weighting adjustment cells.

    If independent samples were developed for each round of the repeated cross-section design, attrition in one round would be independent of (although possibly similar in nature to) that in other rounds. However, since the RLMS uses a static sample of dwellings across multiple rounds, the impact of nonresponse and attrition is the net effect of several factors. Round V attrition bias can arise only from differential nonresponse and noncontact for subclasses of households that occupy the original sample of dwelling units. The potential for nonresponse bias in cross-sectional analysis or contrasts involving the Rounds VI and VII data is a complex function of: (1) initial nonresponse in Round V; (2) net difference in characteristics of households and individuals who move out of or into sample dwellings; (3) nonresponse on the part of old households continuing to reside in sample dwelling units; and (4) nonresponse on the part of new households currently living in sample dwelling units.

    Time did not permit analysis of each of these factors. Instead, I performed several simple analyses of the net effect of household turnover and nonresponse on the marginal sample distributions (unweighted) of population characteristics that should not change significantly over time.

    The general observation is that the combined influence of nonresponse attrition and household turnover does not seriously distort the geographic distribution of the sample or its size or household-head characteristics. The distributions for the geographic variables indicate that, between Round V and Round VII, there is a decline in the nominal representation of households in the Moscow/St. Petersburg region, reflected in a decline in the proportion of sample households from the urban domain. Households with a male head aged 18-59 may be subject to slightly higher than average attrition/net loss in replacement. If we focus only on these characteristics, the problem is not serious.

    In summary, the net effect of nonresponse attrition and change in dwelling unit occupants across rounds on the marginal characteristics of the observed cross-sectional samples is modest. Loss in nominal "sample share" between Rounds V and VII is greatest for residents of Moscow/St. Petersburg--a loss in representation that is readily corrected with the combined sample selection/nonresponse adjustment factors that have been computed for each round. It is important to note that the simple analysis described here cannot demonstrate that no uncorrected attrition bias remains. The potential for uncorrected nonresponse bias can be specific to the dependent variable under study. Nevertheless, it appears that, with the nonresponse and post-stratification adjustments developed by Michael Swafford, the potential for serious attrition bias in repeated cross-section analysis is small.

    Mode of data collection

    Face-to-face [f2f]

    Research instrument

    The questionnaire are English-language translations of the original Russian questionnaires. The English versions have been translated as literally as possible. The order of the questions and the layout of the pages have been preserved in the English versions.

    The questionnaires are also designed to function as codebooks. The variable names, as they appear in the data sets, are usually listed below or to the left of the questions. If the abbreviation (char) appears with a variable name, then the responses to that question are stored in a character variable. If there is no variable name associated with a particular question, then the responses to that question do not appear in the data set. Some questions in the questionnaires are color coded. Pink means that the question was added. Green indicates changes from the previous round (e.g., year). Gray means that the questions were asked, but the data are not available for public use - the questions were added at the request of the Pension Office and are for their use only.

    Cleaning operations

    In Phase II (Rounds V - XX), when questionnaires were returned to local supervisors, those supervisors were required to examine them to locate problems that could best be remedied in the field, e.g., by returning to get key demographic information or cleaning ID numbers so that the roster of individuals located in the household questionnaire matched those on the individual questionnaires from that household. The questionnaires were then transported to Moscow, where yet another ID check was performed.

    In Moscow, coders looked through all questionnaires to code so-called "other: specify" responses. However, open-ended questions (e.g., occupation questions) were not coded at this time. Instead, their texts were fully entered as long string variables. Entering the open-ended answers as character variables offered several advantages. First, it allowed data entry to begin immediately, with no delay for coding. Second, it permited the use of computer programs to assist in coding the string variables. Third, the method allowed any user of the original data sets to recode the character variables to suit his or her purposes without going back to the paper copies of the questionnaires.

    All data entry was handled in-house using the SPSS data entry program on PCs.

  15. d

    Data from: The validity of self-reported weight in US adults: a population...

    • catalog.data.gov
    • data.virginia.gov
    • +1more
    Updated Sep 6, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    National Institutes of Health (2025). The validity of self-reported weight in US adults: a population based cross-sectional study [Dataset]. https://catalog.data.gov/dataset/the-validity-of-self-reported-weight-in-us-adults-a-population-based-cross-sectional-study
    Explore at:
    Dataset updated
    Sep 6, 2025
    Dataset provided by
    National Institutes of Health
    Area covered
    United States
    Description

    Background Investigating the validity of the self-reported values of weight allows for the proper assessment of studies using questionnaire-derived data. The study examined the accuracy of gender-specific self-reported weight in a sample of adults. The effects of age, education, race and ethnicity, income, general health and medical status on the degree of discrepancy (the difference between self-reported weight and measured weight) are similarly considered. Methods The analysis used data from the US Third National Health and Nutrition Examination Survey. Self-reported and measured weights were abstracted and analyzed according to sex, age, measured weight, self-reported weight, and body mass index (BMI). A proportional odds model was applied. Results The weight discrepancy was positively associated with age, and negatively associated with measured weight and BMI. Ordered logistic regression modeling showed age, race-ethnicity, education, and BMI to be associated with the degree of discrepancy in both sexes. In men, additional predictors were consumption of more than 100 cigarettes and the desire to change weight. In women, marital status, income, activity level, and the number of months since the last doctor's visit were important. Conclusions Predictors of the degree of weight discrepancy are gender-specific, and require careful consideration when examined.

  16. H

    Global Indicators 2015 Dataset (Cross-Sectional)

    • dataverse.harvard.edu
    Updated Dec 18, 2017
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Miguel Centellas (2017). Global Indicators 2015 Dataset (Cross-Sectional) [Dataset]. http://doi.org/10.7910/DVN/ZN6MWY
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Dec 18, 2017
    Dataset provided by
    Harvard Dataverse
    Authors
    Miguel Centellas
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    This is a small dataset of various global indicators developed for use in a course teaching research methods at the Croft Institute for International Studies at the University of Mississippi. The data is ready to be directly imported into SPSS, Stata, or other statistical packages. A brief codebook includes descriptions of each variable, the indicator's reference year(s), and links to the original sources. The data is cross-sectional, country-level data centered on 2015 as the primary reference year. Some data come from the most recent election or averages from a handful of years. The dataset includes socioeconomic and political data drawn from sources and indicators from the World Bank, the UNDP, and International IDEA. It also includes popular indexes (and some key components) from Freedom House, Polity IV, the Economist's Democracy Index, the Heritage Foundation's Index of Economic Freedom, and the Fund for Peace's Fragile States Index. The dataset also includes various types of data (nominal, ordinal, interval, and ratio), useful for pedagogical examples of how to handle statistical data.

  17. General Social Survey 2010 Cross-Section and Panel Combined

    • thearda.com
    Updated 2010
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    The Association of Religion Data Archives (2010). General Social Survey 2010 Cross-Section and Panel Combined [Dataset]. http://doi.org/10.17605/OSF.IO/C6G27
    Explore at:
    Dataset updated
    2010
    Dataset provided by
    Association of Religion Data Archives
    Dataset funded by
    National Science Foundation
    Description

    The General Social Surveys (GSS) have been conducted by the National Opinion Research Center (NORC) annually since 1972, except for the years 1979, 1981, and 1992 (a supplement was added in 1992), and biennially beginning in 1994. The GSS are designed to be part of a program of social indicator research, replicating questionnaire items and wording in order to facilitate time-trend studies. This data file has all cases and variables asked on the 2010 GSS. There are a total of 4,901 cases in the data set but their initial sampling years vary because the GSS now contains panel cases. Sampling years can be identified with the variable SAMPTYPE.

    The 2010 GSS featured special modules on aging, the Internet, shared capitalism, gender roles, intergroup relations, immigration, meeting spouse, knowledge about and attitudes toward science, religious identity, religious trends, genetics, veterans, crime and victimization, social networks and group membership, and sexual behavior (continuing the series started in 1988).

    The GSS has switched from a repeating, cross-section design to a combined repeating cross-section and panel-component design. The 2006 GSS was the base year for the first panel. A sub-sample of 2,000 GSS cases from 2006 was selected for reinterview in 2008 and again in 2010 as part of the GSSs in those years. The 2008 GSS consists of a new cross-section plus the reinterviews from 2006. The 2010 GSS consists of a new cross-section of 2,044, the first reinterview wave of the 2,023 2008 panel cases with 1,581 completed cases, and the second and final reinterview of the 2006 panel with 1,276 completed cases. Altogether, the 2010 GSS had 4,901 cases (2,044 in the new 2010 panel, 1,581 in the 2008 panel, and 1,276 in the 2006 panel). The 2010 GSS is the first round to fully implement the new, rolling panel design. In 2012 and later GSSs, there will likewise be a fresh cross-section (wave one of a new panel), wave two panel cases from the immediately preceding GSS, and wave three panel cases from the next earlier GSS.

    To download syntax files for the GSS that reproduce well-known religious group recodes, including RELTRAD, please visit the "/research/syntax-repository-list" Target="_blank">ARDA's Syntax Repository.

  18. p

    High Frequency Phone Survey, Continuous Data Collection 2023 - Papua New...

    • microdata.pacificdata.org
    Updated Apr 30, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    William Seitz (2025). High Frequency Phone Survey, Continuous Data Collection 2023 - Papua New Guinea [Dataset]. https://microdata.pacificdata.org/index.php/catalog/877
    Explore at:
    Dataset updated
    Apr 30, 2025
    Dataset provided by
    Darian Naidoo
    William Seitz
    Time period covered
    2023 - 2025
    Area covered
    Papua New Guinea
    Description

    Abstract

    Access to up-to-date socio-economic data is a widespread challenge in Papua New Guinea and other Pacific Island Countries. To increase data availability and promote evidence-based policymaking, the Pacific Observatory provides innovative solutions and data sources to complement existing survey data and analysis. One of these data sources is a series of High Frequency Phone Surveys (HFPS), which began in 2020 as a way to monitor the socio-economic impacts of the COVID-19 Pandemic, and since 2023 has grown into a series of continuous surveys for socio-economic monitoring. See https://www.worldbank.org/en/country/pacificislands/brief/the-pacific-observatory for further details.

    For PNG, after five rounds of data collection from 2020-2022, in April 2023 a monthly HFPS data collection commenced and continued for 18 months (ending September 2024) –on topics including employment, income, food security, health, food prices, assets and well-being. This followed an initial pilot of the data collection from January 2023-March 2023. Data for April 2023-September 2023 were a repeated cross section, while October 2023 established the first month of a panel, which is ongoing as of March 2025. For each month, approximately 550-1000 households were interviewed. The sample is representative of urban and rural areas but is not representative at the province level. This dataset contains combined monthly survey data for all months of the continuous HFPS in PNG. There is one date file for household level data with a unique household ID, and separate files for individual level data within each household data, and household food price data, that can be matched to the household file using the household ID. A unique individual ID within the household data which can be used to track individuals over time within households.

    Geographic coverage

    Urban and rural areas of Papua New Guinea

    Analysis unit

    Household, Individual

    Kind of data

    Sample survey data [ssd]

    Sampling procedure

    The initial sample was drawn through Random Digit Dialing (RDD) with geographic stratification from a large random sample of Digicel’s subscribers. As an objective of the survey was to measure changes in household economic wellbeing over time, the HFPS sought to contact a consistent number of households across each province month to month. This was initially a repeated cross section from April 2023-Dec 2023. The resulting overall sample has a probability-based weighted design, with a proportionate stratification to achieve a proper geographical representation. More information on sampling for the cross-sectional monthly sample can be found in previous documentation for the PNG HFPS data.

    A monthly panel was established in October 2023, that is ongoing as of March 2025. In each subsequent round of data collection after October 2024, the survey firm would first attempt to contact all households from the previous month, and then attempt to contact households from earlier months that had dropped out. After previous numbers were exhausted, RDD with geographic stratification was used for replacement households.

    Mode of data collection

    Computer Assisted Telephone Interview [cati]

    Research instrument

    he questionnaire, which can be found in the External Resources of this documentation, is in English with a Pidgin translation.

    The survey instrument for Q1 2025 consists of the following modules: -1. Basic Household information, -2. Household Roster, -3. Labor, -4a Food security, -4b Food prices -5. Household income, -6. Agriculture, -8. Access to services, -9. Assets -10. Wellbeing and shocks -10a. WASH

    Cleaning operations

    The raw data were cleaned by the World Bank team using STATA. This included formatting and correcting errors identified through the survey’s monitoring and quality control process. The data are presented in two datasets: a household dataset and an individual dataset. The individual dataset contains information on individual demographics and labor market outcomes of all household members aged 15 and above, and the household data set contains information about household demographics, education, food security, food prices, household income, agriculture activities, social protection, access to services, and durable asset ownership. The household identifier (hhid) is available in both the household dataset and the individual dataset. The individual identifier (id_member) can be found in the individual dataset.

  19. c

    Labour Force Survey Two-Quarter Longitudinal Dataset, July - December, 2024

    • datacatalogue.cessda.eu
    Updated Feb 28, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Office for National Statistics (2025). Labour Force Survey Two-Quarter Longitudinal Dataset, July - December, 2024 [Dataset]. http://doi.org/10.5255/UKDA-SN-9348-1
    Explore at:
    Dataset updated
    Feb 28, 2025
    Authors
    Office for National Statistics
    Time period covered
    Jul 1, 2024 - Dec 31, 2024
    Area covered
    United Kingdom
    Variables measured
    Individuals
    Measurement technique
    Compilation or synthesis of existing material, the datasets were created from existing LFS data. They do not contain all records, but only those of respondents of working age who have responded to the survey in all the periods being linked. The data therefore comprise a subset of variables representing approximately one third of all QLFS variables. Cases were linked using the QLFS panel design.
    Description

    Abstract copyright UK Data Service and data collection copyright owner.

    Background
    The Labour Force Survey (LFS) is a unique source of information using international definitions of employment and unemployment and economic inactivity, together with a wide range of related topics such as occupation, training, hours of work and personal characteristics of household members aged 16 years and over. It is used to inform social, economic and employment policy. The LFS was first conducted biennially from 1973-1983. Between 1984 and 1991 the survey was carried out annually and consisted of a quarterly survey conducted throughout the year and a 'boost' survey in the spring quarter (data were then collected seasonally). From 1992 quarterly data were made available, with a quarterly sample size approximately equivalent to that of the previous annual data. The survey then became known as the Quarterly Labour Force Survey (QLFS). From December 1994, data gathering for Northern Ireland moved to a full quarterly cycle to match the rest of the country, so the QLFS then covered the whole of the UK (though some additional annual Northern Ireland LFS datasets are also held at the UK Data Archive). Further information on the background to the QLFS may be found in the documentation.

    Longitudinal data
    The LFS retains each sample household for five consecutive quarters, with a fifth of the sample replaced each quarter. The main survey was designed to produce cross-sectional data, but the data on each individual have now been linked together to provide longitudinal information. The longitudinal data comprise two types of linked datasets, created using the weighting method to adjust for non-response bias. The two-quarter datasets link data from two consecutive waves, while the five-quarter datasets link across a whole year (for example January 2010 to March 2011 inclusive) and contain data from all five waves. A full series of longitudinal data has been produced, going back to winter 1992. Linking together records to create a longitudinal dimension can, for example, provide information on gross flows over time between different labour force categories (employed, unemployed and economically inactive). This will provide detail about people who have moved between the categories. Also, longitudinal information is useful in monitoring the effects of government policies and can be used to follow the subsequent activities and circumstances of people affected by specific policy initiatives, and to compare them with other groups in the population. There are however methodological problems which could distort the data resulting from this longitudinal linking. The ONS continues to research these issues and advises that the presentation of results should be carefully considered, and warnings should be included with outputs where necessary.

    New reweighting policy
    Following the new reweighting policy ONS has reviewed the latest population estimates made available during 2019 and have decided not to carry out a 2019 LFS and APS reweighting exercise. Therefore, the next reweighting exercise will take place in 2020. These will incorporate the 2019 Sub-National Population Projection data (published in May 2020) and 2019 Mid-Year Estimates (published in June 2020). It is expected that reweighted Labour Market aggregates and microdata will be published towards the end of 2020/early 2021.

    LFS Documentation
    The documentation available from the Archive to accompany LFS datasets largely consists of the latest version of each user guide volume alongside the appropriate questionnaire for the year concerned. However, volumes are updated periodically by ONS, so users are advised to check the latest documents on the ONS Labour Force Survey - User Guidance pages before commencing analysis. This is especially important for users of older QLFS studies, where information and guidance in the user guide documents may have changed over time.

    Additional data derived from the QLFS
    The Archive also holds further QLFS series: End User Licence (EUL) quarterly data; Secure Access datasets; household datasets; quarterly, annual and ad hoc module datasets compiled for Eurostat; and some additional annual Northern Ireland datasets.

    Variables DISEA and LNGLST
    Dataset A08 (Labour market status of disabled people) which ONS suspended due to an apparent discontinuity between April to June 2017 and July to September 2017 is now available. As a result of this apparent discontinuity and the inconclusive...

  20. Enterprise Survey 2009-2014, Panel Data - Malawi

    • microdata.worldbank.org
    • catalog.ihsn.org
    • +1more
    Updated Oct 7, 2015
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    World Bank (2015). Enterprise Survey 2009-2014, Panel Data - Malawi [Dataset]. https://microdata.worldbank.org/index.php/catalog/2360
    Explore at:
    Dataset updated
    Oct 7, 2015
    Dataset provided by
    World Bank Grouphttp://www.worldbank.org/
    Authors
    World Bank
    Time period covered
    2009 - 2014
    Area covered
    Malawi
    Description

    Abstract

    The documented dataset covers Enterprise Survey (ES) panel data collected in Malawi in 2009 and 2014, as part of Africa Enterprise Surveys roll-out, an initiative of the World Bank.

    New Enterprise Surveys target a sample consisting of longitudinal (panel) observations and new cross-sectional data. Panel firms are prioritized in the sample selection, comprising up to 50% of the sample in the current wave. For all panel firms, regardless of the sample, current eligibility or operating status is determined and included in panel datasets.

    Malawi ES 2014 was conducted between April 2014 and February 2015, Malawi ES 2009 was carried out in May - July 2009. The objective of the Enterprise Survey is to obtain feedback from enterprises on the state of the private sector as well as to help in building a panel of enterprise data that will make it possible to track changes in the business environment over time, thus allowing, for example, impact assessments of reforms. Through interviews with firms in the manufacturing and services sectors, the survey assesses the constraints to private sector growth and creates statistically significant business environment indicators that are comparable across countries.

    Stratified random sampling was used to select the surveyed businesses. The data was collected using face-to-face interviews.

    Data from 673 establishments was analyzed: 436 businesses were from 2014 ES only, 63 - from 2009 ES only, and 174 firms were from both 2009 and 2014 panels.

    The standard Enterprise Survey topics include firm characteristics, gender participation, access to finance, annual sales, costs of inputs and labor, workforce composition, bribery, licensing, infrastructure, trade, crime, competition, capacity utilization, land and permits, taxation, informality, business-government relations, innovation and technology, and performance measures. Over 90 percent of the questions objectively measure characteristics of a country’s business environment. The remaining questions assess the survey respondents’ opinions on what are the obstacles to firm growth and performance.

    Geographic coverage

    National

    Analysis unit

    The primary sampling unit of the study is an establishment. An establishment is a physical location where business is carried out and where industrial operations take place or services are provided. A firm may be composed of one or more establishments. For example, a brewery may have several bottling plants and several establishments for distribution. For the purposes of this survey an establishment must make its own financial decisions and have its own financial statements separate from those of the firm. An establishment must also have its own management and control over its payroll.

    Universe

    The whole population, or the universe, covered in the Enterprise Surveys is the non-agricultural private economy. It comprises: all manufacturing sectors according to the ISIC Revision 3.1 group classification (group D), construction sector (group F), services sector (groups G and H), and transport, storage, and communications sector (group I). Note that this population definition excludes the following sectors: financial intermediation (group J), real estate and renting activities (group K, except sub-sector 72, IT, which was added to the population under study), and all public or utilities sectors. Companies with 100% government ownership are not eligible to participate in the Enterprise Surveys.

    Kind of data

    Sample survey data [ssd]

    Sampling procedure

    For the Malawi ES, multiple sample frames were used: a sample frame was built using data compiled from local and municipal business registries. Due to the fact that the previous round of surveys utilized different stratification criteria in the 2009 survey sample, the presence of panel firms was limited to a maximum of 50% of the achieved interviews in each stratum. That sample is referred to as the panel.

    Mode of data collection

    Face-to-face [f2f]

    Research instrument

    The following survey instruments were used for Malawi ES 2009 and 2014: - Manufacturing Module Questionnaire - Services Module Questionnaire

    The survey is fielded via manufacturing or services questionnaires in order not to ask questions that are irrelevant to specific types of firms, e.g. a question that relates to production and nonproduction workers should not be asked of a retail firm. In addition to questions that are asked across countries, all surveys are customized and contain country-specific questions. An example of customization would be including tourism-related questions that are asked in certain countries when tourism is an existing or potential sector of economic growth. There is a skip pattern in the Service Module Questionnaire for questions that apply only to retail firms.

    Cleaning operations

    Data entry and quality controls are implemented by the contractor and data is delivered to the World Bank in batches (typically 10%, 50% and 100%). These data deliveries are checked for logical consistency, out of range values, skip patterns, and duplicate entries. Problems are flagged by the World Bank and corrected by the implementing contractor through data checks, callbacks, and revisiting establishments.

    Response rate

    Survey non-response must be differentiated from item non-response. The former refers to refusals to participate in the survey altogether whereas the latter refers to the refusals to answer some specific questions. Enterprise Surveys suffer from both problems and different strategies were used to address these issues.

    Item non-response was addressed by two strategies: a- For sensitive questions that may generate negative reactions from the respondent, such as corruption or tax evasion, enumerators were instructed to collect "Refusal to respond" (-8) as a different option from "Don't know" (-9). b- Establishments with incomplete information were re-contacted in order to complete this information, whenever necessary.

    Survey non-response was addressed by maximizing efforts to contact establishments that were initially selected for interview. Attempts were made to contact the establishment for interview at different times/days of the week before a replacement establishment (with similar strata characteristics) was suggested for interview. Survey non-response did occur but substitutions were made in order to potentially achieve strata-specific goals.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Adam Baimel (2023). Dataset #1: Cross-sectional survey data [Dataset]. http://doi.org/10.6084/m9.figshare.23708730.v1
Organization logo

Dataset #1: Cross-sectional survey data

Explore at:
txtAvailable download formats
Dataset updated
Jul 19, 2023
Dataset provided by
Figsharehttp://figshare.com/
Authors
Adam Baimel
License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

N.B. This is not real data. Only here for an example for project templates.

Project Title: Add title here

Project Team: Add contact information for research project team members

Summary: Provide a descriptive summary of the nature of your research project and its aims/focal research questions.

Relevant publications/outputs: When available, add links to the related publications/outputs from this data.

Data availability statement: If your data is not linked on figshare directly, provide links to where it is being hosted here (i.e., Open Science Framework, Github, etc.). If your data is not going to be made publicly available, please provide details here as to the conditions under which interested individuals could gain access to the data and how to go about doing so.

Data collection details: 1. When was your data collected? 2. How were your participants sampled/recruited?

Sample information: How many and who are your participants? Demographic summaries are helpful additions to this section.

Research Project Materials: What materials are necessary to fully reproduce your the contents of your dataset? Include a list of all relevant materials (e.g., surveys, interview questions) with a brief description of what is included in each file that should be uploaded alongside your datasets.

List of relevant datafile(s): If your project produces data that cannot be contained in a single file, list the names of each of the files here with a brief description of what parts of your research project each file is related to.

Data codebook: What is in each column of your dataset? Provide variable names as they are encoded in your data files, verbatim question associated with each response, response options, details of any post-collection coding that has been done on the raw-response (and whether that's encoded in a separate column).

Examples available at: https://www.thearda.com/data-archive?fid=PEWMU17 https://www.thearda.com/data-archive?fid=RELLAND14

Search
Clear search
Close search
Google apps
Main menu