100+ datasets found

Survey weights
figshare.com
pdf
Updated Jul 30, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Carolin Kilian (2020). Survey weights [Dataset]. http://doi.org/10.6084/m9.figshare.12739469.v1
Explore at:
pdfAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.12739469.v1
Dataset updated
Jul 30, 2020
Dataset provided by
Figsharehttp://figshare.com/
Authors
Carolin Kilian
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Calculation strategy for survey and population weighting of the data.
d
Community Survey: 2021 Random Sample Results
catalog.data.gov
data.bloomington.in.gov
Updated May 20, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
data.bloomington.in.gov (2023). Community Survey: 2021 Random Sample Results [Dataset]. https://catalog.data.gov/dataset/community-survey-2021-random-sample-results-69942
Explore at:
Dataset updated
May 20, 2023
Dataset provided by
data.bloomington.in.gov
Description
A random sample of households were invited to participate in this survey. In the dataset, you will find the respondent level data in each row with the questions in each column. The numbers represent a scale option from the survey, such as 1=Excellent, 2=Good, 3=Fair, 4=Poor. The question stem, response option, and scale information for each field can be found in the var "variable labels" and "value labels" sheets. VERY IMPORTANT NOTE: The scientific survey data were weighted, meaning that the demographic profile of respondents was compared to the demographic profile of adults in Bloomington from US Census data. Statistical adjustments were made to bring the respondent profile into balance with the population profile. This means that some records were given more "weight" and some records were given less weight. The weights that were applied are found in the field "wt". If you do not apply these weights, you will not obtain the same results as can be found in the report delivered to the Bloomington. The easiest way to replicate these results is likely to create pivot tables, and use the sum of the "wt" field rather than a count of responses.
The People and Nature Surveys for England: Monthly indicators with specific...
gov.uk
Updated Jul 14, 2021
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Natural England (2021). The People and Nature Surveys for England: Monthly indicators with specific weight for April 2020 - March 2021 (Official Statistics) [Dataset]. https://www.gov.uk/government/statistics/the-people-and-nature-survey-for-england-monthly-indicators-with-specific-weight-for-april-2020-march-2021-official-statistics
Explore at:
Dataset updated
Jul 14, 2021
Dataset provided by
GOV.UKhttp://gov.uk/
Authors
Natural England
Area covered
England
Description
The People and Nature Survey for England gathers information on people’s experiences and views about the natural environment, and its contributions to our health and wellbeing.

This publication reports a set of weighted national indicators (Official Statistics) from the survey, which have been generated using data collected in the first year (April 2020 - March 2021) from approx. 25,000 adults (16+).

These updated indicators have been generated using the specific People and Nature weight and can be directly compared with monthly indicators published from April 2021 onwards. See Technical methods and limitations for more information.
d
Replication Data for: \"Sensitivity Analysis for Survey Weights\"
search.dataone.org
dataverse.harvard.edu
Updated Nov 8, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Hartman, Erin; Huang, Melody (2023). Replication Data for: \"Sensitivity Analysis for Survey Weights\" [Dataset]. http://doi.org/10.7910/DVN/YJSJEX
Explore at:
Unique identifier
https://doi.org/10.7910/DVN/YJSJEX
Dataset updated
Nov 8, 2023
Dataset provided by
Harvard Dataverse
Authors
Hartman, Erin; Huang, Melody
Description
Survey weighting allows researchers to account for bias in survey samples, due to unit nonresponse or convenience sampling, using measured demographic covariates. Unfortunately, in practice, it is impossible to know whether the estimated survey weights are sufficient to alleviate concerns about bias due to unobserved confounders or incorrect functional forms used in weighting. In the following paper, we propose two sensitivity analyses for the exclusion of important covariates: (1) a sensitivity analysis for partially observed confounders (i.e., variables measured across the survey sample, but not the target population), and (2) a sensitivity analysis for fully unobserved confounders (i.e., variables not measured in either the survey or the target population). We provide graphical and numerical summaries of the potential bias that arises from such confounders, and introduce a benchmarking approach that allows researchers to quantitatively reason about the sensitivity of their results. We demonstrate our proposed sensitivity analyses using state-level 2020 U.S. Presidential Election polls.
National Longitudinal Study of Adolescent to Adult Health, Public Use Grand...
thearda.com
osf.io
Updated Nov 15, 2014
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Dr. Kathleen Mullan Harris (2014). National Longitudinal Study of Adolescent to Adult Health, Public Use Grand Sample Weights, Wave II [Dataset]. http://doi.org/10.17605/OSF.IO/G69FA
Explore at:
Unique identifier
https://doi.org/10.17605/OSF.IO/G69FA
Dataset updated
Nov 15, 2014
Dataset provided by
Association of Religion Data Archives
Authors
Dr. Kathleen Mullan Harris
Dataset funded by
Cooperative funding from 23 other federal agencies and foundations
National Institutes of Health
Department of Health and Human Services
Eunice Kennedy Shriver National Institute of Child Health & Human Development
Description
The "https://addhealth.cpc.unc.edu/" Target="_blank">National Longitudinal Study of Adolescent to Adult Health (Add Health) is a longitudinal study of a nationally representative sample of adolescents in grades 7-12 in the United States. The Add Health cohort has been followed into young adulthood with four in-home interviews, the most recent in 2008, when the sample was aged 24-32*. Add Health combines longitudinal survey data on respondents' social, economic, psychological and physical well-being with contextual data on the family, neighborhood, community, school, friendships, peer groups, and romantic relationships, providing unique opportunities to study how social environments and behaviors in adolescence are linked to health and achievement outcomes in young adulthood. The fourth wave of interviews expanded the collection of biological data in Add Health to understand the social, behavioral, and biological linkages in health trajectories as the Add Health cohort ages through adulthood. The fifth wave of data collection is planned to begin in 2016.

Initiated in 1994 and supported by three program project grants from the "https://www.nichd.nih.gov/" Target="_blank">Eunice Kennedy Shriver National Institute of Child Health and Human Development (NICHD) with co-funding from 23 other federal agencies and foundations, Add Health is the largest, most comprehensive longitudinal survey of adolescents ever undertaken. Beginning with an in-school questionnaire administered to a nationally representative sample of students in grades 7-12, the study followed up with a series of in-home interviews conducted in 1995, 1996, 2001-02, and 2008. Other sources of data include questionnaires for parents, siblings, fellow students, and school administrators and interviews with romantic partners. Preexisting databases provide information about neighborhoods and communities.

Add Health was developed in response to a mandate from the U.S. Congress to fund a study of adolescent health, and Waves I and II focus on the forces that may influence adolescents' health and risk behaviors, including personal traits, families, friendships, romantic relationships, peer groups, schools, neighborhoods, and communities. As participants have aged into adulthood, however, the scientific goals of the study have expanded and evolved. Wave III, conducted when respondents were between 18 and 26** years old, focuses on how adolescent experiences and behaviors are related to decisions, behavior, and health outcomes in the transition to adulthood. At Wave IV, respondents were ages 24-32* and assuming adult roles and responsibilities. Follow up at Wave IV has enabled researchers to study developmental and health trajectories across the life course of adolescence into adulthood using an integrative approach that combines the social, behavioral, and biomedical sciences in its research objectives, design, data collection, and analysis.

* 52 respondents were 33-34 years old at the time of the Wave IV interview.
** 24 respondents were 27-28 years old at the time of the Wave III interview.

Included here are weights to remove any differences between the composition of the sample and the estimated composition of the population. See the attached codebook for information regarding how these weights were calculated.
g
Data from: wgtdistrim: Stata module for trimming extreme sampling weights
search.gesis.org
Updated Nov 15, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Lang, Sebastian; Klein, Daniel (2023). wgtdistrim: Stata module for trimming extreme sampling weights [Dataset]. https://search.gesis.org/research_data/SDN-10.7802-2641
Explore at:
Dataset updated
Nov 15, 2023
Dataset provided by
GESIS search
GESIS, Köln
Authors
Lang, Sebastian; Klein, Daniel
License
https://www.gesis.org/en/institute/data-usage-termshttps://www.gesis.org/en/institute/data-usage-terms
Description
Stata module that implements Potter's (1990) weight distribution approach to trim extreme sampling weights. The basic idea is that the sampling weights are assumed to follow a beta distribution. The parameters of the distribution are estimated from the moments of the observed sampling weights and the resulting quantiles are used as cut-off points for extreme sampling weights. The process is repeated a specified number of times (10 by default) or until no sampling weights are more extreme than the specified quantiles.
d
Replication Data for Gender Identification and Survey Weighting: A Shifting...
dataone.org
dataverse.harvard.edu
Updated Nov 8, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Urlacher, Brian (2023). Replication Data for Gender Identification and Survey Weighting: A Shifting Landscape [Dataset]. http://doi.org/10.7910/DVN/P6IHTM
Explore at:
Unique identifier
https://doi.org/10.7910/DVN/P6IHTM
Dataset updated
Nov 8, 2023
Dataset provided by
Harvard Dataverse
Authors
Urlacher, Brian
Description
Demographic information for a Utah survey of mental health stigma used to generate different survey weights to match census data.
d
Community Survey: 2019 Survey Data
catalog.data.gov
data.bloomington.in.gov
+1more
Updated May 20, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
data.bloomington.in.gov (2023). Community Survey: 2019 Survey Data [Dataset]. https://catalog.data.gov/dataset/community-survey-2019-survey-data-ac78c
Explore at:
Dataset updated
May 20, 2023
Dataset provided by
data.bloomington.in.gov
Description
The City of Bloomington contracted with National Research Center, Inc. to conduct the 2019 Bloomington Community Survey. This was the second time a scientific citywide survey had been completed covering resident opinions on service delivery satisfaction by the City of Bloomington and quality of life issues. The first was in 2017. The survey captured the responses of 610 households from a representative sample of 3,000 residents of Bloomington who were randomly selected to complete the survey. VERY IMPORTANT NOTE: The scientific survey data were weighted, meaning that the demographic profile of respondents was compared to the demographic profile of adults in Bloomington from US Census data. Statistical adjustments were made to bring the respondent profile into balance with the population profile. This means that some records were given more "weight" and some records were given less weight. The weights that were applied are found in the field "wt". If you do not apply these weights, you will not obtain the same results as can be found in the report delivered to the City of Bloomington. The easiest way to replicate these results is likely to create pivot tables, and use the sum of the "wt" field rather than a count of responses.
d
Current Population Survey (CPS)
search.dataone.org
dataverse.harvard.edu
Updated Nov 21, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Damico, Anthony (2023). Current Population Survey (CPS) [Dataset]. http://doi.org/10.7910/DVN/AK4FDD
Explore at:
Unique identifier
https://doi.org/10.7910/DVN/AK4FDD
Dataset updated
Nov 21, 2023
Dataset provided by
Harvard Dataverse
Authors
Damico, Anthony
Description
analyze the current population survey (cps) annual social and economic supplement (asec) with r the annual march cps-asec has been supplying the statistics for the census bureau's report on income, poverty, and health insurance coverage since 1948. wow. the us census bureau and the bureau of labor statistics ( bls) tag-team on this one. until the american community survey (acs) hit the scene in the early aughts (2000s), the current population survey had the largest sample size of all the annual general demographic data sets outside of the decennial census - about two hundred thousand respondents. this provides enough sample to conduct state- and a few large metro area-level analyses. your sample size will vanish if you start investigating subgroups b y state - consider pooling multiple years. county-level is a no-no. despite the american community survey's larger size, the cps-asec contains many more variables related to employment, sources of income, and insurance - and can be trended back to harry truman's presidency. aside from questions specifically asked about an annual experience (like income), many of the questions in this march data set should be t reated as point-in-time statistics. cps-asec generalizes to the united states non-institutional, non-active duty military population. the national bureau of economic research (nber) provides sas, spss, and stata importation scripts to create a rectangular file (rectangular data means only person-level records; household- and family-level information gets attached to each person). to import these files into r, the parse.SAScii function uses nber's sas code to determine how to import the fixed-width file, then RSQLite to put everything into a schnazzy database. you can try reading through the nber march 2012 sas importation code yourself, but it's a bit of a proc freak show. this new github repository contains three scripts: 2005-2012 asec - download all microdata.R down load the fixed-width file containing household, family, and person records import by separating this file into three tables, then merge 'em together at the person-level download the fixed-width file containing the person-level replicate weights merge the rectangular person-level file with the replicate weights, then store it in a sql database create a new variable - one - in the data table 2012 asec - analysis examples.R connect to the sql database created by the 'download all microdata' progr am create the complex sample survey object, using the replicate weights perform a boatload of analysis examples replicate census estimates - 2011.R connect to the sql database created by the 'download all microdata' program create the complex sample survey object, using the replicate weights match the sas output shown in the png file below 2011 asec replicate weight sas output.png statistic and standard error generated from the replicate-weighted example sas script contained in this census-provided person replicate weights usage instructions document. click here to view these three scripts for more detail about the current population survey - annual social and economic supplement (cps-asec), visit: the census bureau's current population survey page the bureau of labor statistics' current population survey page the current population survey's wikipedia article notes: interviews are conducted in march about experiences during the previous year. the file labeled 2012 includes information (income, work experience, health insurance) pertaining to 2011. when you use the current populat ion survey to talk about america, subract a year from the data file name. as of the 2010 file (the interview focusing on america during 2009), the cps-asec contains exciting new medical out-of-pocket spending variables most useful for supplemental (medical spending-adjusted) poverty research. confidential to sas, spss, stata, sudaan users: why are you still rubbing two sticks together after we've invented the butane lighter? time to transition to r. :D
Annual Population Survey Household, 2004-2021: Secure Access
beta.ukdataservice.ac.uk
Updated 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Social Survey Division Office For National Statistics (2024). Annual Population Survey Household, 2004-2021: Secure Access [Dataset]. http://doi.org/10.5255/ukda-sn-6725-9
Explore at:
Unique identifier
https://doi.org/10.5255/ukda-sn-6725-9
Dataset updated
2024
Dataset provided by
DataCitehttps://www.datacite.org/
UK Data Servicehttps://ukdataservice.ac.uk/
Authors
Social Survey Division Office For National Statistics
Description
Background
The Annual Population Survey (APS) Household datasets are produced annually and are available from 2004 (Secure Access) and 2006 (End User Licence). They allow production of family and household labour market statistics at local areas and for small sub-groups of the population across the UK. The data comprise key variables from the Labour Force Survey (LFS) (held at the UK Data Archive under GN 33246) and the APS (person) datasets (held at the Data Archive under GN 33357). The former is a quarterly survey of households living at private addresses in the UK. The latter is created by combining individuals in waves one and five from four consecutive LFS quarters with the English, Welsh and Scottish Local Labour Force Surveys (LLFS). The APS Household datasets therefore contain results from four different sources.

The APS Household datasets include all the variables on the LFS and APS person datasets except for the income variables. They also include key family and household level derived variables. These variables allow for an analysis of the combined economic activity status of the family or household. In addition they also include more detailed geographical, industry, occupation, health and age variables.

For information on the main (person) APS datasets, for which EUL and Secure Access versions are available, please see GNs 33357 and 33427, respectively.

New reweighting policy
Following the new reweighting policy ONS has reviewed the latest population estimates made available during 2019 and have decided not to carry out a 2019 LFS and APS reweighting exercise. Therefore, the next reweighting exercise will take place in 2020. These will incorporate the 2019 Sub-National Population Projection data (published in May 2020) and 2019 Mid-Year Estimates (published in June 2020). It is expected that reweighted Labour Market aggregates and microdata will be published in 2021.

Secure Access APS Household data
Secure Access datasets for the APS Household survey include additional variables not included in the EUL versions (GN 33455). Extra variables that may be found in the Secure Access version but not in the EUL version relate to:
geography (see 'Spatial Units' below)
individual demographics, including age bands, day of birth, sex/marital status and detailed ethnicity
main reason for coming to the UK
number of bedrooms
health problems, work-related health problems, sickness absence from work
reasons why not in work, including health and other reasons, wage received when not in work, time away from job, and whether and when will work in the future
type of benefit claimed
education and training, including
vocational and work-related qualifications and training
class of first degree
qualifications from government schemes
number of O levels/GCSEs, etc held
qualifications held from UK and abroad
qualifications gained from school/home schooling
qualifications below highest level
other qualifications
time spent in taught courses
who paid for training
main place of education/training
length of training course
level of Welsh baccalaureate
worst 30 local authorities based on Indices of Deprivation
casual/holiday work
disability, including learning difficulty/disability
payment of own National Insurance and tax
Prospective users of the Secure Access version of an APS Household dataset will need to fulfil additional requirements, including completion of face-to-face training and agreement to Secure Access' User Agreement, in order to obtain permission to use that version (see 'Access' section below). The EUL version of the data, for which less stingent access conditions apply, may suffice for many users' research requirements. Further details and links to all APS studies can be found via the APS Key Data series webpage.

Documentation and coding frames
The APS is compiled from variables present in the LFS. For variable and value labelling and coding frames that are not included either in the data or in the current APS documentation (e.g. coding frames for education, industrial and geographic variables, which are held in LFS User Guide Vol.5, Classifications), users are advised to consult the latest versions of the LFS User Guides, which are available from the ONS Labour Force Survey - User Guidance webpages.

Weighting 2022
The LFS team have been working on reweighting the datasets to account for newly delivered Real Time Information (RTI) tax information, adjusting Northern Ireland non-responses, and fixing the grossing factors where ONS had combined England and Wales (rather than doing them separately). The first two issues have been resolved but the grossing factors for England and Wales were not fully revised. This means that error remains in the calculation of some of the population weights in the APS and therefore the age breakdown of the population in both England and Wales remain affected to a small extent. The affected APS Household annual dataset is January - December 2020, and this will be revised again in the future.

Latest edition information
For the ninth edition (October 2023), the data file covering January - December 2021 has been revised.
Z
Base rates of food safety practices in European households: Summary data...
data.niaid.nih.gov
zenodo.org
Updated Nov 4, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Scholderer, Joachim (2022). Base rates of food safety practices in European households: Summary data from the SafeConsume Household Survey [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_7264924
Explore at:
Dataset updated
Nov 4, 2022
Dataset authored and provided by
Scholderer, Joachim
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This data set contains estimates of the base rates of 550 food safety-relevant food handling practices in European households. The data are representative for the population of private households in the ten European countries in which the SafeConsume Household Survey was conducted (Denmark, France, Germany, Greece, Hungary, Norway, Portugal, Romania, Spain, UK).

Sampling design

In each of the ten EU and EEA countries where the survey was conducted (Denmark, France, Germany, Greece, Hungary, Norway, Portugal, Romania, Spain, UK), the population under study was defined as the private households in the country. Sampling was based on a stratified random design, with the NUTS2 statistical regions of Europe and the education level of the target respondent as stratum variables. The target sample size was 1000 households per country, with selection probability within each country proportional to stratum size.

Fieldwork

The fieldwork was conducted between December 2018 and April 2019 in ten EU and EEA countries (Denmark, France, Germany, Greece, Hungary, Norway, Portugal, Romania, Spain, United Kingdom). The target respondent in each household was the person with main or shared responsibility for food shopping in the household. The fieldwork was sub-contracted to a professional research provider (Dynata, formerly Research Now SSI). Complete responses were obtained from altogether 9996 households.

Weights

In addition to the SafeConsume Household Survey data, population data from Eurostat (2019) were used to calculate weights. These were calculated with NUTS2 region as the stratification variable and assigned an influence to each observation in each stratum that was proportional to how many households in the population stratum a household in the sample stratum represented. The weights were used in the estimation of all base rates included in the data set.

Transformations

All survey variables were normalised to the [0,1] range before the analysis. Responses to food frequency questions were transformed into the proportion of all meals consumed during a year where the meal contained the respective food item. Responses to questions with 11-point Juster probability scales as the response format were transformed into numerical probabilities. Responses to questions with time (hours, days, weeks) or temperature (C) as response formats were discretised using supervised binning. The thresholds best separating between the bins were chosen on the basis of five-fold cross-validated decision trees. The binned versions of these variables, and all other input variables with multiple categorical response options (either with a check-all-that-apply or forced-choice response format) were transformed into sets of binary features, with a value 1 assigned if the respective response option had been checked, 0 otherwise.

Treatment of missing values

In many cases, a missing value on a feature logically implies that the respective data point should have a value of zero. If, for example, a participant in the SafeConsume Household Survey had indicated that a particular food was not consumed in their household, the participant was not presented with any other questions related to that food, which automatically results in missing values on all features representing the responses to the skipped questions. However, zero consumption would also imply a zero probability that the respective food is consumed undercooked. In such cases, missing values were replaced with a value of 0.
H
Replication Data for: Worth Weighting? How to Think About and Use Weights in...
dataverse.harvard.edu
Updated Nov 16, 2017
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Luke W. Miratrix; Jasjeet S. Sekhon; Alexander G. Theodoridis; Luis F. Campos (2017). Replication Data for: Worth Weighting? How to Think About and Use Weights in Survey Experiments [Dataset]. http://doi.org/10.7910/DVN/52UGJT
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Unique identifier
https://doi.org/10.7910/DVN/52UGJT
Dataset updated
Nov 16, 2017
Dataset provided by
Harvard Dataverse
Authors
Luke W. Miratrix; Jasjeet S. Sekhon; Alexander G. Theodoridis; Luis F. Campos
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Description
Replication materials for the forthcoming publication entitled "Worth Weighting? How to Think About and Use Weights in Survey Experiments."
d
COVID Impact Survey - Public Data
data.world
csv, zip
Updated Oct 16, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
The Associated Press (2024). COVID Impact Survey - Public Data [Dataset]. https://data.world/associatedpress/covid-impact-survey-public-data
Explore at:
csv, zipAvailable download formats
Dataset updated
Oct 16, 2024
Authors
The Associated Press
Description
Overview

The Associated Press is sharing data from the COVID Impact Survey, which provides statistics about physical health, mental health, economic security and social dynamics related to the coronavirus pandemic in the United States.

Conducted by NORC at the University of Chicago for the Data Foundation, the probability-based survey provides estimates for the United States as a whole, as well as in 10 states (California, Colorado, Florida, Louisiana, Minnesota, Missouri, Montana, New York, Oregon and Texas) and eight metropolitan areas (Atlanta, Baltimore, Birmingham, Chicago, Cleveland, Columbus, Phoenix and Pittsburgh).

The survey is designed to allow for an ongoing gauge of public perception, health and economic status to see what is shifting during the pandemic. When multiple sets of data are available, it will allow for the tracking of how issues ranging from COVID-19 symptoms to economic status change over time.

The survey is focused on three core areas of research:

Physical Health: Symptoms related to COVID-19, relevant existing conditions and health insurance coverage.

Economic and Financial Health: Employment, food security, and government cash assistance.

Social and Mental Health: Communication with friends and family, anxiety and volunteerism. (Questions based on those used on the U.S. Census Bureau’s Current Population Survey.) ## Using this Data - IMPORTANT This is survey data and must be properly weighted during analysis: DO NOT REPORT THIS DATA AS RAW OR AGGREGATE NUMBERS!!

Instead, use our queries linked below or statistical software such as R or SPSS to weight the data.

Queries

If you'd like to create a table to see how people nationally or in your state or city feel about a topic in the survey, use the survey questionnaire and codebook to match a question (the variable label) to a variable name. For instance, "How often have you felt lonely in the past 7 days?" is variable "soc5c".

Nationally: Go to this query and enter soc5c as the variable. Hit the blue Run Query button in the upper right hand corner.

Local or State: To find figures for that response in a specific state, go to this query and type in a state name and soc5c as the variable, and then hit the blue Run Query button in the upper right hand corner.

The resulting sentence you could write out of these queries is: "People in some states are less likely to report loneliness than others. For example, 66% of Louisianans report feeling lonely on none of the last seven days, compared with 52% of Californians. Nationally, 60% of people said they hadn't felt lonely."

Margin of Error

The margin of error for the national and regional surveys is found in the attached methods statement. You will need the margin of error to determine if the comparisons are statistically significant. If the difference is:

At least twice the margin of error, you can report there is a clear difference.

At least as large as the margin of error, you can report there is a slight or apparent difference.

Less than or equal to the margin of error, you can report that the respondents are divided or there is no difference. ## A Note on Timing Survey results will generally be posted under embargo on Tuesday evenings. The data is available for release at 1 p.m. ET Thursdays.

About the Data

The survey data will be provided under embargo in both comma-delimited and statistical formats.

Each set of survey data will be numbered and have the date the embargo lifts in front of it in the format of: 01_April_30_covid_impact_survey. The survey has been organized by the Data Foundation, a non-profit non-partisan think tank, and is sponsored by the Federal Reserve Bank of Minneapolis and the Packard Foundation. It is conducted by NORC at the University of Chicago, a non-partisan research organization. (NORC is not an abbreviation, it part of the organization's formal name.)

Data for the national estimates are collected using the AmeriSpeak Panel, NORC’s probability-based panel designed to be representative of the U.S. household population. Interviews are conducted with adults age 18 and over representing the 50 states and the District of Columbia. Panel members are randomly drawn from AmeriSpeak with a target of achieving 2,000 interviews in each survey. Invited panel members may complete the survey online or by telephone with an NORC telephone interviewer.

Once all the study data have been made final, an iterative raking process is used to adjust for any survey nonresponse as well as any noncoverage or under and oversampling resulting from the study specific sample design. Raking variables include age, gender, census division, race/ethnicity, education, and county groupings based on county level counts of the number of COVID-19 deaths. Demographic weighting variables were obtained from the 2020 Current Population Survey. The count of COVID-19 deaths by county was obtained from USA Facts. The weighted data reflect the U.S. population of adults age 18 and over.

Data for the regional estimates are collected using a multi-mode address-based (ABS) approach that allows residents of each area to complete the interview via web or with an NORC telephone interviewer. All sampled households are mailed a postcard inviting them to complete the survey either online using a unique PIN or via telephone by calling a toll-free number. Interviews are conducted with adults age 18 and over with a target of achieving 400 interviews in each region in each survey.Additional details on the survey methodology and the survey questionnaire are attached below or can be found at https://www.covid-impact.org.

Attribution

Results should be credited to the COVID Impact Survey, conducted by NORC at the University of Chicago for the Data Foundation.

AP Data Distributions

To learn more about AP's data journalism capabilities for publishers, corporations and financial institutions, go here or email kromano@ap.org.
f
Changes in Final Population Estimates and Personal Network Size between...
plos.figshare.com
xls
Updated May 31, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Patrick Habecker; Kirk Dombrowski; Bilal Khan (2023). Changes in Final Population Estimates and Personal Network Size between Original/MoS and Unweighted/Weighted Proceedures after Recursive Trimming. [Dataset]. http://doi.org/10.1371/journal.pone.0143406.t005
Explore at:
xlsAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0143406.t005
Dataset updated
May 31, 2023
Dataset provided by
PLOS ONE
Authors
Patrick Habecker; Kirk Dombrowski; Bilal Khan
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Changes in Final Population Estimates and Personal Network Size between Original/MoS and Unweighted/Weighted Proceedures after Recursive Trimming.
i
OHS-LFS Consistent Series Weights 1994-2007 - South Africa
catalog.ihsn.org
dev.ihsn.org
+1more
Updated Mar 29, 2019
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Branson, Nicola (2019). OHS-LFS Consistent Series Weights 1994-2007 - South Africa [Dataset]. https://catalog.ihsn.org/index.php/catalog/3308
Explore at:
Dataset updated
Mar 29, 2019
Dataset authored and provided by
Branson, Nicola
Time period covered
1994 - 2007
Area covered
South Africa
Description
Abstract

One focus of post apartheid research in South Africa is change. Questions include the progress of South Africa in the economic, social and political arena. National datasets such as the October Household Surveys (OHS) and Labour Force Surveys (LFS) provide a rich source of information on both economic and social variables in a cross sectional framework. These datasets are repeated annually or biannually and therefore have the potential to highlight changes over time. Yet to treat the cross sectional national data as a time series requires that, when stacked side by side, the data produce realistic trends. Since these data were not designed to be used as a time series, there are changes in sample design, the interview process and shifts in the sampling frame which can cause unrealistic changes in aggregates over a short period of time. This raises concerns about the validity of using these datasets as a time series to examine change.

The aggregate trends calculated from the OHS and LFS show the data to be both temporally and internally inconsistent. Examining the weights given in the datasets, in addition to the public documentation, it is clear that the Statistics South Africa (StatsSA) household and person weights are not simple design weights i.e. inverse inclusion probability weights. StatsSA poststratifies the person design weight to external population totals. Since the data are cross sectional the intention of the post-stratification adjustment is to produce best estimates of the population given the information available at the time and temporal consistency is not considered. This creates problems when the data is used as a time series.

A project was thus undertaken by Nicola Branson at the University of Cape Town, with a scholarship from DataFirst as part of DataFirst's Data Quality Project, funded by the Mellon Foundation. to design a new set of person and household weights for the OHS 1994-1999 and the LFS 2000-2007. These weights are generated using an entropy estimation technique. The new weights result in consistent demographic and geographic trends and greater consistency between person and household level analysis.

This dataset consists of the cross-entrophy weights and the research resources used to construct them, including the syntax files, as well as background documentation on the project, and other research output. These should be used with the OHS and LFS data available from the data portal.

Geographic coverage

National coverage

Kind of data

Sample survey data [ssd]

Mode of data collection

Face-to-face [f2f]

Data appraisal

The purpose of survey weights is to inflate the sample to represent the entire population. These weights therefore play an important role in creating consistent aggregates over time. Statistics South Africa's (StatsSA) household and person weights are not simple design weights i.e. inverse inclusion probability weights. The weights presented in the StatsSA National Household surveys are the design weight post-stratified to external population totals. Since the data are cross sectional the intention of the post-stratification adjustment is to produce best estimates of the population given the information available at the time and temporal consistency is not considered. These cross entropy weights have been provided to render the OHS and LFS series consistent over time.

The original cross entropy weights created by Nicola Branson did not include weights for OHS 1996. These have now been created by DataFirst, using a later version of the OHS 1996 data provided by Statistics South Africa.
NSDUH 2022 Pair Level Sampling Weight Report
data.virginia.gov
catalog.data.gov
html
Updated Jul 14, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Substance Abuse and Mental Health Services Administration (2025). NSDUH 2022 Pair Level Sampling Weight Report [Dataset]. https://data.virginia.gov/dataset/nsduh-2022-pair-level-sampling-weight-report
Explore at:
htmlAvailable download formats
Dataset updated
Jul 14, 2025
Dataset provided by
Substance Abuse and Mental Health Services Administrationhttp://www.samhsa.gov/
Description
Learn about the techniques used to create weights for the 2022 National Survey on Drug Use and Health (NSDUH) at the pair and questionnaire dwelling unit (QDU) levels. NSDUH is designed so that some of the sampled households have both an adult and a youth respondent who are paired. Because of this, NSDUH allows for estimating characteristics at the person level, pair level, or QDU level. This report describes pair selection probabilities, the generalized exponential model (including predictor variables used), and the multiple weight components that are used for pair or QDU levels of analysis. An evaluation of the calibration weights is also included.Chapters:Introduces the report.Discusses the probability of selection for pairs and QDUs.Briefly describes of the generalized exponential model.Describes the predictor variables for the model calibration.Defines extreme weights.Discusses weight calibrations.Evaluates the calibration weights.Appendices include technical details about the model and the evaluations that were performed.
a
OSMP Master Plan Survey 2019 Data
hub.arcgis.com
Updated Aug 31, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
BoulderCO (2020). OSMP Master Plan Survey 2019 Data [Dataset]. https://hub.arcgis.com/datasets/de03870a382744d7900197a15c565dd5
Explore at:
Dataset updated
Aug 31, 2020
Dataset authored and provided by
BoulderCO
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Area covered

Description
To interpret these datasets, it is essential that you have a copy of the hand-annotated survey instrument, "Boulder OSMP Codebook Version A.pdf," along with all three of the following csv files described below. In addition, these datasets were released in coordination with a detailed report describing the survey results, which will be helpful in providing more context. The report can be found on the City of Boulder Open Space and Mountain Parks (OSMP) website. Note, open ended comments have been removed from this dataset, consistent with the Open Data Policy.OSMP Master Plan Survey Data (This file): Contains the data (survey responses) for three different surveys: (1) that from the "Scientific Survey," in which a random sample of households were invited to participate, (2) that from an "Open Participation (Opt-In) Survey," an online survey to which all residents were invited, and (3) that from a special effort made to reach Boulder’s Latino population through the promotoras network to invite them to participate (Promotoras). The data field "type" corresponds to the survey type. The files listed below contain supporting information that is necessary to interpret the dataset.OSMP Master Plan Survey - Survey Question Labels This file list the variable names found in the previous csv, and then gives a few words describing what the variable means, with reference to the survey. You will need to refer to "Boulder OSMP Codebook Version A.pdf" for the full name of the survey question being referenced.OSMP Master Plan Survey - Survey Response Option Labels For each variable, the numeric values that are possible, and their associated labels.VERY IMPORTANT NOTE: The scientific survey data were weighted, meaning that the demographic profile of respondents was compared to the demographic profile of adults in Boulder from US Census data. Statistical adjustments were made to bring the respondent profile into balance with the population profile. This means that some records were given more "weight" and some records were given less weight. The weights that were applied are found in the field "wt". If you do not apply these weights, you will not obtain the same results as can be found in the report delivered to the City of Boulder. Please read the Instructions for Working with Survey Weights document for more information.This survey was implemented by Erin Caldwell of the National Research Center, under contract with City of Boulder's Open Space and Mountain Parks Department.Note: The data file contains survey responses from three different surveys. Use the Data column "Type" to distinguish among the surveys. Type=1 is the statistically valid survey, Type=2 is the open participation survey, and Type=3 is the promotoras survey.
data_anthro_for deposit.pdf
figshare.com
pdf
Updated Dec 8, 2016
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Emmanuel Grellety (2016). data_anthro_for deposit.pdf [Dataset]. http://doi.org/10.6084/m9.figshare.4294085.v1
Explore at:
pdfAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.4294085.v1
Dataset updated
Dec 8, 2016
Dataset provided by
Figsharehttp://figshare.com/
Authors
Emmanuel Grellety
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The WHO standards are available on the website of WHO (http://www.who.int/childgrowth/en/). You find there the software Anthro2005 for calculation of individual data (Anthro2005). For calculating the z-scores of the new standards the LMS procedure is used. The principle is to get from a table for a certain age and sex 3 values (LMS) which are put into a formula For example WAZ = ((weight/M)^L-1)/(L*S)) e.g.: Female, 9kg, 365 days => L = -0.2022, M = 8.9462, S = 0.12267 WAZ = ((9/M)^L-1)/(L*S) = ((9/8.9462)^-0.2022-1)/(-0.2022*0.12267) = 0.04885 Z-scores On the WHO website you find only the LMS values for each month. Wt L M S 9 -0.2022 8.9462 0.12267 0.048847178 To calculate the actual weight from the LSM statistics at any cut off point use the following formula: wt = M * ((Z * S * L) +1)^(1/L)Where Z = Zscore you are trying to obtainThis is the reverse of the formula to calculate WAZ etc.And will give the standard to more decimal places for interpolation [purposesI have taken 10 boys and 10 girls at 0.5 cm intervals from 60 to 110cm in height with weight for height mean at -1Z with SD or 1.0. These are in the sheet WFH_Population.

For the simulation I have now added a random variable to the weight or the height in the POP_weight and the POP_height sheets.
f
Change in Three Population Estimates and Personal Network Size over the...
plos.figshare.com
xls
Updated Jun 1, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Patrick Habecker; Kirk Dombrowski; Bilal Khan (2023). Change in Three Population Estimates and Personal Network Size over the Original and MoS Estimator. [Dataset]. http://doi.org/10.1371/journal.pone.0143406.t001
Explore at:
xlsAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0143406.t001
Dataset updated
Jun 1, 2023
Dataset provided by
PLOS ONE
Authors
Patrick Habecker; Kirk Dombrowski; Bilal Khan
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Change in Three Population Estimates and Personal Network Size over the Original and MoS Estimator.
H
Replication Data for: Countering Non-Ignorable Nonresponse in Survey Models...
dataverse.harvard.edu
search.dataone.org
Updated Jun 29, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Michael Bailey (2024). Replication Data for: Countering Non-Ignorable Nonresponse in Survey Models with Randomized Response Instruments and Doubly Robust Estimation [Dataset]. http://doi.org/10.7910/DVN/L2NVRD
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Unique identifier
https://doi.org/10.7910/DVN/L2NVRD
Dataset updated
Jun 29, 2024
Dataset provided by
Harvard Dataverse
Authors
Michael Bailey
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Description
Conventional survey tools such as weighting do not address non-ignorable nonresponse that occurs when nonresponse depends on the variable being measured. This paper describes non-ignorable nonresponse weighting and imputation models using randomized response instruments, which are variables that affect response but not the outcome of interest \citep{SunEtal2018}. The paper uses a doubly robust estimator that is valid if one, but not necessarily both, of the weighting and imputation models is correct. When applied to a national 2019 survey, these tools produce estimates that suggest there was non-trivial non-ignorable nonresponse related to turnout, and, for subgroups, Trump approval and policy questions. For example, the conventional MAR-based weighted estimates of Trump support in the Midwest were 10 percentage points lower than the MNAR-based estimates. Data to replicate estimation described in "Countering Non-Ignorable Nonresponse in Survey Models with Randomized Response Instruments and Doubly Robust Estimation"