100+ datasets found

Survey weights
figshare.com
pdf
Updated Jul 30, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Carolin Kilian (2020). Survey weights [Dataset]. http://doi.org/10.6084/m9.figshare.12739469.v1
Explore at:
pdfAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.12739469.v1
Dataset updated
Jul 30, 2020
Dataset provided by
Figsharehttp://figshare.com/
Authors
Carolin Kilian
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Calculation strategy for survey and population weighting of the data.
d
Community Survey: 2021 Random Sample Results
catalog.data.gov
data.bloomington.in.gov
Updated May 20, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
data.bloomington.in.gov (2023). Community Survey: 2021 Random Sample Results [Dataset]. https://catalog.data.gov/dataset/community-survey-2021-random-sample-results-69942
Explore at:
Dataset updated
May 20, 2023
Dataset provided by
data.bloomington.in.gov
Description
A random sample of households were invited to participate in this survey. In the dataset, you will find the respondent level data in each row with the questions in each column. The numbers represent a scale option from the survey, such as 1=Excellent, 2=Good, 3=Fair, 4=Poor. The question stem, response option, and scale information for each field can be found in the var "variable labels" and "value labels" sheets. VERY IMPORTANT NOTE: The scientific survey data were weighted, meaning that the demographic profile of respondents was compared to the demographic profile of adults in Bloomington from US Census data. Statistical adjustments were made to bring the respondent profile into balance with the population profile. This means that some records were given more "weight" and some records were given less weight. The weights that were applied are found in the field "wt". If you do not apply these weights, you will not obtain the same results as can be found in the report delivered to the Bloomington. The easiest way to replicate these results is likely to create pivot tables, and use the sum of the "wt" field rather than a count of responses.
The People and Nature Surveys for England: Monthly indicators with specific...
gov.uk
Updated Jul 14, 2021
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Natural England (2021). The People and Nature Surveys for England: Monthly indicators with specific weight for April 2020 - March 2021 (Official Statistics) [Dataset]. https://www.gov.uk/government/statistics/the-people-and-nature-survey-for-england-monthly-indicators-with-specific-weight-for-april-2020-march-2021-official-statistics
Explore at:
Dataset updated
Jul 14, 2021
Dataset provided by
GOV.UKhttp://gov.uk/
Authors
Natural England
Area covered
England
Description
The People and Nature Survey for England gathers information on people’s experiences and views about the natural environment, and its contributions to our health and wellbeing.

This publication reports a set of weighted national indicators (Official Statistics) from the survey, which have been generated using data collected in the first year (April 2020 - March 2021) from approx. 25,000 adults (16+).

These updated indicators have been generated using the specific People and Nature weight and can be directly compared with monthly indicators published from April 2021 onwards. See Technical methods and limitations for more information.
H
Replication Data for: Worth Weighting? How to Think About and Use Weights in...
dataverse.harvard.edu
Updated Nov 16, 2017
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Luke W. Miratrix; Jasjeet S. Sekhon; Alexander G. Theodoridis; Luis F. Campos (2017). Replication Data for: Worth Weighting? How to Think About and Use Weights in Survey Experiments [Dataset]. http://doi.org/10.7910/DVN/52UGJT
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Unique identifier
https://doi.org/10.7910/DVN/52UGJT
Dataset updated
Nov 16, 2017
Dataset provided by
Harvard Dataverse
Authors
Luke W. Miratrix; Jasjeet S. Sekhon; Alexander G. Theodoridis; Luis F. Campos
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Description
Replication materials for the forthcoming publication entitled "Worth Weighting? How to Think About and Use Weights in Survey Experiments."
d
Replication Data for Gender Identification and Survey Weighting: A Shifting...
dataone.org
dataverse.harvard.edu
Updated Nov 8, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Urlacher, Brian (2023). Replication Data for Gender Identification and Survey Weighting: A Shifting Landscape [Dataset]. http://doi.org/10.7910/DVN/P6IHTM
Explore at:
Unique identifier
https://doi.org/10.7910/DVN/P6IHTM
Dataset updated
Nov 8, 2023
Dataset provided by
Harvard Dataverse
Authors
Urlacher, Brian
Description
Demographic information for a Utah survey of mental health stigma used to generate different survey weights to match census data.
NSDUH 2022 Pair Level Sampling Weight Report
data.virginia.gov
gimi9.com
+1more
html
Updated Jul 14, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Substance Abuse and Mental Health Services Administration (2025). NSDUH 2022 Pair Level Sampling Weight Report [Dataset]. https://data.virginia.gov/dataset/nsduh-2022-pair-level-sampling-weight-report
Explore at:
htmlAvailable download formats
Dataset updated
Jul 14, 2025
Dataset provided by
Substance Abuse and Mental Health Services Administrationhttps://www.samhsa.gov/
Description
Learn about the techniques used to create weights for the 2022 National Survey on Drug Use and Health (NSDUH) at the pair and questionnaire dwelling unit (QDU) levels. NSDUH is designed so that some of the sampled households have both an adult and a youth respondent who are paired. Because of this, NSDUH allows for estimating characteristics at the person level, pair level, or QDU level. This report describes pair selection probabilities, the generalized exponential model (including predictor variables used), and the multiple weight components that are used for pair or QDU levels of analysis. An evaluation of the calibration weights is also included.Chapters:Introduces the report.Discusses the probability of selection for pairs and QDUs.Briefly describes of the generalized exponential model.Describes the predictor variables for the model calibration.Defines extreme weights.Discusses weight calibrations.Evaluates the calibration weights.Appendices include technical details about the model and the evaluations that were performed.
2012 NSDUH Person-level Weight Calibration
healthdata.gov
odgavaprod.ogopendata.com
+2more
application/rdfxml +5
Updated Jul 14, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2025). 2012 NSDUH Person-level Weight Calibration [Dataset]. https://healthdata.gov/SAMHSA/2012-NSDUH-Person-level-Weight-Calibration/qewh-vjbt/data
Explore at:
application/rssxml, application/rdfxml, csv, tsv, json, xmlAvailable download formats
Dataset updated
Jul 14, 2025
Description
This report describes the person-level sampling weight calibration procedures used on the 2012 National Survey on Drug Use and Health (NSDUH). The report describes the practical aspects of implementing generalized exponential model (GEM) for the NSDUH.
d
Replication Data for: \"Sensitivity Analysis for Survey Weights\"
search.dataone.org
dataverse.harvard.edu
Updated Nov 8, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Hartman, Erin; Huang, Melody (2023). Replication Data for: \"Sensitivity Analysis for Survey Weights\" [Dataset]. http://doi.org/10.7910/DVN/YJSJEX
Explore at:
Unique identifier
https://doi.org/10.7910/DVN/YJSJEX
Dataset updated
Nov 8, 2023
Dataset provided by
Harvard Dataverse
Authors
Hartman, Erin; Huang, Melody
Description
Survey weighting allows researchers to account for bias in survey samples, due to unit nonresponse or convenience sampling, using measured demographic covariates. Unfortunately, in practice, it is impossible to know whether the estimated survey weights are sufficient to alleviate concerns about bias due to unobserved confounders or incorrect functional forms used in weighting. In the following paper, we propose two sensitivity analyses for the exclusion of important covariates: (1) a sensitivity analysis for partially observed confounders (i.e., variables measured across the survey sample, but not the target population), and (2) a sensitivity analysis for fully unobserved confounders (i.e., variables not measured in either the survey or the target population). We provide graphical and numerical summaries of the potential bias that arises from such confounders, and introduce a benchmarking approach that allows researchers to quantitatively reason about the sensitivity of their results. We demonstrate our proposed sensitivity analyses using state-level 2020 U.S. Presidential Election polls.
Weighting Techniques for Large Private Claims Data
stanford.redivis.com
redivis.com
application/jsonl +7
Updated Feb 21, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Stanford Center for Population Health Sciences (2025). Weighting Techniques for Large Private Claims Data [Dataset]. http://doi.org/10.57761/k5kz-gh68
Explore at:
csv, spss, stata, parquet, arrow, avro, sas, application/jsonlAvailable download formats
Unique identifier
https://doi.org/10.57761/k5kz-gh68
Dataset updated
Feb 21, 2025
Dataset provided by
Redivis Inc.
Authors
Stanford Center for Population Health Sciences
Description
Abstract

The page contains materials from the PHS Seminar on Weighting Techniques for Large Private Claims Data that was held on On October 24, 2024, as well as some additional documentation and the weights themselves.

Methodology

On October 24, 2024, PHS hosted a Seminar on Weighting Techniques for Large Private Claims Data. Using the MarketScan Commercial Database as an example case, Social Scientist Sarah Hirsch discussed three schemes for weighting private claims data using US census-based surveys, and the associated methods and techniques. She provided researchers with the tools to implement these methodologies, or to formulate their own for other datasets.

We invite you to view the Recording of the Seminar to learn more about this topic! The slide deck and transcript are also available for reference.

We have also added some code scripts, a written description of the weighting process, and the final MarketScan weights. Some additional have also been made related to the following:

The region imputation.

Patients from Puerto Rico (who are under a different survey from those employed here), who are being removed.

Imputation based on non-null values in other years that were available for some people.

%3C!-- --%3E
d
Replication Data for: Multilevel calibration weighting for survey data
dataone.org
Updated Nov 8, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ben-Michael, Eli; Feller, Avi; Hartman, Erin (2023). Replication Data for: Multilevel calibration weighting for survey data [Dataset]. http://doi.org/10.7910/DVN/J7BSXQ
Explore at:
Unique identifier
https://doi.org/10.7910/DVN/J7BSXQ
Dataset updated
Nov 8, 2023
Dataset provided by
Harvard Dataverse
Authors
Ben-Michael, Eli; Feller, Avi; Hartman, Erin
Description
In the November 2016 U.S. presidential election, many state level public opinion polls, particularly in the Upper Midwest, incorrectly predicted the winning candidate. One leading explanation for this polling miss is that the precipitous decline in traditional polling response rates led to greater reliance on statistical methods to adjust for the corresponding bias---and that these methods failed to adjust for important interactions between key variables like educational attainment, race, and geographic region. Finding calibration weights that account for important interactions remains challenging with traditional survey methods: raking typically balances the margins alone, while post-stratification, which exactly balances all interactions, is only feasible for a small number of variables. In this paper, we propose multilevel calibration weighting, which enforces tight balance constraints for marginal balance and looser constraints for higher-order interactions. This incorporates some of the benefits of post-stratification while retaining the guarantees of raking. We then correct for the bias due to the relaxed constraints via a flexible outcome model; we call this approach Double Regression with Post-stratification (DRP). We use these tools to to re-assess a large-scale survey of voter intention in the 2016 U.S. presidential election, finding meaningful gains from the proposed methods. The approach is available in the multical R package. Contains replication materials for "Multilevel calibration weighting for survey data", including raw data, scripts to clean the raw data, scripts to replicate the analysis, and scripts to replicate the simulation study.
2011 NSDUH Person-level Weight Calibration
odgavaprod.ogopendata.com
catalog.data.gov
html
Updated Jul 14, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Substance Abuse and Mental Health Services Administration (2025). 2011 NSDUH Person-level Weight Calibration [Dataset]. https://odgavaprod.ogopendata.com/dataset/2011-nsduh-person-level-weight-calibration
Explore at:
htmlAvailable download formats
Dataset updated
Jul 14, 2025
Dataset provided by
Substance Abuse and Mental Health Services Administrationhttps://www.samhsa.gov/
Description
This report describes the person-level sampling weight calibration procedures used on the 2011 National Survey on Drug Use and Health (NSDUH).
NSDUH 2022 Person Level Sampling Weight Report
data.virginia.gov
gimi9.com
+1more
html
Updated Jul 14, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Substance Abuse and Mental Health Services Administration (2025). NSDUH 2022 Person Level Sampling Weight Report [Dataset]. https://data.virginia.gov/dataset/nsduh-2022-person-level-sampling-weight-report
Explore at:
htmlAvailable download formats
Dataset updated
Jul 14, 2025
Dataset provided by
Substance Abuse and Mental Health Services Administrationhttps://www.samhsa.gov/
Description
Learn about the techniques used to create weights for the 2022 National Survey on Drug Use and Health (NSDUH) at the person level. The report reviews the generalized exponential model (GEM) used in weighting, discusses potential predictor variables, and details the practical steps used to implement GEM. The report also details the weight calibrations, and presents the evaluation measures of the calibrations, as well as a sensitivity analysis.Chapters:Introduces the survey and the remainder of the report.Reviews the impact of multimode data collection on weighting.Briefly describes of the generalized exponential model.Describes the predictor variables for the model calibration.Defines extreme weights.Discusses control totals for poststratification adjustments.Discusses weight calibration at the dwelling unit level.Discusses weight calibration at the person level.Presents the evaluation measures of calibrated weights and a sensitivity analysis of selected prevalence estimates.Explains the break-off analysis weights.Appendices include technical details about the model and the evaluations that were performed.
f
Weighted sample proportions and demographic weighting values obtained from...
plos.figshare.com
xls
Updated Jun 6, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Patrick R. Heck; Daniel J. Simons; Christopher F. Chabris (2023). Weighted sample proportions and demographic weighting values obtained from 2010 U.S. Census data. [Dataset]. http://doi.org/10.1371/journal.pone.0200103.t001
Explore at:
xlsAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0200103.t001
Dataset updated
Jun 6, 2023
Dataset provided by
PLOS ONE
Authors
Patrick R. Heck; Daniel J. Simons; Christopher F. Chabris
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
United States
Description
Weighted sample proportions and demographic weighting values obtained from 2010 U.S. Census data.
H
Replication Data for: Countering Non-Ignorable Nonresponse in Survey Models...
dataverse.harvard.edu
search.dataone.org
Updated Jun 29, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Michael Bailey (2024). Replication Data for: Countering Non-Ignorable Nonresponse in Survey Models with Randomized Response Instruments and Doubly Robust Estimation [Dataset]. http://doi.org/10.7910/DVN/L2NVRD
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Unique identifier
https://doi.org/10.7910/DVN/L2NVRD
Dataset updated
Jun 29, 2024
Dataset provided by
Harvard Dataverse
Authors
Michael Bailey
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Description
Conventional survey tools such as weighting do not address non-ignorable nonresponse that occurs when nonresponse depends on the variable being measured. This paper describes non-ignorable nonresponse weighting and imputation models using randomized response instruments, which are variables that affect response but not the outcome of interest \citep{SunEtal2018}. The paper uses a doubly robust estimator that is valid if one, but not necessarily both, of the weighting and imputation models is correct. When applied to a national 2019 survey, these tools produce estimates that suggest there was non-trivial non-ignorable nonresponse related to turnout, and, for subgroups, Trump approval and policy questions. For example, the conventional MAR-based weighted estimates of Trump support in the Midwest were 10 percentage points lower than the MNAR-based estimates. Data to replicate estimation described in "Countering Non-Ignorable Nonresponse in Survey Models with Randomized Response Instruments and Doubly Robust Estimation"
d
Community Survey: 2019 Survey Data
catalog.data.gov
data.bloomington.in.gov
+1more
Updated May 20, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
data.bloomington.in.gov (2023). Community Survey: 2019 Survey Data [Dataset]. https://catalog.data.gov/dataset/community-survey-2019-survey-data-ac78c
Explore at:
Dataset updated
May 20, 2023
Dataset provided by
data.bloomington.in.gov
Description
The City of Bloomington contracted with National Research Center, Inc. to conduct the 2019 Bloomington Community Survey. This was the second time a scientific citywide survey had been completed covering resident opinions on service delivery satisfaction by the City of Bloomington and quality of life issues. The first was in 2017. The survey captured the responses of 610 households from a representative sample of 3,000 residents of Bloomington who were randomly selected to complete the survey. VERY IMPORTANT NOTE: The scientific survey data were weighted, meaning that the demographic profile of respondents was compared to the demographic profile of adults in Bloomington from US Census data. Statistical adjustments were made to bring the respondent profile into balance with the population profile. This means that some records were given more "weight" and some records were given less weight. The weights that were applied are found in the field "wt". If you do not apply these weights, you will not obtain the same results as can be found in the report delivered to the City of Bloomington. The easiest way to replicate these results is likely to create pivot tables, and use the sum of the "wt" field rather than a count of responses.
f
Demographic report of weighted survey data (online).
plos.figshare.com
xls
Updated Jun 2, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Patrick R. Heck; Daniel J. Simons; Christopher F. Chabris (2023). Demographic report of weighted survey data (online). [Dataset]. http://doi.org/10.1371/journal.pone.0200103.t004
Explore at:
xlsAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0200103.t004
Dataset updated
Jun 2, 2023
Dataset provided by
PLOS ONE
Authors
Patrick R. Heck; Daniel J. Simons; Christopher F. Chabris
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Demographic report of weighted survey data (online).
Quality Performance Measures Data Package
johnsnowlabs.com
csv
Updated Jan 20, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
John Snow Labs (2021). Quality Performance Measures Data Package [Dataset]. https://www.johnsnowlabs.com/marketplace/quality-performance-measures-data-package/
Explore at:
csvAvailable download formats
Dataset updated
Jan 20, 2021
Dataset authored and provided by
John Snow Labs
Description
This data package contains quality measures such as Air Quality, Austin Airport, LBB Performance Report, School Survey, Child Poverty, System International Units, Weight Measures, etc.
g
WISIND - Weighting Survey – Experts
dbk.gesis.org
search.gesis.org
+2more
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Bug, Mathias; Kroh, Martin; Meier, Kristina; Rieckmann, Johannes; Um, Eric van; Wald, Nina, WISIND - Weighting Survey – Experts [Dataset]. http://doi.org/10.4232/1.12482
Explore at:
Unique identifier
https://doi.org/10.4232/1.12482
Dataset provided by
GESIS - Leibniz Institute for the Social Sciences
Authors
Bug, Mathias; Kroh, Martin; Meier, Kristina; Rieckmann, Johannes; Um, Eric van; Wald, Nina
License
https://dbk.gesis.org/dbksearch/sdesc2.asp?no=7467https://dbk.gesis.org/dbksearch/sdesc2.asp?no=7467
Description
Media use related to crime. Weighting of criminal offenses. Perception of safety.
f
Table_1_On the methodological framework of composite index under complex...
frontiersin.figshare.com
docx
Updated Nov 17, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Deepak Singh; Pradip Basak; Raju Kumar; Tauqueer Ahmad (2023). Table_1_On the methodological framework of composite index under complex surveys and its application in development of food consumption index for India.docx [Dataset]. http://doi.org/10.3389/fams.2023.1274530.s001
Explore at:
docxAvailable download formats
Unique identifier
https://doi.org/10.3389/fams.2023.1274530.s001
Dataset updated
Nov 17, 2023
Dataset provided by
Frontiers
Authors
Deepak Singh; Pradip Basak; Raju Kumar; Tauqueer Ahmad
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
India
Description
Indices are created by consolidating multidimensional data into a single representative measure known as an index, using a fundamental mathematical model. Most present indices are essentially the averages or weighted averages of the variables under study, ignoring multicollinearity among the variables, with the exception of the existing Ordinary Least Squares (OLS) estimator based OLS-PCA index methodology. Many existing surveys adopt survey designs that incorporate survey weights, aiming to obtain a representative sample of the population while minimizing costs. Survey weights play a crucial role in addressing the unequal probabilities of selection inherent in complex survey designs, ensuring accurate and representative estimates of population parameters. However, the existing OLS-PCA based index methodology is designed for simple random sampling and is incapable of incorporating survey weights, leading to biased estimates and erroneous rankings that can result in flawed inferences and conclusions for survey data. To address this limitation, we propose a novel Survey Weighted PCA (SW-PCA) based Index methodology, tailored for survey-weighted data. SW-PCA incorporates survey weights, facilitating the development of unbiased and efficient composite indices, improving the quality and validity of survey-based research. Simulation studies demonstrate that the SW-PCA based index outperforms the OLS-PCA based index that neglects survey weights, indicating its higher efficiency. To validate the methodology, we applied it to a Household Consumer Expenditure Survey (HCES), NSS 68th Round survey data to construct a Food Consumption Index for different states of India. The result was significant improvements in state rankings when survey weights were considered. In conclusion, this study highlights the crucial importance of incorporating survey weights in index construction from complex survey data. The SW-PCA based Index provides a valuable solution, enhancing the accuracy and reliability of survey-based research, ultimately contributing to more informed decision-making.
Publications Using SAMHSA DataNSDUH 2018 Person-Level Sampling Weight...
catalog.data.gov
data.virginia.gov
+2more
Updated Jul 31, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Substance Abuse and Mental Health Services Administration (2025). Publications Using SAMHSA DataNSDUH 2018 Person-Level Sampling Weight Calibration Report [Dataset]. https://catalog.data.gov/dataset/publications-using-samhsa-datansduh-2018-person-level-sampling-weight-calibration-report
Explore at:
Dataset updated
Jul 31, 2025
Dataset provided by
Substance Abuse and Mental Health Services Administrationhttps://www.samhsa.gov/
Description
This report contains a brief review of the sampling weight calibration methodology used for the 2018 National Survey on Drug Use and Health (NSDUH). This report also lists detailed documentation on the implementation steps and evaluation results from the weight calibration application. The constrained exponential modeling (CEM) method used in the surveys before 1999 (referred to in this report as the generalized exponential model [GEM]) was modified to provide more flexibility in dealing internally with the extreme weights and for setting bounds directly on the weight adjustment factors so they can become suitable for nonresponse (nr) and poststratification (ps) adjustments.