Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Calculation strategy for survey and population weighting of the data.
A random sample of households were invited to participate in this survey. In the dataset, you will find the respondent level data in each row with the questions in each column. The numbers represent a scale option from the survey, such as 1=Excellent, 2=Good, 3=Fair, 4=Poor. The question stem, response option, and scale information for each field can be found in the var "variable labels" and "value labels" sheets. VERY IMPORTANT NOTE: The scientific survey data were weighted, meaning that the demographic profile of respondents was compared to the demographic profile of adults in Bloomington from US Census data. Statistical adjustments were made to bring the respondent profile into balance with the population profile. This means that some records were given more "weight" and some records were given less weight. The weights that were applied are found in the field "wt". If you do not apply these weights, you will not obtain the same results as can be found in the report delivered to the Bloomington. The easiest way to replicate these results is likely to create pivot tables, and use the sum of the "wt" field rather than a count of responses.
The People and Nature Survey for England gathers information on people’s experiences and views about the natural environment, and its contributions to our health and wellbeing.
This publication reports a set of weighted national indicators (Official Statistics) from the survey, which have been generated using data collected in the first year (April 2020 - March 2021) from approx. 25,000 adults (16+).
These updated indicators have been generated using the specific People and Nature weight and can be directly compared with monthly indicators published from April 2021 onwards. See Technical methods and limitations for more information.
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Replication materials for the forthcoming publication entitled "Worth Weighting? How to Think About and Use Weights in Survey Experiments."
Demographic information for a Utah survey of mental health stigma used to generate different survey weights to match census data.
Learn about the techniques used to create weights for the 2022 National Survey on Drug Use and Health (NSDUH) at the pair and questionnaire dwelling unit (QDU) levels. NSDUH is designed so that some of the sampled households have both an adult and a youth respondent who are paired. Because of this, NSDUH allows for estimating characteristics at the person level, pair level, or QDU level. This report describes pair selection probabilities, the generalized exponential model (including predictor variables used), and the multiple weight components that are used for pair or QDU levels of analysis. An evaluation of the calibration weights is also included.Chapters:Introduces the report.Discusses the probability of selection for pairs and QDUs.Briefly describes of the generalized exponential model.Describes the predictor variables for the model calibration.Defines extreme weights.Discusses weight calibrations.Evaluates the calibration weights.Appendices include technical details about the model and the evaluations that were performed.
This report describes the person-level sampling weight calibration procedures used on the 2012 National Survey on Drug Use and Health (NSDUH). The report describes the practical aspects of implementing generalized exponential model (GEM) for the NSDUH.
Survey weighting allows researchers to account for bias in survey samples, due to unit nonresponse or convenience sampling, using measured demographic covariates. Unfortunately, in practice, it is impossible to know whether the estimated survey weights are sufficient to alleviate concerns about bias due to unobserved confounders or incorrect functional forms used in weighting. In the following paper, we propose two sensitivity analyses for the exclusion of important covariates: (1) a sensitivity analysis for partially observed confounders (i.e., variables measured across the survey sample, but not the target population), and (2) a sensitivity analysis for fully unobserved confounders (i.e., variables not measured in either the survey or the target population). We provide graphical and numerical summaries of the potential bias that arises from such confounders, and introduce a benchmarking approach that allows researchers to quantitatively reason about the sensitivity of their results. We demonstrate our proposed sensitivity analyses using state-level 2020 U.S. Presidential Election polls.
The page contains materials from the PHS Seminar on Weighting Techniques for Large Private Claims Data that was held on On October 24, 2024, as well as some additional documentation and the weights themselves.
On October 24, 2024, PHS hosted a Seminar on Weighting Techniques for Large Private Claims Data. Using the MarketScan Commercial Database as an example case, Social Scientist Sarah Hirsch discussed three schemes for weighting private claims data using US census-based surveys, and the associated methods and techniques. She provided researchers with the tools to implement these methodologies, or to formulate their own for other datasets.
We invite you to view the Recording of the Seminar to learn more about this topic! The slide deck and transcript are also available for reference.
We have also added some code scripts, a written description of the weighting process, and the final MarketScan weights. Some additional have also been made related to the following:
%3C!-- --%3E
In the November 2016 U.S. presidential election, many state level public opinion polls, particularly in the Upper Midwest, incorrectly predicted the winning candidate. One leading explanation for this polling miss is that the precipitous decline in traditional polling response rates led to greater reliance on statistical methods to adjust for the corresponding bias---and that these methods failed to adjust for important interactions between key variables like educational attainment, race, and geographic region. Finding calibration weights that account for important interactions remains challenging with traditional survey methods: raking typically balances the margins alone, while post-stratification, which exactly balances all interactions, is only feasible for a small number of variables. In this paper, we propose multilevel calibration weighting, which enforces tight balance constraints for marginal balance and looser constraints for higher-order interactions. This incorporates some of the benefits of post-stratification while retaining the guarantees of raking. We then correct for the bias due to the relaxed constraints via a flexible outcome model; we call this approach Double Regression with Post-stratification (DRP). We use these tools to to re-assess a large-scale survey of voter intention in the 2016 U.S. presidential election, finding meaningful gains from the proposed methods. The approach is available in the multical R package. Contains replication materials for "Multilevel calibration weighting for survey data", including raw data, scripts to clean the raw data, scripts to replicate the analysis, and scripts to replicate the simulation study.
This report describes the person-level sampling weight calibration procedures used on the 2011 National Survey on Drug Use and Health (NSDUH).
Learn about the techniques used to create weights for the 2022 National Survey on Drug Use and Health (NSDUH) at the person level. The report reviews the generalized exponential model (GEM) used in weighting, discusses potential predictor variables, and details the practical steps used to implement GEM. The report also details the weight calibrations, and presents the evaluation measures of the calibrations, as well as a sensitivity analysis.Chapters:Introduces the survey and the remainder of the report.Reviews the impact of multimode data collection on weighting.Briefly describes of the generalized exponential model.Describes the predictor variables for the model calibration.Defines extreme weights.Discusses control totals for poststratification adjustments.Discusses weight calibration at the dwelling unit level.Discusses weight calibration at the person level.Presents the evaluation measures of calibrated weights and a sensitivity analysis of selected prevalence estimates.Explains the break-off analysis weights.Appendices include technical details about the model and the evaluations that were performed.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Weighted sample proportions and demographic weighting values obtained from 2010 U.S. Census data.
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Conventional survey tools such as weighting do not address non-ignorable nonresponse that occurs when nonresponse depends on the variable being measured. This paper describes non-ignorable nonresponse weighting and imputation models using randomized response instruments, which are variables that affect response but not the outcome of interest \citep{SunEtal2018}. The paper uses a doubly robust estimator that is valid if one, but not necessarily both, of the weighting and imputation models is correct. When applied to a national 2019 survey, these tools produce estimates that suggest there was non-trivial non-ignorable nonresponse related to turnout, and, for subgroups, Trump approval and policy questions. For example, the conventional MAR-based weighted estimates of Trump support in the Midwest were 10 percentage points lower than the MNAR-based estimates. Data to replicate estimation described in "Countering Non-Ignorable Nonresponse in Survey Models with Randomized Response Instruments and Doubly Robust Estimation"
The City of Bloomington contracted with National Research Center, Inc. to conduct the 2019 Bloomington Community Survey. This was the second time a scientific citywide survey had been completed covering resident opinions on service delivery satisfaction by the City of Bloomington and quality of life issues. The first was in 2017. The survey captured the responses of 610 households from a representative sample of 3,000 residents of Bloomington who were randomly selected to complete the survey. VERY IMPORTANT NOTE: The scientific survey data were weighted, meaning that the demographic profile of respondents was compared to the demographic profile of adults in Bloomington from US Census data. Statistical adjustments were made to bring the respondent profile into balance with the population profile. This means that some records were given more "weight" and some records were given less weight. The weights that were applied are found in the field "wt". If you do not apply these weights, you will not obtain the same results as can be found in the report delivered to the City of Bloomington. The easiest way to replicate these results is likely to create pivot tables, and use the sum of the "wt" field rather than a count of responses.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Demographic report of weighted survey data (online).
This data package contains quality measures such as Air Quality, Austin Airport, LBB Performance Report, School Survey, Child Poverty, System International Units, Weight Measures, etc.
https://dbk.gesis.org/dbksearch/sdesc2.asp?no=7467https://dbk.gesis.org/dbksearch/sdesc2.asp?no=7467
Media use related to crime. Weighting of criminal offenses. Perception of safety.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Indices are created by consolidating multidimensional data into a single representative measure known as an index, using a fundamental mathematical model. Most present indices are essentially the averages or weighted averages of the variables under study, ignoring multicollinearity among the variables, with the exception of the existing Ordinary Least Squares (OLS) estimator based OLS-PCA index methodology. Many existing surveys adopt survey designs that incorporate survey weights, aiming to obtain a representative sample of the population while minimizing costs. Survey weights play a crucial role in addressing the unequal probabilities of selection inherent in complex survey designs, ensuring accurate and representative estimates of population parameters. However, the existing OLS-PCA based index methodology is designed for simple random sampling and is incapable of incorporating survey weights, leading to biased estimates and erroneous rankings that can result in flawed inferences and conclusions for survey data. To address this limitation, we propose a novel Survey Weighted PCA (SW-PCA) based Index methodology, tailored for survey-weighted data. SW-PCA incorporates survey weights, facilitating the development of unbiased and efficient composite indices, improving the quality and validity of survey-based research. Simulation studies demonstrate that the SW-PCA based index outperforms the OLS-PCA based index that neglects survey weights, indicating its higher efficiency. To validate the methodology, we applied it to a Household Consumer Expenditure Survey (HCES), NSS 68th Round survey data to construct a Food Consumption Index for different states of India. The result was significant improvements in state rankings when survey weights were considered. In conclusion, this study highlights the crucial importance of incorporating survey weights in index construction from complex survey data. The SW-PCA based Index provides a valuable solution, enhancing the accuracy and reliability of survey-based research, ultimately contributing to more informed decision-making.
This report contains a brief review of the sampling weight calibration methodology used for the 2018 National Survey on Drug Use and Health (NSDUH). This report also lists detailed documentation on the implementation steps and evaluation results from the weight calibration application. The constrained exponential modeling (CEM) method used in the surveys before 1999 (referred to in this report as the generalized exponential model [GEM]) was modified to provide more flexibility in dealing internally with the extreme weights and for setting bounds directly on the weight adjustment factors so they can become suitable for nonresponse (nr) and poststratification (ps) adjustments.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Calculation strategy for survey and population weighting of the data.