100+ datasets found

f
Comparison of variable weights from 2 datasets for grizzly bear local...
figshare.com
xls
Updated Jun 1, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Tabitha A. Graves; J. Andrew Royle; Katherine C. Kendall; Paul Beier; Jeffrey B. Stetz; Amy C. Macleod (2023). Comparison of variable weights from 2 datasets for grizzly bear local abundance. [Dataset]. http://doi.org/10.1371/journal.pone.0049410.t003
Explore at:
xlsAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0049410.t003
Dataset updated
Jun 1, 2023
Dataset provided by
PLOS ONE
Authors
Tabitha A. Graves; J. Andrew Royle; Katherine C. Kendall; Paul Beier; Jeffrey B. Stetz; Amy C. Macleod
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Importance (weight) of variables influencing grizzly bear abundance in northwestern Montana, USA, in 2004. Only candidate variables for abundance, not detection, are shown. Weights for variables that were in the model ≥50% of iterations are in bold. Data include only cells with both types of sampling. HT = Hair Trap, BR = Bear Rub. See Graves et al. (In Review) for more details on specific variables. We did not include further details to maintain focus on the influence of different detection methods.1Experts assigned a value 1–10 to ownership categories based on efforts to protect bears including 1) attractant storage management, 2) enforcement of food storage regulations, and 3) road density and use management. Glacier National Park = 10, US Forest Service = 7, other public land = 3, and private = 1.
c
Health Survey for England, 2002: Teaching Dataset
datacatalogue.cessda.eu
beta.ukdataservice.ac.uk
Updated Nov 28, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
University of Manchester, Cathie Marsh Centre for Census and Survey Research (2024). Health Survey for England, 2002: Teaching Dataset [Dataset]. http://doi.org/10.5255/UKDA-SN-5033-1
Explore at:
Unique identifier
https://doi.org/10.5255/UKDA-SN-5033-1
Dataset updated
Nov 28, 2024
Dataset provided by
ESDS Government
Authors
University of Manchester, Cathie Marsh Centre for Census and Survey Research
Time period covered
Jan 1, 2002 - Mar 1, 2002
Area covered
England
Variables measured
Individuals, National
Measurement technique
Face-to-face interview, Self-completion, Clinical measurements, Physical measurements, - original data; transcription of existing materials - teaching dataset
Description
Abstract copyright UK Data Service and data collection copyright owner.

The Health Survey for England (HSE), 2002: Teaching Dataset has been prepared solely for the purpose of teaching and student use. The dataset will help class tutors to incorporate empirical data into their courses and thus to develop students’ skills in quantitative methods of analysis.

All the variables and value labels are those used in the original HSE files, with one exception (New-wt) which is a new weighting variable.

Users may be interested in the Guide to using SPSS for Windows available from Online statistical guides and which explores this dataset.

The original HSE 2002 dataset is held at the UK Data Archive under SN 4912.

Main Topics:

The HSE, 2002 : Teaching Dataset includes 60 variables, and only the 9,281 cases from the general population sample; the boost sample cases of young people aged 0-24 and mothers of children aged under one year are excluded. Most of the variables contained within the dataset are individual ones, and require individual based analysis. However, there are a number of household-level variables included such as ‘TenureB’ and ‘Hhsize’. The dataset contains a mix of discrete and continuous variables and all, apart from the weighting variable 'New_wt', are taken directly from the HSE 2002 dataset deposited at UKDA. The variable names on the Teaching Dataset correspond directly to those on the 2002 HSE dataset.

Topics covered include demographic characteristics, illness and general health, recent periods of sickness, medication used, contraception, smoking, alcohol use, consumption of fruit and vegetables, General Health Questionnaire (GHQ12) score, height, weight, body mass index (BMI), waist-hip ratio and blood pressure measurement.

Standard Measures
The General Health Questionnaire (GHQ12), which has 12 items, is used widely to screen for psycho-social disorders. It asks questions about general level of happiness, depression, anxiety and self-confidence. A score of four or more has been used to identify potential psychological disorder.
d
Replication Data for: Countering Non-Ignorable Nonresponse in Survey Models...
search.dataone.org
dataverse.harvard.edu
Updated Sep 24, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Bailey, Michael (2024). Replication Data for: Countering Non-Ignorable Nonresponse in Survey Models with Randomized Response Instruments and Doubly Robust Estimation [Dataset]. http://doi.org/10.7910/DVN/L2NVRD
Explore at:
Unique identifier
https://doi.org/10.7910/DVN/L2NVRD
Dataset updated
Sep 24, 2024
Dataset provided by
Harvard Dataverse
Authors
Bailey, Michael
Description
Conventional survey tools such as weighting do not address non-ignorable nonresponse that occurs when nonresponse depends on the variable being measured. This paper describes non-ignorable nonresponse weighting and imputation models using randomized response instruments, which are variables that affect response but not the outcome of interest \citep{SunEtal2018}. The paper uses a doubly robust estimator that is valid if one, but not necessarily both, of the weighting and imputation models is correct. When applied to a national 2019 survey, these tools produce estimates that suggest there was non-trivial non-ignorable nonresponse related to turnout, and, for subgroups, Trump approval and policy questions. For example, the conventional MAR-based weighted estimates of Trump support in the Midwest were 10 percentage points lower than the MNAR-based estimates. Data to replicate estimation described in "Countering Non-Ignorable Nonresponse in Survey Models with Randomized Response Instruments and Doubly Robust Estimation"
g
American Time Use Survey, 2005 - Archival Version
search.gesis.org
Updated Jan 14, 2008
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
ICPSR - Interuniversity Consortium for Political and Social Research (2008). American Time Use Survey, 2005 - Archival Version [Dataset]. http://doi.org/10.3886/ICPSR04709
Explore at:
Unique identifier
https://doi.org/10.3886/ICPSR04709
Dataset updated
Jan 14, 2008
Dataset provided by
GESIS search
ICPSR - Interuniversity Consortium for Political and Social Research
License
https://search.gesis.org/research_data/datasearch-httpwww-da-ra-deoaip--oaioai-da-ra-de438965https://search.gesis.org/research_data/datasearch-httpwww-da-ra-deoaip--oaioai-da-ra-de438965
Description
Abstract (en): The American Time Use Survey (ATUS) collects information on how people living in the United States spend their time. Data collected in this study measured the amount of time that people spent doing various activities in 2005, such as paid work, child care, religious activities, volunteering, and socializing. Respondents were randomly selected from households that had completed their final month of the Current Population Survey (CPS), and were interviewed two to five months after their household's last CPS interview. Respondents were interviewed only once and reported their activities for the 24-hour period from 4 a.m. on the day before the interview until 4 a.m. on the day of the interview. Respondents indicated the total number of minutes spent on each activity, including where they were and whom they were with. Except for secondary child care, data on activities done simultaneously with primary activities were not collected. Part 1, Respondent and Activity Summary File, contains demographic information about respondents and a summary of the total amount of time they spent doing each activity that day. Part 2, Roster File, contains information about household members and nonhousehold children under the age of 18. Part 3, Activity File, includes additional information on activities in which respondents participated, including the location of each activity and the total time spent on secondary child care. Part 4, Who File, includes data on who was present during each activity. Part 5, ATUS-CPS 2005 File, contains data on respondents and members of their household collected two to five months prior to the ATUS interviews during their participation in the Current Population Survey (CPS). Parts 6-10 contain supplemental data files that can be used for further analysis of the data. Part 6, Case History File, contains information about the interview process, such as identifiers and interview outcome codes. Part 7, Call History File, gives information about each call attempt, including the call date and outcome. Part 8, Trips File, provides information about the number, duration, and purpose of overnight trips away from home for two or more nights in a row. Part 9, Replicate Weights File I, contains base weights, replicated base weights, and replicate final weights for each case that was selected to be interviewed for ATUS, while Part 10, Replicate Weights File II, contains replicate weights that were generated using the 2006 weighting method. Demographic variables include sex, age, race, ethnicity, education level, income, employment status, occupation, citizenship status, country of origin, relationship to household members, and the ages and number of children in the household. The data contain weight variables which should be used in analyzing the data. Unweighted data are not representative of the population due to differences between population groups in both sampling and nonresponse. ATUS weight variables include the ATUS final weight (TUFINLWGT), which indicates the number of person-days the respondent represents, the ATUS base weight (TUBWGT), and a ATUS final weight based on 2006 weighting methodology (TU06FWGT). ATUS weights were selected from the Current Population Survey (CPS), and CPS weights (after the first-stage adjustment) are the basis for the ATUS weights. These base weights were adjusted to account for the fact that less populous states were not oversampled in ATUS, as they were in the CPS. Further adjustments were made to account for the probability of selecting each household within the ATUS sampling strata and the probability of selecting each person from each sample household. Part 9 contains replicate weights for the variable TUFINLWGT, as well as base weights, while Part 10 contains replicate weights for the variable TU06FWGT. ATUS replicate weights were based on the replicate weights developed for the CPS. ATUS began with the CPS replicate weight after the first-stage ratio adjustment, and each replicate was processed through all of the stages of the ATUS weighting procedure. The CPS replicate weights were based on a modified balanced half-sample method of replication, developed in the 1980s by Robert Fay. For more information about the replicate weights, see the publication, Technical Paper 63RV: Current Population Survey -- Design and Methodology, available via the Bureau of Labor Statistics Web site. More information on the weighting variables used in this study can be found in t...
d
Replication Data for: Multilevel calibration weighting for survey data
dataone.org
dataverse.harvard.edu
Updated Nov 8, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ben-Michael, Eli; Feller, Avi; Hartman, Erin (2023). Replication Data for: Multilevel calibration weighting for survey data [Dataset]. http://doi.org/10.7910/DVN/J7BSXQ
Explore at:
Unique identifier
https://doi.org/10.7910/DVN/J7BSXQ
Dataset updated
Nov 8, 2023
Dataset provided by
Harvard Dataverse
Authors
Ben-Michael, Eli; Feller, Avi; Hartman, Erin
Description
In the November 2016 U.S. presidential election, many state level public opinion polls, particularly in the Upper Midwest, incorrectly predicted the winning candidate. One leading explanation for this polling miss is that the precipitous decline in traditional polling response rates led to greater reliance on statistical methods to adjust for the corresponding bias---and that these methods failed to adjust for important interactions between key variables like educational attainment, race, and geographic region. Finding calibration weights that account for important interactions remains challenging with traditional survey methods: raking typically balances the margins alone, while post-stratification, which exactly balances all interactions, is only feasible for a small number of variables. In this paper, we propose multilevel calibration weighting, which enforces tight balance constraints for marginal balance and looser constraints for higher-order interactions. This incorporates some of the benefits of post-stratification while retaining the guarantees of raking. We then correct for the bias due to the relaxed constraints via a flexible outcome model; we call this approach Double Regression with Post-stratification (DRP). We use these tools to to re-assess a large-scale survey of voter intention in the 2016 U.S. presidential election, finding meaningful gains from the proposed methods. The approach is available in the multical R package. Contains replication materials for "Multilevel calibration weighting for survey data", including raw data, scripts to clean the raw data, scripts to replicate the analysis, and scripts to replicate the simulation study.
f
IVFNs of linguistic variables for importance weights of criteria.
figshare.com
plos.figshare.com
xls
Updated May 31, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Sanghoon Lee; Daekook Kang (2023). IVFNs of linguistic variables for importance weights of criteria. [Dataset]. http://doi.org/10.1371/journal.pone.0219739.t003
Explore at:
xlsAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0219739.t003
Dataset updated
May 31, 2023
Dataset provided by
PLOS ONE
Authors
Sanghoon Lee; Daekook Kang
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
IVFNs of linguistic variables for importance weights of criteria.
f
Weights to different model variables by experts (E1–E5).
plos.figshare.com
xls
Updated Jun 1, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Chinmaya S. Rathore; Yogesh Dubey; Anurag Shrivastava; Prasad Pathak; Vinayak Patil (2023). Weights to different model variables by experts (E1–E5). [Dataset]. http://doi.org/10.1371/journal.pone.0039996.t009
Explore at:
xlsAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0039996.t009
Dataset updated
Jun 1, 2023
Dataset provided by
PLOS ONE
Authors
Chinmaya S. Rathore; Yogesh Dubey; Anurag Shrivastava; Prasad Pathak; Vinayak Patil
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Weights to different model variables by experts (E1–E5).
Z
Data from: Viral Communication Phase I-II
data.niaid.nih.gov
Updated Dec 2, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Wagoner, Brady (2021). Viral Communication Phase I-II [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_4946139
Explore at:
Dataset updated
Dec 2, 2021
Dataset provided by
Wagoner, Brady
Jensen, Eric Allen
Herbig, Lisa
Lorenz, Lars
Pfleger, Axel
Watzlawik, Meike
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This dataset represents the anonymised data collected as part of the Viral Communication (Understand-ELSED) project. It includes the two measurements, Phase I (30 October 2020 and 14 December 2020) and Phase II (2 March 2021 and 22 March 2021).

As the dataset involves two measurements (i.e., Phase I and Phase II), it is split into two sections, each of which can be identified by looking at the variable names. Variables corresponding to the Phase I survey will have the prefix “PHASE1_”, while variables from the Phase II survey will have the prefix “PHASE2_”. Exceptions to this are the socio-demographic variables from the main section of the Phase I survey.

Each questionnaire was split into a main and an opt-in section, the cut-off points of which are located after the variables PHASE1_OI_AQ and PHASE2_OI_AQ, respectively. Furthermore, two sets of two weighting variables were calculated. The first set includes weights for analyses only involving Phase I variables, while the second set contains weights for analyses involving any Phase II variables. The corresponding variable labels specify how to use any of the weighting variables, which are located in positions 2 through 5.

In the Phase II survey, we included two experimental setups. For the vaccination origin experiment, we included a grouping variable, PHASE2_HM_VACC_GROUP. The same was done for the risk assessment experiment, with PHASE2_RA_INF_GROUP being the grouping variable.
Health Survey Northern Ireland, 2010-2011: Adult BMI Dataset
datacatalogue.cessda.eu
beta.ukdataservice.ac.uk
Updated Nov 29, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Department of Health (Northern Ireland) (2024). Health Survey Northern Ireland, 2010-2011: Adult BMI Dataset [Dataset]. http://doi.org/10.5255/UKDA-SN-9221-1
Explore at:
Unique identifier
https://doi.org/10.5255/UKDA-SN-9221-1
Dataset updated
Nov 29, 2024
Dataset provided by
Department of Healthhttp://www.health-ni.gov.uk/
Authors
Department of Health (Northern Ireland)
Time period covered
Mar 31, 2010 - Mar 30, 2011
Area covered
Northern Ireland
Variables measured
Individuals, National
Measurement technique
Physical measurements and tests, Face-to-face interview: Computer-assisted (CAPI/CAMI), Telephone interview: Computer-assisted (CATI)
Description
Abstract copyright UK Data Service and data collection copyright owner.
The Health Survey Northern Ireland (HSNI) was commissioned by the Department of Health in Northern Ireland and the Central Survey Unit (CSU) of the Northern Ireland Statistics and Research Agency (NISRA) carried out the survey on their behalf. This survey series has been running on a continuous basis since April 2010 with separate modules for different policy areas included in different financial years. It covers a range of health topics that are important to the lives of people in Northern Ireland. The HSNI replaces the previous Northern Ireland Health and Social Wellbeing Survey (available under SNs 4589, 4590 and 5710).
Adult BMI, height and weight measurements, accompanying demographic and derived variables, geography, and a BMI weighting variable, are available in separate datasets for each survey year.
Further information is available from the Northern Ireland Statistics and Research Agency and the Department of Health (Northern Ireland) survey webpages.

Data gathered in the HSNI 2010-2011. Variables include measured height and weight, calculated BMI including groupings, age, sex and geography.

Main Topics:
c
Active Lives Children and Young People Survey, 2018-2019
datacatalogue.cessda.eu
beta.ukdataservice.ac.uk
Updated Nov 29, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Sport England (2024). Active Lives Children and Young People Survey, 2018-2019 [Dataset]. http://doi.org/10.5255/UKDA-SN-8854-2
Explore at:
Unique identifier
https://doi.org/10.5255/UKDA-SN-8854-2
Dataset updated
Nov 29, 2024
Authors
Sport England
Area covered
England
Variables measured
Individuals, National
Measurement technique
Web-based interview
Description
Abstract copyright UK Data Service and data collection copyright owner.
The Active Lives Children and Young People Survey, which was established in September 2017, provides a world-leading approach to gathering data on how children engage with sport and physical activity. This school-based survey is the first and largest established physical activity survey with children and young people in England. It gives anyone working with children aged 5-16 key insight to help understand children's attitudes and behaviours around sport and physical activity. The results will shape and influence local decision-making as well as inform government policy on the PE and Sport Premium, Childhood Obesity Plan and other cross-departmental programmes. More general information about the study can be found on the Sport England Active Lives Survey webpage and the Active Lives Online website, including reports and data tables.

The Active Lives Children and Young People Survey, 2018-2019 was conducted during school academic year 2018 / 2019. It ran from autumn term 2018 to summer term 2019 and excludes school holidays. The survey identifies how participation varies across different activities and sports, by regions of England, between school types and terms, and between different demographic groups in the population. The survey measures levels of activity (active, fairly active and less active), attitudes towards sport and physical activity, swimming capability, the proportion of children and young people that volunteer in sport, sports spectating, and wellbeing measures such as happiness and life satisfaction. The questionnaire was designed to enable analysis of the findings by a broad range of variables, such as gender, family affluence and school year.

The following datasets are available:
Main dataset: includes responses from children and young people from school years 3 to 11, as well as responses from parents of children in years 1-2. The parents of children in years 1-2 provide behavioural answers about their child’s activity levels, they do not provide attitudinal information. Using this main dataset, full analyses can be carried out into sports and physical activity participation, levels of activity, volunteering (years 5 to 11), etc. Weighting is required when using this dataset (wt_gross / wt_set1.csplan files are available for SPSS users who can utilise them).
Year 1-2 pupil dataset: includes responses from children in school years 1-2 directly, providing their attitudinal responses (e.g. whether they like playing sport and find it easy). Analysis can be carried out into feelings towards swimming, enjoyment for being active, happiness etc. Weighting is required when using this dataset (wt_gross / wt_set1.csplan files are available for SPSS users who can utilise them).
Teacher dataset: includes responses from the teachers at schools selected for the survey. Analysis can be carried out into school facilities available, length of PE lessons, whether swimming lessons are offered, etc. Weighting was formerly not available, however, as Sport England have started to publish the Teacher data, from December 2023 we decide to apply weighting to the data. The Teacher dataset now includes weighting by applying the ‘wt_teacher’ weighting variable.
For further information about the variables available for analysis, and the relevant school years asked survey questions, please see the supporting documentation. Please read the documentation before using the datasets.
Latest edition information
For the second edition (January 2024), the Teacher dataset now includes a weighting variable (‘wt_teacher’). Previously, weighting was not available for these data.

Main Topics:
Topics covered in the Active Lives Children and Young People Survey include:
Sport and physical activity participation
Well-being
Health
d
Replication Data for: \"Sensitivity Analysis for Survey Weights\"
search.dataone.org
dataverse.harvard.edu
Updated Nov 8, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Hartman, Erin; Huang, Melody (2023). Replication Data for: \"Sensitivity Analysis for Survey Weights\" [Dataset]. http://doi.org/10.7910/DVN/YJSJEX
Explore at:
Unique identifier
https://doi.org/10.7910/DVN/YJSJEX
Dataset updated
Nov 8, 2023
Dataset provided by
Harvard Dataverse
Authors
Hartman, Erin; Huang, Melody
Description
Survey weighting allows researchers to account for bias in survey samples, due to unit nonresponse or convenience sampling, using measured demographic covariates. Unfortunately, in practice, it is impossible to know whether the estimated survey weights are sufficient to alleviate concerns about bias due to unobserved confounders or incorrect functional forms used in weighting. In the following paper, we propose two sensitivity analyses for the exclusion of important covariates: (1) a sensitivity analysis for partially observed confounders (i.e., variables measured across the survey sample, but not the target population), and (2) a sensitivity analysis for fully unobserved confounders (i.e., variables not measured in either the survey or the target population). We provide graphical and numerical summaries of the potential bias that arises from such confounders, and introduce a benchmarking approach that allows researchers to quantitatively reason about the sensitivity of their results. We demonstrate our proposed sensitivity analyses using state-level 2020 U.S. Presidential Election polls.
r
All ESI ACS Matched Weights
redivis.com
Updated Apr 7, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Stanford Center for Population Health Sciences (2025). All ESI ACS Matched Weights [Dataset]. https://redivis.com/datasets/6f7e-cxanam2b8/tables
Explore at:
Dataset updated
Apr 7, 2025
Dataset authored and provided by
Stanford Center for Population Health Sciences
Time period covered
2008 - 2022
Description
The table All ESI ACS Matched Weights is part of the dataset Weighting Techniques for Large Private Claims Data, available at https://redivis.com/datasets/6f7e-cxanam2b8. It contains 540187074 rows across 6 variables.
Age, Weight, Height, BMI Analysis
kaggle.com
Updated Sep 1, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ruken Missonnier (2023). Age, Weight, Height, BMI Analysis [Dataset]. https://www.kaggle.com/datasets/rukenmissonnier/age-weight-height-bmi-analysis
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Sep 1, 2023
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Ruken Missonnier
Description
Dataset Description

The dataset in question comprises 741 individual records, each meticulously documented with the following attributes:

Age (in years): This field quantifies the age of each individual, denominated in years. It serves as a chronological reference for the dataset.

Height (in meters): The "Height" column provides measurements of the subjects' stature in meters. This standardized unit allows for precise representation and comparison of individuals' heights.

Weight (in kilograms): In the "Weight" column, the weights of the subjects are quantified in kilograms. This unit ensures consistency and accuracy in measuring the subjects' mass.

BMI (Body Mass Index): Derived from the height and weight columns, the BMI column computes the Body Mass Index of each individual. The calculation utilizes the formula: BMI = (Weight in kg) / (Height in m^2). BMI is a vital numerical indicator used for categorizing individuals based on their weight relative to their height. It is expressed as a continuous variable.

BmiClass: The "BmiClass" column categorizes individuals based on their calculated BMI values. The categories include "Obese Class 1," "Overweight," "Underweight," among others. These classifications are instrumental in health and weight analysis.

Furthermore, it is noteworthy that this dataset exhibits a high degree of data integrity, with no missing values across any of the aforementioned columns. Such completeness enhances its utility for advanced data analytics and visualization, enabling rigorous exploration of relationships between age, height, weight, BMI, and associated weight classifications.
d
Current Population Survey (CPS)
search.dataone.org
Updated Nov 21, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Damico, Anthony (2023). Current Population Survey (CPS) [Dataset]. http://doi.org/10.7910/DVN/AK4FDD
Explore at:
Unique identifier
https://doi.org/10.7910/DVN/AK4FDD
Dataset updated
Nov 21, 2023
Dataset provided by
Harvard Dataverse
Authors
Damico, Anthony
Description
analyze the current population survey (cps) annual social and economic supplement (asec) with r the annual march cps-asec has been supplying the statistics for the census bureau's report on income, poverty, and health insurance coverage since 1948. wow. the us census bureau and the bureau of labor statistics ( bls) tag-team on this one. until the american community survey (acs) hit the scene in the early aughts (2000s), the current population survey had the largest sample size of all the annual general demographic data sets outside of the decennial census - about two hundred thousand respondents. this provides enough sample to conduct state- and a few large metro area-level analyses. your sample size will vanish if you start investigating subgroups b y state - consider pooling multiple years. county-level is a no-no. despite the american community survey's larger size, the cps-asec contains many more variables related to employment, sources of income, and insurance - and can be trended back to harry truman's presidency. aside from questions specifically asked about an annual experience (like income), many of the questions in this march data set should be t reated as point-in-time statistics. cps-asec generalizes to the united states non-institutional, non-active duty military population. the national bureau of economic research (nber) provides sas, spss, and stata importation scripts to create a rectangular file (rectangular data means only person-level records; household- and family-level information gets attached to each person). to import these files into r, the parse.SAScii function uses nber's sas code to determine how to import the fixed-width file, then RSQLite to put everything into a schnazzy database. you can try reading through the nber march 2012 sas importation code yourself, but it's a bit of a proc freak show. this new github repository contains three scripts: 2005-2012 asec - download all microdata.R down load the fixed-width file containing household, family, and person records import by separating this file into three tables, then merge 'em together at the person-level download the fixed-width file containing the person-level replicate weights merge the rectangular person-level file with the replicate weights, then store it in a sql database create a new variable - one - in the data table 2012 asec - analysis examples.R connect to the sql database created by the 'download all microdata' progr am create the complex sample survey object, using the replicate weights perform a boatload of analysis examples replicate census estimates - 2011.R connect to the sql database created by the 'download all microdata' program create the complex sample survey object, using the replicate weights match the sas output shown in the png file below 2011 asec replicate weight sas output.png statistic and standard error generated from the replicate-weighted example sas script contained in this census-provided person replicate weights usage instructions document. click here to view these three scripts for more detail about the current population survey - annual social and economic supplement (cps-asec), visit: the census bureau's current population survey page the bureau of labor statistics' current population survey page the current population survey's wikipedia article notes: interviews are conducted in march about experiences during the previous year. the file labeled 2012 includes information (income, work experience, health insurance) pertaining to 2011. when you use the current populat ion survey to talk about america, subract a year from the data file name. as of the 2010 file (the interview focusing on america during 2009), the cps-asec contains exciting new medical out-of-pocket spending variables most useful for supplemental (medical spending-adjusted) poverty research. confidential to sas, spss, stata, sudaan users: why are you still rubbing two sticks together after we've invented the butane lighter? time to transition to r. :D
Data from: National Health and Nutrition Examination Survey (NHANES),...
icpsr.umich.edu
ascii, delimited, sas +2
Updated Feb 22, 2012
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
United States Department of Health and Human Services. Centers for Disease Control and Prevention. National Center for Health Statistics (2012). National Health and Nutrition Examination Survey (NHANES), 1999-2000 [Dataset]. http://doi.org/10.3886/ICPSR25501.v4
Explore at:
delimited, spss, ascii, sas, stataAvailable download formats
Unique identifier
https://doi.org/10.3886/ICPSR25501.v4
Dataset updated
Feb 22, 2012
Dataset provided by
Inter-university Consortium for Political and Social Researchhttps://www.icpsr.umich.edu/web/pages/
Authors
United States Department of Health and Human Services. Centers for Disease Control and Prevention. National Center for Health Statistics
License
https://www.icpsr.umich.edu/web/ICPSR/studies/25501/termshttps://www.icpsr.umich.edu/web/ICPSR/studies/25501/terms
Time period covered
1999 - 2000
Area covered
United States
Description
The National Health and Nutrition Examination Surveys (NHANES) is a program of studies designed to assess the health and nutritional status of adults and children in the United States. The NHANES combines personal interviews and physical examinations, which focus on different population groups or health topics. These surveys have been conducted by the National Center for Health Statistics (NCHS) on a periodic basis from 1971 to 1994. In 1999 the NHANES became a continuous program with a changing focus on a variety of health and nutrition measurements which were designed to meet current and emerging concerns. The surveys examine a nationally representative sample of approximately 5,000 persons each year. These persons are located in counties across the United States, 15 of which are visited each year. The 1999-2000 NHANES contains data for 9,965 individuals (and MEC examined sample size of 9,282) of all ages. Many questions that were asked in NHANES II, 1976-1980, Hispanic HANES 1982-1984, and NHANES III, 1988-1994, were combined with new questions in the NHANES 1999-2000. The 1999-2000 NHANES collected data on the prevalence of selected chronic conditions and diseases in the population and estimates for previously undiagnosed conditions, as well as those known to and reported by respondents. Risk factors, those aspects of a person's lifestyle, constitution, heredity, or environment that may increase the chances of developing a certain disease or condition, were examined. Data on smoking, alcohol consumption, sexual practices, drug use, physical fitness and activity, weight, and dietary intake were collected. Information on certain aspects of reproductive health, such as use of oral contraceptives and breastfeeding practices, were also collected. The interview includes demographic, socioeconomic, dietary, and health-related questions. The examination component consists of medical, dental, and physiological measurements, as well as laboratory tests. Demographic data file variables are grouped into three broad categories: (1) Status Variables: Provide core information on the survey participant. Examples of the core variables include interview status, examination status, and sequence number. (Sequence number is a unique ID assigned to each sample person and is required to match the information on this demographic file to the rest of the NHANES 1999-2000 data). (2) Recoded Demographic Variables: The variables include age (age in months for persons through age 19 years, 11 months; age in years for 1-84 year olds, and a top-coded age group of 85+ years), gender, a race/ethnicity variable, an education variable (high school, and more than high school education), country of birth (United States, Mexico, or other foreign born), and pregnancy status variable. Some of the groupings were made due to limited sample sizes for the two-year dataset. (3) Interview and Examination Sample Weight Variables: Sample weights are available for analyzing NHANES 1999-2000 data. For a complete listing of survey contents for all years of the NHANES see the document -- Survey Content -- NHANES 1999-2010.
n
Data from: Using multiple imputation to estimate missing data in...
data.niaid.nih.gov
datadryad.org
+1more
zip
Updated Nov 25, 2015
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
E. Hance Ellington; Guillaume Bastille-Rousseau; Cayla Austin; Kristen N. Landolt; Bruce A. Pond; Erin E. Rees; Nicholas Robar; Dennis L. Murray (2015). Using multiple imputation to estimate missing data in meta-regression [Dataset]. http://doi.org/10.5061/dryad.m2v4m
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.5061/dryad.m2v4m
Dataset updated
Nov 25, 2015
Dataset provided by
University of Prince Edward Island
Trent University
Authors
E. Hance Ellington; Guillaume Bastille-Rousseau; Cayla Austin; Kristen N. Landolt; Bruce A. Pond; Erin E. Rees; Nicholas Robar; Dennis L. Murray
License
https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html
Description
There is a growing need for scientific synthesis in ecology and evolution. In many cases, meta-analytic techniques can be used to complement such synthesis. However, missing data is a serious problem for any synthetic efforts and can compromise the integrity of meta-analyses in these and other disciplines. Currently, the prevalence of missing data in meta-analytic datasets in ecology and the efficacy of different remedies for this problem have not been adequately quantified. 2. We generated meta-analytic datasets based on literature reviews of experimental and observational data and found that missing data were prevalent in meta-analytic ecological datasets. We then tested the performance of complete case removal (a widely used method when data are missing) and multiple imputation (an alternative method for data recovery) and assessed model bias, precision, and multi-model rankings under a variety of simulated conditions using published meta-regression datasets. 3. We found that complete case removal led to biased and imprecise coefficient estimates and yielded poorly specified models. In contrast, multiple imputation provided unbiased parameter estimates with only a small loss in precision. The performance of multiple imputation, however, was dependent on the type of data missing. It performed best when missing values were weighting variables, but performance was mixed when missing values were predictor variables. Multiple imputation performed poorly when imputing raw data which was then used to calculate effect size and the weighting variable. 4. We conclude that complete case removal should not be used in meta-regression, and that multiple imputation has the potential to be an indispensable tool for meta-regression in ecology and evolution. However, we recommend that users assess the performance of multiple imputation by simulating missing data on a subset of their data before implementing it to recover actual missing data.
f
WCD_CSV
springernature.figshare.com
zip
Updated May 7, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Francesco Lamperti; Giorgio Fagiolo; Marco Gortan; Lorenzo Testa (2024). WCD_CSV [Dataset]. http://doi.org/10.6084/m9.figshare.25464133.v1
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.25464133.v1
Dataset updated
May 7, 2024
Dataset provided by
figshare
Authors
Francesco Lamperti; Giorgio Fagiolo; Marco Gortan; Lorenzo Testa
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Description
This zip file contains a folder with two layers, where the first corresponds to a choice of geographical resolutions, and the second discriminates among the climate variables. Each dataset, saved in csv, is organized in wide format, where the first column refers to the month (or the day), and the remaining columns, which are identified by the GADM code of the geographical units, contain the values of the weighted climate variable.
d
Data from: Area- and Depth-Weighted Averages of Selected SSURGO Variables...
catalog.data.gov
data.usgs.gov
+1more
Updated Sep 18, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
U.S. Geological Survey (2024). Area- and Depth-Weighted Averages of Selected SSURGO Variables for the Conterminous United States and District of Columbia [Dataset]. https://catalog.data.gov/dataset/area-and-depth-weighted-averages-of-selected-ssurgo-variables-for-the-conterminous-united-
Explore at:
Dataset updated
Sep 18, 2024
Dataset provided by
United States Geological Surveyhttp://www.usgs.gov/
Area covered
Contiguous United States, Washington, United States
Description
This digital data release consists of seven national data files of area- and depth-weighted averages of select soil attributes for every available county in the conterminous United States and the District of Columbia as of March 2014. The files are derived from Natural Resources Conservations Service’s (NRCS) Soil Survey Geographic database (SSURGO). The data files can be linked to the raster datasets of soil mapping unit identifiers (MUKEY) available through the NRCS’s Gridded Soil Survey Geographic (gSSURGO) database (http://www.nrcs.usda.gov/wps/portal/nrcs/detail/soils/survey/geo/?cid=nrcs142p2_053628). The associated files, named DRAINAGECLASS, HYDRATING, HYDGRP, HYDRICCONDITION, LAYER, TEXT, and WTDEP are area- and depth-weighted average values for selected soil characteristics from the SSURGO database for the conterminous United States and the District of Columbia. The SSURGO tables were acquired from the NRCS on March 5, 2014. The soil characteristics in the DRAINAGE table are drainage class (DRNCLASS), which identifies the natural drainage conditions of the soil and refers to the frequency and duration of wet periods. The soil characteristics in the HYDRATING table are hydric rating (HYDRATE), a yes/no field that indicates whether or not a map unit component is classified as a "hydric soil". The soil characteristics in the HYDGRP table are the percentages for each hydrologic group per MUKEY. The soil characteristics in the HYDRICCONDITION table are hydric condition (HYDCON), which describes the natural condition of the soil component. The soil characteristics in the LAYER table are available water capacity (AVG_AWC), bulk density (AVG_BD), saturated hydraulic conductivity (AVG_KSAT), vertical saturated hydraulic conductivity (AVG_KV), soil erodibility factor (AVG_KFACT), porosity (AVG_POR), field capacity (AVG_FC), the soil fraction passing a number 4 sieve (AVG_NO4), the soil fraction passing a number 10 sieve (AVG_NO10), the soil fraction passing a number 200 sieve (AVG_NO200), and organic matter (AVG_OM). The soil characteristics in the TEXT table are percent sand, silt, and clay (AVG_SAND, AVG_SILT, and AVG_CLAY). The soil characteristics in the WTDEP table are the annual minimum water table depth (WTDEP_MIN), available water storage in the 0-25 cm soil horizon (AWS025), the minimum water table depth for the months April, May and June (WTDEPAMJ), the available water storage in the first 25 centimeters of the soil horizon (AWS25), the dominant drainage class (DRCLSD), the wettest drainage class (DRCLSWET), and the hydric classification (HYDCLASS), which is an indication of the proportion of the map unit, expressed as a class, that is "hydric", based on the hydric classification of a given MUKEY. (See Entity_Description for more detail). The tables were created with a set of arc macro language (aml) and awk (awk was created at Bell Labsin the 1970s and its name is derived from the first letters of the last names of its authors – Alfred Aho, Peter Weinberger, and Brian Kernighan) scripts. Send an email to mewieczo@usgs.gov to obtain copies of the computer code (See Process_Description.) The methods used are outlined in NRCS's "SSURGO Data Packaging and Use" (NRCS, 2011). The tables can be related or joined to the gSSURGO rasters of MUKEYs by the item 'MUKEY.' Joining or relating the tables to a MUKEY grid allows the creation of grids of area- and depth-weighted soil characteristics. A 90-meter raster of MUKEYs is provided which can be used to produce rasters of soil attributes. More detailed resolution rasters are available through NRCS via the link above.
r
Large Employer CPS Matched Weights
redivis.com
Updated Apr 7, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Stanford Center for Population Health Sciences (2025). Large Employer CPS Matched Weights [Dataset]. https://redivis.com/datasets/6f7e-cxanam2b8/tables
Explore at:
Dataset updated
Apr 7, 2025
Dataset authored and provided by
Stanford Center for Population Health Sciences
Time period covered
2007 - 2022
Description
The table Large Employer CPS Matched Weights is part of the dataset Weighting Techniques for Large Private Claims Data, available at https://redivis.com/datasets/6f7e-cxanam2b8. It contains 572912944 rows across 8 variables.
r
CPS weights - full ESI
redivis.com
Updated Apr 13, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Sarah W. Hirsch (2025). CPS weights - full ESI [Dataset]. https://redivis.com/datasets/7vaj-f29v5z61r/usage
Explore at:
Dataset updated
Apr 13, 2025
Authors
Sarah W. Hirsch
Description
The table CPS weights - full ESI is part of the dataset MarketScan weights, available at https://redivis.com/datasets/7vaj-f29v5z61r. It contains 14912 rows across 8 variables.

Facebook

Twitter

Click to copy link

Link copied

Cite

Tabitha A. Graves; J. Andrew Royle; Katherine C. Kendall; Paul Beier; Jeffrey B. Stetz; Amy C. Macleod (2023). Comparison of variable weights from 2 datasets for grizzly bear local abundance. [Dataset]. http://doi.org/10.1371/journal.pone.0049410.t003

Comparison of variable weights from 2 datasets for grizzly bear local abundance.

Explore at:

xlsAvailable download formats

Unique identifier

https://doi.org/10.1371/journal.pone.0049410.t003

Dataset updated

Jun 1, 2023

Dataset provided by

PLOS ONE

Authors

Tabitha A. Graves; J. Andrew Royle; Katherine C. Kendall; Paul Beier; Jeffrey B. Stetz; Amy C. Macleod

License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

Importance (weight) of variables influencing grizzly bear abundance in northwestern Montana, USA, in 2004. Only candidate variables for abundance, not detection, are shown. Weights for variables that were in the model ≥50% of iterations are in bold. Data include only cells with both types of sampling. HT = Hair Trap, BR = Bear Rub. See Graves et al. (In Review) for more details on specific variables. We did not include further details to maintain focus on the influence of different detection methods.1Experts assigned a value 1–10 to ownership categories based on efforts to protect bears including 1) attractant storage management, 2) enforcement of food storage regulations, and 3) road density and use management. Glacier National Park = 10, US Forest Service = 7, other public land = 3, and private = 1.

Clear search

Close search

Google apps

Main menu

Comparison of variable weights from 2 datasets for grizzly bear local...

Health Survey for England, 2002: Teaching Dataset

Replication Data for: Countering Non-Ignorable Nonresponse in Survey Models...

American Time Use Survey, 2005 - Archival Version

Replication Data for: Multilevel calibration weighting for survey data

IVFNs of linguistic variables for importance weights of criteria.

Weights to different model variables by experts (E1–E5).

Data from: Viral Communication Phase I-II

Health Survey Northern Ireland, 2010-2011: Adult BMI Dataset

Active Lives Children and Young People Survey, 2018-2019

Replication Data for: \"Sensitivity Analysis for Survey Weights\"

All ESI ACS Matched Weights

Age, Weight, Height, BMI Analysis

Dataset Description

Current Population Survey (CPS)

Data from: National Health and Nutrition Examination Survey (NHANES),...

Data from: Using multiple imputation to estimate missing data in...

WCD_CSV

Data from: Area- and Depth-Weighted Averages of Selected SSURGO Variables...

Large Employer CPS Matched Weights

CPS weights - full ESI

Comparison of variable weights from 2 datasets for grizzly bear local abundance.