Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The National Health and Nutrition Examination Survey (NHANES) provides data and have considerable potential to study the health and environmental exposure of the non-institutionalized US population. However, as NHANES data are plagued with multiple inconsistencies, processing these data is required before deriving new insights through large-scale analyses. Thus, we developed a set of curated and unified datasets by merging 614 separate files and harmonizing unrestricted data across NHANES III (1988-1994) and Continuous (1999-2018), totaling 135,310 participants and 5,078 variables. The variables convey 1. demographics (281 variables), 2. dietary consumption (324 variables), 3. physiological functions (1,040 variables), 4. occupation (61 variables), 5. questionnaires (1444 variables, e.g., physical activity, medical conditions, diabetes, reproductive health, blood pressure and cholesterol, early childhood), 6. medications (29 variables), 7. mortality information linked from the National Death Index (15 variables), 8. survey weights (857 variables), 9. environmental exposure biomarker measurements (598 variables), and 10. chemical comments indicating which measurements are below or above the lower limit of detection (505 variables).
csv Data Record: The curated NHANES datasets and the data dictionaries includes 23 .csv files and 1 excel file. - The curated NHANES datasets involves 20 .csv formatted files, two for each module with one as the uncleaned version and the other as the cleaned version. The modules are labeled as the following: 1) mortality, 2) dietary, 3) demographics, 4) response, 5) medications, 6) questionnaire, 7) chemicals, 8) occupation, 9) weights, and 10) comments. - "dictionary_nhanes.csv" is a dictionary that lists the variable name, description, module, category, units, CAS Number, comment use, chemical family, chemical family shortened, number of measurements, and cycles available for all 5,078 variables in NHANES. - "dictionary_harmonized_categories.csv" contains the harmonized categories for the categorical variables. - “dictionary_drug_codes.csv” contains the dictionary for descriptors on the drugs codes. - “nhanes_inconsistencies_documentation.xlsx” is an excel file that contains the cleaning documentation, which records all the inconsistencies for all affected variables to help curate each of the NHANES modules.
R Data Record: For researchers who want to conduct their analysis in the R programming language, only cleaned NHANES modules and the data dictionaries can be downloaded as a .zip file which include an .RData file and an .R file. - “w - nhanes_1988_2018.RData” contains all the aforementioned datasets as R data objects. We make available all R scripts on customized functions that were written to curate the data. - “m - nhanes_1988_2018.R” shows how we used the customized functions (i.e. our pipeline) to curate the original NHANES data.
Example starter codes: The set of starter code to help users conduct exposome analysis consists of four R markdown files (.Rmd). We recommend going through the tutorials in order. - “example_0 - merge_datasets_together.Rmd” demonstrates how to merge the curated NHANES datasets together. - “example_1 - account_for_nhanes_design.Rmd” demonstrates how to conduct a linear regression model, a survey-weighted regression model, a Cox proportional hazard model, and a survey-weighted Cox proportional hazard model. - “example_2 - calculate_summary_statistics.Rmd” demonstrates how to calculate summary statistics for one variable and multiple variables with and without accounting for the NHANES sampling design. - “example_3 - run_multiple_regressions.Rmd” demonstrates how run multiple regression models with and without adjusting for the sampling design.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The National Health and Nutrition Examination Survey (NHANES) provides data and have considerable potential to study the health and environmental exposure of the non-institutionalized US population. However, as NHANES data are plagued with multiple inconsistencies, processing these data is required before deriving new insights through large-scale analyses. Thus, we developed a set of curated and unified datasets by merging 614 separate files and harmonizing unrestricted data across NHANES III (1988-1994) and Continuous (1999-2018), totaling 135,310 participants and 5,078 variables. The variables conveydemographics (281 variables),dietary consumption (324 variables),physiological functions (1,040 variables),occupation (61 variables),questionnaires (1444 variables, e.g., physical activity, medical conditions, diabetes, reproductive health, blood pressure and cholesterol, early childhood),medications (29 variables),mortality information linked from the National Death Index (15 variables),survey weights (857 variables),environmental exposure biomarker measurements (598 variables), andchemical comments indicating which measurements are below or above the lower limit of detection (505 variables).csv Data Record: The curated NHANES datasets and the data dictionaries includes 23 .csv files and 1 excel file.The curated NHANES datasets involves 20 .csv formatted files, two for each module with one as the uncleaned version and the other as the cleaned version. The modules are labeled as the following: 1) mortality, 2) dietary, 3) demographics, 4) response, 5) medications, 6) questionnaire, 7) chemicals, 8) occupation, 9) weights, and 10) comments."dictionary_nhanes.csv" is a dictionary that lists the variable name, description, module, category, units, CAS Number, comment use, chemical family, chemical family shortened, number of measurements, and cycles available for all 5,078 variables in NHANES."dictionary_harmonized_categories.csv" contains the harmonized categories for the categorical variables.“dictionary_drug_codes.csv” contains the dictionary for descriptors on the drugs codes.“nhanes_inconsistencies_documentation.xlsx” is an excel file that contains the cleaning documentation, which records all the inconsistencies for all affected variables to help curate each of the NHANES modules.R Data Record: For researchers who want to conduct their analysis in the R programming language, only cleaned NHANES modules and the data dictionaries can be downloaded as a .zip file which include an .RData file and an .R file.“w - nhanes_1988_2018.RData” contains all the aforementioned datasets as R data objects. We make available all R scripts on customized functions that were written to curate the data.“m - nhanes_1988_2018.R” shows how we used the customized functions (i.e. our pipeline) to curate the original NHANES data.Example starter codes: The set of starter code to help users conduct exposome analysis consists of four R markdown files (.Rmd). We recommend going through the tutorials in order.“example_0 - merge_datasets_together.Rmd” demonstrates how to merge the curated NHANES datasets together.“example_1 - account_for_nhanes_design.Rmd” demonstrates how to conduct a linear regression model, a survey-weighted regression model, a Cox proportional hazard model, and a survey-weighted Cox proportional hazard model.“example_2 - calculate_summary_statistics.Rmd” demonstrates how to calculate summary statistics for one variable and multiple variables with and without accounting for the NHANES sampling design.“example_3 - run_multiple_regressions.Rmd” demonstrates how run multiple regression models with and without adjusting for the sampling design.
Facebook
Twitterhttps://www.icpsr.umich.edu/web/ICPSR/studies/25501/termshttps://www.icpsr.umich.edu/web/ICPSR/studies/25501/terms
The National Health and Nutrition Examination Surveys (NHANES) is a program of studies designed to assess the health and nutritional status of adults and children in the United States. The NHANES combines personal interviews and physical examinations, which focus on different population groups or health topics. These surveys have been conducted by the National Center for Health Statistics (NCHS) on a periodic basis from 1971 to 1994. In 1999 the NHANES became a continuous program with a changing focus on a variety of health and nutrition measurements which were designed to meet current and emerging concerns. The surveys examine a nationally representative sample of approximately 5,000 persons each year. These persons are located in counties across the United States, 15 of which are visited each year. The 1999-2000 NHANES contains data for 9,965 individuals (and MEC examined sample size of 9,282) of all ages. Many questions that were asked in NHANES II, 1976-1980, Hispanic HANES 1982-1984, and NHANES III, 1988-1994, were combined with new questions in the NHANES 1999-2000. The 1999-2000 NHANES collected data on the prevalence of selected chronic conditions and diseases in the population and estimates for previously undiagnosed conditions, as well as those known to and reported by respondents. Risk factors, those aspects of a person's lifestyle, constitution, heredity, or environment that may increase the chances of developing a certain disease or condition, were examined. Data on smoking, alcohol consumption, sexual practices, drug use, physical fitness and activity, weight, and dietary intake were collected. Information on certain aspects of reproductive health, such as use of oral contraceptives and breastfeeding practices, were also collected. The interview includes demographic, socioeconomic, dietary, and health-related questions. The examination component consists of medical, dental, and physiological measurements, as well as laboratory tests. Demographic data file variables are grouped into three broad categories: (1) Status Variables: Provide core information on the survey participant. Examples of the core variables include interview status, examination status, and sequence number. (Sequence number is a unique ID assigned to each sample person and is required to match the information on this demographic file to the rest of the NHANES 1999-2000 data). (2) Recoded Demographic Variables: The variables include age (age in months for persons through age 19 years, 11 months; age in years for 1-84 year olds, and a top-coded age group of 85+ years), gender, a race/ethnicity variable, an education variable (high school, and more than high school education), country of birth (United States, Mexico, or other foreign born), and pregnancy status variable. Some of the groupings were made due to limited sample sizes for the two-year dataset. (3) Interview and Examination Sample Weight Variables: Sample weights are available for analyzing NHANES 1999-2000 data. For a complete listing of survey contents for all years of the NHANES see the document -- Survey Content -- NHANES 1999-2010.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
NHANES III is a representative sample of the United States non-institutionalized civilian population from 1988 to 1994. It consists of a periodic survey conducted by the United States National Center for Health Statistics designed to provide an estimate of the health of the nation. It was conducted from 1988-1994 in two phases. Phase 1 (1988-1991) and Phase 2 (1991-1994) were each nationally representative samples as well as for the combined six years.
Facebook
TwitterThe National Health and Nutrition Examination Survey (NHANES) is designed to assess the health and nutritional status of adults and children in the United States. The survey is unique in that it combines interviews with standardized physical examinations and laboratory tests.
NHANES was conducted on a periodic basis from 1971 to 1994, including NHANES I (1971-1975), NHANES II (1976-1980), NHANES III (1988-1994), and a Hispanic Health and Nutrition Examination Survey (HHANES, 1982-1984). In 1999, NHANES became continuous and has been collecting data annually ever since.
All of the NHANES programs utilized a stratified, multistage probability cluster design to provide a nationally representative sample of the U.S. civilian, noninstitutionalized population. The NHANES interview includes demographic, socioeconomic, dietary, and health-related questions. The examination component conducted in a mobile examination center consists of medical, dental, and physiological measurements, as well as the collection of biospecimens, such as blood and urine for laboratory testing.
This set of restricted data contains indirect identifying and/or sensitive information collected in NHANES prior to 1999. Please refer to the links below for additional data available from NHANES:
Facebook
Twitterhttps://www.icpsr.umich.edu/web/ICPSR/studies/4010/termshttps://www.icpsr.umich.edu/web/ICPSR/studies/4010/terms
The third National Health and Nutrition Examination Survey (NHANES III, ICPSR 2231), conducted in 1988-1994, was designed to obtain nationally representative information on the health and nutritional status of the population of the United States through interviews and direct physical examinations. This release, Series II, No. 3A, contains data obtained from a second exam of selected survey participants who had had a primary exam. This release does not replace any previous NHANES III data releases. The second exam sample consists of seven separate data files. The Combination Foods file contains information on food weight, nutrient data, and descriptions about combination foods. The Total Nutrient Intake file records respondent intake of foods and beverages in a 24-hour time period. The Examination file consists of a comprehensive physical/dental examination. The Individual Foods file lists the food records and component food records for single and multi-component combination foods. The Laboratory file contains data collected through whole blood, serum, plasma, and urine specimens collected from respondents. The Second Laboratory file contains blood and urine assessments by specimen type and age group. The Variable Ingredient file reports data pertaining to the variable ingredients for many recipe foods in the Individual Foods file.
Facebook
TwitterThe National Health and Nutrition Examination Surveys (NHANES) is a program of studies designed to assess the health and nutritional status of adults and children in the United States. The NHANES combines personal interviews and physical examinations, which focus on different population groups or health topics. These surveys have been conducted by the National Center for Health Statistics (NCHS) on a periodic basis from 1971 to 1994. In 1999, the NHANES became a continuous program with a changing focus on a variety of health and nutrition measurements designed to meet current and emerging concerns. The sample for the survey is selected to represent the U.S. population of all ages. Many of the NHANES 2001-2002 questions also were asked in NHANES II 1976-1980, Hispanic HANES 1982-1984, NHANES III 1988-1994. New questions were added to the survey based on recommendations from survey collaborators, NCHS staff, and other interagency work groups.
In the 2001-2002 wave, the NHANES includes more than 100 datasets. Most have been combined into three datasets for convenience. Each starts with the demographic dataset and includes datasets of a specific type.
1. National Health and Nutrition Examination Survey (NHANES), Demographic & Examination Data, 2001-2002 (The base of the Demographic dataset + all data from medical examinations).
2. National Health and Nutrition Examination Survey (NHANES), Demographic & Laboratory Data, 2001-2002 (The base of the Demographic dataset + all data from medical laboratories).
3. National Health and Nutrition Examination Survey (NHANES), Demographic & Questionnaire Data, 2001-2002 (The base of the Demographic dataset + all data from questionnaires)
Not all files from the 2001-2002 wave are included. This is for two reasons, both of which related to the merging variable (SEQN). For a subset of the files, SEQN is not a unique identifier for cases (i.e. some respondents have multiple cases) or SEQN is not in the file at all. The following datasets from this wave of the NHANES are not included in these three files and can be found individually from the "https://www.cdc.gov/nchs/nhanes/index.html" Target="_blank">NHANES website at the CDC:
Examination: Dietary Interview (Individual Foods File)
Examination: Dual Energy X-ray Absorptiometry (DXX)
Examination: Dual Energy X-ray Absorptiometry (DXX)
Questionnaire: Analgesics Pain Relievers
Questionnaire: Dietary Supplement Use -- Ingredient Information
Questionnaire: Dietary Supplement Use -- Supplement Blend
Questionnaire: Dietary Supplement Use -- Supplement Information
Questionnaire: Drug Information
Questionnaire: Dietary Supplement Use -- Participants Use of Supplement
Questionnaire: Physical Activity Individual Activity File
Questionnaire: Prescription Medications
Variable SEQN is included for merging files within the waves. All data files should be sorted by SEQN.
Additional details of the design and content of each survey are available at the "https://www.cdc.gov/nchs/nhanes/index.html" Target="_blank">NHANES website.
Facebook
TwitterDNA samples were collected in the Third National Health and Nutrition Examination Survey (NHANES III; 1988-1994) and in subsequent NHANES cycles (1999-2002, 2007-2008, 2009-2010, and 2011-2012). The program is a nationally representative collection of stored DNA samples and genetic data and will serve to add to the extensive amount of health, nutritional, and environmental information collected from NHANES. Resulting genetic variants are deposited into the NHANES Genetic Data Repository. These datasets are categorized as restricted data since they contain identifiable information.
For more information on the NHANES Genetic Data please visit: NHANES DNA Specimens and Genetic Data Program at: https://www.cdc.gov/nchs/nhanes/biospecimens/dnaspecimens.htm. For more information on NHANES, visit the NHANES - National Health and Nutrition Examination Survey Homepage at: https://www.cdc.gov/nchs/nhanes/index.htm.
Facebook
TwitterThe National Health and Nutrition Examination Survey (NHANES) is a program of studies designed to assess the health and nutritional status of adults and children in the United States. An ongoing annual survey combines interviews and physical examinations. The NHANES interview includes demographic, socioeconomic, dietary, and health-related questions. The examination component consists of medical, dental, and physiological measurements, as well as laboratory tests administered by highly trained medical personnel.
Ancillary studies include the NHANES National Youth Fitness Survey (NNYFS) and NHANES Epidemiologic Followup Study (NHEFS). NNYFS was conducted in 2012 to evaluate the physical activity and fitness of children aged 3 to 15 years old through interviews and fitness tests. NHEFS is a longitudinal survey of adults aged 25 to 74 years old in the NHANES I (1971-1975) cohort who completed a medical examination. Data was collected in follow-up rounds in 1982-1984, 1986, 1987, and 1992 through subject and proxy interviews and vital record search. Available data files include vital and tracing status, demographic information, interview data on health status, health care facility inpatient data, and mortality data.
Facebook
TwitterThe National Health and Nutrition Examination Surveys (NHANES) is a program of studies designed to assess the health and nutritional status of adults and children in the United States. The NHANES combines personal interviews and physical examinations, which focus on different population groups or health topics. These surveys have been conducted by the National Center for Health Statistics (NCHS) on a periodic basis from 1971 to 1994. In 1999, the NHANES became a continuous program with a changing focus on a variety of health and nutrition measurements which were designed to meet current and emerging concerns. The sample for the survey is selected to represent the U.S. population of all ages. Many of the NHANES 2007-2008 questions also were asked in NHANES II 1976-1980, Hispanic HANES 1982-1984, NHANES III 1988-1994, and NHANES 1999-2006. New questions were added to the survey based on recommendations from survey collaborators, NCHS staff, and other interagency work groups. Estimates for previously undiagnosed conditions, as well as those known to and reported by survey respondents, are produced through the survey.
In the 1999-2000 wave, the NHANES includes more than 100 datasets. Most have been combined into three datasets for convenience. Each starts with the Demographic dataset and includes datasets of a specific type.
1. National Health and Nutrition Examination Survey (NHANES), Demographic & Examination Data, 1999-2000 (The base of the Demographic dataset + all data from medical examinations).
2. National Health and Nutrition Examination Survey (NHANES), Demographic & Laboratory Data, 1999-2000 (The base of the Demographic dataset + all data from medical laboratories).
3. National Health and Nutrition Examination Survey (NHANES), Demographic & Questionnaire Data, 1999-2000 (The base of the Demographic dataset + all data from questionnaires)
Not all files from the 1999-2000 wave are included. This is for two reasons, both of which related to the merging variable (SEQN). For a subset of the files, SEQN is not a unique identifier for cases (i.e., some respondents have multiple cases) or SEQN is not in the file at all. The following datasets from this wave of the NHANES are not included in these three files and can be found individually from the "https://www.cdc.gov/nchs/nhanes/index.html" Target="_blank">NHANES website at the CDC:
Examination: Dietary Interview (Individual Foods File)
Examination: Dual Energy X-ray Absorptiometry (DXX)
Examination: Dual Energy X-ray Absorptiometry (DXX)
Questionnaire: Analgesics Pain Relievers
Questionnaire: Dietary Supplement Use -- Ingredient Information
Questionnaire: Dietary Supplement Use -- Supplement Blend
Questionnaire: Dietary Supplement Use -- Supplement Information
Questionnaire: Drug Information
Questionnaire: Dietary Supplement Use -- Participants Use of Supplement
Questionnaire: Physical Activity Individual Activity File
Questionnaire: Prescription Medications
Variable SEQN is included for merging files within the waves. All data files should be sorted by SEQN.
Additional details of the design and content of each survey are available at the "https://www.cdc.gov/nchs/nhanes/index.html" Target="_blank">NHANES website.
Facebook
TwitterThe National Health and Nutrition Examination Surveys (NHANES) is a program of studies designed to assess the health and nutritional status of adults and children in the United States. The NHANES combines personal interviews and physical examinations, which focus on different population groups or health topics. These surveys have been conducted by the National Center for Health Statistics (NCHS) on a periodic basis from 1971 to 1994. In 1999, the NHANES became a continuous program with a changing focus on a variety of health and nutrition measurements which were designed to meet current and emerging concerns. The sample for the survey is selected to represent the U.S. population of all ages. Many of the NHANES 2007-2008 questions also were asked in NHANES II 1976-1980, Hispanic HANES 1982-1984, NHANES III 1988-1994, and NHANES 1999-2006. New questions were added to the survey based on recommendations from survey collaborators, NCHS staff, and other interagency work groups. Estimates for previously undiagnosed conditions, as well as those known to and reported by survey respondents, are produced through the survey.
In the 2005-2006 wave, the NHANES includes over 100 datasets. Most have been combined into three datasets for convenience. Each starts with the Demographic dataset and includes datasets of a specific type.
1. National Health and Nutrition Examination Survey (NHANES), Demographic & Examination Data, 2005-2006 (The base of the Demographic dataset + all data from medical examinations).
2. National Health and Nutrition Examination Survey (NHANES), Demographic & Laboratory Data, 2005-2006 (The base of the Demographic dataset + all data from medical laboratories).
3. National Health and Nutrition Examination Survey (NHANES), Demographic & Questionnaire Data, 2005-2006 (The base of the Demographic dataset + all data from questionnaires)
Not all files from the 2005-2006 wave are included. This is for two reasons, both of which related to the merging variable (SEQN). For a subset of the files, SEQN is not a unique identifier for cases (i.e., some respondents have multiple cases) or SEQN is not in the file at all. The following datasets from this wave of the NHANES are not included in these three files and can be found individually from the "https://www.cdc.gov/nchs/nhanes/index.html" Target="_blank">NHANES website at the CDC:
Examination: Dietary Interview (Individual Foods -- First Day)
Examination: Dietary Interview (Individual Foods -- Second Day)
Examination: Food Frequency Questionnaire -- DietCalc Output
Examination: Physical Activity Monitor
Questionnaire: Dietary Supplement Use -- Ingredient Information
Questionnaire: Dietary Supplement Use -- Supplement Blend
Questionnaire: Dietary Supplement Use -- Supplement Information
Questionnaire: Dietary Supplement Use -- Drug Information
Questionnaire: Dietary Supplement Use -- Participants Use of Supplement
Questionnaire: Physical Activity Individual Activity File
Questionnaire: Prescription Medications
Variable SEQN is included for merging files within the waves. All data files should be sorted by SEQN.
Additional details of the design and content of each survey are available at the "https://www.cdc.gov/nchs/nhanes/index.html" Target="_blank">NHANES website.
Facebook
TwitterCharacteristics of the training dataset using participants from NHANES III.
Facebook
TwitterThe National Health and Nutrition Examination Surveys (NHANES) is a program of studies designed to assess the health and nutritional status of adults and children in the United States. The NHANES combines personal interviews and physical examinations, which focus on different population groups or health topics. These surveys have been conducted by the National Center for Health Statistics (NCHS) on a periodic basis from 1971 to 1994. In 1999 the NHANES became a continuous program with a changing focus on a variety of health and nutrition measurements which were designed to meet current and emerging concerns. The sample for the survey is selected to represent the U.S. population of all ages. Many of the NHANES 2007-2008 questions were also asked in NHANES II 1976-1980, Hispanic HANES 1982-1984, NHANES III 1988-1994, and NHANES 1999-2006. New questions were added to the survey based on recommendations from survey collaborators, NCHS staff, and other interagency work groups. Estimates for previously undiagnosed conditions, as well as those known to and reported by survey respondents, are produced through the survey.
In the 2007-2008 wave, the NHANES includes 69 datasets. These have been combined into three datasets for convenience. Each starts with the Demographic dataset and includes datasets of a specific type.
1. National Health and Nutrition Examination Survey (NHANES), Demographic & Examination Data, 2007-2008 (The base of the Demographic dataset + all data from medical examinations).
2. National Health and Nutrition Examination Survey (NHANES), Demographic & Laboratory Data, 2007-2008 (The base of the Demographic dataset + all data from medical laboratories).
3. National Health and Nutrition Examination Survey (NHANES), Demographic & Questionnaire Data, 2007-2008 (The base of the Demographic dataset + all data from questionnaires)
Variable SEQN is included for merging files within the waves. All data files should be sorted by SEQN.
Additional details of the design and content of each survey are available at the "https://www.cdc.gov/nchs/nhanes/index.html" Target="_blank">NHANES website.
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
The National health and Nutrition Examination Survey (NHANES) is a public dataset culled from centers for disease control and prevention website https://wwwn.cdc.gov/nchs/nhanes/. This dataset for the period (2017-2020) .
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Data for my NHANES III project. For more details, please visit (and consider adding a star to) my github repo: https://github.com/marskar/nhanes
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Study characteristics are shown for participants greater than 18 years of age who had non-missing Lp(a) levels. Samples sizes shown are the DNA samples available in Genetic NHANES III for each subpopulation. Values are shown as unweighted mean (SD). BMI, body mass index; Lp(a), lipoprotein(a); HDL-C, high-density lipoprotein cholesterol; LDL-C, low-density lipoprotein cholesterol; TG, triglycerides. P-values are based on one-way unweighted ANOVA.
Facebook
TwitterBaseline characteristics of NHANES III participants by self-reported church attendance.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
aAll of the analyses were adjusted with the NHANES III sample weights. Analayis included paticipants aged 40 years and above who had fasted glucose greater than 100 mg/dl without any oral hypoglycemic agent or insulin.bCalculated as insulin (µU/mL) x glucose (mmol/L)/22.5.cApproximately 50% of IGF1 and IGFBP3 measurements are missing. Binary (high vs. low) values were determined separately by gender from weighted distribution of available data for both gender (n = 671 and 615 for men and women, respectively).dDefined as serum cotinine level >14 ng/mL.Abbreviations: IGF1, insulin like growth factor; IGFBP3, insulin like growth factor binding protein-3; HOMA, homeostasis model assessment - insulin resistance; SE, standard error;
Facebook
TwitterThe National Health and Nutrition Examination Survey (NHANES) is an on-going survey conducted by the Centers for Disease Control and Prevention (CDC) that assesses a series of health and nutritional outcomes through questionnaires, interviews, and physical examinations. This is a stratified, multistage, probability sample of the civilian, noninstitutionalized U.S. population. Beginning in 1999, the survey became a continuous program examining roughly 5,000 different individuals each year across a variety of changing health and nutritional measurements to meet emerging public health needs. Data are released in two-year cycles. Between cycles, the measurements are modified, removed, or added over the lifecycle of NHANES. Within each 2-year cycle, there are different target population groups for specific topics within each component.
The 1999-2000 and 2001-2002 datasets in this file include variables related to:
Information regarding these variables and instruments used can be found here and here
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
study population excludes individuals who report a cough, cold or other acute illness in the past few days.*Asthma as defined by self report of doctor diagnoses and COPD defined by as self-reported physician-diagnosed emphysema and/or chronic bronchitis, or by GOLD spirometry criteria (FEV1/FVC
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The National Health and Nutrition Examination Survey (NHANES) provides data and have considerable potential to study the health and environmental exposure of the non-institutionalized US population. However, as NHANES data are plagued with multiple inconsistencies, processing these data is required before deriving new insights through large-scale analyses. Thus, we developed a set of curated and unified datasets by merging 614 separate files and harmonizing unrestricted data across NHANES III (1988-1994) and Continuous (1999-2018), totaling 135,310 participants and 5,078 variables. The variables convey 1. demographics (281 variables), 2. dietary consumption (324 variables), 3. physiological functions (1,040 variables), 4. occupation (61 variables), 5. questionnaires (1444 variables, e.g., physical activity, medical conditions, diabetes, reproductive health, blood pressure and cholesterol, early childhood), 6. medications (29 variables), 7. mortality information linked from the National Death Index (15 variables), 8. survey weights (857 variables), 9. environmental exposure biomarker measurements (598 variables), and 10. chemical comments indicating which measurements are below or above the lower limit of detection (505 variables).
csv Data Record: The curated NHANES datasets and the data dictionaries includes 23 .csv files and 1 excel file. - The curated NHANES datasets involves 20 .csv formatted files, two for each module with one as the uncleaned version and the other as the cleaned version. The modules are labeled as the following: 1) mortality, 2) dietary, 3) demographics, 4) response, 5) medications, 6) questionnaire, 7) chemicals, 8) occupation, 9) weights, and 10) comments. - "dictionary_nhanes.csv" is a dictionary that lists the variable name, description, module, category, units, CAS Number, comment use, chemical family, chemical family shortened, number of measurements, and cycles available for all 5,078 variables in NHANES. - "dictionary_harmonized_categories.csv" contains the harmonized categories for the categorical variables. - “dictionary_drug_codes.csv” contains the dictionary for descriptors on the drugs codes. - “nhanes_inconsistencies_documentation.xlsx” is an excel file that contains the cleaning documentation, which records all the inconsistencies for all affected variables to help curate each of the NHANES modules.
R Data Record: For researchers who want to conduct their analysis in the R programming language, only cleaned NHANES modules and the data dictionaries can be downloaded as a .zip file which include an .RData file and an .R file. - “w - nhanes_1988_2018.RData” contains all the aforementioned datasets as R data objects. We make available all R scripts on customized functions that were written to curate the data. - “m - nhanes_1988_2018.R” shows how we used the customized functions (i.e. our pipeline) to curate the original NHANES data.
Example starter codes: The set of starter code to help users conduct exposome analysis consists of four R markdown files (.Rmd). We recommend going through the tutorials in order. - “example_0 - merge_datasets_together.Rmd” demonstrates how to merge the curated NHANES datasets together. - “example_1 - account_for_nhanes_design.Rmd” demonstrates how to conduct a linear regression model, a survey-weighted regression model, a Cox proportional hazard model, and a survey-weighted Cox proportional hazard model. - “example_2 - calculate_summary_statistics.Rmd” demonstrates how to calculate summary statistics for one variable and multiple variables with and without accounting for the NHANES sampling design. - “example_3 - run_multiple_regressions.Rmd” demonstrates how run multiple regression models with and without adjusting for the sampling design.