Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
The Behavioral Risk Factor Surveillance System (BRFSS) is the nation's premier system of health-related telephone surveys that collect uniform, state-specific data about U.S. residents regarding their health-related risk behaviors, chronic health conditions, and use of preventive services.
The objective of the BRFSS is to gather consistent, state-level data on preventive health practices and risk behaviors associated with chronic diseases, injuries, and preventable infectious diseases among adults (aged 18 and older).
Established in 1984 with 15 states, the BRFSS now collects data in all 50 states, the District of Columbia, and three U.S. territories. The system completes more than 400,000 adult interviews each year, making it the largest continuously conducted health survey system in the world.
The 2024 BRFSS dataset continues to use the raking weighting methodology (introduced in 2011) and includes both landline and cellphone-only respondents, ensuring more accurate representation of the U.S. adult population.
The aggregate dataset combines landline and cell phone data collected in 2024 from 49 states, The District of Columbia, Guam, Puerto Rico, and The U.S. Virgin Islands.
This original dataset contains responses from 457,670 individuals and has 301 features. These features are either questions directly asked of participants, or calculated variables based on individual participant responses.
⚠️ Note: Tennessee was unable to collect enough responses to meet inclusion requirements for 2024 and is not included in this public dataset.
Certain survey questions and responses have been modified or omitted to comply with federal data policies in effect during the 2024 collection period. As a result, some variables may contain missing values or appear inconsistent due to questions that were removed or restructured.
Data are collected from a random sample of adults (one per household) via telephone interviews.
Factors assessed include:
- Tobacco use
- Health care access and coverage
- Alcohol consumption
- Physical activity and diet
- HIV/AIDS knowledge and prevention
- Chronic health conditions
- Preventive health services and screenings
The annual dataset contains 301 variables, covering both core questions and optional modules. Please refer to the official BRFSS 2024 Codebook for detailed variable definitions and coding.
This dataset contains 3 files:
1. brfss_survey_data_2024.csv # Dataset in .csv format (converted from SAS)
2. codebook_2024.HTML # CDC codebook for variable definitions
3. main_data_brfss_2024.XPT # Main dataset
⚙️ Note: The CSV file were converted from the original SAS format using pandas. Minor conversion artifacts may exist.
Complete description about each column of the CSV file can be found in the codebook.
Data provided by the U.S. Centers for Disease Control and Prevention (CDC).
Original source and additional years of BRFSS data: CDC BRFSS Annual Data
Citation:
Centers for Disease Control and Prevention (CDC). Behavioral Risk Factor Surveillance System Survey Data. Atlanta, Georgia: U.S. Department of Health and Human Services, Centers for Disease Control and Prevention, 2024.
License: Public Domain (U.S. Government Work)
If you use this dataset in your analysis or publication, please cite as:
Behavioral Risk Factor Surveillance System (BRFSS) 2024. U.S. Centers for Disease Control and Prevention (CDC). Public Domain.
Prepared for Kaggle public dataset publication. All data are in the public domain as U.S. Government works.
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
The Behavioral Risk Factor Surveillance System (BRFSS) is a comprehensive health survey in the United States, initiated in 1984 by the Centers for Disease Control and Prevention (CDC). It is designed to collect data on behavioral risk factors, chronic health conditions, and the use of preventive services among adults aged 18 and older. The BRFSS is notable for being the largest continuously conducted health survey system globally, with over 400,000 adult interviews completed annually across all 50 states, the District of Columbia, Puerto Rico, Guam, and the U.S. Virgin Islands.
The primary aim of the BRFSS is to monitor modifiable risk behaviors that contribute to the leading causes of morbidity and mortality in the U.S. These behaviors include: - Tobacco use - Alcohol consumption - Physical activity - Dietary habits - Preventive health practices (e.g., immunizations and screenings)
The survey employs a telephone-based methodology, initially using landlines but expanding to include cellular phones since 2009. Each participating state can customize the survey by adding specific questions relevant to local health concerns while adhering to a core set of questions established by the CDC
Data from the BRFSS are essential for public health planning and evaluation at both state and federal levels. They help identify trends in health behaviors and inform interventions aimed at improving public health outcomes. The information gathered is utilized by various stakeholders, including state health departments and local governments, to tailor health promotion activities effectively.
The BRFSS also serves as a critical resource for researchers and policymakers, providing insights into health disparities across different populations and regions. Its findings contribute to national health objectives, such as those outlined in Healthy People 2030, which aims to improve the health of all Americans
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
The objective of the BRFSS is to collect uniform, state-specific data on preventive health practices and risk behaviors that are linked to chronic diseases, injuries, and preventable infectious diseases in the adult population. Factors assessed by the BRFSS include tobacco use, health care coverage, HIV/AIDS knowledge or prevention, physical activity, and fruit and vegetable consumption. Data are collected from a random sample of adults (one per household) through a telephone survey.
The Behavioral Risk Factor Surveillance System (BRFSS) is the nation's premier system of health-related telephone surveys that collect state data about U.S. residents regarding their health-related risk behaviors, chronic health conditions, and use of preventive services. Established in 1984 with 15 states, BRFSS now collects data in all 50 states as well as the District of Columbia and three U.S. territories. BRFSS completes more than 400,000 adult interviews each year, making it the largest continuously conducted health survey system in the world.
Each year contains a few hundred columns. Please see one of the annual code books for complete details 2019 codebook: https://www.cdc.gov/brfss/annual_data/2019/pdf/codebook19_llcp-v2-508.HTML 2018 codebook: https://www.cdc.gov/brfss/annual_data/2018/pdf/codebook18_llcp-v2-508.pdf 2017 codebook: https://www.cdc.gov/brfss/annual_data/2017/pdf/codebook17_llcp-v2-508.pdf 2016 codebook: https://www.cdc.gov/brfss/annual_data/2016/pdf/codebook16_llcp.pdf These CSV files were converted from a SAS data format using R studio; there may be some data artifacts as a result. If you like this dataset, you might also like the data for 2011-2015: https://www.kaggle.com/cdc/behavioral-risk-factor-surveillance-system
This dataset was released by the CDC.
Facebook
TwitterAccording to this BRFSS is:
BRFSS is the nation’s premier system of health-related telephone surveys that collect state data about U.S. residents regarding their health-related risk behaviors, chronic health conditions, and use of preventive services. BRFSS collects data in all 50 states as well as the District of Columbia and three U.S. territories. BRFSS completes more than 400,000 adult interviews each year, making it the largest continuously conducted health survey system in the world.
To learn more about the data see the official page.
Complete description about each column of the CSV file can be found in the codebook.
Facebook
Twitterhttps://creativecommons.org/share-your-work/public-domain/pdmhttps://creativecommons.org/share-your-work/public-domain/pdm
In 1984, the Centers for Disease Control and Prevention (CDC) initiated the state-based Behavioral Risk Factor Surveillance System (BRFSS)--a cross-sectional telephone survey that state health departments conduct monthly over landline telephones and cellular telephones with a standardized questionnaire and technical and methodologic assistance from CDC. BRFSS is used to collect prevalence data among adult U.S. residents regarding their risk behaviors and preventive health practices that can affect their health status. Respondent data are forwarded to CDC to be aggregated for each state, returned with standard tabulations, and published at year's end by each state. In 2011, more than 500,000 interviews were conducted in the states, the District of Columbia, and participating U.S. territories and other geographic areas.The files in this deposit were downloaded from the CDC website by Julia Dennett, Yale University, and Toby Chaiken, J-PAL North America, and archived by Travis Donahoe, Harvard University. Additional information edited by Michael Darisse and Lars Vilhuber, Cornell University and American Economic Association.
Facebook
Twitterhttp://www.gnu.org/licenses/old-licenses/gpl-2.0.en.htmlhttp://www.gnu.org/licenses/old-licenses/gpl-2.0.en.html
The Behavioral Risk Factor Surveillance System (BRFSS) is a collaborative project between all the United States states and participating US territories and the Centers for Disease Control and Prevention (CDC). The BRFSS is a system of ongoing health-related telephone surveys designed to collect data on health-related risk behaviors, chronic health conditions, and the use of preventive services from the non-institutionalized adult population (≥ 18 years) residing in the United States. The BRFSS is administered and supported by CDC's Population Health Surveillance Branch under the Division of Population Health at the CDC's National Center for Chronic Disease Prevention and Health Promotion.
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
The Behavioral Risk Factor Surveillance System (BRFSS) is a collaborative project between all of the states in the United States (US) and participating US territories and the Centers for Disease Control and Prevention (CDC) The BRFSS is administered and supported by CDC’s Population Health Surveillance Branch, under the Division of Population Health at the National Center for Chronic Disease Prevention and Health Promotion.
BRFSS is an ongoing surveillance system designed to measure behavioural risk factors for the non-institutionalized adult population (18 years of age and older) residing in the US. Bias to data can exist. This data source is valid and generalizable. The reliability is around 75% and compares well with other surveys of similar nature.
The Behavioral Risk Factor Surveillance System (BRFSS) is the nation’s premier system of health-related telephone surveys that collect state data about U.S. residents regarding their health-related risk behaviours, chronic health conditions, and use of preventive services.
The BRFSS objective is to collect uniform, state-specific include tobacco use, HIV/AIDS knowledge and prevention, exercise, immunization, health status, healthy days - health-related quality of life, health care access, inadequate sleep, hypertension awareness, cholesterol awareness, chronic health conditions, alcohol consumption, fruits and vegetable consumption, arthritis burden, and seatbelt use.
brfss2013.RData: The BRFSS Data in Rdata format. Contains 491775 records and 330 columns. Useful for analysis in R.brfss2013.csv: The BRFSS Data in CSV format. Contains 491775 records and 330 columns. Useful for analysis in Python.brfss2013_sample.csv: The BRFSS sample Data in CSV format. Contains 100 records and 8 columns. There are 330 columns, you can select only the columns of your interest and start your analysis. Here is the codebook that contains column information. You can search the column names in the pdf. https://www.cdc.gov/brfss/annual_data/2013/pdf/CODEBOOK13_LLCP.pdf
https://www.cdc.gov/brfss/index.html http://www.cdc.gov/brfss/annual_data/2013/pdf/Overview_2013.pdf
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
The data source for this project work is a large collection of raw data publicly available on CDC website. “CDC is the nation’s leading science-based, data-driven, service organisation that protects the public’s health. In 1984, the CDC established the Behavioral Risk Factor Surveillance System (BRFSS). The BRFSS is the nation’s premier system of health-related telephone surveys that collect state data about U.S. residents regarding their health-related risk behaviours, chronic health conditions, and use of preventive services.” (CDC - BRFSS Annual Survey Data, 2020).
I have referred to a set of data collected between the years 2005 and 2021 and it contains more than 7 Million records in total (7,143,987 to be exact). For each year there are around 300 to 400 features available in the dataset, but not all of them are needed for this project, as some of them are irrelevant to my work. I have shortlisted a total of 22 features which are relevant for designing and developing my ML models and I have explained them in detail in the below table.
The codebook link (of the year 2021) explains below columns in more details - https://www.cdc.gov/brfss/annual_data/2021/pdf/codebook21_llcp-v2-508.pdf
All datasets are obtained from CDC website wherein they are available in Zip format containing a SAS format file with .xpt extension. So, I downloaded all the zip files, extracted them and then converted each one of them into a .csv format so I could easily fetch the records in my project code. I used below command in Anaconda Prompt to convert .xpt extension file into .csv extension file,
C:\users\mayur\Downloads> python -m xport LLCP2020.xpt > LLCP2020.csv
Not seeing a result you expected?
Learn how you can add new datasets to our index.
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
The Behavioral Risk Factor Surveillance System (BRFSS) is the nation's premier system of health-related telephone surveys that collect uniform, state-specific data about U.S. residents regarding their health-related risk behaviors, chronic health conditions, and use of preventive services.
The objective of the BRFSS is to gather consistent, state-level data on preventive health practices and risk behaviors associated with chronic diseases, injuries, and preventable infectious diseases among adults (aged 18 and older).
Established in 1984 with 15 states, the BRFSS now collects data in all 50 states, the District of Columbia, and three U.S. territories. The system completes more than 400,000 adult interviews each year, making it the largest continuously conducted health survey system in the world.
The 2024 BRFSS dataset continues to use the raking weighting methodology (introduced in 2011) and includes both landline and cellphone-only respondents, ensuring more accurate representation of the U.S. adult population.
The aggregate dataset combines landline and cell phone data collected in 2024 from 49 states, The District of Columbia, Guam, Puerto Rico, and The U.S. Virgin Islands.
This original dataset contains responses from 457,670 individuals and has 301 features. These features are either questions directly asked of participants, or calculated variables based on individual participant responses.
⚠️ Note: Tennessee was unable to collect enough responses to meet inclusion requirements for 2024 and is not included in this public dataset.
Certain survey questions and responses have been modified or omitted to comply with federal data policies in effect during the 2024 collection period. As a result, some variables may contain missing values or appear inconsistent due to questions that were removed or restructured.
Data are collected from a random sample of adults (one per household) via telephone interviews.
Factors assessed include:
- Tobacco use
- Health care access and coverage
- Alcohol consumption
- Physical activity and diet
- HIV/AIDS knowledge and prevention
- Chronic health conditions
- Preventive health services and screenings
The annual dataset contains 301 variables, covering both core questions and optional modules. Please refer to the official BRFSS 2024 Codebook for detailed variable definitions and coding.
This dataset contains 3 files:
1. brfss_survey_data_2024.csv # Dataset in .csv format (converted from SAS)
2. codebook_2024.HTML # CDC codebook for variable definitions
3. main_data_brfss_2024.XPT # Main dataset
⚙️ Note: The CSV file were converted from the original SAS format using pandas. Minor conversion artifacts may exist.
Complete description about each column of the CSV file can be found in the codebook.
Data provided by the U.S. Centers for Disease Control and Prevention (CDC).
Original source and additional years of BRFSS data: CDC BRFSS Annual Data
Citation:
Centers for Disease Control and Prevention (CDC). Behavioral Risk Factor Surveillance System Survey Data. Atlanta, Georgia: U.S. Department of Health and Human Services, Centers for Disease Control and Prevention, 2024.
License: Public Domain (U.S. Government Work)
If you use this dataset in your analysis or publication, please cite as:
Behavioral Risk Factor Surveillance System (BRFSS) 2024. U.S. Centers for Disease Control and Prevention (CDC). Public Domain.
Prepared for Kaggle public dataset publication. All data are in the public domain as U.S. Government works.