34 datasets found

U.S. fitness center/health club memberships 2000-2024
statista.com
Updated Nov 26, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Statista (2025). U.S. fitness center/health club memberships 2000-2024 [Dataset]. https://www.statista.com/statistics/236123/us-fitness-center-health-club-memberships/
Explore at:
Dataset updated
Nov 26, 2025
Dataset authored and provided by
Statistahttp://statista.com/
Area covered
United States
Description
The number of members of fitness centers and health clubs within the United States has experienced a near continual increase between 2000 and 2024. In 2024, there were found to be around ** million members of fitness centers and health clubs within the U.S., the greatest number during the period of observation.
Attendance rate of gym / boutique gym members in the United States 2017
statista.com
Updated Nov 26, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Statista (2025). Attendance rate of gym / boutique gym members in the United States 2017 [Dataset]. https://www.statista.com/statistics/930495/gym-boutique-gym-membership-attendance/
Explore at:
Dataset updated
Nov 26, 2025
Dataset authored and provided by
Statistahttp://statista.com/
Time period covered
2017
Area covered
United States
Description
The statistic shows survey results how often people attend their current gyms and boutique gyms in the United Statees in 2017. ** percent of boutique gym members attend their current gym * or more times per week.
Population Health (BRFSS: HRQOL)
kaggle.com
zip
Updated Dec 14, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
The Devastator (2022). Population Health (BRFSS: HRQOL) [Dataset]. https://www.kaggle.com/datasets/thedevastator/unlock-population-health-needs-with-brfss-hrqol
Explore at:
zip(2247473 bytes)Available download formats
Dataset updated
Dec 14, 2022
Authors
The Devastator
Description
Population Health (BRFSS: HRQOL)

Examining Trends, Disparities and Determinants of Health in the US Population

By Health [source]

About this dataset

The Behavioral Risk Factor Surveillance System (BRFSS) offers an expansive collection of data on the health-related quality of life (HRQOL) from 1993 to 2010. Over this time period, the Health-Related Quality of Life dataset consists of a comprehensive survey reflecting the health and well-being of non-institutionalized US adults aged 18 years or older. The data collected can help track and identify unmet population health needs, recognize trends, identify disparities in healthcare, determine determinants of public health, inform decision making and policy development, as well as evaluate programs within public healthcare services.

The HRQOL surveillance system has developed a compact set of HRQOL measures such as a summary measure indicating unhealthy days which have been validated for population health surveillance purposes and have been widely implemented in practice since 1993. Within this study's dataset you will be able to access information such as year recorded, location abbreviations & descriptions, category & topic overviews, questions asked in surveys and much more detailed information including types & units regarding data values retrieved from respondents along with their sample sizes & geographical locations involved!

More Datasets

For more datasets, click here.

Featured Notebooks

🚨 Your notebook can be here! 🚨!

How to use the dataset

This dataset tracks the Health-Related Quality of Life (HRQOL) from 1993 to 2010 using data from the Behavioral Risk Factor Surveillance System (BRFSS). This dataset includes information on the year, location abbreviation, location description, type and unit of data value, sample size, category and topic of survey questions.

Using this dataset on BRFSS: HRQOL data between 1993-2010 will allow for a variety of analyses related to population health needs. The compact set of HRQOL measures can be used to identify trends in population health needs as well as determine disparities among various locations. Additionally, responses to survey questions can be used to inform decision making and program and policy development in public health initiatives.

Research Ideas

Analyzing trends in HRQOL over the years by location to identify disparities in health outcomes between different populations and develop targeted policy interventions.

Developing new models for predicting HRQOL indicators at a regional level, and using this information to inform medical practice and public health implementation efforts.

Using the data to understand differences between states in terms of their HRQOL scores and establish best practices for healthcare provision based on that understanding, including areas such as access to care, preventative care services availability, etc

Acknowledgements

If you use this dataset in your research, please credit the original authors. Data Source

License

See the dataset description for more information.

Columns

File: rows.csv | Column name | Description | |:-------------------------------|:----------------------------------------------------------| | Year | Year of survey. (Integer) | | LocationAbbr | Abbreviation of location. (String) | | LocationDesc | Description of location. (String) | | Category | Category of survey. (String) | | Topic | Topic of survey. (String) | | Question | Question asked in survey. (String) | | DataSource | Source of data. (String) | | Data_Value_Unit | Unit of data value. (String) | | Data_Value_Type | Type of data value. (String) | | Data_Value_Footnote_Symbol | Footnote symbol for data value. (String) | | Data_Value_Std_Err | Standard error of the data value. (Float) | | Sample_Size | Sample size used in sample. (Integer) | | Break_Out | Break out categories used. (String) | | Break_Out_Category | Type break out assessed. (String) | | **GeoLocation*...
U.S. Pandemic Mental Health Care
kaggle.com
zip
Updated Jan 21, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
The Devastator (2023). U.S. Pandemic Mental Health Care [Dataset]. https://www.kaggle.com/datasets/thedevastator/u-s-pandemic-mental-health-care
Explore at:
zip(75773 bytes)Available download formats
Dataset updated
Jan 21, 2023
Authors
The Devastator
Area covered
United States
Description
U.S. Pandemic Mental Health Care

Impact on Households in Previous 4 Weeks

By US Open Data Portal, data.gov [source]

About this dataset

This U.S. Household Pandemic Impacts dataset assesses the mental health care that households in America have been receiving over the past four weeks during the Covid-19 pandemic. Produced by a collaboration between the U.S. Census Bureau, and five other federal agencies, this survey was designed to measure both social and economic impacts of Covid-19 on American households, such as employment status, consumer spending trends, food security levels and housing disruptions among other important factors. The data collected was based on an internet questionnaire which was conducted through emails and text messages sent to randomly selected housing units from across America linked with email addresses or cell phone numbers from the Census Bureau Master Address File Data; all estimates comply with NCHS Data Presentation Standards for Proportions. Be sure to check out more about how U.S Government Works for further details!

More Datasets

For more datasets, click here.

Featured Notebooks

🚨 Your notebook can be here! 🚨!

How to use the dataset

This dataset can be useful to examine the impact of the Covid-19 pandemic on access to and utilization of mental health care by U.S. households in the last 4 weeks.

By studying this dataset, you can gain insight into how people’s mental health has been affected by the pandemic and identify trends based on population subgroups, states, phases of the survey and more.

Instructions for Use: - To get started, open up ‘csv-1’ found in this dataset. This file contains information on access to and utilization of mental health care by U.S households in the last 4 weeks, broken down into 14 different columns (e.g., Indicator, Group, State).
- Familiarize yourself with each column label (e.g., Time Period Start Date), data type (e

Research Ideas

Analyzing the impact of pandemic-induced stress on different demographic groups, such as age and race/ethnicity.

Comparing the mental health care services received in different states over time.

Investigating the correlation between socio-economic status and access to mental health care services during Covid-19 pandemic

Acknowledgements

If you use this dataset in your research, please credit the original authors. Data Source

License

License: Dataset copyright by authors - You are free to: - Share - copy and redistribute the material in any medium or format for any purpose, even commercially. - Adapt - remix, transform, and build upon the material for any purpose, even commercially. - You must: - Give appropriate credit - Provide a link to the license, and indicate if changes were made. - ShareAlike - You must distribute your contributions under the same license as the original. - Keep intact - all notices that refer to this license, including copyright notices.

Columns

File: csv-1.csv | Column name | Description | |:---------------------------|:-------------------------------------------------------------------| | Indicator | The type of indicator being measured. (String) | | Group | The group (by age, gender or race) being measured. (String) | | State | The state where the data was collected. (String) | | Subgroup | A narrower level categorization within Group. (String) | | Phase | Phase number reflective of survey iteration. (Integer) | | Time Period | A label indicating duration captured by survey period. (String) | | Time Period Label | A label indicating duration captured by survey period. (String) | | Time Period Start Date | Beginning date for surveyed period. (DateFormat ‘YYYY-MM-DD’) | | Time Period End Date | End date for surveyed period. (DateFormat ‘YYYY-MM-DD’) | | Value | The value of the indicator being measured. (Float) | | LowCI | The lower confidence interval of the value. (Float) | | HighCI | The higher confidence interval of the value. (Float) | | Quartile Range | The quartile range of the value. (String) | | Suppression Flag | A f...
NYC Open Data
kaggle.com
zip
Updated Mar 20, 2019
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
NYC Open Data (2019). NYC Open Data [Dataset]. https://www.kaggle.com/datasets/nycopendata/new-york
Explore at:
zip(0 bytes)Available download formats
Dataset updated
Mar 20, 2019
Dataset authored and provided by
NYC Open Data
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
Context

NYC Open Data is an opportunity to engage New Yorkers in the information that is produced and used by City government. We believe that every New Yorker can benefit from Open Data, and Open Data can benefit from every New Yorker. Source: https://opendata.cityofnewyork.us/overview/

Content

Thanks to NYC Open Data, which makes public data generated by city agencies available for public use, and Citi Bike, we've incorporated over 150 GB of data in 5 open datasets into Google BigQuery Public Datasets, including:

Over 8 million 311 service requests from 2012-2016

More than 1 million motor vehicle collisions 2012-present

Citi Bike stations and 30 million Citi Bike trips 2013-present

Over 1 billion Yellow and Green Taxi rides from 2009-present

Over 500,000 sidewalk trees surveyed decennially in 1995, 2005, and 2015

This dataset is deprecated and not being updated.

Fork this kernel to get started with this dataset.

Acknowledgements

https://opendata.cityofnewyork.us/

https://cloud.google.com/blog/big-data/2017/01/new-york-city-public-datasets-now-available-on-google-bigquery

This dataset is publicly available for anyone to use under the following terms provided by the Dataset Source - https://data.cityofnewyork.us/ - and is provided "AS IS" without any warranty, express or implied, from Google. Google disclaims all liability for any damages, direct or indirect, resulting from the use of the dataset.

By accessing datasets and feeds available through NYC Open Data, the user agrees to all of the Terms of Use of NYC.gov as well as the Privacy Policy for NYC.gov. The user also agrees to any additional terms of use defined by the agencies, bureaus, and offices providing data. Public data sets made available on NYC Open Data are provided for informational purposes. The City does not warranty the completeness, accuracy, content, or fitness for any particular purpose or use of any public data set made available on NYC Open Data, nor are any such warranties to be implied or inferred with respect to the public data sets furnished therein.

The City is not liable for any deficiencies in the completeness, accuracy, content, or fitness for any particular purpose or use of any public data set, or application utilizing such data set, provided by any third party.

Banner Photo by @bicadmedia from Unplash.

Inspiration

On which New York City streets are you most likely to find a loud party?

Can you find the Virginia Pines in New York City?

Where was the only collision caused by an animal that injured a cyclist?

What’s the Citi Bike record for the Longest Distance in the Shortest Time (on a route with at least 100 rides)?

https://cloud.google.com/blog/big-data/2017/01/images/148467900588042/nyc-dataset-6.png" alt="enter image description here"> https://cloud.google.com/blog/big-data/2017/01/images/148467900588042/nyc-dataset-6.png
US Births by County and State
kaggle.com
zip
Updated Jan 22, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
The Devastator (2023). US Births by County and State [Dataset]. https://www.kaggle.com/datasets/thedevastator/us-births-by-county-and-state
Explore at:
zip(3159011 bytes)Available download formats
Dataset updated
Jan 22, 2023
Authors
The Devastator
Area covered
United States
Description
US Births by County and State

1985-2015 Aggregated Data

By data.world's Admin [source]

About this dataset

This dataset contains an aggregation of birth data from the United Statesbetween 1985 and 2015. It consists of information on mothers' locations by state (including District of Columbia) and county, as well as information such as the month they gave birth, and aggregates giving the sum of births during that month. This data has been provided by both the National Bureau for Economic Research and National Center for Health Statistics, whose shared mission is to understand how life works in order to aid individuals in making decisions about their health and wellbeing. This dataset provides valuable insight into population trends across time and location - for example, which states have higher or lower birthrates than others? Which counties experience dramatic fluctuations over time? Given its scope, this dataset could be used in a number of contexts--from epidemiology research to population forecasting. Be sure to check out our other datasets related to births while you're here!

More Datasets

For more datasets, click here.

Featured Notebooks

🚨 Your notebook can be here! 🚨!

How to use the dataset

This dataset could be used to examine local trends in birth rates over time or analyze births at different geographical locations. In order to maximize your use of this dataset, it is important that you understand what information the various columns contain.

The main columns are: State (including District of Columbia), County (coded using the FIPS county code number), Month (numbering from 1 for January through 12 for December), Year (4-digit year) countyBirths (calculated sum of births that occurred to mothers living in a county for a given month) and stateBirths (calculated sum of births that occurred to mothers living in a state for a given month). These fields should provide enough information for you analyze trends across geographic locations both at monthly and yearly levels. You could also consider combining variables such as Year with State or Year with Month or any other grouping combinations depending on your analysis goal.

In addition, while all data were downloaded on April 5th 2017, it is worth noting that all sources used followed privacy guidelines as laid out by NCHC so individual births occurring after 2005 are not included due to geolocation concerns.
We hope you find this dataset useful and can benefit from its content! With proper understanding of what each field contains, we are confident you will gain valuable insights on birth rates across counties within the United States during this period

Research Ideas

Establishing county-level trends in birth rates for the US over time.

Analyzing the relationship between month of birth and health outcomes for US babies after they are born (e.g., infant mortality, neurological development, etc.).

Comparing state/county-level differences in average numbers of twins born each year

Acknowledgements

If you use this dataset in your research, please credit the original authors. Data Source

License

See the dataset description for more information.

Columns

File: allBirthData.csv | Column name | Description | |:-----------------|:-----------------------------------------------------------------------------------------------------------------| | State | The numerical order of the state where the mother lives. (Integer) | | Month | The month in which the birth took place. (Integer) | | Year | The year of the birth. (Integer) | | countyBirths | The calculated sum of births that occurred to mothers living in that county for that particular month. (Integer) | | stateBirths | The aggregate number at the level of entire states for any given month-year combination. (Integer) | | County | The county where the mother lives, coded using FIPS County Code. (Integer) |

Acknowledgements

If you use this dataset in your research, please credit the original authors. If you use this dataset in your research, please credit data.world's Admin.
B
Data from: Sex-specific additive genetic variances and correlations for...
borealisdata.ca
search.dataone.org
Updated May 19, 2021
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Matthew Ernest Wolak; Peter Arcese; Lukas F. Keller; Pirmin Nietlisbach; Jane M. Reid (2021). Data from: Sex-specific additive genetic variances and correlations for fitness in a song sparrow (Melospiza melodia) population subject to natural immigration and inbreeding [Dataset]. http://doi.org/10.5683/SP2/0PMFIV
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Unique identifier
https://doi.org/10.5683/SP2/0PMFIV
Dataset updated
May 19, 2021
Dataset provided by
Borealis
Authors
Matthew Ernest Wolak; Peter Arcese; Lukas F. Keller; Pirmin Nietlisbach; Jane M. Reid
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Area covered
Canada, British Columbia
Description
AbstractQuantifying sex-specific additive genetic variance (VA) in fitness, and the cross-sex genetic correlation (rA), is prerequisite to predicting evolutionary dynamics and the magnitude of sexual conflict. Further, quantifying VA and rA in underlying fitness components, and genetic consequences of immigration and resulting gene flow, is required to identify mechanisms that maintain VA in fitness. However, these key parameters have rarely been estimated in wild populations experiencing natural environmental variation and immigration. We used comprehensive pedigree and life history data from song sparrows (Melospiza melodia) to estimate VA and rA in sex-specific fitness and underlying fitness components, and to estimate additive genetic effects of immigrants alongside inbreeding depression. We found evidence of substantial VA in female and male fitness, with a moderate positive cross-sex rA. There was also substantial VA in male but not female adult reproductive success, and moderate VA in juvenile survival but not adult annual survival. Immigrants introduced alleles with negative additive genetic effects on local fitness, potentially reducing population mean fitness through migration load, but alleviating expression of inbreeding depression. Our results show that VA for fitness can be maintained in the wild, and be broadly concordant between the sexes despite marked sex-specific VA in reproductive success. Usage notesWolak_et_al_SOSP_fitness_QG_DataData for SEX-SPECIFIC ADDITIVE GENETIC VARIANCES AND CORRELATIONS FOR FITNESS IN A SONG SPARROW (MELOSPIZA MELODIA) POPULATION SUBJECT TO NATURAL IMMIGRATION AND INBREEDING by Wolak, Arcese, Keller, Nietlisbach, & Reid published in Evolution These data come from the long-term song sparrow field study on Mandarte Island, BC, Canada. The data provided here are sufficient to replicate the analyses presented in the above paper, and are therefore a restricted subset of the full Mandarte dataset. If you are interested in running additional analyses that require further data then please get in touch with at least one (preferably all) of the following project leaders: - Prof Peter Arcese (University of British Columbia): peter.arceseubc.ca - Prof Lukas Keller (University of Zurich): lukas.kellerieu.uzh.ch - Prof Jane Reid (University of Aberdeen): jane.reidabdn.ac.uk We are always happy to develop collaborations with researchers who have good ideas for new analyses. We would also appreciate it if you could let us know if you are intending to make use of the dataset below in order to facilitate coordination of different ongoing research efforts and allow us to keep track of all outputs from the long-term field study.Wolak_et_al_SOSP_fitness_QG.zipWolak_et_al_SOSP_fitness_QG_AnalysisCodeCode for SEX-SPECIFIC ADDITIVE GENETIC VARIANCES AND CORRELATIONS FOR FITNESS IN A SONG SPARROW (MELOSPIZA MELODIA) POPULATION SUBJECT TO NATURAL IMMIGRATION AND INBREEDING by Wolak, Arcese, Keller, Nietlisbach, & Reid published in Evolution These data come from the long-term song sparrow field study on Mandarte Island, BC, Canada. The data provided here are sufficient to replicate the analyses presented in the above paper, and are therefore a restricted subset of the full Mandarte dataset. If you are interested in running additional analyses that require further data then please get in touch with at least one (preferably all) of the following project leaders: - Prof Peter Arcese (University of British Columbia): peter.arceseubc.ca - Prof Lukas Keller (University of Zurich): lukas.kellerieu.uzh.ch - Prof Jane Reid (University of Aberdeen): jane.reidabdn.ac.uk We are always happy to develop collaborations with researchers who have good ideas for new analyses. We would also appreciate it if you could let us know if you are intending to make use of the dataset below in order to facilitate coordination of different ongoing research efforts and allow us to keep track of all outputs from the long-term field study.Wolak_et_al_fitness_AnalysisCode.R
Major US Sports Venues Usage and Affiliations
kaggle.com
zip
Updated Jan 15, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
The Devastator (2023). Major US Sports Venues Usage and Affiliations [Dataset]. https://www.kaggle.com/datasets/thedevastator/major-us-sports-venues-usage-and-affiliations
Explore at:
zip(36399 bytes)Available download formats
Dataset updated
Jan 15, 2023
Authors
The Devastator
Area covered
United States
Description
Major US Sports Venues Usage and Affiliations

Team, League, Conference and Population Usage Records

By Homeland Infrastructure Foundation [source]

About this dataset

This dataset provides detailed information on major sport venues, along with their usage and affiliations. It includes data related to the National Association for Stock Car Auto Racing, Indy Racing League, Major League Soccer, Major League Baseball, National Basketball Association, Women's National Basketball Association, National Hockey League, National Football League, PGA Tour, NCAA Division 1 FBS Football, NCAA Division 1 Basketball and thoroughbred horse racing.* This dataset contains columns such as USE (which describes the type of use for the venue), TEAM (the team associated with the venue), LEAGUE (the league associated with the venue) , CONFERENCE (the conference associated with the venue), DIVISION (the division associated with the venue), INST_AFFIL(the institution affiliation associatedwith the venue), TRACK_TYPE(type of track at a specific point in time or over its complete life-cycle) as well as LENGTH_MILEGE ('length of track in milege') ROOF_TYPE(The type of roof covering used at a specific point in time or over its complete life-cycle) and plenty other variables. With this astounding range and quantity of data points -- spanning countries across different continents and leagues -- explore patterns in sports games you never even thought were possible!

More Datasets

For more datasets, click here.

Featured Notebooks

🚨 Your notebook can be here! 🚨!

How to use the dataset

The MajorUS Sports Venues Usage and Affiliations dataset includes data on major sports venues from leagues including National Association for Stock Car Auto Racing (NASCAR), Indy Racing League (IRL), Major League Soccer (MLS), Major League Baseball (MLB), National Basketball Association (NBA), Women's National Basketball Association (WNBA), National Hockey League (NHL), National Football League(NFL), PGA Tour, NCAA Division 1 FBS Football, NCAA Division 1 Basketball, and thoroughbred horse racing. The columns provided include USE_, USE_POP, TEAM, LEAGUE,CONFERENCE,DIVISION ,INST_AFFIL,TRACK_TYPE. LENGTH_MI,ROOF_TYPESTADIUM_SH,`ADDDATAE , USEWEBSITE',and'COMMENTS'.

The `USE~ column specifies the type of usage of each venue at which point can be college athletics or professional athletics. The corresponding column to this is the ‘USE~POP’ which informs you about how many people are using each venue for a particular sport at a given time. For example if there were 6 NHL games being played that day then USE~ would say “professional Athletics” while USE~POP would state “NNN” reflecting there were NNN people spectating those events collectively: The next column is TEAM which represents what team sponsors or manages each venue or what teams will be playing in them.

Following on from TEAM is LEAGUE; here you can find out what league each team represents such as MLB, NBA etc… The next three columns CONFERENCE/DIVISION/INST ~ AFFIL provide more specific details as they blur into collegiate level as well where CONFERENCE indicates which conference they belong within their respective division: while INST ~ AFFIL states its affiliated school body e.g.: Southeastern Conference > University of Arkansas Razorbacks . Rounding up our overview these last three columns TRACK ~ TYPE/LENGTH

Research Ideas

Analyzing the affiliations and usage of different sports venues to determine which teams or leagues have the most presence across a certain geographic area.

Comparing different stadiums within a given conference in terms of their roof type, track length, and stadium shape for optimal design features for new construction projects.

Placing sponsorships or advertisements within each sporting arena based on audience size, league popularity, and team affiliation within a given conference or division

Acknowledgements

If you use this dataset in your research, please credit the original authors. Data Source

License

License: Dataset copyright by authors - You are free to: - Share - copy and redistribute the material in any medium or format for any purpose, even commercially. - Adapt - remix, transform, and build upon the material for any purpose, even commercially. - You must: - Give appropriate credit - Provide a link to the license, and indicate if changes were made. - ShareAlike - You must distribute your contribut...
US Health Insurance
kaggle.com
zip
Updated Jan 7, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
The Devastator (2023). US Health Insurance [Dataset]. https://www.kaggle.com/datasets/thedevastator/comprehensive-analysis-of-us-health-insurance-ma
Explore at:
zip(15726377 bytes)Available download formats
Dataset updated
Jan 7, 2023
Authors
The Devastator
Area covered
United States
Description
US Health Insurance

Exploring Rates, Benefits, and Providers

By Data Society [source]

About this dataset

This fascinating dataset from the Centers for Medicare & Medicaid Services provides an in-depth analysis of health insurance plans offered throughout the United States. Exploring this data, you can gain insights into how plan rates and benefits vary across states, explore how plan benefits relate to plan rates, and investigate how plans vary across insurance network providers.

The top-level directory includes six CSV files which contain information about: BenefitsCostSharing.csv; BusinessRules.csv; Network.csv; PlanAttributes.csv; Rate.csv; and ServiceArea.csv - as well as two additional CSV files which facilitate joining data across years: Crosswalk2015.csv (joining 2014 and 2015 data) and Crosswalk2016

More Datasets

For more datasets, click here.

Featured Notebooks

🚨 Your notebook can be here! 🚨!

How to use the dataset

This Kaggle dataset contains comprehensive data on US health insurance Marketplace plans. The data was obtained from the Centers for Medicare & Medicaid Services and contains information such as plan rates and benefits, metal levels, dental coverage, and child/adult-only coverages.

In order to use this dataset effectively, it is important to understand the different columns/variables that make up the dataset. The columns are state, dental plan, multistate plan (2015 and 2016), metal level (2014-2016), child/adult-only coverage (2014-2016), FIPS code (Federal Information Processing Standard code for the particular state), zipcode, crosswalk level (level of crosswalk between 2014-2016 data sets), reason for crosswalk parameter.

Using this dataset can help you answer interesting questions about US health insurance Marketplace plans across different variables such as state or rate information. It may also be interesting to compare certain variables over time with respect to how they affect certain types of people or how they differ across states or regions. Additionally, an analysis of the different price points associated with various kinds of coverage could provide insights into which kinds of plans are most attractive in various marketplaces based on cost savings alone

Once you have a good understanding of your data by studying individual parameters in depth across multiple states or regions you can begin looking at correlations between different parameters You can identify patterns that emerge around common characteristics or trends within areas or across markets over time when you have gathered sufficient historical data:

Does higher out of pocket limits tend to come with higher premiums?

Are there more multi-state markets in some states than others?

What type of metal levels does each region prefer?

Research Ideas

Examining the impacts of age, metal levels and plan benefits on insurance rates in different states.

Analyzing how dental plans vary across different states/regions and examining whether there are correlations between affordability and quality of care among plans with dental coverage options.

Investigating how the Crosswalk level affects insurance rates by comparing insurance premiums from different metals level across states with varying Crosswalk Levels (e.g., how does a Bronze plan differ in cost for two states with differing Crosswalk Level 1 vs 2)

Acknowledgements

If you use this dataset in your research, please credit the original authors. Data Source

License

License: Dataset copyright by authors - You are free to: - Share - copy and redistribute the material in any medium or format for any purpose, even commercially. - Adapt - remix, transform, and build upon the material for any purpose, even commercially. - You must: - Give appropriate credit - Provide a link to the license, and indicate if changes were made. - ShareAlike - You must distribute your contributions under the same license as the original. - Keep intact - all notices that refer to this license, including copyright notices.

Columns

File: Crosswalk2016.csv | Column name | Description | |:------------------------------|:------------------------------------------------------------------------------------------------------------------------------| | State | The state in which...
Cancer Rates by U.S. State
kaggle.com
zip
Updated Dec 26, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Heemali Chaudhari (2022). Cancer Rates by U.S. State [Dataset]. https://www.kaggle.com/datasets/heemalichaudhari/cancer-rates-by-us-state
Explore at:
zip(219237 bytes)Available download formats
Dataset updated
Dec 26, 2022
Authors
Heemali Chaudhari
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Area covered
United States
Description
In the following maps, the U.S. states are divided into groups based on the rates at which people developed or died from cancer in 2013, the most recent year for which incidence data are available.

The rates are the numbers out of 100,000 people who developed or died from cancer each year.

Incidence Rates by State The number of people who get cancer is called cancer incidence. In the United States, the rate of getting cancer varies from state to state.

*Rates are per 100,000 and are age-adjusted to the 2000 U.S. standard population.

‡Rates are not shown if the state did not meet USCS publication criteria or if the state did not submit data to CDC.

†Source: U.S. Cancer Statistics Working Group. United States Cancer Statistics: 1999–2013 Incidence and Mortality Web-based Report. Atlanta (GA): Department of Health and Human Services, Centers for Disease Control and Prevention, and National Cancer Institute; 2016. Available at: http://www.cdc.gov/uscs.

Death Rates by State Rates of dying from cancer also vary from state to state.

*Rates are per 100,000 and are age-adjusted to the 2000 U.S. standard population.

†Source: U.S. Cancer Statistics Working Group. United States Cancer Statistics: 1999–2013 Incidence and Mortality Web-based Report. Atlanta (GA): Department of Health and Human Services, Centers for Disease Control and Prevention, and National Cancer Institute; 2016. Available at: http://www.cdc.gov/uscs.

Source: https://www.cdc.gov/cancer/dcpc/data/state.htm
Americans Vision and Eye Health
kaggle.com
Updated Nov 18, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
The Devastator (2022). Americans Vision and Eye Health [Dataset]. https://www.kaggle.com/datasets/thedevastator/americans-vision-and-eye-health-2005-2014
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Nov 18, 2022
Dataset provided by
Kaggle
Authors
The Devastator
License
Open Database License (ODbL) v1.0https://www.opendatacommons.org/licenses/odbl/1.0/
License information was derived automatically
Description
Americans' Vision and Eye Health: 2005-2014

The Impact of Different Factors

About this dataset

The kaggle dataset description for the Vision and Eye Health Surveillance System is a population-based study that collects data on visual impairments, eye diseases, and access to eye care. The system is intended to provide accurate estimates of the prevalence of vision loss and eye diseases, as well as to identify barriers to access to vision and eye care. This information can be used for designing, implementing, and evaluating vision and eye health prevention programs

The Vision and Eye Health Surveillance System is a critical tool for public health officials working to prevent vision loss and promote eye health. The system provides accurate estimates of the prevalence of vision loss and eye diseases, which can be used to design, implement, and evaluate effective prevention programs. In addition, the system can help identify barriers to access to vision and eye care, which can be addressed in order to improve access for all Americans

How to use the dataset

When using this dataset, it is important to keep in mind that it provides population estimates of visual impairments, eye diseases, and access to eye care. This information can be used for designing, implementing, and evaluating vision and eye health prevention programs.

When using this dataset, it is also important to note that the data is self-reported and may not be accurate. Therefore, it is important to use caution when interpreting the data

Research Ideas

To generate estimates of prevalence of visual impairments, eye diseases and access to eye care at the state level

To identify barriers to access to vision and eye care

To design, implement and evaluate programs for preventing vision loss and promoting eye health

Acknowledgements

The health dataset was used in this study to provide population estimates of vision loss function, eye diseases, health disparities, as well as barriers and facilitators to access to vision and eye care. This information can be used for designing, implementing, and evaluating vision and eye health prevention programs

License

License: Open Database License (ODbL) v1.0 - You are free to: - Share - copy and redistribute the material in any medium or format. - Adapt - remix, transform, and build upon the material for any purpose, even commercially. - You must: - Give appropriate credit - Provide a link to the license, and indicate if changes were made. - ShareAlike - You must distribute your contributions under the same license as the original. - Keep intact - all notices that refer to this license, including copyright notices. - No Derivatives - If you remix, transform, or build upon the material, you may not distribute the modified material. - No additional restrictions - You may not apply legal terms or technological measures that legally restrict others from doing anything the license permits.

Columns

File: Behavioral_Risk_Factors_-_Vision_Eye_Health.csv | Column name | Description | |:-------------------------------|:---------------------------------------------------------------------------------| | Year | The year the data was collected. (Integer) | | LocationAbbr | The two-letter abbreviation for the state where the data was collected. (String) | | LocationDesc | The name of the state where the data was collected. (String) | | Topic | The topic of the data. (String) | | Question | The question asked in the survey. (String) | | DataSource | The source of the data. (String) | | Response | The response to the question. (String) | | Data_Value_Unit | The unit of measurement for the data value. (String) | | Data_Value_Type | The type of data value. (String) | | Data_Value_Footnote_Symbol | A footnote symbol for the data value. (String) | | Sample_Size | The sample size for the data value. (Integer) | | Break_Out ...
Hospital Care Quality Measures
kaggle.com
zip
Updated Jan 22, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
The Devastator (2023). Hospital Care Quality Measures [Dataset]. https://www.kaggle.com/datasets/thedevastator/hospital-care-quality-measures/code
Explore at:
zip(13361768 bytes)Available download formats
Dataset updated
Jan 22, 2023
Authors
The Devastator
License
Open Database License (ODbL) v1.0https://www.opendatacommons.org/licenses/odbl/1.0/
License information was derived automatically
Description
Hospital Care Quality Measures

Timely & Effective Care Across the U.S

By Health [source]

About this dataset

This dataset includes provider-level data revealing the quality of timely and effective care from hospitals across the United States. It allows us to analyze heart attack, heart failure, pneumonia, surgical, emergency department, preventive care for children's asthma and stroke prevention and treatment data for pregnancy and delivery care courtesy of the Centers for Medicare & Medicaid Services. With this dataset you can analyze hospital's performance on all these areas using Hospital Name, Addresss , City , State , ZIP Code , County Name , Phone Number as well as scores creditable to Measure Name , Sample size from which it was derived a Footnote explanation based on location. Dig deep into each provider's level of care with this dataset to understand their performance on providing timely effective care

More Datasets

For more datasets, click here.

Featured Notebooks

🚨 Your notebook can be here! 🚨!

How to use the dataset

To get the most out of this dataset, it is important to understand each column in the dataset: Hospital Name identifies the health care facility; Address provides the address of the hospital; City identifies the city where it is located; State specifies which state it belongs to; ZIP Code denotes its specific zip code; County Name mentions what county it belongs to; Phone Number connects you with an immediate contact at the facility if needed; Condition categorizes types of tests/treatments being monitored in that case study; Measure Name outlines all related measures under said condition umbrella or metric(s) studied as part of that investigative research project/condition category (i.e., infection prevention); Score grades out how well that measure was doing compared against expectations or goals for quality & safe patient protections (higher scores are indicative of better performance on those surveyed & tracked items); Sample details how many patients were involved in this particular study topic component and involved participant sample size selection & unit evaluation criteria definition considerations during research recruitment and retention efforts associated with a particular area of specialty treatment/testing cluster system activity factors reviewed directionally by researchers via cohort based review activities over time [note: matching non-patients or control subject population reference points also sometimes may be used depending on written scope descriptions outlined by investigators]; Footnotes can amplify additional evaluations/CAVEATS sometimes noted regarding high-lighted findings(-such as improvement yet still not meeting standards), etc.; Measure Start Date defines when all test students were allowed entry into their respective study groups associated with one another for convergence analysis purposes within a defined subject patient group prospectively selected category designation feature component selection batch cases (new patients added mid-project have crossed design frontiers at random intervals sometimes necessary). Lastly, Measure End Date reflects terminal endpoint lead review periods cut off times when no new data entries can be accepted post-data collection stopped official time period specifications if designated by protocol order via institutional clinical trial board IRB approved advanced notification statements issued throughout any official project undertaking design process stages at its multiplex points).

Understanding each column's features will assist you in selecting relevant variables from this dataset according to your research needs. Additionally, using Location can help narrow down search results geographically. With this information researchers can gain valuable insight into overall trends regarding timely and effective care in different hospitals across different states

Research Ideas

Create an interactive heatmap to visualize provider-level data across different states. This can allow researchers, consumers and policy makers to identify areas of excellence as well as opportunities for improvement in timely and effective care measures.

Develop a web app that allows users to locate hospitals in their area based on any given health condition, measure name, score or timeframe data provided by this dataset. This could give patients access to quality care options and help them make informed decisions while seeking medical attention.

Utilizing the geographic coordinates data included in the Location column, create a virtual tour function that lets people virtually explore the interior of hospital facilities associated with this dataset...
🏟️ Negro League Database
kaggle.com
zip
Updated Oct 8, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
mexwell (2024). 🏟️ Negro League Database [Dataset]. https://www.kaggle.com/datasets/mexwell/negro-league-database
Explore at:
zip(16198067 bytes)Available download formats
Dataset updated
Oct 8, 2024
Authors
mexwell
Description
About

The Negro leagues were United States professional baseball leagues comprising teams of African Americans. The term may be used broadly to include professional black teams outside the leagues and it may be used narrowly for the seven relatively successful leagues beginning in 1920 that are sometimes termed "Negro Major Leagues".

To date, Retrosheet has compiled data on 6,116 Negro League games which were played in 337 different ballparks in 259 cities across 33 states, the District of Columbia, and two foreign countries (Mexico and Canada). We have compiled at least some statistics for 2,759 players who participated in one or more of these games. These games include not only regular-season Negro League games, but also all-star games, playoff games, and exhibition games between major-league caliber teams. This latter set includes several hundred games played between White and Black major-league baseball players (so those 2,759 players include players such as Dizzy Dean, Bob Feller, Lefty Grove, Babe Ruth, and Ted Williams, among others).

The centerpiece of Negro League data are a set of .csv files which summarize game-level data for all (5,255) Negro League games for which Retrosheet has compiled data. There are five such .csv files.

gameinfo.csv - contains game-level information such as teams, attendance, umpires, etc. teamstats.csv - contains team-level statistics - line scores, lineups, and team statistics (batting, pitching, fielding) batting.csv - batting statistics by player by game pitching.csv - pitching statistics by player by game fielding.csv - fielding statistics by player by position by game

The columns are labeled and should be mostly self-explanatory. But, in case not, the columns are defined in the document context.txt which is included in the zip file.

The level of detail at which Negro League data can be determined is highly variable across games and the data "known" is highly uncertain in many cases. For example, for many games, we have no box score but may have a reference to the fact that a particular player had at least one hit in the game. To attempt to convey this uncertainty in our data, teams and players may be given up to three sets of statistical lines for each game within the data files which are available for download. These are identified within the .csv files by the variable 'stattype'.

stattype 'value' is Retrosheet's best estimate of the relevant statistical total

stattype 'lower' is the lower bound on a player's total

stattype 'upper' is the upper bound on a player's total

All teams players will have lines with stattype 'value' regardless of how little information may be known. Data for which Retrosheet has no information will be blank. In most cases where we have some information, Retrosheet has attempted to make its best estimate of player statistics and has assigned these totals to the stattype 'value'. In cases where there is some uncertainty, additional lines with stattype 'lower' or 'upper' may be added. As an example of 'upper' and 'lower' stattypes, we may know that a pitcher was knocked out of the game in the 5th inning and that the opposing team scored 4 runs in the 5th inning. In this case, the lower and upper bound for the pitcher's innings pitched would be 4 and 4.2, respectively, and the lower and upper bound for the pitcher's runs allowed would be 0 and 4 (plus whatever we know the pitcher allowed in his first four innings pitched).

In addition to these five files which aggregate all Negro League games, we also have compiled separate logs by team (subsets of teamstats.csv divided by team-season), by ballpark (subsets of gameinfo.csv) and by player (subsets of batting.csv, pitching.csv, and fielding.csv). For ballparks and players, these aggregate across all seasons.

In addition to these .csvs, Retrosheet has also compiled event files (.evx files) and box-score files (.ebx files) for games for which sufficient data is available. Games are compiled into a single file for each season for which we have compiled games of the relevant type. In the former case, event files are included both for games for which we have found play-by-play accounts as well as games which have been deduced. The latter are identified within the files via a comment at the start of the play-by-play portion of the file.

Finally, the zip file here includes roster files for all teams for whom Retrosheet has compiled rosters as well as our master files for people (biofile.csv), ballparks (ballparks.csv), and teams (teams.csv). These files include data for all people, teams, and sites across all Retrosheet games, not just Negro League games.

Read more about the dataset here.

Notice

The information used here was obtained free of charge from and is copyrighted by Retrosheet. Interested parties ma...
🌊 US Water Quality: 20+ Years of PFAS Monitoring
kaggle.com
zip
Updated Nov 13, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Anudeep Adiraju (2024). 🌊 US Water Quality: 20+ Years of PFAS Monitoring [Dataset]. https://www.kaggle.com/datasets/anudeepadiraju/ucmr-1-5-combined-csv-data
Explore at:
zip(42971180 bytes)Available download formats
Dataset updated
Nov 13, 2024
Authors
Anudeep Adiraju
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
UCMR Historical PFAS Contamination Dataset (2001-2024)

Context

This dataset contains comprehensive monitoring data of Per- and Polyfluoroalkyl Substances (PFAS) and other contaminants in U.S. public water systems, collected under the EPA's Unregulated Contaminant Monitoring Rule (UCMR) program from 2001 to 2024. The data represents a critical resource for understanding the prevalence and patterns of PFAS contamination in drinking water across different regions and time periods.

Content

The dataset combines results from multiple UCMR monitoring cycles (UCMR 1-5) and includes over 4 million observations of various contaminants, with a particular focus on PFAS compounds. Each record represents a single analytical measurement at a public water system.

File Descriptions

combined_ucmr_data.csv (4,082,839 rows × 24 columns)

Features Description

Key Fields: * PWSID: Public Water System Identification number (string) * PWSName: Name of the Public Water System * Size: Size category of the water system (L: >10,000, S: ≤10,000 people served) * FacilityID: Unique identifier for the facility * FacilityName: Name of the facility * FacilityWaterType: Source water type - GW: Ground Water - SW: Surface Water - GU: Ground Water Under Direct Influence of Surface Water - MX: Mixed Water Types * SamplePointID: Unique identifier for the sampling location * SamplePointName: Description of the sampling location * SamplePointType: Type of sampling point (e.g., EP: Entry Point to distribution system) * CollectionDate: Date of sample collection * Contaminant: Name of the contaminant analyzed * MRL: Minimum Reporting Level in μg/L * Units: Measurement units (typically μg/L) * MethodID: EPA analytical method used * AnalyticalResultsSign: < for less than MRL, = for detected values * AnalyticalResultValue: Numerical result of the analysis * SampleEventCode: Sampling event identifier (SE1, SE2, SE3, SE4) * MonitoringRequirement: Type of monitoring (AM: Assessment Monitoring) * Region: EPA Region number (1-10) * State: Two-letter state code

Key PFAS Compounds Monitored:

PFOA (Perfluorooctanoic acid)

PFOS (Perfluorooctanesulfonic acid)

PFHxS (Perfluorohexanesulfonic acid)

PFNA (Perfluorononanoic acid)

PFBS (Perfluorobutanesulfonic acid)

HFPO-DA (GenX chemicals)

And many others (29 PFAS compounds in total)

Use Cases

This dataset is valuable for: 1. Environmental Science: Analyzing trends in PFAS contamination over time 2. Public Health Research: Identifying areas with elevated PFAS levels 3. Machine Learning: - Predicting future PFAS levels - Identifying patterns in contamination spread - Analyzing geographical and temporal trends 4. Policy Analysis: Informing water quality regulations and standards

Challenges in the Dataset

Missing Values: Results below MRL are indicated with '<' sign

Mixed Data Types: Combination of numeric and categorical variables

Temporal Gaps: Different monitoring cycles with varying sampling frequencies

Regional Variations: Inconsistent coverage across different regions

Multiple Contaminants: Need to handle multiple PFAS compounds simultaneously

Citation

Data sourced from EPA's UCMR program. When using this dataset, please cite: - EPA UCMR Program (https://www.epa.gov/dwucmr) - UCMR Data Files (2001-2024)

Acknowledgments

Special thanks to: - EPA for making this data publicly available - Public Water Systems for collecting and reporting the data - Environmental laboratories for analyzing the samples

Inspiration

Can we predict PFAS levels for 2024 based on historical trends?

How do PFAS contamination patterns vary by region and water source type?

What correlations exist between different PFAS compounds?

How effective are current detection methods and reporting limits?

Can we identify high-risk areas for future contamination?

Additional Resources

EPA's PFAS Strategic Roadmap

PFAS Health Effects

Safe Drinking Water Act Information
Hipsters in USA
kaggle.com
zip
Updated Jan 15, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
The Devastator (2023). Hipsters in USA [Dataset]. https://www.kaggle.com/datasets/thedevastator/quantifying-hipster-behaviors-and-preferences-in/code
Explore at:
zip(2317788 bytes)Available download formats
Dataset updated
Jan 15, 2023
Authors
The Devastator
License
Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
Area covered
United States
Description
Hipsters in USA

This dataset ranks the level of hipster activity by the block group level @ US

By Chad Gardner [source]

About this dataset

Explore the hidden depths of human behavior and personality using ML for the first time with this cutting-edge data set from Spatial.ai!

We curate an unrivaled collection of over 4 billion social media data points from monthly contributions of 140 million new entries. Our proprietary algorithms then filter, clean, and analyze these immense datasets to boil down their predictive values that can accurately quantify the qualitative essence at any given community or demographic. We sort these insights into 100+ socially relevant segments across by index score across North America or International locations for easy use in measuring human behavior.

Leverage this sample dataset to get a feel for how detailed and expansive our offerings are; you won't be disappointed! For all your questions, don't hesitate to reach out to our helpful support team at supportspatial.ai. Ready to dive in? Check hundreds more social topics and segmentation taxonomy right now at taxonomy.spatial.ai!

More Datasets

For more datasets, click here.

Featured Notebooks

🚨 Your notebook can be here! 🚨!

How to use the dataset

This dataset is a great resource to help you quantify the behaviors and preferences of hipsters in the United States. Using this information, you can gain insights into popular trends among hipsters, identify emerging trends early on, and even uncover hidden gems in the US hipster scene.

Research Ideas

Measuring trend changes in the hipster population across different city blocks.

Guiding marketers to target areas with a higher concentration of hipsters for promotions and product campaigns.

Pinpointing key locations for businesses that cater to hipster customers, such as specialty coffee shops, vintage stores, or art galleries

Acknowledgements

If you use this dataset in your research, please credit the original authors. Data Source

License

License: Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0) - You are free to: - Share - copy and redistribute the material in any medium or format for non-commercial purposes only. - Adapt - remix, transform, and build upon the material for non-commercial purposes only. - You must: - Give appropriate credit - Provide a link to the license, and indicate if changes were made. - ShareAlike - You must distribute your contributions under the same license as the original. - You may not: - Use the material for commercial purposes.

Columns

File: spatial_self_identifying_hipster_dataset.csv | Column name | Description | |:-----------------------------------|:-------------------------------------------------------------------------------------------------------| | social_media_volume_percentile | The percentile of social media usage for the self-identifying hipster population in the USA. (Numeric) |

Acknowledgements

If you use this dataset in your research, please credit the original authors. If you use this dataset in your research, please credit Chad Gardner.
US States Tobacco Use Prevalence
kaggle.com
zip
Updated Jan 23, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
The Devastator (2023). US States Tobacco Use Prevalence [Dataset]. https://www.kaggle.com/datasets/thedevastator/us-states-tobacco-use-prevalence
Explore at:
zip(815015 bytes)Available download formats
Dataset updated
Jan 23, 2023
Authors
The Devastator
Area covered
United States
Description
US States Tobacco Use Prevalence

1996-2010 Survey Data

By Health [source]

About this dataset

This dataset from the Centers for Disease Control and Prevention (CDC) provides state-based surveillance information related to tobacco use among American adults from 1996 to 2010. It contains data on modifiable risk factors for chronic diseases and other leading causes of death obtained from annual BRFSS surveys conducted in participating states.

The dataset focuses on key topics such as cigarette smoking status, prevalence by demographics, frequency, and quit attempts. The metrics collected are important indicators of public health efforts in tobacco prevention, control and cessation programs at the state level.

With this dataset you can explore how people perceive smoking differently across geographical areas as well as their socio-economic backgrounds like gender identity, race or ethnicity, educational level or life stage. Analyzing this data will give us valuable insights into the impact of tobacco consumption in our society today and help create more effective public health interventions tailored to local needs

More Datasets

For more datasets, click here.

Featured Notebooks

🚨 Your notebook can be here! 🚨!

How to use the dataset

This dataset can be used to study the prevalence of tobacco use in different US states in the period 1996-2010. The dataset contains information on cigarette smoking status, prevalence by demographics, frequency, and quit attempts.

In order to begin exploring this dataset it is recommended that one first understand the column headers and their corresponding values. This can be done by familiarizing oneself with the included data dictionary that defines each column's name and description.

Next it is recommended to familiarize oneself with the data types contained in the columns. Depending on what type of query you are wanting to make some columns may need conversion from one type to another for better results when performing a query. Some common types found within this dataset include integers (whole numbers), strings (text) and floats (decimals).

Once you have familiarized yourself with both the columns and data types it is now a good time to start considering which questions you want answer related to tobacco use in US states during this period of time. Consider which variables might provide valuable insights into your analysis such as age, gender, race etc., as well as other variables such as location or year that could add more complexity or context understanding into your analysis. Assuming that your desired questions have been determined you can begin querying your data using methods supported by whichever language or platform you are choosing work with such us SQL or Python Pandas Dataframes etc.. This will allow manipulation of all relevant variables according get useful insights out of them related back tobaccos use in US states during this specific period.

Finally when doing an analysis on any given topic its helpful no compare ones findings between multiple datasets if possible so consider obtaining any other datasets relevant top toxins use over a similar timespan which could be compared against these findings if available

Research Ideas

Identifying and targeting high-risk locations for tobacco use prevention efforts by analyzing the prevalence of different forms of tobacco use in different states.

Examining patterns of tobacco use among different demographic groups (gender, age, race, etc.) to design better tailored interventions for tobacco cessation.

Comparing quit attempt rates with smoking frequency and prevalence across states to understand the effectiveness of smoke-free laws and policies that have been enacted in recent years

Acknowledgements

If you use this dataset in your research, please credit the original authors. Data Source

License

See the dataset description for more information.

Columns

File: rows.csv | Column name | Description | |:-------------------------------|:-----------------------------------------------| | YEAR | Year of survey (Integer) | | LocationAbbr | Abbreviation of the state (String) | | LocationDesc | Full name of the state (String) | | TopicType | Type of topic (String) | | TopicDesc | Description of the topic (String) | | MeasureDesc | Description of ...

U.S. Metro Healthcare & Demographics

kaggle.com

zip

Updated May 10, 2023

Facebook

Twitter

Click to copy link

Link copied

Cite

Utkarsh Singh (2023). U.S. Metro Healthcare & Demographics [Dataset]. https://www.kaggle.com/datasets/utkarshx27/health-services-in-metropolitan-areas

Explore at:

zip(4602 bytes)Available download formats

Dataset updated

May 10, 2023

Authors

Utkarsh Singh

License

https://www.usa.gov/government-works/https://www.usa.gov/government-works/

Area covered

United States

Description

The U.S. Census Bureau regularly collects information for many metropolitan areas in the United States, including data on number of physicians and number (and size) of hospitals. This dataset has such information for 83 different metropolitan areas.

Column Name	Description
City	Name of the metropolitan area
NumMDs	Number of physicians
RateMDs	Number of physicians per 100,000 people
NumHospitals	Number of community hospitals
NumBeds	Number of hospital beds
RateBeds	Number of hospital beds per 100,000 people
NumMedicare	Number of Medicare recipients in 2003
PctChangeMedicare	Percent change in Medicare recipients (2000 to 2003)
MedicareRate	Number of Medicare recipients per 100,000 people
SSBNum	Number of Social Security recipients in 2004
SSBRate	Number of Social Security recipients per 100,000 people
SSBChange	Percent change in Social Security recipients (2000 to 2004)
NumRetired	Number of retired workers
SSINum	Number of Supplemental Security Income recipients in 2004
SSIRate	Number of Supplemental Security Income recipients per 100,000 people
SqrtMDs	Square root of number of physicians

Heart Disease Dataset
kaggle.com
zip
Updated Mar 11, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Mirza_Hasnine (2023). Heart Disease Dataset [Dataset]. https://www.kaggle.com/datasets/mirzahasnine/heart-disease-dataset/discussion
Explore at:
zip(62688 bytes)Available download formats
Dataset updated
Mar 11, 2023
Authors
Mirza_Hasnine
License
http://opendatacommons.org/licenses/dbcl/1.0/http://opendatacommons.org/licenses/dbcl/1.0/
Description
What is heart disease?

The term “heart disease” refers to several types of heart conditions. The most common type of heart disease in the United States is coronary artery disease (CAD), which affects the blood flow to the heart. Decreased blood flow can cause a heart attack. What are the symptoms of heart disease?

Sometimes heart disease may be “silent” and not diagnosed until a person experiences signs or symptoms of a heart attack, heart failure, or an arrhythmia. When these events happen, symptoms may include1

Heart attack: Chest pain or discomfort, upper back or neck pain, indigestion, heartburn, nausea or vomiting, extreme fatigue, upper body discomfort, dizziness, and shortness of breath. Arrhythmia: Fluttering feelings in the chest (palpitations). Heart failure: Shortness of breath, fatigue, or swelling of the feet, ankles, legs, abdomen, or neck veins.

Learn the Facts About Heart Disease

About 697,000 people in the United States died from heart disease in 2020—that’s 1 in every 5 deaths.1,2

Learn more facts. What are the risk factors for heart disease?

High blood pressure, high blood cholesterol, and smoking are key risk factors for heart disease. About half of people in the United States (47%) have at least one of these three risk factors.2 Several other medical conditions and lifestyle choices can also put people at a higher risk for heart disease, including

Diabetes Overweight and obesity Unhealthy diet Physical inactivity Excessive alcohol use

Learn about how heart disease and mental health disorders are related.

Learn more about heart disease, heart attack, and related conditions:

Coronary Artery Disease Heart Attack Men and Heart Disease Women and Heart Disease Other Related Conditions

What is cardiac rehabilitation?

Cardiac rehabilitation (rehab) is an important program for anyone recovering from a heart attack, heart failure, or some types of heart surgery. Cardiac rehab is a supervised program that includes

Physical activity Education about healthy living, including healthy eating, taking medicine as prescribed, and ways to help you quit smoking Counseling to find ways to relieve stress and improve mental health

A team of people may help you through cardiac rehab, including your health care team, exercise and nutrition specialists, physical therapists, and counselors or mental health professionals. Heart Disease Quiz

Test your knowledge of heart disease!

CDC’s Public Health Efforts Related to Heart Disease

State Public Health Actions to Prevent and Control Chronic Diseases Million Hearts® WISEWOMAN

More Information

American Heart Association National Heart, Lung, and Blood Institute

References

Centers for Disease Control and Prevention, National Center for Health Statistics. About Multiple Cause of Death, 1999–2020. CDC WONDER Online Database website. Atlanta, GA: Centers for Disease Control and Prevention; 2022. Accessed February 21, 2022. Tsao CW, Aday AW, Almarzooq ZI, Beaton AZ, Bittencourt MS, Boehme AK, et al. Heart Disease and Stroke Statistics—2022 Update: A Report From the American Heart Association. Circulation. 2022;145(8):e153–e639. Virani SS, Alonso A, Aparicio HJ, Benjamin EJ, Bittencourt MS, Callaway CW, et al. Heart disease and stroke statistics—2021 update: a report from the American Heart Association. Circulation. 2021;143:e254–e743.
Home Health Care Agency Ratings
kaggle.com
zip
Updated Jan 29, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
The Devastator (2023). Home Health Care Agency Ratings [Dataset]. https://www.kaggle.com/datasets/thedevastator/home-health-care-agency-ratings
Explore at:
zip(1078307 bytes)Available download formats
Dataset updated
Jan 29, 2023
Authors
The Devastator
Description
Home Health Care Agency Ratings

Quality Measurements, Types of Services and More

By US Open Data Portal, data.gov [source]

About this dataset

This dataset provides a list of all Home Health Agencies registered with Medicare. Contained within this dataset is information on each agency's address, phone number, type of ownership, quality measure ratings and other associated data points. With this valuable insight into the operations of each Home Health Care Agency, you can make informed decisions about your care needs. Learn more about the services offered at each agency and how they are rated according to their quality measure ratings. From dedicated nursing care services to speech pathology to medical social services - get all the information you need with this comprehensive look at U.S.-based Home Health Care Agencies!

More Datasets

For more datasets, click here.

Featured Notebooks

🚨 Your notebook can be here! 🚨!

How to use the dataset

Are you looking to learn more about Home Health Care Agencies registered with Medicare? This dataset can provide quality measure ratings, addresses, phone numbers, types of services offered and other information that may be helpful when researching Home Health Care Agencies.

This guide will explain how to use the data in this dataset to gain a better understanding of Home Health Care Agencies registered with Medicare.

First, you will need to become familiar with the columns in the dataset. A list of all columns and their associated descriptions is provided above for your reference. Once you understand each column’s purpose, it will be easier for you to decide what metrics or variables are most important for your own research.

Next, use this data to compare various facets between different Home Health Care Agencies such as type of ownership, services offered and quality measure ratings like star rating or CMS certification number (from 0-5 stars). Collecting information from multiple sources such as public reviews or customer feedback can help supplement these numerical metrics in order to paint a more accurate picture about each agency's performance and customer satisfaction level.

Finally once you have collected enough data points on one particular agency or a comparison between multiple agencies then conduct more analysis using statistical methods like correlation matrices in order to determine any patterns that exist within the data set which may reveal valuable insights into topic of research at hand

Research Ideas

Using the data to compare quality of care ratings between agencies, so people can make better informed decisions about which agency to hire for home health services.

Analyzing the costs associated with different types of home health care services, such as nursing care and physical therapy, in order to determine where money could be saved in health care budgets.

Evaluating the performance of certain agencies by analyzing the number of episodes billed to Medicare compared to their national averages, allowing agencies with lower numbers of billing episodes to be identified and monitored more closely if necessary

Acknowledgements

If you use this dataset in your research, please credit the original authors. Data Source

License

Unknown License - Please check the dataset description for more information.

Columns

File: csv-1.csv | Column name | Description | |:----------------------------------------...
Winningest Cities in Sports
kaggle.com
zip
Updated Jan 22, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
The Devastator (2023). Winningest Cities in Sports [Dataset]. https://www.kaggle.com/datasets/thedevastator/winningest-cities-in-sports
Explore at:
zip(466430 bytes)Available download formats
Dataset updated
Jan 22, 2023
Authors
The Devastator
Description
Winningest Cities in Sports

Professional and NCAA Titles, 1870-2018

By data.world's Admin [source]

About this dataset

This dataset contains the key information about some of the most prominent titletowns in North American professional and NCAA sports. It is a comprehensive collection of team and city data, including information about champions, runners-up, final four appearances, seasons competed in and other valuable insights. Analyzing this data can provide an inside look into why certain cities become titletowns through athletic success. This dataset sheds light on different eras, trends and milestones that demonstrate how sports have shaped our culture across time. With this data you can identify patterns over time to see which cities have been successful or which franchises have had a consistent presence throughout various leagues and eras. The data also offers perspective on how different players, teams or locals have contributed to achieving these titles and creating a standout sporting culture within their city or region

More Datasets

For more datasets, click here.

Featured Notebooks

🚨 Your notebook can be here! 🚨!

How to use the dataset

This dataset provides comprehensive information about the winningest cities in North American professional and NCAA sports. It includes details on the teams, sport, titles, finals, and final fours of each city based on data collected from 1920 to 2018. With this dataset, you can analyze the success of different cities living in North America and identify how many titles they have won over the years.

Step 1: Look at the data fields included in this dataset. The columns include key (unique identifier for each city), values (array of objects containing info about teams from that city), population (population of that city), seasons (array with information about seasons in which teams from th city competed), level_0 (index for each row), year (year of championship), sport (sport of championships), winner (winning team for each championship season), winner_metro(metro area for winning team) runner_up(runner up team for each cahmpionship season) runner_up_metro( me tro area fpr runner up tema) finalfour3( thirdteam itnhfinal four championship session) fiNalfour3Metroc maxotityfiegddamnssfnfore34ampxshipsd asfkvq8nearnaeffinalfoumdfvrmaxorityedteamder7thmcffinlafoursesson ) dfiniatfdovrfour4emtroeeramomnea afinthffonfe4 darmotionihffidftoamoeduaraofeimtdyivmpalityasp icratttintThif5tmReBaayYOergucfienPlpouorrliainfnashinipitons .

Step 2: For a deeper analysis look into a specific title count by year using filters or search through single rows like those included in ‘case2’ json file to see detailed info regarding objects inside ‘teamArray’ where you can find out more information such as team name ,sport or league as well as array with titles etcetera. You can also use ‘seasons’ field contained inside case2 json files to look into years and list down all competing teams under that particular season giving us a better understanding which countries/teams were competing against one another over that period.

Step 3: To make comparison easier between cities don't forget to include metro area filters when searching otherwise you might not get

Research Ideas

The dataset can be used to create an interactive web-based map showing the winningest cities in North American professional and NCAA sports by year. For example, users could hover over a city to see its total number of titles won, which sports it was won in, and details about the teams that won them.

Data from this dataset can be used to publish a list of all-time greatest teams in each sport from North American professional and NCAA sports. This can then be filtered by metro area or city for an even deeper perspective into the greatest sporting franchises across North America over time.

The dataset could also be used to compare titles won between different cities within one year, allowing people to look at two (or more) cities simultaneously and gain insight into how they fared competitively on each other in that season across all their respective sports championships/titles

Acknowledgements

If you use this dataset in your research, please credit the original authors. Data Source

License

See the dataset description for more information.

Columns

File: case1.csv | Column name | Description | |:---------------|:-------------------------------------------------------------------------------------------| | key | Uni...

Facebook

Twitter

Click to copy link

Link copied

Cite

Statista (2025). U.S. fitness center/health club memberships 2000-2024 [Dataset]. https://www.statista.com/statistics/236123/us-fitness-center-health-club-memberships/

U.S. fitness center/health club memberships 2000-2024

Explore at:

16 scholarly articles cite this dataset (View in Google Scholar)

Dataset updated

Nov 26, 2025

Dataset authored and provided by

Statistahttp://statista.com/

Area covered

United States

Description

The number of members of fitness centers and health clubs within the United States has experienced a near continual increase between 2000 and 2024. In 2024, there were found to be around ** million members of fitness centers and health clubs within the U.S., the greatest number during the period of observation.

Clear search

Close search

Google apps

Main menu

U.S. fitness center/health club memberships 2000-2024

Attendance rate of gym / boutique gym members in the United States 2017

Population Health (BRFSS: HRQOL)

Population Health (BRFSS: HRQOL)

Examining Trends, Disparities and Determinants of Health in the US Population

About this dataset

More Datasets

Featured Notebooks

How to use the dataset

Research Ideas

Acknowledgements

License

Columns

U.S. Pandemic Mental Health Care

U.S. Pandemic Mental Health Care

Impact on Households in Previous 4 Weeks

About this dataset

More Datasets

Featured Notebooks

How to use the dataset

Research Ideas

Acknowledgements

License

Columns

NYC Open Data

Context

Content

Acknowledgements

Inspiration

US Births by County and State

US Births by County and State

1985-2015 Aggregated Data

About this dataset

More Datasets

Featured Notebooks

How to use the dataset

Research Ideas

Acknowledgements

License

Columns

Acknowledgements

Data from: Sex-specific additive genetic variances and correlations for...

Major US Sports Venues Usage and Affiliations

Major US Sports Venues Usage and Affiliations

Team, League, Conference and Population Usage Records

About this dataset

More Datasets

Featured Notebooks

How to use the dataset

Research Ideas

Acknowledgements

License

US Health Insurance

US Health Insurance

Exploring Rates, Benefits, and Providers

About this dataset

More Datasets

Featured Notebooks

How to use the dataset

Research Ideas

Acknowledgements

License

Columns

Cancer Rates by U.S. State

Americans Vision and Eye Health

Americans' Vision and Eye Health: 2005-2014

The Impact of Different Factors

About this dataset

How to use the dataset

Research Ideas

Acknowledgements

License

Columns

Hospital Care Quality Measures

Hospital Care Quality Measures

Timely & Effective Care Across the U.S

About this dataset

More Datasets

Featured Notebooks

How to use the dataset