Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
The Health Insurance Marketplace Public Use Files contain data on health and dental plans offered to individuals and small businesses through the US Health Insurance Marketplace.
To help get you started, here are some data exploration ideas:
See this forum thread for more ideas, and post there if you want to add your own ideas or answer some of the open questions!
This data was originally prepared and released by the Centers for Medicare & Medicaid Services (CMS). Please read the CMS Disclaimer-User Agreement before using this data.
Here, we've processed the data to facilitate analytics. This processed version has three components:
The original versions of the 2014, 2015, 2016 data are available in the "raw" directory of the download and "../input/raw" on Kaggle Scripts. Search for "dictionaries" on this page to find the data dictionaries describing the individual raw files.
In the top level directory of the download ("../input" on Kaggle Scripts), there are six CSV files that contain the combined at across all years:
Additionally, there are two CSV files that facilitate joining data across years:
The "database.sqlite" file contains tables corresponding to each of the processed CSV files.
The code to create the processed version of this data is available on GitHub.
Facebook
Twitterhttp://opendatacommons.org/licenses/dbcl/1.0/http://opendatacommons.org/licenses/dbcl/1.0/
This public dataset contains data concerning the public and private insurance companies provided by IRDAI(Insurance Regulatory and Development Authority of India) from 2013-2022. This is a multi-index data and can be a great practice to hone manipulation of pandas multi-index dataframes. Mainly, the business of the companies (total premiums and number of policies), subscription information(number of people subscribed), Claims incurred and the Network hospitals enrolled by Third Party Administrators are attributes focused by the dataset.
The Excel file contains the following data | Table No.| Contents| | --- | --- | |**A**|**III.A: HEALTH INSURANCE BUSINESS OF GENERAL AND HEALTH INSURERS**| |62| Health Insurance - Number of Policies, Number of Persons Covered and Gross Premium| |63| Personal Accident Insurance - Number of Policies, Number of Persons Covered and Gross Premium| |64| Overseas Travel Insurance - Number of Policies, Number of Persons Covered and Gross Premium| |65| Domestic Travel Insurance - Number of Policies, Number of Persons Covered and Gross Premium| |66| Health Insurance - Net Premium Earned, Incurred Claims and Incurred Claims Ratio| |67| Personal Accident Insurance - Net Premium Earned, Incurred Claims and Incurred Claims Ratio| |68| Overseas Travel Insurance - Net Earned Premium, Incurred Claims and Incurred Claims Ratio| |69| Domestic Travel Insurance - Net Earned Premium, Incurred Claims and Incurred Claims Ratio| |70| Details of Claims Development and Aging - Health Insurance Business| |71| State-wise Health Insurance Business| |72| State-wise Individual Health Insurance Business| |73| State-wise Personal Accident Insurance Business| |74| State-wise Overseas Insurance Business| |75| State-wise Domestic Insurance Business| |76| State-wise Claims Settlement under Health Insurance Business| |**B**|**III.B: HEALTH INSURANCE BUSINESS OF LIFE INSURERS**| |77| Health Insurance Business in respect of Products offered by Life Insurers - New Busienss| |78| Health Insurance Business in respect of Products offered by Life insurers - Renewal Business| |79| Health Insurance Business in respect of Riders attached to Life Insurance Products - New Business| |80| Health Insurance Business in respect of Riders attached to Life Insurance Products - Renewal Business| |**C**|**III.C: OTHERS**| |81| Network Hospital Enrolled by TPAs| |82| State-wise Details on Number of Network Providers |
Facebook
TwitterAttribution-NonCommercial-ShareAlike 3.0 (CC BY-NC-SA 3.0)https://creativecommons.org/licenses/by-nc-sa/3.0/
License information was derived automatically
This dataset, released April 2017, contains the estimated number of people, aged 18 years and over, with private health insurance hospital cover, 2014-15. The data is by Local Government Area (LGA) 2016 geographic boundaries. For more information please see the data source notes on the data. Source: Estimates for Population Health Areas (PHAs) are modelled estimates and were produced by the ABS;estimates at the LGA and PHN level were derived from the PHA estimates. AURIN has spatially enabled the original data. Data that was not shown/not applicable/not published/not available for the specific area ('#', '..', '^', 'np, 'n.a.', 'n.y.a.' in original PHIDU data) was removed.It has been replaced by by Blank cells. For other keys and abbreviations refer to PHIDU Keys.
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
This dataset provides a realistic look at how important lifestyle and human characteristics are used to calculate health insurance rates. It presents a wide range of people, each characterized by their age, gender, BMI, number of dependents, smoking status, and geographic location, as well as the associated insurance bills they received.
We can find important patterns by examining this dataset, like: Why smokers pay noticeably higher premiums How age and BMI affect medical expenses Whether insurance costs are higher in some areas The connection between charges and family size
Facebook
TwitterAttribution-NonCommercial-ShareAlike 3.0 (CC BY-NC-SA 3.0)https://creativecommons.org/licenses/by-nc-sa/3.0/
License information was derived automatically
The number of people with private health insurance which is additional health cover to that provided under Medicare, to reimburse all or part of the cost of hospital or other medical services incurred by an individual, 2014-15 (all entries that were classified as not shown, not published or not applicable were assigned a null value; no data was provided for Maralinga Tjarutja LGA, in South Australia). The data is by LGA 2015 profile (based on the LGA 2011 geographic boundaries). For more information on statistics used please refer to the PHIDU website, available from: http://phidu.torrens.edu.au/ Source: Estimates for Population Health Areas (PHAs) are modelled estimates and were produced by the ABS; estimates at the LGA and PHN level were derived from the PHA estimates.
Facebook
TwitterAttribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
Global Private Health Insurance Coverage by Country, 2023 Discover more data with ReportLinker!
Facebook
TwitterBy Data Society [source]
This fascinating dataset from the Centers for Medicare & Medicaid Services provides an in-depth analysis of health insurance plans offered throughout the United States. Exploring this data, you can gain insights into how plan rates and benefits vary across states, explore how plan benefits relate to plan rates, and investigate how plans vary across insurance network providers.
The top-level directory includes six CSV files which contain information about: BenefitsCostSharing.csv; BusinessRules.csv; Network.csv; PlanAttributes.csv; Rate.csv; and ServiceArea.csv - as well as two additional CSV files which facilitate joining data across years: Crosswalk2015.csv (joining 2014 and 2015 data) and Crosswalk2016
For more datasets, click here.
- đ¨ Your notebook can be here! đ¨!
This Kaggle dataset contains comprehensive data on US health insurance Marketplace plans. The data was obtained from the Centers for Medicare & Medicaid Services and contains information such as plan rates and benefits, metal levels, dental coverage, and child/adult-only coverages.
In order to use this dataset effectively, it is important to understand the different columns/variables that make up the dataset. The columns are state, dental plan, multistate plan (2015 and 2016), metal level (2014-2016), child/adult-only coverage (2014-2016), FIPS code (Federal Information Processing Standard code for the particular state), zipcode, crosswalk level (level of crosswalk between 2014-2016 data sets), reason for crosswalk parameter.
Using this dataset can help you answer interesting questions about US health insurance Marketplace plans across different variables such as state or rate information. It may also be interesting to compare certain variables over time with respect to how they affect certain types of people or how they differ across states or regions. Additionally, an analysis of the different price points associated with various kinds of coverage could provide insights into which kinds of plans are most attractive in various marketplaces based on cost savings alone
Once you have a good understanding of your data by studying individual parameters in depth across multiple states or regions you can begin looking at correlations between different parameters You can identify patterns that emerge around common characteristics or trends within areas or across markets over time when you have gathered sufficient historical data:
- Does higher out of pocket limits tend to come with higher premiums?
- Are there more multi-state markets in some states than others?
- What type of metal levels does each region prefer?
- Examining the impacts of age, metal levels and plan benefits on insurance rates in different states.
- Analyzing how dental plans vary across different states/regions and examining whether there are correlations between affordability and quality of care among plans with dental coverage options.
- Investigating how the Crosswalk level affects insurance rates by comparing insurance premiums from different metals level across states with varying Crosswalk Levels (e.g., how does a Bronze plan differ in cost for two states with differing Crosswalk Level 1 vs 2)
If you use this dataset in your research, please credit the original authors. Data Source
License: Dataset copyright by authors - You are free to: - Share - copy and redistribute the material in any medium or format for any purpose, even commercially. - Adapt - remix, transform, and build upon the material for any purpose, even commercially. - You must: - Give appropriate credit - Provide a link to the license, and indicate if changes were made. - ShareAlike - You must distribute your contributions under the same license as the original. - Keep intact - all notices that refer to this license, including copyright notices.
File: Crosswalk2016.csv | Column name | Description | |:------------------------------|:------------------------------------------------------------------------------------------------------------------------------| | State | The state in which...
Facebook
TwitterAttribution-NonCommercial-ShareAlike 3.0 (CC BY-NC-SA 3.0)https://creativecommons.org/licenses/by-nc-sa/3.0/
License information was derived automatically
This dataset, released April 2017, contains the estimated number of people, aged 18 years and over, with private health insurance hospital cover, 2014-15. The data is by Population Health Area (PHA) 2016 geographic boundaries based on the 2016 Australian Statistical Geography Standard (ASGS). Population Health Areas, developed by PHIDU, are comprised of a combination of whole SA2s and multiple (aggregates of) SA2s, where the SA2 is an area in the ABS structure. For more information please see the data source notes on the data. Source: Estimates for Population Health Areas (PHAs) are modelled estimates and were produced by the ABS;estimates at the LGA and PHN level were derived from the PHA estimates. AURIN has spatially enabled the original data. Data that was not shown/not applicable/not published/not available for the specific area ('#', '..', '^', 'np, 'n.a.', 'n.y.a.' in original PHIDU data) was removed.It has been replaced by by Blank cells. For other keys and abbreviations refer to PHIDU Keys.
Facebook
Twitterhttps://www.icpsr.umich.edu/web/ICPSR/studies/6168/termshttps://www.icpsr.umich.edu/web/ICPSR/studies/6168/terms
The National Medical Expenditure Survey (NMES) series provides information on health expenditures by or on behalf of families and individuals, the financing of these expenditures, and each person's use of services. Public Use Tape 16 is the second public use data release from the NMES Health Insurance Plans Survey (HIPS). The purpose of the HIPS was to verify information reported by respondents to two components of the NMES, the Household Survey and the Survey of American Indians and Alaska Natives (SAIAN), about their health insurance coverage. Additional details were also obtained from the employers, unions, and insurance companies through which coverage was provided. Parts 1 and 2 of Public Use Tape 16 are files that can be used to link data to Household Survey policyholders in NATIONAL MEDICAL EXPENDITURE SURVEY, 1987: POLICYHOLDERS OF PRIVATE INSURANCE: PREMIUMS, PAYMENT SOURCES, AND TYPES AND SOURCE OF COVERAGE PUBLIC USE TAPE 15. These link files permit identification of the records in the Private Health Insurance Benefit Database (Parts 3-17 of this collection) that describe the specific benefits held by the policyholders. These files also permit linkage to the personal and socioeconomic characteristics for these policyholders found in NATIONAL MEDICAL EXPENDITURE SURVEY, 1987: HOUSEHOLD SURVEY, POPULATION CHARACTERISTICS AND PERSON-LEVEL UTILIZATION, ROUNDS 1-4 PUBLIC USE TAPE 13. Future link files will permit linkage of the Benefit Database to persons in the SAIAN and to dependents of policyholders in the Household Survey. The section files of the Benefit Database, Parts 4-13, contain information on Health Maintenance Organizations (HMOs), copayments, basic coverage, hospital and medical services, cost-containment provisions, major medical coverage, dental care, prescription drugs, vision and hearing care, and Medicare benefits. The schedule files, Parts 14-17, contain specific deductible amounts, dollar benefits, coinsurance provisions, maximum benefits, and benefit periods. Wherever possible, copies of policies or booklets describing the coverage and benefits were obtained in order to abstract this information.
Facebook
TwitterAttribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
Forecast: Private Health Insurance Coverage in Denmark 2024 - 2028 Discover more data with ReportLinker!
Facebook
TwitterThis dataset, released April 2017, contains the estimated number of people, aged 18 years and over, with private health insurance hospital cover, 2014-15. The data is by Population Health Area (PHA) âŚShow full descriptionThis dataset, released April 2017, contains the estimated number of people, aged 18 years and over, with private health insurance hospital cover, 2014-15. The data is by Population Health Area (PHA) 2016 geographic boundaries based on the 2016 Australian Statistical Geography Standard (ASGS). Population Health Areas, developed by PHIDU, are comprised of a combination of whole SA2s and multiple (aggregates of) SA2s, where the SA2 is an area in the ABS structure. For more information please see the data source notes on the data. Source: Estimates for Population Health Areas (PHAs) are modelled estimates and were produced by the ABS; estimates at the LGA and PHN level were derived from the PHA estimates. Please note: AURIN has spatially enabled the original data. "*" - Indicates statistically significant, at the 95% confidence level. "**" - Indicates statistically significant, at the 99% confidence level. "~" - Indicates modelled estimates have Relative Root Mean Square Errors (RRMSEs) from 0.25 to 0.50 and should be used with caution. "~~" - Indicates modelled estimates have RRMSEs greater than 0.50 but less than 1 and are considered too unreliable for general use. '?' - Indicates modelled estimates are considered too unreliable. Blank cell - Indicates data was not shown/not applicable/not published/not available for the specific area ('#', '..', '^', 'np, 'n.a.', 'n.y.a.' in original PHIDU data). Abbreviation Information: "ASR per #" - Indirectly age-standardised rate per specified population. "SR" - Indirectly age-standardised ratio. "95% C.I" - upper and lower 95% confidence intervals. Copyright attribution: Torrens University Australia - Public Health Information Development Unit, (2018): ; accessed from AURIN on 12/3/2020. Licence type: Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Australia (CC BY-NC-SA 3.0 AU)
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Abstract This paper explores the existence of ex-ante moral hazard in private health insurance in Brazil. Before the advent of illness, insured individuals have no incentives to seek preventive care if it is not previously contractible. The data set comprises longitudinal administrative records of health care utilization from a Brazilian employer-sponsored health insurance plan. The empirical strategy is based on an exogenous and anticipated shock in health insurance coverage not associated with health conditions. The results show an increase of up to 17% on medical visits and 22% on diagnostic tests due to the loss of health insurance. Medical visits start to increase ďŹve months before the individual leaves the health insurance pool, reaching its peak at two months prior to exit. For diagnostic tests, the increase was observed only in the last two months before the loss of health insurance coverage.
Facebook
TwitterThis dataset contains electronic health records used to study associations between PFAS occurrence and multimorbidity in a random sample of UNC Healthcare system patients. The dataset contains the medical record number to uniquely identify each individual as well as information on PFAS occurrence at the zip code level, the zip code of residence for each individual, chronic disease diagnoses, patient demographics, and neighborhood socioeconomic information from the 2010 US Census. This dataset is not publicly accessible because: EPA cannot release personally identifiable information regarding living individuals, according to the Privacy Act and the Freedom of Information Act (FOIA). This dataset contains information about human research subjects. Because there is potential to identify individual participants and disclose personal information, either alone or in combination with other datasets, individual level data are not appropriate to post for public access. Restricted access may be granted to authorized persons by contacting the party listed. It can be accessed through the following means: Because this data has PII from electronic health records the data can only be accessed with an approved IRB application. Project analytic code is available at L:/PRIV/EPHD_CRB/Cavin/CARES/Project Analytic Code/Cavin Ward/PFAS Chronic Disease and Multimorbidity. Format: This data is formatted as a R dataframe and associated comma-delimited flat text file. The data has the medical record number to uniquely identify each individual (which also serves as the primary key for the dataset), as well as information on the occurrence of PFAS contamination at the zip code level, socioeconomic data at the census tract level from the 2010 US Census, demographics, and the presence of chronic disease as well as multimorbidity (the presence of two or more chronic diseases). This dataset is associated with the following publication: Ward-Caviness, C., J. Moyer, A. Weaver, R. Devlin, and D. Diazsanchez. Associations between PFAS occurrence and multimorbidity as observed in an electronic health record cohort. Environmental Epidemiology. Wolters Kluwer, Alphen aan den Rijn, NETHERLANDS, 6(4): p e217, (2022).
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
This dataset contains medical insurance cost information for 1338 individuals. It includes demographic and health-related variables such as age, sex, BMI, number of children, smoking status, and residential region in the US. The target variable is charges, which represents the medical insurance cost billed to the individual.
The dataset is commonly used for:
Regression modeling
Health economics research
Insurance pricing analysis
Machine learning education and tutorials
Columns
age: Age of primary beneficiary (int)
sex: Gender of beneficiary (male, female)
bmi: Body Mass Index, a measure of body fat based on height and weight (float)
children: Number of children covered by health insurance (int)
smoker: Smoking status of the beneficiary (yes, no)
region: Residential region in the US (northeast, northwest, southeast, southwest)
charges: Medical insurance cost billed to the beneficiary (float)
Potential Uses
Build predictive models for medical costs Explore how smoking and BMI impact charges Teach students about regression and feature engineering Analyze healthcare affordability trends
Facebook
TwitterThis dataset includes individual catastrophic health plans available outside the Marketplace. They are available to people whose individual health plans have been cancelled and who believe that bronze-level plans in the Marketplace are too expensive. These people may apply for a hardship exemption that allows them to buy one of these plans. Not all states offer catastrophic plans outside the Marketplace. People who live in states that run their own Marketplaces may be able to participate in this program. In states with state-based Marketplaces that do offer catastrophic plans, the dataset includes listings for state departments of insurance, which can provide more information.
Facebook
TwitterThis is a study on healthcare workers at the University of North Carolina Hospital system conducted during the COVID-19 pandemic in 2020-2021. This includes responses to survey questions on occupation, living situation, mental health, physical health, prior COVID-19 infection, and vaccination status. As the data are identifiable, we cannot release them publicly. This dataset is not publicly accessible because: EPA cannot release personally identifiable information regarding living individuals, according to the Privacy Act and the Freedom of Information Act (FOIA). This dataset contains information about human research subjects. Because there is potential to identify individual participants and disclose personal information, either alone or in combination with other datasets, individual level data are not appropriate to post for public access. Restricted access may be granted to authorized persons by contacting the party listed. It can be accessed through the following means: These data are owned by the University of North Carolina at Chapel Hill. Contact Dr. Emily Ciccone ciccone@med.unc.edu with inquiries. Format: This dataset includes data on healthcare workers, including questionnaire responses and data from wearable tracking devices. These data are sensitive and participants are potentially identifiable.
Facebook
Twitterhttps://cubig.ai/store/terms-of-servicehttps://cubig.ai/store/terms-of-service
1) Data Introduction ⢠The Insurance Classification Dataset is a structured dataset designed to predict an individual's smoking status (Smoker) based on features such as age, gender, BMI, number of children, region, and medical expenses. Smoking status is closely associated with health risks and medical costs, making it a critical factor for insurance risk assessment and personalized policy design.
2) Data Utilization (1) Characteristics of the Insurance Classification Dataset: ⢠The dataset includes various health and lifestyle-related attributes such as age, gender, BMI, number of children, region, and medical expenses. The Smoker field serves as a binary classification label indicating whether the individual is a smoker.
(2) Applications of the Insurance Classification Dataset: ⢠Smoking status classification model training: The dataset can be used to train machine learning models that predict whether an individual is a smoker based on health indicators and lifestyle factors. ⢠High-risk group identification and insurance strategy development: By detecting high-risk individuals (i.e., smokers) early, the dataset can support policy pricing, preventive healthcare programs, and the development of personalized insurance strategies.
Facebook
TwitterThe Medical Expenditure Panel Survey (MEPS) Household Component (HC) collects data from a sample of families and individuals in selected communities across the United States, drawn from a nationally representative subsample of households that participated in the prior year's National Health Interview Survey (conducted by the National Center for Health Statistics). During the household interviews, MEPS collects detailed information for each person in the household on the following: demographic characteristics, health conditions, health status, use of medical services, charges and source of payments, access to care, satisfaction with care, health insurance coverage, income, and employment. The panel design of the survey, which features several rounds of interviewing, makes it possible to determine how changes in respondents' health status, income, employment, eligibility for public and private insurance coverage, use of services, and payment for care are related. Public Use Files for Household data are available on the MEPS website.
Facebook
TwitterCC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
This dataset provides sample premium information for individual ACA-compliant health insurance plans available to Iowans for 2026 based on age, rating area and metal level. These are premiums for individuals, not families.
Explore and drill into the data using the 2026 Sample Premium Explorer.
Please note that not every plan ID is available in every county. On or after November 1, 2025, please go to www.healthcare.gov to determine if your plan is available in the county you reside in.
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
The Affordable Care Act created the new Pre-Existing Condition Insurance Plan (PCIP) program to make health insurance available to Americans denied coverage by private insurance companies because of a pre-existing condition. Coverage for people living with such conditions as diabetes, asthma, cancer, and HIV/AIDS has often been priced out of the reach of most Americans who buy their own insurance, and this has resulted in a lack of coverage for millions. The temporary program covers a broad range of health benefits and is designed as a bridge for people with pre-existing conditions who cannot obtain health insurance coverage in todayâs private insurance market. To learn more, visit PCIP.gov or HealthCare.gov.
Note: * Massachusetts and Vermont are guarantee issue states that have already implemented many of the broader market reforms included in the Affordable Care Act that take effect in 2014. Existing commercial plans offering guaranteed coverage at premiums comparable to PCIP are already available in both states.
This is a dataset hosted by the Centers for Medicare & Medicaid Services (CMS). The organization has an open data platform found here and they update their information according the amount of data that is brought in. Explore CMS's Data using Kaggle and all of the data sources available through the CMS organization page!
This dataset is maintained using Socrata's API and Kaggle's API. Socrata has assisted countless organizations with hosting their open data and has been an integral part of the process of bringing more data to the public.
Cover photo by Lily Banse on Unsplash
Unsplash Images are distributed under a unique Unsplash License.
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
The Health Insurance Marketplace Public Use Files contain data on health and dental plans offered to individuals and small businesses through the US Health Insurance Marketplace.
To help get you started, here are some data exploration ideas:
See this forum thread for more ideas, and post there if you want to add your own ideas or answer some of the open questions!
This data was originally prepared and released by the Centers for Medicare & Medicaid Services (CMS). Please read the CMS Disclaimer-User Agreement before using this data.
Here, we've processed the data to facilitate analytics. This processed version has three components:
The original versions of the 2014, 2015, 2016 data are available in the "raw" directory of the download and "../input/raw" on Kaggle Scripts. Search for "dictionaries" on this page to find the data dictionaries describing the individual raw files.
In the top level directory of the download ("../input" on Kaggle Scripts), there are six CSV files that contain the combined at across all years:
Additionally, there are two CSV files that facilitate joining data across years:
The "database.sqlite" file contains tables corresponding to each of the processed CSV files.
The code to create the processed version of this data is available on GitHub.