12 datasets found

Health Insurance Marketplace
kaggle.com
zip
Updated May 1, 2017
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
US Department of Health and Human Services (2017). Health Insurance Marketplace [Dataset]. https://www.kaggle.com/datasets/hhs/health-insurance-marketplace
Explore at:
zip(868821924 bytes)Available download formats
Dataset updated
May 1, 2017
Dataset provided by
United States Department of Health and Human Serviceshttp://www.hhs.gov/
Authors
US Department of Health and Human Services
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
The Health Insurance Marketplace Public Use Files contain data on health and dental plans offered to individuals and small businesses through the US Health Insurance Marketplace.

Exploration Ideas

To help get you started, here are some data exploration ideas:

How do plan rates and benefits vary across states?

How do plan benefits relate to plan rates?

How do plan rates vary by age?

How do plans vary across insurance network providers?

See this forum thread for more ideas, and post there if you want to add your own ideas or answer some of the open questions!

Data Description

This data was originally prepared and released by the Centers for Medicare & Medicaid Services (CMS). Please read the CMS Disclaimer-User Agreement before using this data.

Here, we've processed the data to facilitate analytics. This processed version has three components:

1. Original versions of the data

The original versions of the 2014, 2015, 2016 data are available in the "raw" directory of the download and "../input/raw" on Kaggle Scripts. Search for "dictionaries" on this page to find the data dictionaries describing the individual raw files.

2. Combined CSV files that contain

In the top level directory of the download ("../input" on Kaggle Scripts), there are six CSV files that contain the combined at across all years:

BenefitsCostSharing.csv

BusinessRules.csv

Network.csv

PlanAttributes.csv

Rate.csv

ServiceArea.csv

Additionally, there are two CSV files that facilitate joining data across years:

Crosswalk2015.csv - joining 2014 and 2015 data

Crosswalk2016.csv - joining 2015 and 2016 data

3. SQLite database

The "database.sqlite" file contains tables corresponding to each of the processed CSV files.

The code to create the processed version of this data is available on GitHub.
h
medical-insurance-charges-dataset
huggingface.co
Updated Jul 15, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
F (2025). medical-insurance-charges-dataset [Dataset]. https://huggingface.co/datasets/affnanation/medical-insurance-charges-dataset
Explore at:
Dataset updated
Jul 15, 2025
Authors
F
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
Dataset: Medical Insurance Cost

This is the dataset used to train and evaluate the health insurance cost prediction model for the RiskGuard project. The main code repository can be found on GitHub.

Dataset Description

This dataset originates from Kaggle (Medical Cost Personal Datasets) and contains demographic and personal attributes of insurance customers. It is used to predict individual medical costs.

Data Columns

age: Age of the primary beneficiary… See the full description on the dataset page: https://huggingface.co/datasets/affnanation/medical-insurance-charges-dataset.
A
‘Medical Insurance Premium Prediction’ analyzed by Analyst-2
analyst-2.ai
Updated Aug 4, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com) (2021). ‘Medical Insurance Premium Prediction’ analyzed by Analyst-2 [Dataset]. https://analyst-2.ai/analysis/kaggle-medical-insurance-premium-prediction-5cfe/827b15fc/?iid=021-110&v=presentation
Explore at:
Dataset updated
Aug 4, 2021
Dataset authored and provided by
Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com)
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Analysis of ‘Medical Insurance Premium Prediction’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://www.kaggle.com/tejashvi14/medical-insurance-premium-prediction on 12 November 2021.

--- Dataset description provided by original source is as follows ---

Context

A Medical Insurance Company Has Released Data For Almost 1000 Customers. Create A Model That Predicts The Yearly Medical Cover Cost. The Data Is Voluntarily Given By Customers.

Content

The Dataset Contains Health Related Parameters Of The Customers. Use Them To Build A Model And Also Perform EDA On The Same. The Premium Price Is In INR(₹) Currency And Showcases Prices For A Whole Year.

Inspiration

Help Solve A Crucial Finance Problem That Would Potentially Impact Many People And Would Help Them Make Better Decisions. Don't Forget To Submit Your EDAs And Models In The Task Section. These Will Be Keenly Reviewed Hope You Enjoy Working On The Data. note- This is a dummy dataset used for teaching and training purposes. It is free to use, Image Credits-Unsplash

--- Original source retains full ownership of the source dataset ---
Insurance
kaggle.com
Updated Jun 5, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
G DEEPAK REDDY (2022). Insurance [Dataset]. https://www.kaggle.com/datasets/gdeepakreddy/insurance
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jun 5, 2022
Dataset provided by
Kagglehttp://kaggle.com/
Authors
G DEEPAK REDDY
Description
Business Problem: We all know that Health care is very important domain in the market. It is directly linked with the life of the individual; hence we have to be always be proactive in this particular domain. Money plays a major role in this domain, because sometime treatment becomes super costly and if any individual is not covered under the insurance then it will become a pretty tough financial situation for that individual. The companies in the medical insurance also want to reduce their risk by optimizing the insurance cost, because we all know a healthy body is in the hand of the individual only. If individual eat healthy and do proper exercise the chance of getting ill is drastically reduced. Goal & Objective: The objective of this exercise is to build a model, using data that provide the optimum insurance cost for an individual. You have to use the health and habit related parameters for the estimated cost of insurance

Review Parameters Review points 1) Introduction of the business problem a) Defining problem statement
b) Need of the study/project
c) Understanding business/social opportunity

2)Data Report
a) Understanding how data was collected in terms of time, frequency and methodology
b) Visual inspection of data (rows, columns, descriptive details)
c) Understanding of attributes (variable info, renaming if required)

3) Exploratory data analysis
a) Univariate analysis (distribution and spread for every continuous attribute, distribution of data in categories for categorical ones)
b) Bivariate analysis (relationship between different variables , correlations)
a) Removal of unwanted variables (if applicable)
b) Missing Value treatment (if applicable)
d) Outlier treatment (if required)
e) Variable transformation (if applicable)
f) Addition of new variables (if required)

4) Business insights from EDA a) Is the data unbalanced? If so, what can be done? Please explain in the context of the business
b) Any business insights using clustering (if applicable)
c) Any other business insights
Medical_Insurance_Cost_Dataset
kaggle.com
Updated Jul 3, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Sakshi Singh (2024). Medical_Insurance_Cost_Dataset [Dataset]. https://www.kaggle.com/datasets/sakshisinghssg/medical-insurance-cost-dataset/code
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jul 3, 2024
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Sakshi Singh
License
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Description
Dataset

This dataset was created by Sakshi Singh

Released under Apache 2.0

Contents
Health Insurance Marketplaces
kaggle.com
Updated Jan 23, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
The Devastator (2023). Health Insurance Marketplaces [Dataset]. https://www.kaggle.com/datasets/thedevastator/health-insurance-marketplaces/code
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jan 23, 2023
Dataset provided by
Kagglehttp://kaggle.com/
Authors
The Devastator
Description
Health Insurance Marketplaces

Rates, Benefits, Coverage and Networks

By Data Society [source]

About this dataset

Do you want to explore the complexities of Health Insurance Marketplace and uncover insights into plan rates, benefits, and networks? Look no further! With this dataset from the Centers for Medicare & Medicaid Services (CMS), you can investigate trends in plan rates, access coverage across states and zip codes, compare metal level plans (across years), as well as analyze benefit information all in one place.

We’ve provided six CSV files containing combined data from across all years: BenefitsCostSharing.csv provides details on benefits, BusinessRules.csv provides details about premium payment requirements for a plan or set of plans, Network.csv offers details about health plans’ networks of providers who offer services at different cost levels to members enrolled in a given plan or set of plans; PlanAttributes.csv gives attributes like age off dates for various plans; Rate.csv delivers information on rate changes; ServiceArea.csv reveals demographic characteristics related to each service area associated with a specific issuer and two CSV files that join data across years (Crosswalk2015 & Crosswalk2016).

So come on board and use your creativity to unlock the mysteries behind changes in benefits in relation to costs while exploring network providers within different regions!!!

More Datasets

For more datasets, click here.

Featured Notebooks

🚨 Your notebook can be here! 🚨!

How to use the dataset

This dataset contains information about the health insurance plans offered in the US Health Insurance Marketplace. It includes data on plan benefits, cost-sharing, networks, rates and service areas for different states. The data can be used to compare and analyze plan characteristics across different states and ages which will help guide users decision making when purchasing a health insurance plan.

To begin using the dataset, you should start by looking at the columns available. These include State, Dental Plan, Multistate Plan (2015 & 2016), Metal Level (2015 & 2016), Child/Adult Only (2015 & 2016), FIPS Code, Zip Code Crosswalk Level, Reason for Crosswalk, Multistate Plan Ageoff (2016 & 2015) and MetalLevel Ageoff (2016 & 2015). These columns provide important information on each plan that can be used to compare them across states or between years.

Using this data you can explore several interesting questions such as: How do benefit levels vary among states? Are there any differences in network providers between states? What factors influence plan rates?

In order to answer these questions you should join together relevant tables from across years using Crosswalk 2015/2016 CSV files then organize your data accordingly so that it is easier to visualize differences in features between plans sold across different states or years. Once the information is organized it might be helpful to use visualizations such as line graphs or bar charts to view comparison between feature values of two plans versus one another more clearly in order differentiate variations of plans among Consumers.

By doing this you can gain a better understanding of how certain factors may affect rate changes over time or how certain benefit levels might differ by state which will allow Consumers make an informed choice when selecting their next health insurance plan

Research Ideas

Analyzing the effectiveness of different plan benefits and how they affect premiums to determine a fair price point for different types of healthcare plans.

Examining the variation in rates, benefits and coverage by state or zip code to identify potential trends or disparities in access to quality health care services across regions.

Developing an algorithm that can predict premium prices based on certain factors such as age groups, type of plan (metal levels), multistate coverage, etc., to help consumers more easily understand the true cost of their health insurance plans before committing to purchase them

Acknowledgements

If you use this dataset in your research, please credit the original authors. Data Source

License

License: Dataset copyright by authors - You are free to: - Share - copy and redistribute the material in any medium or format for any purpose, even commercially. - Adapt - remix, transform, and build upon the material for any purpose, even commercially. - You must: - Give appropriate credit -...
Health insurance price
kaggle.com
Updated Sep 9, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Amar Ahmed Hamed (2024). Health insurance price [Dataset]. https://www.kaggle.com/datasets/amarahmedhamed/health-insurance-price/code
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Sep 9, 2024
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Amar Ahmed Hamed
License
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Description
Dataset

This dataset was created by Amar Ahmed Hamed

Released under Apache 2.0

Contents
f
Health Insurance Dataset
figshare.com
csv
Updated Mar 11, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Prakash M C (2025). Health Insurance Dataset [Dataset]. http://doi.org/10.6084/m9.figshare.28571408.v1
Explore at:
csvAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.28571408.v1
Dataset updated
Mar 11, 2025
Dataset provided by
figshare
Authors
Prakash M C
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This dataset contains expense and premium details related to health insurance.
Medical Policy Premium Dataset
kaggle.com
Updated Sep 22, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Sunil Sanjay Hule (2021). Medical Policy Premium Dataset [Dataset]. https://www.kaggle.com/sunilhule/medical-policy-premium-dataset/metadata
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Sep 22, 2021
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Sunil Sanjay Hule
Description
Context

The given data set contains a user's medical history in the form of whether they have a specific condition or not, their age, height, weight, etc along with the premium they have to pay in INR for insurance.

Content

This tabular data contains 11 columns regarding patient's medical records and current health conditions.
Health Insurance Lead Prediction
kaggle.com
zip
Updated Feb 26, 2021
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Bhavyajot Malhotra (2021). Health Insurance Lead Prediction [Dataset]. https://www.kaggle.com/bhavyajotmalhotra/jobathon-health-insurance-prediction
Explore at:
zip(1129580 bytes)Available download formats
Dataset updated
Feb 26, 2021
Authors
Bhavyajot Malhotra
Description
Your Client FinMan is a financial services company that provides various financial services like loan, investment funds, insurance etc. to its customers. FinMan wishes to cross-sell health insurance to the existing customers who may or may not hold insurance policies with the company. The company recommend health insurance to it's customers based on their profile once these customers land on the website. Customers might browse the recommended health insurance policy and consequently fill up a form to apply. When these customers fill-up the form, their Response towards the policy is considered positive and they are classified as a lead.

Once these leads are acquired, the sales advisors approach them to convert and thus the company can sell proposed health insurance to these leads in a more efficient manner.

Now the company needs your help in building a model to predict whether the person will be interested in their proposed Health plan/policy given the information about:

Demographics (city, age, region etc.) Information regarding holding policies of the customer Recommended Policy Information

Data Dictionary Train Data Variable Definition ID Unique Identifier for a row City_Code Code for the City of the customers Region_Code Code for the Region of the customers Accomodation_Type Customer Owns or Rents the house Reco_Insurance_Type Joint or Individual type for the recommended insurance
Upper_Age Maximum age of the customer Lower _Age Minimum age of the customer Is_Spouse If the customers are married to each other (in case of joint insurance) Health_Indicator Encoded values for health of the customer Holding_Policy_Duration Duration (in years) of holding policy (a policy that customer has already subscribed to with the company) Holding_Policy_Type Type of holding policy Reco_Policy_Cat Encoded value for recommended health insurance Reco_Policy_Premium Annual Premium (INR) for the recommended health insurance Response (Target) 0 : Customer did not show interest in the recommended policy, 1 : Customer showed interest in the recommended policy

Test Data Variable Definition ID Unique Identifier for a row City_Code Code for the City of the customers Region_Code Code for the Region of the customers Accomodation_Type Customer Owns or Rents the house Reco_Insurance_Type Joint or Individual type for the recommended insurance Upper_Age Maximum age of the customer Lower _Age Minimum age of the customer Is_Spouse If the customers are married to each other (in case of joint insurance) Health_Indicator Encoded values for health of the customer Holding_Policy_Duration Duration (in years) of holding policy (a policy that customer has already subscribed to with the company) Holding_Policy_Type Type of holding policy Reco_Policy_Cat Encoded value for recommended health insurance Reco_Policy_Premium Annual Premium (INR) for the recommended health insurance

Variable Definition ID Unique Identifier for a row Response (Target) Probability of Customer showing interest (class 1)
AV: JantaHackathon
kaggle.com
Updated Sep 12, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Kunal Bambardekar (2020). AV: JantaHackathon [Dataset]. https://www.kaggle.com/kbambardekar/av-jantahackathon/activity
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Sep 12, 2020
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Kunal Bambardekar
License
Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
Description
Will you take vehicle insurance?

An insurance policy is an arrangement by which a company undertakes to provide a guarantee of compensation for specified loss, damage, illness, or death in return for the payment of a specified premium. A premium is a sum of money that the customer needs to pay regularly to an insurance company for this guarantee.

For example, you may pay a premium of Rs. 5000 each year for a health insurance cover of Rs. 200,000/- so that if, God forbid, you fall ill and need to be hospitalised in that year, the insurance provider company will bear the cost of hospitalisation etc. for upto Rs. 200,000. Now if you are wondering how can company bear such high hospitalisation cost when it charges a premium of only Rs. 5000/-, that is where the concept of probabilities comes in picture. For example, like you, there may be 100 customers who would be paying a premium of Rs. 5000 every year, but only a few of them (say 2-3) would get hospitalised that year and not everyone. This way everyone shares the risk of everyone else.

Just like medical insurance, there is vehicle insurance where every year customer needs to pay a premium of a certain amount to its insurance provider company so that in case of an unfortunate accident by the vehicle, the insurance provider company will provide compensation (called ‘sum assured’) to the customer.

Building a model to predict whether a customer would be interested in Vehicle Insurance is extremely helpful for the company because it can then accordingly plan its communication strategy to reach out to those customers and optimise its business model and revenue.

Now, in order to predict, whether the customer would be interested in Vehicle insurance, you have information about demographics (gender, age, region code type), Vehicles (Vehicle Age, Damage), Policy (Premium, sourcing channel) etc.

Content

id: Unique ID for the customer Gender: Gender of the customer Age :: Age of the customer driving license: 0 :: Customer does not have DL, 1 : Customer already has DL RegionCode: Unique code for the region of the customer PreviouslyInsured 1 : Customer already has Vehicle Insurance, 0 : Customer doesn't have Vehicle Insurance VehicleAge: Age of the Vehicle VehicleDamage: 1 : Customer got his/her vehicle damaged in the past, 0 : Customer: Customer didn't get his/her vehicle damaged in the past. AnnualPremium: The amount customer needs to pay as premium in the year PolicySalesChannel: Anonymised Code for the channel of outreaching to the customer ie. Different Agents, Over Mail, Over Phone, In Person, etc. Vintage: Number of Days, Customer has been associated with the company Response: 1: Customer is interested, 0 : Customer is not interested

Acknowledgements

We wouldn't be here without the help of others. If you owe any attributions or thanks, include them here along with any citations of past research.

Original DatabSource Analytics Vidhya: https://datahack.analyticsvidhya.com/contest/janatahack-cross-sell-prediction/#About

Inspiration

Your data will be in front of the world's largest data science community. What questions do you want to see answered?
Cervical Cancer Risk Classification
kaggle.com
zip
Updated Aug 31, 2017
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Gokagglers (2017). Cervical Cancer Risk Classification [Dataset]. https://www.kaggle.com/forums/f/5746/cervical-cancer-risk-classification/t/91763/data-is-from-caracas-venezuela?forumMessageId=528857
Explore at:
zip(9052 bytes)Available download formats
Dataset updated
Aug 31, 2017
Authors
Gokagglers
Description
Cervical Cancer Risk Factors for Biopsy: This Dataset is Obtained from UCI Repository and kindly acknowledged!

This file contains a List of Risk Factors for Cervical Cancer leading to a Biopsy Examination!

About 11,000 new cases of invasive cervical cancer are diagnosed each year in the U.S. However, the number of new cervical cancer cases has been declining steadily over the past decades. Although it is the most preventable type of cancer, each year cervical cancer kills about 4,000 women in the U.S. and about 300,000 women worldwide. In the United States, cervical cancer mortality rates plunged by 74% from 1955 - 1992 thanks to increased screening and early detection with the Pap test. AGE Fifty percent of cervical cancer diagnoses occur in women ages 35 - 54, and about 20% occur in women over 65 years of age. The median age of diagnosis is 48 years. About 15% of women develop cervical cancer between the ages of 20 - 30. Cervical cancer is extremely rare in women younger than age 20. However, many young women become infected with multiple types of human papilloma virus, which then can increase their risk of getting cervical cancer in the future. Young women with early abnormal changes who do not have regular examinations are at high risk for localized cancer by the time they are age 40, and for invasive cancer by age 50. SOCIOECONOMIC AND ETHNIC FACTORS Although the rate of cervical cancer has declined among both Caucasian and African-American women over the past decades, it remains much more prevalent in African-Americans -- whose death rates are twice as high as Caucasian women. Hispanic American women have more than twice the risk of invasive cervical cancer as Caucasian women, also due to a lower rate of screening. These differences, however, are almost certainly due to social and economic differences. Numerous studies report that high poverty levels are linked with low screening rates. In addition, lack of health insurance, limited transportation, and language difficulties hinder a poor woman’s access to screening services. HIGH SEXUAL ACTIVITY Human papilloma virus (HPV) is the main risk factor for cervical cancer. In adults, the most important risk factor for HPV is sexual activity with an infected person. Women most at risk for cervical cancer are those with a history of multiple sexual partners, sexual intercourse at age 17 years or younger, or both. A woman who has never been sexually active has a very low risk for developing cervical cancer. Sexual activity with multiple partners increases the likelihood of many other sexually transmitted infections (chlamydia, gonorrhea, syphilis).Studies have found an association between chlamydia and cervical cancer risk, including the possibility that chlamydia may prolong HPV infection. FAMILY HISTORY Women have a higher risk of cervical cancer if they have a first-degree relative (mother, sister) who has had cervical cancer. USE OF ORAL CONTRACEPTIVES Studies have reported a strong association between cervical cancer and long-term use of oral contraception (OC). Women who take birth control pills for more than 5 - 10 years appear to have a much higher risk HPV infection (up to four times higher) than those who do not use OCs. (Women taking OCs for fewer than 5 years do not have a significantly higher risk.) The reasons for this risk from OC use are not entirely clear. Women who use OCs may be less likely to use a diaphragm, condoms, or other methods that offer some protection against sexual transmitted diseases, including HPV. Some research also suggests that the hormones in OCs might help the virus enter the genetic material of cervical cells. HAVING MANY CHILDREN Studies indicate that having many children increases the risk for developing cervical cancer, particularly in women infected with HPV. SMOKING Smoking is associated with a higher risk for precancerous changes (dysplasia) in the cervix and for progression to invasive cervical cancer, especially for women infected with HPV. IMMUNOSUPPRESSION Women with weak immune systems, (such as those with HIV / AIDS), are more susceptible to acquiring HPV. Immunocompromised patients are also at higher risk for having cervical precancer develop rapidly into invasive cancer. DIETHYLSTILBESTROL (DES) From 1938 - 1971, diethylstilbestrol (DES), an estrogen-related drug, was widely prescribed to pregnant women to help prevent miscarriages. The daughters of these women face a higher risk for cervical cancer. DES is no longer prsecribed.
Not seeing a result you expected?
Learn how you can add new datasets to our index.

Facebook

Twitter

Click to copy link

Link copied

Cite

US Department of Health and Human Services (2017). Health Insurance Marketplace [Dataset]. https://www.kaggle.com/datasets/hhs/health-insurance-marketplace

Health Insurance Marketplace

Explore health and dental plans data in the US Health Insurance Marketplace

Explore at:

zip(868821924 bytes)Available download formats

Dataset updated

May 1, 2017

Dataset provided by

United States Department of Health and Human Serviceshttp://www.hhs.gov/

Authors

US Department of Health and Human Services

License

https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

Description

The Health Insurance Marketplace Public Use Files contain data on health and dental plans offered to individuals and small businesses through the US Health Insurance Marketplace.

Exploration Ideas

To help get you started, here are some data exploration ideas:

How do plan rates and benefits vary across states?
How do plan benefits relate to plan rates?
How do plan rates vary by age?
How do plans vary across insurance network providers?

See this forum thread for more ideas, and post there if you want to add your own ideas or answer some of the open questions!

Data Description

This data was originally prepared and released by the Centers for Medicare & Medicaid Services (CMS). Please read the CMS Disclaimer-User Agreement before using this data.

Here, we've processed the data to facilitate analytics. This processed version has three components:

1. Original versions of the data

The original versions of the 2014, 2015, 2016 data are available in the "raw" directory of the download and "../input/raw" on Kaggle Scripts. Search for "dictionaries" on this page to find the data dictionaries describing the individual raw files.

2. Combined CSV files that contain

In the top level directory of the download ("../input" on Kaggle Scripts), there are six CSV files that contain the combined at across all years:

BenefitsCostSharing.csv
BusinessRules.csv
Network.csv
PlanAttributes.csv
Rate.csv
ServiceArea.csv

Additionally, there are two CSV files that facilitate joining data across years:

Crosswalk2015.csv - joining 2014 and 2015 data
Crosswalk2016.csv - joining 2015 and 2016 data

3. SQLite database

The "database.sqlite" file contains tables corresponding to each of the processed CSV files.

The code to create the processed version of this data is available on GitHub.

Clear search

Close search

Google apps

Main menu

Health Insurance Marketplace

Exploration Ideas

Data Description

1. Original versions of the data

2. Combined CSV files that contain

3. SQLite database

medical-insurance-charges-dataset

‘Medical Insurance Premium Prediction’ analyzed by Analyst-2

Context

Content

Inspiration

Insurance

Medical_Insurance_Cost_Dataset

Dataset

Contents

Health Insurance Marketplaces

Health Insurance Marketplaces

Rates, Benefits, Coverage and Networks

About this dataset

More Datasets

Featured Notebooks

How to use the dataset

Research Ideas

Acknowledgements

License

Health insurance price

Dataset

Contents

Health Insurance Dataset

Medical Policy Premium Dataset

Context

Content

Health Insurance Lead Prediction

AV: JantaHackathon

Content

Acknowledgements

Inspiration

Cervical Cancer Risk Classification

Health Insurance Marketplace

Explore health and dental plans data in the US Health Insurance Marketplace

Exploration Ideas

Data Description

1. Original versions of the data

2. Combined CSV files that contain

3. SQLite database