47 datasets found
  1. Health Insurance Marketplace

    • kaggle.com
    zip
    Updated May 1, 2017
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    US Department of Health and Human Services (2017). Health Insurance Marketplace [Dataset]. https://www.kaggle.com/datasets/hhs/health-insurance-marketplace
    Explore at:
    zip(868821924 bytes)Available download formats
    Dataset updated
    May 1, 2017
    Dataset provided by
    United States Department of Health and Human Serviceshttp://www.hhs.gov/
    Authors
    US Department of Health and Human Services
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    The Health Insurance Marketplace Public Use Files contain data on health and dental plans offered to individuals and small businesses through the US Health Insurance Marketplace.

    median plan premiums

    Exploration Ideas

    To help get you started, here are some data exploration ideas:

    • How do plan rates and benefits vary across states?
    • How do plan benefits relate to plan rates?
    • How do plan rates vary by age?
    • How do plans vary across insurance network providers?

    See this forum thread for more ideas, and post there if you want to add your own ideas or answer some of the open questions!

    Data Description

    This data was originally prepared and released by the Centers for Medicare & Medicaid Services (CMS). Please read the CMS Disclaimer-User Agreement before using this data.

    Here, we've processed the data to facilitate analytics. This processed version has three components:

    1. Original versions of the data

    The original versions of the 2014, 2015, 2016 data are available in the "raw" directory of the download and "../input/raw" on Kaggle Scripts. Search for "dictionaries" on this page to find the data dictionaries describing the individual raw files.

    2. Combined CSV files that contain

    In the top level directory of the download ("../input" on Kaggle Scripts), there are six CSV files that contain the combined at across all years:

    • BenefitsCostSharing.csv
    • BusinessRules.csv
    • Network.csv
    • PlanAttributes.csv
    • Rate.csv
    • ServiceArea.csv

    Additionally, there are two CSV files that facilitate joining data across years:

    • Crosswalk2015.csv - joining 2014 and 2015 data
    • Crosswalk2016.csv - joining 2015 and 2016 data

    3. SQLite database

    The "database.sqlite" file contains tables corresponding to each of the processed CSV files.

    The code to create the processed version of this data is available on GitHub.

  2. h

    medical_insurance_data

    • huggingface.co
    Updated Mar 14, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Rahul Vyas M (2024). medical_insurance_data [Dataset]. https://huggingface.co/datasets/rahulvyasm/medical_insurance_data
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Mar 14, 2024
    Authors
    Rahul Vyas M
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    Dataset Card for Medical Insurance Cost Prediction

    The medical insurance dataset encompasses various factors influencing medical expenses, such as age, sex, BMI, smoking status, number of children, and region. This dataset serves as a foundation for training machine learning models capable of forecasting medical expenses for new policyholders. Its purpose is to shed light on the pivotal elements contributing to increased insurance costs, aiding the company in making more informed… See the full description on the dataset page: https://huggingface.co/datasets/rahulvyasm/medical_insurance_data.

  3. US Health Insurance

    • kaggle.com
    zip
    Updated Jan 7, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    The Devastator (2023). US Health Insurance [Dataset]. https://www.kaggle.com/datasets/thedevastator/comprehensive-analysis-of-us-health-insurance-ma
    Explore at:
    zip(15726377 bytes)Available download formats
    Dataset updated
    Jan 7, 2023
    Authors
    The Devastator
    Area covered
    United States
    Description

    US Health Insurance

    Exploring Rates, Benefits, and Providers

    By Data Society [source]

    About this dataset

    This fascinating dataset from the Centers for Medicare & Medicaid Services provides an in-depth analysis of health insurance plans offered throughout the United States. Exploring this data, you can gain insights into how plan rates and benefits vary across states, explore how plan benefits relate to plan rates, and investigate how plans vary across insurance network providers.

    The top-level directory includes six CSV files which contain information about: BenefitsCostSharing.csv; BusinessRules.csv; Network.csv; PlanAttributes.csv; Rate.csv; and ServiceArea.csv - as well as two additional CSV files which facilitate joining data across years: Crosswalk2015.csv (joining 2014 and 2015 data) and Crosswalk2016

    More Datasets

    For more datasets, click here.

    Featured Notebooks

    • 🚨 Your notebook can be here! 🚨!

    How to use the dataset

    This Kaggle dataset contains comprehensive data on US health insurance Marketplace plans. The data was obtained from the Centers for Medicare & Medicaid Services and contains information such as plan rates and benefits, metal levels, dental coverage, and child/adult-only coverages.

    In order to use this dataset effectively, it is important to understand the different columns/variables that make up the dataset. The columns are state, dental plan, multistate plan (2015 and 2016), metal level (2014-2016), child/adult-only coverage (2014-2016), FIPS code (Federal Information Processing Standard code for the particular state), zipcode, crosswalk level (level of crosswalk between 2014-2016 data sets), reason for crosswalk parameter.

    Using this dataset can help you answer interesting questions about US health insurance Marketplace plans across different variables such as state or rate information. It may also be interesting to compare certain variables over time with respect to how they affect certain types of people or how they differ across states or regions. Additionally, an analysis of the different price points associated with various kinds of coverage could provide insights into which kinds of plans are most attractive in various marketplaces based on cost savings alone

    Once you have a good understanding of your data by studying individual parameters in depth across multiple states or regions you can begin looking at correlations between different parameters You can identify patterns that emerge around common characteristics or trends within areas or across markets over time when you have gathered sufficient historical data:

    • Does higher out of pocket limits tend to come with higher premiums?
    • Are there more multi-state markets in some states than others?
    • What type of metal levels does each region prefer?

    Research Ideas

    • Examining the impacts of age, metal levels and plan benefits on insurance rates in different states.
    • Analyzing how dental plans vary across different states/regions and examining whether there are correlations between affordability and quality of care among plans with dental coverage options.
    • Investigating how the Crosswalk level affects insurance rates by comparing insurance premiums from different metals level across states with varying Crosswalk Levels (e.g., how does a Bronze plan differ in cost for two states with differing Crosswalk Level 1 vs 2)

    Acknowledgements

    If you use this dataset in your research, please credit the original authors. Data Source

    License

    License: Dataset copyright by authors - You are free to: - Share - copy and redistribute the material in any medium or format for any purpose, even commercially. - Adapt - remix, transform, and build upon the material for any purpose, even commercially. - You must: - Give appropriate credit - Provide a link to the license, and indicate if changes were made. - ShareAlike - You must distribute your contributions under the same license as the original. - Keep intact - all notices that refer to this license, including copyright notices.

    Columns

    File: Crosswalk2016.csv | Column name | Description | |:------------------------------|:------------------------------------------------------------------------------------------------------------------------------| | State | The state in which...

  4. Health insurance dataset | India-2022

    • kaggle.com
    Updated May 28, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    balaji adithya (2023). Health insurance dataset | India-2022 [Dataset]. https://www.kaggle.com/datasets/balajiadithya/health-insurance-dataset-india-2022
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    May 28, 2023
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    balaji adithya
    License

    http://opendatacommons.org/licenses/dbcl/1.0/http://opendatacommons.org/licenses/dbcl/1.0/

    Area covered
    India
    Description

    Context

    This public dataset contains data concerning the public and private insurance companies provided by IRDAI(Insurance Regulatory and Development Authority of India) from 2013-2022. This is a multi-index data and can be a great practice to hone manipulation of pandas multi-index dataframes. Mainly, the business of the companies (total premiums and number of policies), subscription information(number of people subscribed), Claims incurred and the Network hospitals enrolled by Third Party Administrators are attributes focused by the dataset.

    Content

    The Excel file contains the following data | Table No.| Contents| | --- | --- | |**A**|**III.A: HEALTH INSURANCE BUSINESS OF GENERAL AND HEALTH INSURERS**| |62| Health Insurance - Number of Policies, Number of Persons Covered and Gross Premium| |63| Personal Accident Insurance - Number of Policies, Number of Persons Covered and Gross Premium| |64| Overseas Travel Insurance - Number of Policies, Number of Persons Covered and Gross Premium| |65| Domestic Travel Insurance - Number of Policies, Number of Persons Covered and Gross Premium| |66| Health Insurance - Net Premium Earned, Incurred Claims and Incurred Claims Ratio| |67| Personal Accident Insurance - Net Premium Earned, Incurred Claims and Incurred Claims Ratio| |68| Overseas Travel Insurance - Net Earned Premium, Incurred Claims and Incurred Claims Ratio| |69| Domestic Travel Insurance - Net Earned Premium, Incurred Claims and Incurred Claims Ratio| |70| Details of Claims Development and Aging - Health Insurance Business| |71| State-wise Health Insurance Business| |72| State-wise Individual Health Insurance Business| |73| State-wise Personal Accident Insurance Business| |74| State-wise Overseas Insurance Business| |75| State-wise Domestic Insurance Business| |76| State-wise Claims Settlement under Health Insurance Business| |**B**|**III.B: HEALTH INSURANCE BUSINESS OF LIFE INSURERS**| |77| Health Insurance Business in respect of Products offered by Life Insurers - New Busienss| |78| Health Insurance Business in respect of Products offered by Life insurers - Renewal Business| |79| Health Insurance Business in respect of Riders attached to Life Insurance Products - New Business| |80| Health Insurance Business in respect of Riders attached to Life Insurance Products - Renewal Business| |**C**|**III.C: OTHERS**| |81| Network Hospital Enrolled by TPAs| |82| State-wise Details on Number of Network Providers |

  5. U

    United States Health Insurance: Premium Per Member Per Month

    • ceicdata.com
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    CEICdata.com, United States Health Insurance: Premium Per Member Per Month [Dataset]. https://www.ceicdata.com/en/united-states/health-insurance-industry-financial-snapshots/health-insurance-premium-per-member-per-month
    Explore at:
    Dataset provided by
    CEICdata.com
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    Dec 1, 2021 - Sep 1, 2024
    Area covered
    United States
    Variables measured
    Insurance Market
    Description

    United States Health Insurance: Premium Per Member Per Month data was reported at 364.000 USD in Sep 2024. This stayed constant from the previous number of 364.000 USD for Jun 2024. United States Health Insurance: Premium Per Member Per Month data is updated quarterly, averaging 262.000 USD from Mar 2012 (Median) to Sep 2024, with 51 observations. The data reached an all-time high of 364.000 USD in Sep 2024 and a record low of 178.000 USD in Sep 2013. United States Health Insurance: Premium Per Member Per Month data remains active status in CEIC and is reported by National Association of Insurance Commissioners. The data is categorized under Global Database’s United States – Table US.RG017: Health Insurance: Industry Financial Snapshots.

  6. m

    Health Insurance Claim Count data with many zeros

    • data.mendeley.com
    Updated Mar 18, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Olumide Adesina (2021). Health Insurance Claim Count data with many zeros [Dataset]. http://doi.org/10.17632/6hcf5mf7fy.1
    Explore at:
    Dataset updated
    Mar 18, 2021
    Authors
    Olumide Adesina
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The Data set is National Health Insurance Scheme (NHIS) data with excess zero count. The data were obtained from health Ogun State health facility, Ota, Ogun State. Claims made by 116 users of National Health Insurance Scheme (NHIS) user's from September 2016 to July 2017. Response variable is Number of Encounter, while predictors are Sex, Age of patients, number of drugs prescribed (DrugAdm), and Number of drugs out of stock (DrugOS). The data is over-dispersed with dispersion parameter of 1.4980 since the dispersion parameter is greater than 1. Model such as Bayesian and frequentist techniques can be used to model the data. Such work can be found in the study by Adesina et. al (2017), Adesina et. al (2018), Adesina et al (2019), Dare et al (2019), Adesina et. al (2021).

  7. f

    Health Insurance Coverage by ZIP Code Tabulation Area

    • data.ferndalemi.gov
    • detroitdata.org
    • +2more
    Updated May 31, 2019
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    City of Detroit (2019). Health Insurance Coverage by ZIP Code Tabulation Area [Dataset]. https://data.ferndalemi.gov/datasets/detroitmi::health-insurance-coverage-by-zip-code-tabulation-area
    Explore at:
    Dataset updated
    May 31, 2019
    Dataset authored and provided by
    City of Detroit
    Description

    This dataset provides an estimate of the percent of Detroit residents who reported having health insurance at the time they completed the American Community Survey (ACS). The data is averaged over 5 years. This data can be also be accessed in Table S2701 on the American FactFinder website.Note that the data is provided by ZIP Code Tabulation Area (ZCTA), which may not exactly match USPS ZIP Code service areas. For more information: https://web.archive.org/web/20130617034846/http://www.census.gov/geo/reference/zctas.html

  8. g

    Robert Wood Johnson Foundation Employer Health Insurance Survey, 1993 -...

    • search.gesis.org
    Updated May 6, 2021
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    ICPSR - Interuniversity Consortium for Political and Social Research (2021). Robert Wood Johnson Foundation Employer Health Insurance Survey, 1993 - Version 2 [Dataset]. http://doi.org/10.3886/ICPSR06908.v2
    Explore at:
    Dataset updated
    May 6, 2021
    Dataset provided by
    GESIS search
    ICPSR - Interuniversity Consortium for Political and Social Research
    License

    https://search.gesis.org/research_data/datasearch-httpwww-da-ra-deoaip--oaioai-da-ra-de456412https://search.gesis.org/research_data/datasearch-httpwww-da-ra-deoaip--oaioai-da-ra-de456412

    Description

    Abstract (en): The purpose of this survey was to investigate the barriers to the provision of employer-sponsored health insurance coverage and to describe the premiums and other characteristics of health plans offered by employers. With the goal of remedying the previous lack of state-level data, the survey was conducted to aid in defining problems in the employment-based insurance market and in analyzing the impacts of states' policy options. The survey collected data on characteristics of employers and workers in establishments offering and not offering health insurance, including the number of employees (statewide and nationwide), the distribution of workers by hours worked, age, sex, and earnings, the peak month for seasonal workers, the type of industry or business, whether health insurance was offered, and eligibility rules for health insurance. It also collected information about the characteristics of plans offered, including premiums, cost-sharing, medical underwriting, self-insurance, type of plan, number of days a person must wait for coverage of a preexisting condition, and whether each plan covered prenatal care, maternity care, outpatient prescription drugs, mental health services, dental care, and treatment for alcoholism or drug abuse. The survey also elicited information from employers not offering health insurance as to other forms of compensation for medical expenses they provided to employees. There are three data files in the collection. Part 1, Firms Data, contains information on the surveyed firms. Part 2, Plans Data, has data on each insurance plan offered by these firms. Part 3, FIPS State and County Codes for Firms Data, identifies the state and county of each firm. Parts 1 and 3 comprise one case per firm, Part 2 one case per insurance plan. All private business establishments except self-employed persons who had no employees, and all public employers in Colorado, Florida, Minnesota, New Mexico, New York, North Dakota, Oklahoma, Oregon, Vermont, and Washington. In each state, a probability sample of public and private employers was selected. Approximately 2,000 private employment establishments were sampled in each state, allocated equally to four strata as defined by the number of workers: 1-4, 5-9, 10-24, and 25 and over. Forty-six to 262 public employers were sampled in each state. 2006-03-30 File CB06908-0001-2.PDF was removed from any previous datasets and flagged as a study-level file, so that it will accompany all downloads.2006-03-30 File CB06908-0001-2.PDF was removed from any previous datasets and flagged as a study-level file, so that it will accompany all downloads.2006-03-30 File CB06908-0001-2.PDF was removed from any previous datasets and flagged as a study-level file, so that it will accompany all downloads.2006-03-30 File CB06908-0001-2.PDF was removed from any previous datasets and flagged as a study-level file, so that it will accompany all downloads.2006-03-30 File CB06908-0001-2.PDF was removed from any previous datasets and flagged as a study-level file, so that it will accompany all downloads.2005-07-06 SPSS setup files for Parts 1 and 2 have been added to the collection and the SAS setup files have been enhanced.1999-12-29 A file with FIPS state and county codes, which can be merged with Part 1, Firms Data, has been added as Part 3. To obtain this file, researchers must agree to the terms and conditions of a restricted data use agreement in accordance with existing servicing policies.1997-11-14 A report entitled "Data Cleaning Procedures for the 1993 Robert Wood Johnson Foundation Employer Health Insurance Survey" has been added to the documentation for this collection, and all documentation for this study has been converted to PDF files. Funding insitution(s): Robert Wood Johnson Foundation. computer-assisted telephone interview (CATI), mail questionnaireThe data files for this collection are blank-delimited and can be linked using common ID variables.

  9. Health Insurance Charges Dataset

    • kaggle.com
    zip
    Updated Oct 22, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Aleesha Nadeem (2025). Health Insurance Charges Dataset [Dataset]. https://www.kaggle.com/datasets/nalisha/health-insurance-charges-dataset
    Explore at:
    zip(16427 bytes)Available download formats
    Dataset updated
    Oct 22, 2025
    Authors
    Aleesha Nadeem
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    This dataset provides a realistic look at how important lifestyle and human characteristics are used to calculate health insurance rates. It presents a wide range of people, each characterized by their age, gender, BMI, number of dependents, smoking status, and geographic location, as well as the associated insurance bills they received.

    We can find important patterns by examining this dataset, like: Why smokers pay noticeably higher premiums How age and BMI affect medical expenses Whether insurance costs are higher in some areas The connection between charges and family size

  10. a

    Health Insurance Claims FY 2024-25

    • algatesinsurance.in
    csv
    Updated Sep 24, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Algates Insurance (2025). Health Insurance Claims FY 2024-25 [Dataset]. https://algatesinsurance.in/insurance-infographic/health-insurance-claims-complaints-volume/
    Explore at:
    csvAvailable download formats
    Dataset updated
    Sep 24, 2025
    Dataset authored and provided by
    Algates Insurance
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    This dataset lists the number of health insurance claims for various insurers in FY 2024-25.

  11. Data from: Oregon Health Insurance Experiment, 2007-2010

    • icpsr.umich.edu
    • search.datacite.org
    ascii, sas, spss +1
    Updated May 2, 2014
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Finkelstein, Amy; Baicker, Katherine (2014). Oregon Health Insurance Experiment, 2007-2010 [Dataset]. http://doi.org/10.3886/ICPSR34314.v3
    Explore at:
    ascii, spss, stata, sasAvailable download formats
    Dataset updated
    May 2, 2014
    Dataset provided by
    Inter-university Consortium for Political and Social Researchhttps://www.icpsr.umich.edu/web/pages/
    Authors
    Finkelstein, Amy; Baicker, Katherine
    License

    https://www.icpsr.umich.edu/web/ICPSR/studies/34314/termshttps://www.icpsr.umich.edu/web/ICPSR/studies/34314/terms

    Time period covered
    2007 - 2010
    Area covered
    Oregon
    Description

    In 2008, a group of uninsured low-income adults in Oregon was selected by lottery to be given the chance to apply for Medicaid. This lottery provides an opportunity to gauge the effects of expanding access to public health insurance on the health care use, financial strain, and health of low-income adults using a randomized controlled design. The Oregon Health Insurance Experiment follows and compares those selected in the lottery (treatment group) with those not selected (control group). The data collected and provided here include data from in-person interviews, three mail surveys, emergency department records, and administrative records on Medicaid enrollment, the initial lottery sign-up list, welfare benefits, and mortality. This data collection has seven data files: Dataset 1 contains administrative data on the lottery from the state of Oregon. These data include demographic characteristics that were recorded when individuals signed up for the lottery, date of lottery draw, and information on who was selected for the lottery, applied for the lotteried Medicaid plan if selected, and whose application for the lotteried plan was approved. Also included are Oregon mortality data for 2008 and 2009. Dataset 2 contains information from the state of Oregon on the individuals' participation in Medicaid, Supplemental Nutrition Assistance Program (SNAP), and Temporary Assistance to Needy Families (TANF). Datasets 3-5 contain the data from the initial, six month, and 12 month mail surveys, respectively. Topics covered by the surveys include demographic characteristics; health insurance, access to health care and health care utilization; health care needs, experiences, and costs; overall health status and changes in health; and depression and medical conditions and use of medications to treat them. Dataset 6 contains an analysis subset of the variables from the in-person interviews. Topics covered by the survey questionnaire include overall health, health insurance coverage, health care access, health care utilization, conditions and treatments, health behaviors, medical and dental costs, and demographic characteristics. The interviewers also obtained blood pressure and anthropometric measurements and collected dried blood spots to measure levels of cholesterol, glycated hemoglobin and C-reactive protein. Dataset 7 contains an analysis subset of the variables the study obtained for all emergency department (ED) visits to twelve hospitals in the Portland area during 2007-2009. These variables capture total hospital costs, ED costs, and the number of ED visits categorized by time of the visit (daytime weekday or nighttime and weekends), necessity of the visit (emergent, ED care needed, non-preventable; emergent, ED care needed, preventable; emergent, primary care treatable), ambulatory case sensitive status, whether or not the patient was hospitalized, and the reason for the visit (e.g., injury, abdominal pain, chest pain, headache, and mental disorders). The collection also includes a ZIP archive (Dataset 8) with Stata programs that replicate analyses reported in three articles by the principal investigators and others: Finkelstein, Amy et al "The Oregon Health Insurance Experiment: Evidence from the First Year". The Quarterly Journal of Economics. August 2012. Vol 127(3). Baicker, Katherine et al "The Oregon Experiment - Effects of Medicaid on Clinical Outcomes". New England Journal of Medicine. 2 May 2013. Vol 368(18). Taubman, Sarah et al "Medicaid Increases Emergency Department Use: Evidence from Oregon's Health Insurance Experiment". Science. 2 Jan 2014.

  12. Medical Insurance dataset

    • kaggle.com
    zip
    Updated Dec 19, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Raj Gupta (2021). Medical Insurance dataset [Dataset]. https://www.kaggle.com/datasets/rajgupta2019/medical-insurance-dataset
    Explore at:
    zip(94324 bytes)Available download formats
    Dataset updated
    Dec 19, 2021
    Authors
    Raj Gupta
    Description

    Context

    People are always confused about their medical insurance and don't know the cost of insurance at different ages and conditions. This data is useful for these people and is useful to make predictions of the insurance cost they will have to pay.

    Content

    The data provider is unknown and all credit goes to the person. Data may not be sufficient for practical purpose and is solely for education and practice.

    Acknowledgements

    Data collection is one thing and data cleaning and preprocessing is other. The resources on YouTube is enough to learn these basics.

    Inspiration

    The KAGGLE community is very inspiring and is the best way to learn everything we need to know in Data Science and I love it.

  13. d

    Network Adequacy Waiver Request - Major Medical

    • catalog.data.gov
    • data.texas.gov
    • +1more
    Updated Oct 25, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    data.austintexas.gov (2025). Network Adequacy Waiver Request - Major Medical [Dataset]. https://catalog.data.gov/dataset/network-adequacy-waiver-request-major-medical
    Explore at:
    Dataset updated
    Oct 25, 2025
    Dataset provided by
    data.austintexas.gov
    Description

    This dataset includes information related to network adequacy waiver requests filed by major medical health benefit plans. It includes data on physicians, providers, and facilities, other than facility-based physicians and providers. Related datasets are available for major medical (facility-based providers) and vision plans: • Facility-based Physicians & Providers: Network Adequacy Waiver Request - Facility based Physicians & Providers.This dataset relates to waiver requests for networks used for major medical PPO and EPO plans and includes data on facility-based physicians and providers. • Vision: Network Adequacy Waiver Request - Vision. This dataset relates to waiver requests for networks used for vision PPO and EPO plans. Insurers offering health benefits through a preferred or exclusive provider benefit plan (also called PPO and EPO plans) are required to demonstrate that the health insurance network meets Texas network adequacy standards. When a network does not meet these requirements and has a deficiency in a county for a specific physician or provider specialty type, an insurer may apply for a waiver to continue operating within its service area. The commissioner of the Texas Department of Insurance (TDI) may grant the waiver following a public hearing and consideration of relevant testimony and information. Anyone may attend the public hearing and offer testimony. Learn more about how to submit information related to a waiver request or participate in a hearing here: Network Adequacy Standards Waivers.

  14. e

    Health care expenditure; proceeds from forms of financing

    • data.europa.eu
    atom feed, json
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Health care expenditure; proceeds from forms of financing [Dataset]. https://data.europa.eu/data/datasets/14838-zorguitgaven-opbrengsten-van-financieringsregelingen
    Explore at:
    json, atom feedAvailable download formats
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This table describes the different types of income that the forms of financing for healthcare expenditure have. In this table, healthcare expenditure in current prices per type of financing (such as the Healthcare Insurance Act) is broken down into the different types of revenue, such as the income-related health insurance contribution. The classifications used are from the System of Health Accounts of Eurostat, Organisation for Economic Co-operation and Development (OECD) and the World Health Organisation (WHO; World Health Organization).

    Data available from: 1998

    Status of figures: The figures for 2022 and 2021 presented in the table are further provisional. The figures for the previous years are final. The total expenditure according to the forms of financing are equal to the total healthcare expenditure as shown in the table 'Healthcare expenditure; healthcare providers and funding' (see paragraph 3 for a link to this table).

    Changes as of 23 May 2024: - Further provisional figures for 2021 and 2022 and final figures for 2020 have been published. - The term 'funding schemes' has been replaced by the term 'forms of funding' to better reflect the content of this selection.

    When will there be new figures? Statistics on health care expenditure are currently being revised. New data sources are processed and the most up-to-date statistical insights are applied. Revised figures for 2021 and 2022 and provisional figures for 2023 will be published in Q4 2024. Revised figures for 1998-2020 will be published in 2025.
    The figures on total expenditure according to the forms of financing are also adjusted at those times, so that they remain in line with the most recent figures on expenditure on care (after revision).

  15. d

    Network Adequacy Waiver Request - Vision

    • catalog.data.gov
    • data.texas.gov
    • +1more
    Updated Nov 25, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    data.austintexas.gov (2025). Network Adequacy Waiver Request - Vision [Dataset]. https://catalog.data.gov/dataset/network-adequacy-waiver-request-vision
    Explore at:
    Dataset updated
    Nov 25, 2025
    Dataset provided by
    data.austintexas.gov
    Description

    This dataset includes information related to network adequacy waiver requests filed by vision plans. Related datasets are available for major medical PPO and EPO plans: Major medical •Network Adequacy Waiver Request - Facility based Physicians & Providers. This dataset relates to waiver requests for networks used for major medical PPO and EPO plans and includes data on facility-based physicians and providers. • Network Adequacy Waiver Request - Major Medical. This dataset relates to waiver requests for networks used for major medical PPO and EPO plans and includes data on physicians, providers, and facilities, other than facility-based physicians and providers. Insurers offering vision benefits through a preferred or exclusive provider benefit plan (also called PPO and EPO plans) are required to demonstrate that the health insurance network meets Texas network adequacy standards. When a network does not meet these requirements and has a deficiency in a county for a specific physician or provider specialty type, an insurer may apply for a waiver to continue operating within its service area. The commissioner of the Texas Department of Insurance (TDI) may grant the waiver following a public hearing and consideration of relevant testimony and information. Anyone may attend the public hearing and offer testimony. Learn more about how to submit information related to a waiver request or participate in a hearing here: Network Adequacy Standards Waivers.

  16. H

    Survey of Adult Transition and Health (SATH)

    • dataverse.harvard.edu
    • data.niaid.nih.gov
    Updated Apr 27, 2011
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Harvard Dataverse (2011). Survey of Adult Transition and Health (SATH) [Dataset]. http://doi.org/10.7910/DVN/33FJWF
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Apr 27, 2011
    Dataset provided by
    Harvard Dataverse
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    Users can download data regarding the health care needs of children with special health care needs in adolescence and early adulthood. Topics include: transition services, care coordination and health insurance. BackgroundThe Survey of Adult Transition and Health (SATH) is operated by the Centers for Disease Control and Prevention (CDC) and National Center for Health Statistics (NCHS) and is sponsored by the Department of Health and Human Services (DHHS) Maternal and Child Health Bureau and the Health Resources and Services Administration (HRSA). This survey followed up on cases included in the 2001 National Survey of Children with Special health Care Needs (NSCSHCN). The SATH aims to ex amine the current health care needs of the original children with special health care needs survey subjects and to understand their transition from pediatric health care providers to adult health care providers. Topics include, but are not limited to: transition services, accommodations, care coordination, and health insurance. User Functionality Users can download the survey instrument, public dataset and codebook. Users can download the questionnaire as a PDF; the dataset can be downloaded into SAS statistical software. Data Notes The SATH is a follow-up survey administered to children with special health care needs who were 14-17 years of age during the initial interview in the 2001 National Survey of Children with Special health Care Needs (NSCSHCN). In 2007, these cases were 19-23 y ears old. The 2001 survey preceding this interview was conducted with the parent or guardian of the child with special health care needs. The child with special health care needs (n= 1,916) responded to the 2007 follow-up survey. Data were collected between June, 2007 and August, 2007. Information is available on a national level.

  17. c

    COVID-19 Test Sites

    • s.cnmilf.com
    • catalog.data.gov
    Updated Mar 31, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    City of Philadelphia (2025). COVID-19 Test Sites [Dataset]. https://s.cnmilf.com/user74170196/https/catalog.data.gov/dataset/covid-19-test-sites
    Explore at:
    Dataset updated
    Mar 31, 2025
    Dataset provided by
    City of Philadelphia
    Description

    A dataset of COVID-19 testing sites. A dataset of COVID-19 testing sites. If looking for a test, please use the Testing Sites locator app. You will be asked for identification and will also be asked for health insurance information. Identification will be required to receive a test. If you don’t have health insurance, you may still be able to receive a test by paying out-of-pocket. Some sites may also: - Limit testing to people who meet certain criteria. - Require an appointment. - Require a referral from your doctor. Check a _location’s specific details on the map. Then, call or visit the provider’s website before going for a test.

  18. Insurance Claims Dataset

    • kaggle.com
    zip
    Updated May 9, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sergey Litvinenko (2024). Insurance Claims Dataset [Dataset]. https://www.kaggle.com/datasets/litvinenko630/insurance-claims
    Explore at:
    zip(688768 bytes)Available download formats
    Dataset updated
    May 9, 2024
    Authors
    Sergey Litvinenko
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    Dataset Description: Insurance Claims Prediction

    Introduction: In the insurance industry, accurately predicting the likelihood of claims is essential for risk assessment and policy pricing. However, insurance claims datasets frequently suffer from class imbalance, where the number of non-claims instances far exceeds that of actual claims. This class imbalance poses challenges for predictive modeling, often leading to biased models favoring the majority class, resulting in subpar performance for the minority class, which is typically of greater interest.

    Dataset Overview: The dataset utilized in this project comprises historical data on insurance claims, encompassing a variety of information about the policyholders, their demographics, past claim history, and other pertinent features. The dataset is structured to facilitate predictive modeling tasks aimed at accurately identifying the likelihood of future insurance claims.

    Key Features: 1. Policyholder Information: This includes demographic details such as age, gender, occupation, marital status, and geographical location. 2. Claim History: Information regarding past insurance claims, including claim amounts, types of claims (e.g., medical, automobile), frequency of claims, and claim durations. 3. Policy Details: Details about the insurance policies held by the policyholders, such as coverage type, policy duration, premium amount, and deductibles. 4. Risk Factors: Variables indicating potential risk factors associated with policyholders, such as credit score, driving record (for automobile insurance), health status (for medical insurance), and property characteristics (for home insurance). 5. External Factors: Factors external to the policyholders that may influence claim likelihood, such as economic indicators, weather conditions, and regulatory changes.

    Objective: The primary objective of utilizing this dataset is to develop robust predictive models capable of accurately assessing the likelihood of insurance claims. By leveraging advanced machine learning techniques, such as classification algorithms and ensemble methods, the aim is to mitigate the effects of class imbalance and produce models that demonstrate high predictive performance across both majority and minority classes.

    Application Areas: 1. Risk Assessment: Assessing the risk associated with insuring a particular policyholder based on their characteristics and historical claim behavior. 2. Policy Pricing: Determining appropriate premium amounts for insurance policies by estimating the expected claim frequency and severity. 3. Fraud Detection: Identifying fraudulent insurance claims by detecting anomalous patterns in claim submissions and policyholder behavior. 4. Customer Segmentation: Segmenting policyholders into distinct groups based on their risk profiles and insurance needs to tailor marketing strategies and policy offerings.

    Conclusion: The insurance claims dataset serves as a valuable resource for developing predictive models aimed at enhancing risk management, policy pricing, and overall operational efficiency within the insurance industry. By addressing the challenges posed by class imbalance and leveraging the rich array of features available, organizations can gain valuable insights into insurance claim likelihood and make informed decisions to mitigate risk and optimize business outcomes.

    FeatureDescription
    policy_idUnique identifier for the insurance policy.
    subscription_lengthThe duration for which the insurance policy is active.
    customer_ageAge of the insurance policyholder, which can influence the likelihood of claims.
    vehicle_ageAge of the vehicle insured, which may affect the probability of claims due to factors like wear and tear.
    modelThe model of the vehicle, which could impact the claim frequency due to model-specific characteristics.
    fuel_typeType of fuel the vehicle uses (e.g., Petrol, Diesel, CNG), which might influence the risk profile and claim likelihood.
    max_torque, max_powerEngine performance characteristics that could relate to the vehicle’s mechanical condition and claim risks.
    engine_typeThe type of engine, which might have implications for maintenance and claim rates.
    displacement, cylinderSpecifications related to the engine size and construction, affec...
  19. Social Insurance Programs in Richest Quintile

    • kaggle.com
    Updated Jan 7, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    The Devastator (2023). Social Insurance Programs in Richest Quintile [Dataset]. https://www.kaggle.com/datasets/thedevastator/coverage-of-social-insurance-programs-in-richest
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jan 7, 2023
    Dataset provided by
    Kaggle
    Authors
    The Devastator
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Coverage of Social Insurance Programs in Richest Quintile

    Percent of Population Eligible

    By data.world's Admin [source]

    About this dataset

    This dataset offers a unique insight into the coverage of social insurance programs for the wealthiest quintile of populations around the world. It reveals how many individuals in each country are receiving support from old age contributory pensions, disability benefits, and social security and health insurance benefits such as occupational injury benefits, paid sick leave, maternity leave, and more. This data provides an invaluable resource to understand the health and well-being of those most financially privileged in society – often having greater impact on decision making than other groups. With up-to-date figures from 2019-05-11 this dataset is invaluable in uncovering where there is work to be done for improved healthcare provision in each country across the world

    More Datasets

    For more datasets, click here.

    Featured Notebooks

    • 🚨 Your notebook can be here! 🚨!

    How to use the dataset

    • Understand the context: Before you begin analyzing this dataset, it is important to understand the information that it provides. Take some time to read the description of what is included in the dataset, including a clear understanding of the definitions and scope of coverage provided with each data point.

    • Examine the data: Once you have a general understanding of this dataset's contents, take some time to explore its contents in more depth. What specific questions does this dataset help answer? What kind of insights does it provide? Are there any missing pieces?

    • Clean & Prepare Data: After you've preliminarily examined its content, start preparing your data for further analysis and visualization. Clean up any formatting issues or irregularities present in your data set by correcting typos and eliminating unnecessary rows or columns before working with your chosen programming language (I prefer R for data manipulation tasks). Additionally, consider performing necessary transformations such as sorting or averaging values if appropriate for the findings you wish to draw from your analysis.

    • Visualize Results: Once you've cleaned and prepared your data, use visualizations such as charts, graphs or tables to reveal patterns within it that support specific conclusions about how insurance coverage under social programs vary among different groups within society's quintiles - based on age groups etc.. This type of visualization allows those who aren't familiar with programming to process complex information quickly and accurately than when displayed numerically in tabular form only!

    5 Final Analysis & Export Results: Finally export your visuals into presentation-ready formats (e.g., PDFs) which can be shared with colleagues! Additionally use these results as part of a narrative conclusion report providing an accurate assessment and meaningful interpretation about how social insurance programs vary between different members within society's quintiles (i..e., accordingest vs poorest), along with potential policy implications relevant for implementing effective strategies that improve access accordingly!

    Research Ideas

    • Analyzing the effectiveness of social insurance programs by comparing the coverage levels across different geographic areas or socio-economic groups;
    • Estimating the economic impact of social insurance programs on local and national economies by tracking spending levels and revenues generated;
    • Identifying potential problems with access to social insurance benefits, such as racial or gender disparities in benefit coverage

    Acknowledgements

    If you use this dataset in your research, please credit the original authors. Data Source

    License

    License: CC0 1.0 Universal (CC0 1.0) - Public Domain Dedication No Copyright - You can copy, modify, distribute and perform the work, even for commercial purposes, all without asking permission. See Other Information.

    Columns

    File: coverage-of-social-insurance-programs-in-richest-quintile-of-population-1.csv

    Acknowledgements

    If you use this dataset in your research, please credit the original authors. If you use this dataset in your research, please credit data.world's Admin.

  20. Data from: Insurance Claim Dataset

    • kaggle.com
    zip
    Updated Jan 26, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    M Yasser H (2022). Insurance Claim Dataset [Dataset]. https://www.kaggle.com/datasets/yasserh/insurance-claim-dataset
    Explore at:
    zip(15957 bytes)Available download formats
    Dataset updated
    Jan 26, 2022
    Authors
    M Yasser H
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    https://raw.githubusercontent.com/Masterx-AI/Project_Insurance_Claim_Anticipation_/main/ica.jpg" alt="">

    Description:

    A simple yet challenging project, to anticipate whether the insurance will be claimed or not. The complexity arises due to the fact that the dataset has fewer samples, & is slightly imbalanced. Can you overcome these obstacles & build a good predictive model to classify them?

    This data frame contains the following columns:

    • age : age of policyholder
    • sex: gender of policy holder (female=0, male=1)
    • bmi: Body mass index, providing an understanding of body, weights that are relatively high or low relative to height, objective index of body weight (kg / m ^ 2) using the ratio of height to weight, ideally 18.5 to 25
    • steps: average walking steps per day of policyholder
    • children: number of children / dependents of policyholder
    • smoker: smoking state of policyholder (non-smoke=0;smoker=1)
    • region: the residential area of policyholder in the US (northeast=0, northwest=1, southeast=2, southwest=3)
    • charges: individual medical costs billed by health insurance
    • insuranceclaim: yes=1, no=0

    This is "Sample Insurance Claim Prediction Dataset" which based on "[Medical Cost Personal Datasets][1]" to update sample value on top.

    Acknowledgements:

    This dataset has been referred from Kaggle.

    Objective:

    • Understand the Dataset & cleanup (if required).
    • Build classification model to predict weather the insurance will be claimed or not.
    • Also fine-tune the hyperparameters & compare the evaluation metrics of vaious classification algorithms.
Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
US Department of Health and Human Services (2017). Health Insurance Marketplace [Dataset]. https://www.kaggle.com/datasets/hhs/health-insurance-marketplace
Organization logo

Health Insurance Marketplace

Explore health and dental plans data in the US Health Insurance Marketplace

Explore at:
zip(868821924 bytes)Available download formats
Dataset updated
May 1, 2017
Dataset provided by
United States Department of Health and Human Serviceshttp://www.hhs.gov/
Authors
US Department of Health and Human Services
License

https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

Description

The Health Insurance Marketplace Public Use Files contain data on health and dental plans offered to individuals and small businesses through the US Health Insurance Marketplace.

median plan premiums

Exploration Ideas

To help get you started, here are some data exploration ideas:

  • How do plan rates and benefits vary across states?
  • How do plan benefits relate to plan rates?
  • How do plan rates vary by age?
  • How do plans vary across insurance network providers?

See this forum thread for more ideas, and post there if you want to add your own ideas or answer some of the open questions!

Data Description

This data was originally prepared and released by the Centers for Medicare & Medicaid Services (CMS). Please read the CMS Disclaimer-User Agreement before using this data.

Here, we've processed the data to facilitate analytics. This processed version has three components:

1. Original versions of the data

The original versions of the 2014, 2015, 2016 data are available in the "raw" directory of the download and "../input/raw" on Kaggle Scripts. Search for "dictionaries" on this page to find the data dictionaries describing the individual raw files.

2. Combined CSV files that contain

In the top level directory of the download ("../input" on Kaggle Scripts), there are six CSV files that contain the combined at across all years:

  • BenefitsCostSharing.csv
  • BusinessRules.csv
  • Network.csv
  • PlanAttributes.csv
  • Rate.csv
  • ServiceArea.csv

Additionally, there are two CSV files that facilitate joining data across years:

  • Crosswalk2015.csv - joining 2014 and 2015 data
  • Crosswalk2016.csv - joining 2015 and 2016 data

3. SQLite database

The "database.sqlite" file contains tables corresponding to each of the processed CSV files.

The code to create the processed version of this data is available on GitHub.

Search
Clear search
Close search
Google apps
Main menu