100+ datasets found

Health Insurance Marketplace
kaggle.com
zip
Updated May 1, 2017
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
US Department of Health and Human Services (2017). Health Insurance Marketplace [Dataset]. https://www.kaggle.com/datasets/hhs/health-insurance-marketplace
Explore at:
zip(868821924 bytes)Available download formats
Dataset updated
May 1, 2017
Dataset provided by
United States Department of Health and Human Serviceshttp://www.hhs.gov/
Authors
US Department of Health and Human Services
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
The Health Insurance Marketplace Public Use Files contain data on health and dental plans offered to individuals and small businesses through the US Health Insurance Marketplace.

Exploration Ideas

To help get you started, here are some data exploration ideas:

How do plan rates and benefits vary across states?

How do plan benefits relate to plan rates?

How do plan rates vary by age?

How do plans vary across insurance network providers?

See this forum thread for more ideas, and post there if you want to add your own ideas or answer some of the open questions!

Data Description

This data was originally prepared and released by the Centers for Medicare & Medicaid Services (CMS). Please read the CMS Disclaimer-User Agreement before using this data.

Here, we've processed the data to facilitate analytics. This processed version has three components:

1. Original versions of the data

The original versions of the 2014, 2015, 2016 data are available in the "raw" directory of the download and "../input/raw" on Kaggle Scripts. Search for "dictionaries" on this page to find the data dictionaries describing the individual raw files.

2. Combined CSV files that contain

In the top level directory of the download ("../input" on Kaggle Scripts), there are six CSV files that contain the combined at across all years:

BenefitsCostSharing.csv

BusinessRules.csv

Network.csv

PlanAttributes.csv

Rate.csv

ServiceArea.csv

Additionally, there are two CSV files that facilitate joining data across years:

Crosswalk2015.csv - joining 2014 and 2015 data

Crosswalk2016.csv - joining 2015 and 2016 data

3. SQLite database

The "database.sqlite" file contains tables corresponding to each of the processed CSV files.

The code to create the processed version of this data is available on GitHub.
d
Medical records of 30K Synthea synthetic patients
search.dataone.org
Updated Nov 8, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Chen, AJ (2023). Medical records of 30K Synthea synthetic patients [Dataset]. http://doi.org/10.7910/DVN/BWDKXS
Explore at:
Unique identifier
https://doi.org/10.7910/DVN/BWDKXS
Dataset updated
Nov 8, 2023
Dataset provided by
Harvard Dataverse
Authors
Chen, AJ
Description
The dataset has 2 populations of Synthea synthetic patients generated by Synthea tool. Each population has 15K patients with original medical records in CSV files. Because the total file size is >3GB in each population, the files are compressed in zip file. Synthea records are in domains similar to those in real EMR, including patients, encounters, conditions (diagnosis), observations, medications, and procedures. The data was first used in building ML models for lung cancer risk prediction. For more information, see the published paper in Nature Scientific Reports (https://www.nature.com/articles/s41598-022-23011-4)
Area Health Resources Files
datacatalog.med.nyu.edu
Updated Mar 21, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
United States - Health Resources and Services Administration (HRSA) (2024). Area Health Resources Files [Dataset]. https://datacatalog.med.nyu.edu/dataset/10001
Explore at:
Dataset updated
Mar 21, 2024
Dataset provided by
Health Resources and Services Administrationhttps://www.hrsa.gov/
Authors
United States - Health Resources and Services Administration (HRSA)
Time period covered
Jan 1, 2000 - Present
Area covered
Illinois, Washington (State), New Mexico, Massachusetts, Vermont, South Dakota, Georgia, Hawaii, United States, Idaho
Description
The Area Health Resources Files (AHRF) provide current as well as historic data for more than 6,000 variables for each of the nation's counties, as well as state and national data. They contain information on health facilities, health professions, measures of resource scarcity, health status, economic activity, health training programs, and socioeconomic and environmental characteristics. In addition, the basic file contains geographic codes and other metadata which enable it to be linked to other files.
Data from: Geographic Classification for Health - Concordance Files
figshare.com
datasetcatalog.nlm.nih.gov
+1more
txt
Updated May 30, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Jesse Whitehead; Gabrielle Davie; Brandon de Graaf; Sue Crengle; David Fearnley; Michelle Smith; Ross Lawrenson; Garry Nixon (2023). Geographic Classification for Health - Concordance Files [Dataset]. http://doi.org/10.6084/m9.figshare.22728851.v1
Explore at:
txtAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.22728851.v1
Dataset updated
May 30, 2023
Dataset provided by
Figsharehttp://figshare.com/
Authors
Jesse Whitehead; Gabrielle Davie; Brandon de Graaf; Sue Crengle; David Fearnley; Michelle Smith; Ross Lawrenson; Garry Nixon
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
These datasets are concordance files that link the Geographic Classification for Health (GCH) to statistical geographies and geographic units commonly used in health research and analysis in Aotearoa New Zealand (NZ). More information about the develppment of the GCH is available in our Open Access publication. Our long-term aim is the comprehensive and accurate understanding of urban-rural variation in health outcomes and healthcare utilization at both national and regional levels. This is best achieved by the widespread uptake of the GCH by health researchers and health policy makers. The GCH is straightforward to use and most users will only need the relevant concordance file.
Statistical Area 1s (SA1s, small statistical areas which are the output geography for population data) were used as the building blocks for the Geographic Classification for Health (GCH) and are the preferred small areas when undertaking the analysis of health data using the GCH. It is however appreciated that a lot of health data is not available at the SA1 level and GCH concordance files are also available for Domicile (Census Area Units, CAU) and Statistical Area 2s (SA2) and Meshblock. The following concordance files are available in excel format:

SA12018_to_GCH2018.csv This concordance file applies a GCH category to each SA1 in NZ SA22018_to_GCH2018.csv This concordance file applies a GCH category to each SA2 in NZ MoH_HDOM_to_GCH2018.csv This concordance file applies a GCH category to each Domicile in NZ. Please read the additional information below if you plan to use this concordance file. MoH_MB_to_GCH2018.csv This concordance file applies a GCH category to each Meshblock in NZ. Please read the additional information below if you plan to use this concordance file.

Additional information relating to geographic units used by the Ministry of Health:

MoH_HDOM_to_GCH2018.csv This file has been designed specifically to add GCH to the Ministry of Health (MoH) datasets containing Domicile codes. Use this file if your dataset contains only Domicile codes. If your dataset also contains Meshblock codes, then use the MoH Meshblock to GCH concordance file. This file includes 2006 and 2013 domicile codes. The 2013 domiciles are still current as of 2022, and this file will still work well with data outside those years. Domicile boundaries do not align well with SA1 boundaries, and longitudinal health data usually contains some older Domiciles which have been phased out and replaced with multiple smaller Domiciles. These deprecated Domiciles may overlap multiple SA1s. Usually, all such SA1s belong to the same GCH category. Occasionally, a Domicile will overlap more than one GCH category. When this happens, we have assigned the GCH category to which the majority of people living in that Domicile belong. By necessity, this will allocate a minority of people in those Domiciles to a GCH category to which they do not belong.
MoH_MB_to_GCH2018.csv This file has been designed specifically to add GCH to Ministry of Health (MoH) datasets containing Meshblock codes. This file includes 2018, 2013, 2006, and 2001 Meshblock codes, but will still work well with data outside those years. Meshblock boundaries from census 2018 fit perfectly and completely within the Statistics New Zealand Statistical Area 1s (SA1) boundaries on which GCH is based. However, longitudinal health data usually contains some older Meshblocks which have been phased out and replaced by multiple smaller Meshblocks. These deprecated Meshblocks may overlap multiple SA1s. Usually, all such SA1s belong to the same GCH category. Occasionally, a Meshblock will overlap more than one GCH category. When this happens, we have assigned the GCH category to which the majority of people living in that Meshblock belong. By necessity, this will allocate a minority of people in those Meshblocks to a GCH category to which they do not belong.
Mental Health Care in the Last 4 Weeks
catalog.data.gov
healthdata.gov
+3more
Updated Apr 23, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Centers for Disease Control and Prevention (2025). Mental Health Care in the Last 4 Weeks [Dataset]. https://catalog.data.gov/dataset/mental-health-care-in-the-last-4-weeks
Explore at:
Dataset updated
Apr 23, 2025
Dataset provided by
Centers for Disease Control and Preventionhttp://www.cdc.gov/
Description
The U.S. Census Bureau, in collaboration with five federal agencies, launched the Household Pulse Survey to produce data on the social and economic impacts of Covid-19 on American households. The Household Pulse Survey was designed to gauge the impact of the pandemic on employment status, consumer spending, food security, housing, education disruptions, and dimensions of physical and mental wellness. The survey was designed to meet the goal of accurate and timely weekly estimates. It was conducted by an internet questionnaire, with invitations to participate sent by email and text message. The sample frame is the Census Bureau Master Address File Data. Housing units linked to one or more email addresses or cell phone numbers were randomly selected to participate, and one respondent from each housing unit was selected to respond for him or herself. Estimates are weighted to adjust for nonresponse and to match Census Bureau estimates of the population by age, gender, race and ethnicity, and educational attainment. All estimates shown meet the NCHS Data Presentation Standards for Proportions.
Population Assessment of Tobacco and Health (PATH) Study [United States]...
icpsr.umich.edu
ascii, delimited, r +3
Updated Jun 27, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Inter-university Consortium for Political and Social Research [distributor] (2025). Population Assessment of Tobacco and Health (PATH) Study [United States] Master Linkage Files [Dataset]. http://doi.org/10.3886/ICPSR38008.v18
Explore at:
sas, r, ascii, delimited, spss, stataAvailable download formats
Unique identifier
https://doi.org/10.3886/ICPSR38008.v18
Dataset updated
Jun 27, 2025
Dataset provided by
Inter-university Consortium for Political and Social Researchhttps://www.icpsr.umich.edu/web/pages/
License
https://www.icpsr.umich.edu/web/ICPSR/studies/38008/termshttps://www.icpsr.umich.edu/web/ICPSR/studies/38008/terms
Area covered
United States
Description
The PATH Study was launched in 2011 to inform the Food and Drug Administration's regulatory activities under the Family Smoking Prevention and Tobacco Control Act (TCA). The PATH Study is a collaboration between the National Institute on Drug Abuse (NIDA), National Institutes of Health (NIH), and the Center for Tobacco Products (CTP), Food and Drug Administration (FDA). The study sampled over 150,000 mailing addresses across the United States to create a national sample of people who do and do not use tobacco. 45,971 adults and youth constitute the first (baseline) wave, Wave 1, of data collected by this longitudinal cohort study. These 45,971 adults and youth along with 7,207 "shadow youth" (youth ages 9 to 11 sampled at Wave 1) make up the 53,178 participants that constitute the Wave 1 Cohort. Respondents are asked to complete an interview at each follow-up wave. Youth who turn 18 by the current wave of data collection are considered "aged-up adults" and are invited to complete the Adult Interview. Additionally, "shadow youth" are considered "aged-up youth" upon turning 12 years old, when they are asked to complete the Youth Interview after parental consent. At Wave 4, a probability sample of 14,098 adults, youth, and shadow youth ages 10 to 11 was selected from the civilian, noninstitutionalized population at the time of Wave 4. This sample was recruited from residential addresses not selected for Wave 1 in the same sampled Primary Sampling Units (PSUs) and segments using similar within-household sampling procedures. This "replenishment sample" was combined for estimation and analysis purposes with Wave 4 adult and youth respondents from the Wave 1 Cohort who were in the civilian, noninstitutionalized population at the time of Wave 4. This combined set of Wave 4 participants, 52,731 participants in total, forms the Wave 4 Cohort. At Wave 7, a probability sample of 14,863 adults, youth, and shadow youth ages 9 to 11 was selected from the civilian, noninstitutionalized population at the time of Wave 7. This sample was recruited from residential addresses not selected for Wave 1 or Wave 4 in the same sampled PSUs and segments using similar within-household sampling procedures. This second replenishment sample was combined for estimation and analysis purposes with Wave 7 adult and youth respondents from the Wave 4 Cohort who were at least age 15 and in the civilian, noninstitutionalized population at the time of Wave 7. This combined set of Wave 7 participants, 46,169 participants in total, forms the Wave 7 Cohort. Please refer to the Restricted-Use Files User Guide that provides further details about children designated as "shadow youth" and the formation of the Wave 1, Wave 4, and Wave 7 Cohorts. Dataset 0001 (DS0001) contains the data from the Public-Use File Master Linkage File (PUF-MLF). This file contains 93 variables and 82,139 cases. The file provides a master list of every person's unique identification number and what type of respondent they were in each wave for data that are available in the Public-Use Files and Special Collection Public-Use Files. Dataset 0002 (DS0002) contains the data from the Restricted-Use File Master Linkage File (RUF-MLF). This file contains 198 variables and 82,139 cases. The file provides a master list of every person's unique identification number and what type of respondent they were in each wave for data that are available in the Restricted-Use Files, Special Collection Restricted-Use Files, and Biomarker Restricted-Use Files.
Evaluating Health Home Care Quality
kaggle.com
Updated Jan 23, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
The Devastator (2023). Evaluating Health Home Care Quality [Dataset]. https://www.kaggle.com/datasets/thedevastator/evaluating-health-home-care-quality
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jan 23, 2023
Dataset provided by
Kaggle
Authors
The Devastator
Description
Evaluating Health Home Care Quality

CMS Core Set and Health Home SPA Measures

By Health Data New York [source]

About this dataset

This dataset provides comprehensive measures to evaluate the quality of medical services provided to Medicaid beneficiaries by Health Homes, including the Centers for Medicare & Medicaid Services (CMS) Core Set and Health Home State Plan Amendment (SPA). This allows us to gain insight into how well these health homes are performing in terms of delivering high-quality care. Our data sources include the Medicaid Data Mart, QARR Member Level Files, and New York State Delivery System Inform Incentive Program (DSRIP) Data Warehouse. With this data set you can explore essential indicators such as rates for indicators within scope of Core Set Measures, sub domains, domains and measure descriptions; age categories used; denominators of each measure; level of significance for each indicator; and more! By understanding more about Health Home Quality Measures from this resource you can help make informed decisions about evidence based health practices while also promoting better patient outcomes

More Datasets

For more datasets, click here.

Featured Notebooks

🚨 Your notebook can be here! 🚨!

How to use the dataset

This dataset contains measures that evaluate the quality of care delivered by Health Homes for the Centers for Medicare & Medicaid Services (CMS). With this dataset, you can get an overview of how a health home is performing in terms of quality. You can use this data to compare different health homes and their respective service offerings.

The data used to create this dataset was collected from Medicaid Data Mart, QARR Member Level Files, and New York State Delivery System Incentive Program (DSRIP) Data Warehouse sources.

In order to use this dataset effectively, you should start by looking at the columns provided. These include: Measurement Year; Health Home Name; Domain; Sub Domain; Measure Description; Age Category; Denominator; Rate; Level of Significance; Indicator. Each column provides valuable insight into how a particular health home is performing in various measurements of healthcare quality.

When examining this data, it is important to remember that many variables are included in any given measure and that changes may have occurred over time due to varying factors such as population or financial resources available for healthcare delivery. Furthermore, changes in policy may also affect performance over time so it is important to take these things into account when evaluating the performance of any given health home from one year to the next or when comparing different health homes on a specific measure or set of indicators over time

Research Ideas

Using this dataset, state governments can evaluate the effectiveness of their health home programs by comparing the performance across different domains and subdomains.

Healthcare providers and organizations can use this data to identify areas for improvement in quality of care provided by health homes and strategies to reduce disparities between individuals receiving care from health homes.

Researchers can use this dataset to analyze how variations in cultural context, geography, demographics or other factors impact delivery of quality health home services across different locations

Acknowledgements

If you use this dataset in your research, please credit the original authors. Data Source

License

See the dataset description for more information.

Columns

File: health-home-quality-measures-beginning-2013-1.csv | Column name | Description | |:--------------------------|:----------------------------------------------------| | Measurement Year | The year in which the data was collected. (Integer) | | Health Home Name | The name of the health home. (String) | | Domain | The domain of the measure. (String) | | Sub Domain | The sub domain of the measure. (String) | | Measure Description | A description of the measure. (String) | | Age Category | The age category of the patient. (String) | | Denominator | The denominator of the measure. (Integer) | | Rate | The rate of the measure. (Float) | | Level of Significance | The level of significance of the measure. (String) | | Indicator | The indicator of the measure. (String) |

Acknowledgements

...
G
Health Trends, Comprehensive download file for all geographies
open.canada.ca
csv
Updated Mar 9, 2022
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Statistics Canada (2022). Health Trends, Comprehensive download file for all geographies [Dataset]. https://open.canada.ca/data/en/dataset/3ef254aa-519b-47d6-96ec-f0ba2e72e1dd
Explore at:
csvAvailable download formats
Dataset updated
Mar 9, 2022
Dataset provided by
Statistics Canada
License
Open Government Licence - Canada 2.0https://open.canada.ca/en/open-government-licence-canada
License information was derived automatically
Description
This product presents comparable time-series data for a range of health indicators from a number of sources including the Canadian Community Health Survey, Vital Statistics, and Canadian Cancer Registry.
g
Health Interview Survey, 1972 - Version 3
search.gesis.org
Updated May 7, 2021
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
United States Department of Health and Human Services. Centers for Disease Control and Prevention. National Center for Health Statistics (2021). Health Interview Survey, 1972 - Version 3 [Dataset]. http://doi.org/10.3886/ICPSR08337.v3
Explore at:
Unique identifier
https://doi.org/10.3886/ICPSR08337.v3
Dataset updated
May 7, 2021
Dataset provided by
GESIS search
ICPSR - Interuniversity Consortium for Political and Social Research
Authors
United States Department of Health and Human Services. Centers for Disease Control and Prevention. National Center for Health Statistics
License
https://search.gesis.org/research_data/datasearch-httpwww-da-ra-deoaip--oaioai-da-ra-de456828https://search.gesis.org/research_data/datasearch-httpwww-da-ra-deoaip--oaioai-da-ra-de456828
Description
Abstract (en): The purpose of the Health Interview Survey is to obtain information about the amount and distribution of illness, its effects in terms of disability and chronic impairments, and the kinds of health services people receive. There are five types of records in this core survey, each in a separate data file. The variables in the Household File (Part 1) include type of living quarters, size of family, number of families in the household, presence of a telephone, number of unrelated individuals, and region. The Person File (Part 2) includes information on sex, age, race, marital status, Hispanic origin, education, veteran status, family income, family size, major activities, health status, activity limits, employment status, and industry and occupation. These variables are found in the Condition, Doctor Visit, and Hospital Episode Files as well. The Person File also supplies data on height, weight, bed days, doctor visits, hospital stays, years at residence, and region variables. The Condition File (Part 3) contains information for each reported health condition, with specifics on injury and accident reports. The Hospital Episode File (Part 4) provides information on medical conditions, hospital episodes, type of service, type of hospital ownership, date of admission and discharge, number of nights in hospital, and operations performed. The Doctor Visit File (Part 5) documents doctor visits within the time period and identifies acute or chronic conditions. A sixth file has been added, along with the five core files. The Health Insurance File (Part 6) documents basic demographic information along with medical coverage and health insurance plans, as well as differentiates between hospital, doctor visit, and surgical insurance coverage. Civilian, noninstitutionalized population of the United States. A multistage probability sample was used in selecting housing units. 2010-09-30 Frequencies and variable labels that were previously incorrect have been corrected.2010-09-09 A technical error has been found and resolved in the processing procedure, in which defined file sets did not match subsequent data sets.2010-09-02 SAS, SPSS, and Stata setup files have been added. Some corresponding documentation has been updated and pre-existing data files have been replaced. A sixth dataset has been added in place of the National Health Survey Procedure Documentation, which can now be found with all other corresponding and added documentation.2006-01-18 File CB8337.PDF was removed from any previous datasets and flagged as a study-level file, so that it will accompany all downloads. face-to-face interviewThese data files contain weights that must be used in any analysis.Per agreement with NCHS, ICPSR distributes the data files and text of the technical documentation for this collection as prepared by NCHS.
Medical Expenditure Panel Survey (MEPS) Household Component Public Use Files...
catalog.data.gov
healthdata.gov
+1more
Updated Jul 26, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Agency for Healthcare Research and Quality, Department of Health & Human Services (2023). Medical Expenditure Panel Survey (MEPS) Household Component Public Use Files [Dataset]. https://catalog.data.gov/dataset/medical-expenditure-panel-survey-household-component
Explore at:
Dataset updated
Jul 26, 2023
Dataset provided by
Agency for Healthcare Research and Qualityhttp://www.ahrq.gov/
United States Department of Health and Human Serviceshttp://www.hhs.gov/
Description
The Medical Expenditure Panel Survey (MEPS) Household Component (HC) collects data from a sample of families and individuals in selected communities across the United States, drawn from a nationally representative subsample of households that participated in the prior year's National Health Interview Survey (conducted by the National Center for Health Statistics). During the household interviews, MEPS collects detailed information for each person in the household on the following: demographic characteristics, health conditions, health status, use of medical services, charges and source of payments, access to care, satisfaction with care, health insurance coverage, income, and employment. The panel design of the survey, which features several rounds of interviewing, makes it possible to determine how changes in respondents' health status, income, employment, eligibility for public and private insurance coverage, use of services, and payment for care are related. Public Use Files for Household data are available on the MEPS website.
Provider of Services File - Internet Quality Improvement and Evaluation...
catalog.data.gov
data.virginia.gov
+1more
Updated Jul 17, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Centers for Medicare & Medicaid Services (2025). Provider of Services File - Internet Quality Improvement and Evaluation System - Home Health Agency, Ambulatory Surgical Center, and Hospice Providers [Dataset]. https://catalog.data.gov/dataset/provider-of-services-file-internet-quality-improvement-and-evaluation-system-home-health-a
Explore at:
Dataset updated
Jul 17, 2025
Dataset provided by
Centers for Medicare & Medicaid Services
Description
The Provider of Services File (POS) - Internet Quality Improvement and Evaluation System (iQIES) - Home Health Agency (HHA), Ambulatory Surgical Center (ASC), and Hospice Providers data provides information on provider demographic and associated certification information. In this file you will find provider number (CMS Certification Number), name, address, and other characteristics of the participating institution providers.
AHRQ Social Determinants of Health Updated Database
datalumos.org
openicpsr.org
Updated Feb 25, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
AHRQ (2025). AHRQ Social Determinants of Health Updated Database [Dataset]. http://doi.org/10.3886/E220762V1
Explore at:
Unique identifier
https://doi.org/10.3886/E220762V1
Dataset updated
Feb 25, 2025
Dataset provided by
Agency for Healthcare Research and Qualityhttp://www.ahrq.gov/
Authors
AHRQ
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
AHRQ's database on Social Determinants of Health (SDOH) was created under a project funded by the Patient Centered Outcomes Research (PCOR) Trust Fund. The purpose of this project is to create easy to use, easily linkable SDOH-focused data to use in PCOR research, inform approaches to address emerging health issues, and ultimately contribute to improved health outcomes.The database was developed to make it easier to find a range of well documented, readily linkable SDOH variables across domains without having to access multiple source files, facilitating SDOH research and analysis.Variables in the files correspond to five key SDOH domains: social context (e.g., age, race/ethnicity, veteran status), economic context (e.g., income, unemployment rate), education, physical infrastructure (e.g, housing, crime, transportation), and healthcare context (e.g., health insurance). The files can be linked to other data by geography (county, ZIP Code, and census tract). The database includes data files and codebooks by year at three levels of geography, as well as a documentation file.The data contained in the SDOH database are drawn from multiple sources and variables may have differing availability, patterns of missing, and methodological considerations across sources, geographies, and years. Users should refer to the data source documentation and codebooks, as well as the original data sources, to help identify these patterns
D
Medical Records Filing System Market Report | Global Forecast From 2025 To...
dataintelo.com
csv, pdf, pptx
Updated Jan 7, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Dataintelo (2025). Medical Records Filing System Market Report | Global Forecast From 2025 To 2033 [Dataset]. https://dataintelo.com/report/medical-records-filing-system-market
Explore at:
csv, pdf, pptxAvailable download formats
Dataset updated
Jan 7, 2025
Dataset authored and provided by
Dataintelo
License
https://dataintelo.com/privacy-and-policyhttps://dataintelo.com/privacy-and-policy
Time period covered
2024 - 2032
Area covered
Global
Description
Medical Records Filing System Market Outlook

The Medical Records Filing System market size was estimated at USD 14.8 billion in 2023, and it is projected to reach USD 28.6 billion by 2032, growing at a compound annual growth rate (CAGR) of 7.8% from 2024 to 2032. The rising demand for efficient healthcare management systems and the increasing adoption of digital solutions are key drivers for this market growth.

The medical records filing system market is experiencing significant growth primarily due to the increased emphasis on improving patient care through better data management. As healthcare systems globally strive for higher efficiency, the importance of maintaining accurate and accessible patient records has become paramount. The adoption of digital solutions, such as Electronic Health Records (EHRs), is accelerating due to their ability to streamline operations and reduce errors, leading to enhanced patient outcomes. Furthermore, legislations and regulations promoting data interoperability and secure patient information exchange are encouraging healthcare providers to upgrade their filing systems. This trend is expected to continue as healthcare institutions increasingly recognize the long-term cost benefits of digital recordkeeping systems.

Technological advancements are another significant growth driver for the medical records filing system market. Innovations in cloud computing, artificial intelligence (AI), and machine learning are transforming how patient data is stored, accessed, and analyzed. Cloud-based medical records systems offer scalable solutions that can be customized to meet the diverse needs of healthcare providers. AI and machine learning technologies, on the other hand, enable predictive analytics, helping healthcare providers make informed decisions. These technological advancements are not only enhancing the functionality of medical records filing systems but also providing a competitive edge to early adopters in the healthcare sector.

Another critical factor contributing to the market growth is the increasing prevalence of chronic diseases and the aging global population. As the number of patients with chronic conditions rises, so does the volume of medical data that needs to be managed. Efficient medical records filing systems are crucial for the ongoing management of these patients, ensuring that healthcare providers have timely access to comprehensive medical histories. This need is particularly acute in regions with older populations, where the demand for long-term care facilities and ongoing medical management is higher.

The concept of Medical Data Middle is increasingly becoming a focal point in the healthcare industry. As healthcare providers strive to enhance data management and patient care, the integration of a centralized data repository, or Medical Data Middle, facilitates seamless data sharing and interoperability. This approach not only improves the accessibility of patient information across various healthcare settings but also enhances the accuracy of diagnoses and treatment plans. By centralizing medical data, healthcare providers can ensure that patient records are up-to-date and comprehensive, leading to better-informed clinical decisions. The implementation of Medical Data Middle can also streamline administrative processes, reduce redundancies, and ultimately contribute to more efficient healthcare delivery systems.

Regionally, North America is expected to dominate the medical records filing system market, followed by Europe and the Asia Pacific. The high adoption rate of advanced healthcare technologies, well-established healthcare infrastructure, and favorable regulatory environment in North America are key factors driving the market in this region. Conversely, the Asia Pacific region is projected to witness the highest growth rate during the forecast period due to increasing healthcare expenditures, rising patient awareness, and government initiatives to digitize healthcare records. Markets in Latin America and the Middle East & Africa are also expected to grow, albeit at a slower pace, driven by improvements in healthcare infrastructure and increased investments in healthcare technology.

Product Type Analysis

The medical records filing system market can be segmented by product type into paper-based filing systems, electronic filing systems, and hybrid filing systems. Paper-based filing systems, while traditional, are becoming less popular due to their limitations in storage capacity and risk
Healthcare Data
caliper.com
cdf, dwg, dxf, gdb +9
Updated Jul 25, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Caliper Corporation (2024). Healthcare Data [Dataset]. https://www.caliper.com/mapping-software-data/maptitude-healthcare-data.htm
Explore at:
sql server mssql, ntf, postgis, cdf, kmz, shp, kml, geojson, dwg, sdo, dxf, gdb, postgresqlAvailable download formats
Dataset updated
Jul 25, 2024
Dataset authored and provided by
Caliper Corporationhttp://www.caliper.com/
License
https://www.caliper.com/license/maptitude-license-agreement.htmhttps://www.caliper.com/license/maptitude-license-agreement.htm
Time period covered
2024
Area covered
United States
Description
Healthcare Data for use with GIS mapping software, databases, and web applications are from Caliper Corporation and contain point geographic files of healthcare organizations, providers, and hospitals and an boundary file of Primary Care Service Areas.
Population Assessment of Tobacco and Health (PATH) Study [United States]...
icpsr.umich.edu
Updated Jun 27, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Inter-university Consortium for Political and Social Research [distributor] (2025). Population Assessment of Tobacco and Health (PATH) Study [United States] Special Collection Restricted-Use Files [Dataset]. http://doi.org/10.3886/ICPSR37519.v13
Explore at:
Unique identifier
https://doi.org/10.3886/ICPSR37519.v13
Dataset updated
Jun 27, 2025
Dataset provided by
Inter-university Consortium for Political and Social Researchhttps://www.icpsr.umich.edu/web/pages/
License
https://www.icpsr.umich.edu/web/ICPSR/studies/37519/termshttps://www.icpsr.umich.edu/web/ICPSR/studies/37519/terms
Area covered
United States
Description
The PATH Study was launched in 2011 to inform the Food and Drug Administration's regulatory activities under the Family Smoking Prevention and Tobacco Control Act (TCA). The PATH Study is a collaboration between the National Institute on Drug Abuse (NIDA), National Institutes of Health (NIH), and the Center for Tobacco Products (CTP), Food and Drug Administration (FDA). The study sampled over 150,000 mailing addresses across the United States to create a national sample of people who use or do not use tobacco. 45,971 adults and youth constitute the first (baseline) wave, Wave 1, of data collected by this longitudinal cohort study. These 45,971 adults and 9 to 11 sampled at Wave 1) make up the 53,178 participants that constitute the Wave 1 Cohort. Respondents are asked to complete an interview at each follow-up wave. Youth who turn 18 by the current wave of data collection are considered "aged-up adults" and are invited to complete the Adult Interview. Additionally, "shadow youth" are considered "aged-up youth" upon turning 12 years old, when they are asked to complete an interview after parental consent. At Wave 4, a probability sample of 14,098 adults, youth, and shadow youth ages 10 to 11 was selected from the civilian, noninstitutionalized population at the time of Wave 4. This sample was recruited from residential addresses not selected for Wave 1 in the same sampled primary sampling units (PSU)s and segments using similar within-household sampling procedures. This "replenishment sample" was combined for estimation and analysis purposes with Wave 4 adult and youth respondents from the Wave 1 Cohort who were in the civilian, noninstitutionalized population at the time of Wave 4. This combined set of Wave 4 participants, 52,731 participants in total, forms the Wave 4 Cohort. At Wave 7, a probability sample of 14,863 adults, youth, and shadow youth ages 9 to 11 was selected from the civilian, noninstitutionalized population at the time of Wave 7. This sample was recruited from residential addresses not selected for Wave 1 or Wave 4 in the same sampled PSUs and segments using similar within-household sampling procedures. This "second replenishment sample" was combined for estimation and analysis purposes with the Wave 7 adult and youth respondents from the Wave 4 Cohorts who were at least age 15 and in the civilian, noninstitutionalized population at the time of Wave 7 participants, 46,169 participants in total, forms the Wave 7 Cohort. Please refer to the Restricted-Use Files User Guide that provides further details about children designated as "shadow youth" and the formation of the Wave 1, Wave 4, and Wave 7 Cohorts. Wave 4.5 was a special data collection for youth only who were aged 12 to 17 at the time of the Wave 4.5 interview. Wave 4.5 was the fourth annual follow-up wave for those who were members of the Wave 1 Cohort. For those who were sampled at Wave 4, Wave 4.5 was the first annual follow-up wave. Wave 5.5, conducted in 2020, was a special data collection for Wave 4 Cohort youth and young adults ages 13 to 19 at the time of the Wave 5.5 interview. Also in 2020, a subsample of Wave 4 Cohort adults ages 20 and older were interviewed via the PATH Study Adult Telephone Survey (PATH-ATS). Wave 7.5 was a special collection for Wave 4 and Wave 7 Cohort youth and young adults ages 12 to 22 at the time of the Wave 7.5 interview. For those who were sampled at Wave 7, Wave 7.5 was the first annual follow-up wave. Dataset 1002 (DS1002) contains the data from the Wave 4.5 Youth and Parent Questionnaire. This file contains 1,617 variables and 13,131 cases. Of these cases, 11,378 are continuing youth having completed a prior Youth Interview. The other 1,753 cases are "aged-up youth" having previously been sampled as "shadow youth" Datasets 1112, 1212, and 1222, (DS1112, DS1212, and DS1222) are data files comprising the weight variables for Wave 4.5. The "all-waves" weight file contains weights for participants in the Wave 1 Cohort who completed a Wave 4.5 Youth Interview and completed interviews (if old enough to do so) or verified their information with the study (if not old enough to be interviewed) in Waves 1, 2, 3, and 4. There are two separate files with "single wave" weights: one for the Wave 1 Cohort and one for the Wave 4 Cohort. The "single-wave" weight file for the Wave 1 Cohort contains weights for youth who c
PSYCHE-D: predicting change in depression severity using person-generated...
zenodo.org
data.niaid.nih.gov
bin, pdf
Updated Jul 18, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Mariko Makhmutova; Mariko Makhmutova; Raghu Kainkaryam; Raghu Kainkaryam; Marta Ferreira; Marta Ferreira; Jae Min; Jae Min; Martin Jaggi; Martin Jaggi; Ieuan Clay; Ieuan Clay (2024). PSYCHE-D: predicting change in depression severity using person-generated health data (DATASET) [Dataset]. http://doi.org/10.5281/zenodo.5085146
Explore at:
pdf, binAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.5085146
Dataset updated
Jul 18, 2024
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Mariko Makhmutova; Mariko Makhmutova; Raghu Kainkaryam; Raghu Kainkaryam; Marta Ferreira; Marta Ferreira; Jae Min; Jae Min; Martin Jaggi; Martin Jaggi; Ieuan Clay; Ieuan Clay
Description
This dataset is made available under Creative Commons Attribution-NonCommercial 4.0 International (CC BY-NC 4.0). See LICENSE.pdf for details.

Dataset description

Parquet file, with:

35694 rows

154 columns

The file is indexed on [participant]_[month], such that 34_12 means month 12 from participant 34. All participant IDs have been replaced with randomly generated integers and the conversion table deleted.

Column names and explanations are included as a separate tab-delimited file. Detailed descriptions of feature engineering are available from the linked publications.

File contains aggregated, derived feature matrix describing person-generated health data (PGHD) captured as part of the DiSCover Project (https://clinicaltrials.gov/ct2/show/NCT03421223). This matrix focuses on individual changes in depression status over time, as measured by PHQ-9.

The DiSCover Project is a 1-year long longitudinal study consisting of 10,036 individuals in the United States, who wore consumer-grade wearable devices throughout the study and completed monthly surveys about their mental health and/or lifestyle changes, between January 2018 and January 2020.

The data subset used in this work comprises the following:

Wearable PGHD: step and sleep data from the participants’ consumer-grade wearable devices (Fitbit) worn throughout the study

Screener survey: prior to the study, participants self-reported socio-demographic information, as well as comorbidities

Lifestyle and medication changes (LMC) survey: every month, participants were requested to complete a brief survey reporting changes in their lifestyle and medication over the past month

Patient Health Questionnaire (PHQ-9) score: every 3 months, participants were requested to complete the PHQ-9, a 9-item questionnaire that has proven to be reliable and valid to measure depression severity

From these input sources we define a range of input features, both static (defined once, remain constant for all samples from a given participant throughout the study, e.g. demographic features) and dynamic (varying with time for a given participant, e.g. behavioral features derived from consumer-grade wearables).

The dataset contains a total of 35,694 rows for each month of data collection from the participants. We can generate 3-month long, non-overlapping, independent samples to capture changes in depression status over time with PGHD. We use the notation ‘SM0’ (sample month 0), ‘SM1’, ‘SM2’ and ‘SM3’ to refer to relative time points within each sample. Each 3-month sample consists of: PHQ-9 survey responses at SM0 and SM3, one set of screener survey responses, LMC survey responses at SM3 (as well as SM1, SM2, if available), and wearable PGHD for SM3 (and SM1, SM2, if available). The wearable PGHD includes data collected from 8 to 14 days prior to the PHQ-9 label generation date at SM3. Doing this generates a total of 10,866 samples from 4,036 unique participants.
County Health Ranking Dataset
kaggle.com
Updated Jul 10, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Nikhil Narayan (2023). County Health Ranking Dataset [Dataset]. https://www.kaggle.com/datasets/nikhil7280/county-health-ranking-dataset
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jul 10, 2023
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Nikhil Narayan
License
https://www.usa.gov/government-works/https://www.usa.gov/government-works/
Description
Basic Info:

The Dataset represents the County Health Ranking of all states taking into account the various factors The County Health Rankings can be used to highlight regional variations in health, increase public understanding of the various factors that affect health, and inspire actions to improve community health. The Rankings capitalizes on our innate desire to compete by enabling comparisons across adjacent or comparable counties within states.

Dataset Information:

The CSV file contains the rankings and data details for the measures used in the 2022/23 County Health Rankings.
1) Outcomes and Factors Rankings --Ranks are all calculated and reported WITHIN states
2)**Outcomes and Factors SubRankings** --Ranks are all calculated and reported WITHIN states
3) Ranked Measure Data --The measures themselves are listed in bold.
4) Ranked Measure Sources & Years
5) Additional Measure Data --These are supplemental measures reported on the Rankings web site but not used in calculating the rankings.
6) Additional Measure Sources & Years

The Data Types of all Columns are automatically set to "Object" To change it just use data.apply(pd.to_numeric)
C
Hospital Annual Financial Data - Selected Data & Pivot Tables
data.chhs.ca.gov
data, doc, pdf, xls +2
Updated Apr 23, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
The citation is currently not available for this dataset.
Explore at:
xlsx(752914), xls(19577856), xlsx(754073), pdf(258239), xls(14657536), xlsx(782546), xlsx, pdf(383996), xls(51424256), xlsx(769128), xlsx(790979), xls(51554816), xlsx(771275), data, xlsx(768036), xls, xls(18445312), xlsx(758089), xls(19599360), doc, xlsx(763636), xls(16002048), zip, pdf(333268), pdf(303198), xlsx(777616), pdf(310420), xls(44967936), xls(18301440), xlsx(758376), xls(19625472), xlsx(750199), xlsx(779866), xls(920576)Available download formats
Dataset updated
Apr 23, 2025
Dataset authored and provided by
Department of Health Care Access and Information
Description
On an annual basis (individual hospital fiscal year), individual hospitals and hospital systems report detailed facility-level data on services capacity, inpatient/outpatient utilization, patients, revenues and expenses by type and payer, balance sheet and income statement.

Due to the large size of the complete dataset, a selected set of data representing a wide range of commonly used data items, has been created that can be easily managed and downloaded. The selected data file includes general hospital information, utilization data by payer, revenue data by payer, expense data by natural expense category, financial ratios, and labor information.

There are two groups of data contained in this dataset: 1) Selected Data - Calendar Year: To make it easier to compare hospitals by year, hospital reports with report periods ending within a given calendar year are grouped together. The Pivot Tables for a specific calendar year are also found here. 2) Selected Data - Fiscal Year: Hospital reports with report periods ending within a given fiscal year (July-June) are grouped together.
f
Synthetic Dataset of Emergency Healthcare Services
figshare.com
csv
Updated Dec 12, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Marco Ferreira (2024). Synthetic Dataset of Emergency Healthcare Services [Dataset]. http://doi.org/10.6084/m9.figshare.28012784.v1
Explore at:
csvAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.28012784.v1
Dataset updated
Dec 12, 2024
Dataset provided by
figshare
Authors
Marco Ferreira
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This dataset was generated using Simio simulation software. The simulations model patient flow in healthcare settings, capturing key metrics such as queue times, length of stay (LOS) for patients, and nurse utilization rates. Each CSV file contains time-series data, with measured variables including patient waiting times, resource utilization percentages, and service durations.## File Overview**CheckBloodPressure.csv** - (9 KB): Contains blood pressure Server records of patients.**CheckPatientType.csv** - (19 KB): Identifies the type of each patient (e.g., 1 or 3).**Fill_Information.csv** - (2 KB): Fill information records for new patients.**MedicalRecord1.csv** - (10 KB): Medical record dataset for patient type 1.**MedicalRecord2.csv** - (4 KB): Medical record dataset for patient type 2.**MedicalRecord3.csv** - (2 KB): Medical record dataset for patient type 3.**MedicalRecord4.csv** - (13 KB): Medical record dataset for patient type 4.**OutPatientDepartment.csv** - (18 KB): Data related to the satisfaction and length of stay of an given patient.**Triage.csv** - (13 KB): Data related to the triage process.**README.txt** - (4 KB): Documentation of the dataset, including structure, metadata, and usage.## Common Fields Across Files**Patient ID** (Integer): Unique identifier for each patient.**Patient Type** (Integer): Classification of patient (e.g., 1, 4).**Medical Records Arrival Time** (DateTime): Timestamp of the patient's first arrival in the medical record department.**Exiting Time** (DateTime): Timestamp when the patient exits a Server.**Waiting Time (min)** (Real): Total waiting time before being attended to.**Resource Used** (String): Resource (e.g., Operator) allocated to the patient.**Utilization %** (Real): Utilization rate of the resource as a percentage.**Queue Count Before Processing** (Integer): Number of patients in the queue before processing begins.**Queue Count After Processing** (Integer): Number of patients in the queue after processing ends.**Queue Difference** (Integer): Difference between the before and after queue counts.**Length of Stay (min)** (Real): Total time spent in the simulation by the patient.**LOS without Queues (min)** (Real): Length of stay excluding any queuing time.**Satisfaction %** (Real): Patient satisfaction rating based on their experience.**New Patient?** (String): Indicates if this is a new patient or a returning one.
u
HRSA Area Health Resources Files
knowledge.uchicago.edu
Updated 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Miranda-Izguerra; Kolak, Marynia (2025). HRSA Area Health Resources Files [Dataset]. https://knowledge.uchicago.edu/record/14541
Explore at:
Dataset updated
2025
Dataset provided by
University of Chicago
Authors
Miranda-Izguerra; Kolak, Marynia
Description
The Area Health Resources Files (AHRF), compiled by the Health Resources and Services Administration (HRSA), offer a comprehensive collection of data on health resources in the United States. These files integrate information from over 50 sources, providing extensive county-level, state-level, and national-level data. The dataset includes annual data releases, with files corresponding to the years 2020-2021, 2021-2022, and 2022-2023. Each release is accompanied by technical documentation and is available in various formats, including CSV and SAS.

Facebook

Twitter

Click to copy link

Link copied

Cite

US Department of Health and Human Services (2017). Health Insurance Marketplace [Dataset]. https://www.kaggle.com/datasets/hhs/health-insurance-marketplace

Health Insurance Marketplace

Explore health and dental plans data in the US Health Insurance Marketplace

Explore at:

zip(868821924 bytes)Available download formats

Dataset updated

May 1, 2017

Dataset provided by

United States Department of Health and Human Serviceshttp://www.hhs.gov/

Authors

US Department of Health and Human Services

License

https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

Description

The Health Insurance Marketplace Public Use Files contain data on health and dental plans offered to individuals and small businesses through the US Health Insurance Marketplace.

Exploration Ideas

To help get you started, here are some data exploration ideas:

How do plan rates and benefits vary across states?
How do plan benefits relate to plan rates?
How do plan rates vary by age?
How do plans vary across insurance network providers?

See this forum thread for more ideas, and post there if you want to add your own ideas or answer some of the open questions!

Data Description

This data was originally prepared and released by the Centers for Medicare & Medicaid Services (CMS). Please read the CMS Disclaimer-User Agreement before using this data.

Here, we've processed the data to facilitate analytics. This processed version has three components:

1. Original versions of the data

The original versions of the 2014, 2015, 2016 data are available in the "raw" directory of the download and "../input/raw" on Kaggle Scripts. Search for "dictionaries" on this page to find the data dictionaries describing the individual raw files.

2. Combined CSV files that contain

In the top level directory of the download ("../input" on Kaggle Scripts), there are six CSV files that contain the combined at across all years:

BenefitsCostSharing.csv
BusinessRules.csv
Network.csv
PlanAttributes.csv
Rate.csv
ServiceArea.csv

Additionally, there are two CSV files that facilitate joining data across years:

Crosswalk2015.csv - joining 2014 and 2015 data
Crosswalk2016.csv - joining 2015 and 2016 data

3. SQLite database

The "database.sqlite" file contains tables corresponding to each of the processed CSV files.

The code to create the processed version of this data is available on GitHub.

Clear search

Close search

Google apps

Main menu

Health Insurance Marketplace

Exploration Ideas

Data Description

1. Original versions of the data

2. Combined CSV files that contain

3. SQLite database

Medical records of 30K Synthea synthetic patients

Area Health Resources Files

Data from: Geographic Classification for Health - Concordance Files

Mental Health Care in the Last 4 Weeks

Population Assessment of Tobacco and Health (PATH) Study [United States]...

Evaluating Health Home Care Quality

Evaluating Health Home Care Quality

CMS Core Set and Health Home SPA Measures

About this dataset

More Datasets

Featured Notebooks

How to use the dataset

Research Ideas

Acknowledgements

License

Columns

Acknowledgements

Health Trends, Comprehensive download file for all geographies

Health Interview Survey, 1972 - Version 3

Medical Expenditure Panel Survey (MEPS) Household Component Public Use Files...

Provider of Services File - Internet Quality Improvement and Evaluation...

AHRQ Social Determinants of Health Updated Database

Medical Records Filing System Market Report | Global Forecast From 2025 To...

Medical Records Filing System Market Outlook

Product Type Analysis

Healthcare Data

Population Assessment of Tobacco and Health (PATH) Study [United States]...

PSYCHE-D: predicting change in depression severity using person-generated...

County Health Ranking Dataset

Basic Info:

Dataset Information:

Hospital Annual Financial Data - Selected Data & Pivot Tables

Synthetic Dataset of Emergency Healthcare Services

HRSA Area Health Resources Files

Health Insurance Marketplace

Explore health and dental plans data in the US Health Insurance Marketplace

Exploration Ideas

Data Description

1. Original versions of the data

2. Combined CSV files that contain

3. SQLite database