78 datasets found

D
De-identified Health Data Market Report
archivemarketresearch.com
doc, pdf, ppt
Updated Jan 22, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Archive Market Research (2025). De-identified Health Data Market Report [Dataset]. https://www.archivemarketresearch.com/reports/de-identified-health-data-market-9104
Explore at:
ppt, pdf, docAvailable download formats
Dataset updated
Jan 22, 2025
Dataset authored and provided by
Archive Market Research
License
https://www.archivemarketresearch.com/privacy-policyhttps://www.archivemarketresearch.com/privacy-policy
Time period covered
2025 - 2033
Area covered
global
Variables measured
Market Size
Description
Recent developments include: In February 2024, Veradigm published its first Veradigm Insights Report: Cardiovascular Conditions in 2024, analyzing de-identified real-world data from 53 million cardiovascular patients. The report assesses the prevalence of cardiovascular disease (CVD) and related conditions across all U.S. states, with demographic breakdowns based on age, ethnicity, and sex. , In July 2021, Verana Health and Komodo Health partnered to integrate Komodo’s Healthcare Map into Verana’s de-identified EHR datasets, spanning over 325 million patient journeys. This collaboration aims to provide life sciences researchers with detailed insights into patient pathways, encompassing treatment histories, hospitalizations, and socioeconomic factors. The partnership is expected to enhance research efforts in ophthalmology, neurology, and urology by combining clinical outcomes with real-world patient data, supporting more informed treatment development. , In September 2024, ICON announced a collaboration with Intel to utilize de-identified data from its clinical research platform alongside Intel's AI technology. This partnership enhances patient recruitment and streamlines clinical trial processes by deriving insights from de-identified patient data. The initiative aims to advance precision medicine and improve efficiencies in drug development and outcomes by integrating ICON's clinical trial expertise with Intel's AI capabilities. .
Hospital Inpatient Discharges (SPARCS De-Identified): 2013
healthdata.gov
health.data.ny.gov
application/rdfxml +5
Updated Apr 8, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
health.data.ny.gov (2025). Hospital Inpatient Discharges (SPARCS De-Identified): 2013 [Dataset]. https://healthdata.gov/State/Hospital-Inpatient-Discharges-SPARCS-De-Identified/gbzd-5nff/data
Explore at:
application/rssxml, tsv, csv, json, xml, application/rdfxmlAvailable download formats
Dataset updated
Apr 8, 2025
Dataset provided by
health.data.ny.gov
Description
The Statewide Planning and Research Cooperative System (SPARCS) Inpatient De-identified File contains discharge level detail on patient characteristics, diagnoses, treatments, services, and charges. This data file contains basic record level detail for the discharge. The de-identified data file does not contain data that is protected health information (PHI) under HIPAA. The health information is not individually identifiable; all data elements considered identifiable have been redacted. For example, the direct identifiers regarding a date have the day and month portion of the date removed.
Open View - EMS Substance Use Response Incident Patients - Deidentified
data.virginia.gov
opendata.winchesterva.gov
csv
Updated Jun 3, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Virginia Department of Health (2025). Open View - EMS Substance Use Response Incident Patients - Deidentified [Dataset]. https://data.virginia.gov/dataset/open-view-ems-substance-use-response-incident-patients-deidentified
Explore at:
csv(21727671)Available download formats
Dataset updated
Jun 3, 2025
Dataset authored and provided by
Virginia Department of Health
Description
MPORTANT NOTE: This provisional data is being provided as VDH OEMS continues to improve its data systems. The data on this page will continue to change throughout the data system improvement process and will stabilize over time. Thank you for your patience.

This dataset contains Emergency Medical Services (EMS) information for reported emergency response incidents that involve a substance or have suspected substance involvement. Data in this dataset has been provided by ESO on behalf of the Office of EMS.

Please be advised that the accuracy of the data within the EMS patient care reporting system is limited by system performance and the accuracy of data submissions received from EMS agencies. While each record in this dataset is for a single patient involved in an incident reported by an EMS agency, unique patients may be counted more than once in the dataset (e.g., if a patient was treated by two EMS agencies, that patient may be counted in the dataset twice). This data should not be interpreted as the number of unique substance use incidents reported by Virginia EMS agencies.

For instances where medication was administered to the patient, the response to the medication is provided, if reported by the EMS agency (e.g., if a patient received "naloxone" and the response of the patient for this administration of naloxone was reported as "Improved", then the record will show "naloxone with Improved response"). In instances where multiple medications were administered to the patient, the administrations and their associated responses are provided as a pipe-delimited list in the order that the patient received the medications.

This dataset has been classified as a Tier 0 asset by the Commonwealth Data Trust. Tier 0 classifies a data resource as information that is neither sensitive nor proprietary, and intended for public access.
c
De-identified Health Data Market Size & Forecast, 2025-2032
coherentmarketinsights.com
Updated Nov 18, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Coherent Market Insights (2024). De-identified Health Data Market Size & Forecast, 2025-2032 [Dataset]. https://www.coherentmarketinsights.com/industry-reports/de-identified-health-data-market
Explore at:
Dataset updated
Nov 18, 2024
Dataset authored and provided by
Coherent Market Insights
License
https://www.coherentmarketinsights.com/privacy-policyhttps://www.coherentmarketinsights.com/privacy-policy
Time period covered
2025 - 2031
Area covered
Global
Description
De-identified Health Data Market holds a forecasted revenue of US$ 8.21 Bn in 2025 and is likely to cross US$ 15.31 Bn by 2032.
u
De-identified Data from the PArTNER Study: A Pragmatic Clinical Trial to...
indigo.uic.edu
csv
Updated May 5, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Jerry Krishnan; Sai Dheeraj Illendula; Lynn Gerald; Jun Lu (2025). De-identified Data from the PArTNER Study: A Pragmatic Clinical Trial to Improve Patient Experience During Transitions from Hospital to Home [Dataset]. http://doi.org/10.25417/uic.28889918.v1
Explore at:
csvAvailable download formats
Unique identifier
https://doi.org/10.25417/uic.28889918.v1
Dataset updated
May 5, 2025
Dataset provided by
University of Illinois Chicago
Authors
Jerry Krishnan; Sai Dheeraj Illendula; Lynn Gerald; Jun Lu
License
http://rightsstatements.org/vocab/InC/1.0/http://rightsstatements.org/vocab/InC/1.0/
Description
The PArTNER study was a single-center pragmatic randomized clinical trial conducted at a minority-serving hospital in Chicago. It evaluated whether a Navigator intervention—delivered by community health workers and peer coaches—could improve patient experience, health outcomes, and healthcare utilization during the transition from hospital to home among adults hospitalized with heart failure, pneumonia, myocardial infarction (MI), chronic obstructive pulmonary disease (COPD), or sickle cell disease. A total of 1,029 adults, predominantly non-Hispanic Black, participated. The intervention included in-hospital visits, a home visit, and follow-up telephone coaching. The primary outcomes were changes in anxiety and informational support at 30 days post-discharge. The study found no significant overall improvements compared to usual care, although exploratory analyses suggested potential benefits for certain subgroups.Data Description:The dataset includes de-identified information on participant demographics, clinical characteristics, social determinants of health, Patient-Reported Outcomes Measurement Information System (PROMIS) scores (e.g., anxiety, informational support), healthcare utilization outcomes (e.g., hospital readmissions, emergency department visits), and intervention engagement. Data were collected through baseline hospital assessments, telephone follow-up surveys at 30 and 60 days post-discharge, and electronic health record reviews.Publications related to data:LaBedz, Stephanie L., et al. "Pragmatic clinical trial to improve patient experience among adults during transitions from hospital to home: the PArTNER study." Journal of general internal medicine 37.16 (2022): 4103-4111.Prieto-Centurion, Valentin, et al. "Design of the patient navigator to Reduce Readmissions (PArTNER) study: a pragmatic clinical effectiveness trial." contemporary clinical trials communications 15 (2019): 100420.
Hospital Inpatient Discharges (SPARCS De-Identified): 2022
healthdata.gov
health.data.ny.gov
application/rdfxml +5
Updated Apr 8, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
health.data.ny.gov (2025). Hospital Inpatient Discharges (SPARCS De-Identified): 2022 [Dataset]. https://healthdata.gov/State/Hospital-Inpatient-Discharges-SPARCS-De-Identified/2b9p-3w94
Explore at:
xml, csv, application/rdfxml, tsv, json, application/rssxmlAvailable download formats
Dataset updated
Apr 8, 2025
Dataset provided by
health.data.ny.gov
Description
The Statewide Planning and Research Cooperative System (SPARCS) Inpatient De-identified File contains discharge level detail on patient characteristics, diagnoses, treatments, services, and charges.

This data file contains basic record level detail for the discharge. The de-identified data file does not contain data that is protected health information (PHI) under HIPAA. The health information is not individually identifiable; all data elements considered identifiable have been redacted. For example, the direct identifiers regarding a date have the day and month portion of the date removed.

For more information visit: https://www.health.ny.gov/statistics/sparcs/
New York State Hospital De-Identified Data Data Package
johnsnowlabs.com
csv
Updated Jan 20, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
John Snow Labs (2021). New York State Hospital De-Identified Data Data Package [Dataset]. https://www.johnsnowlabs.com/marketplace/new-york-state-hospital-de-identified-data-data-package/
Explore at:
csvAvailable download formats
Dataset updated
Jan 20, 2021
Dataset authored and provided by
John Snow Labs
Area covered
New York
Description
This data package shows the information on hospital discharges at patient-level data with basic record details without showing protected health information (PHI) and was made not identifiable. The data is classified by Health Service Area and county.
Hospital Compare Inpatient Discharges Data Package
johnsnowlabs.com
csv
Updated Jan 20, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
John Snow Labs (2021). Hospital Compare Inpatient Discharges Data Package [Dataset]. https://www.johnsnowlabs.com/marketplace/hospital-compare-inpatient-discharges-data-package/
Explore at:
csvAvailable download formats
Dataset updated
Jan 20, 2021
Dataset authored and provided by
John Snow Labs
Description
This data package contains the New York State level information on discharge details on patient characteristics, diagnoses, treatments, services, charges and costs from 2009 to 2016.
supplementary data- deidentified colorectal data.xlsx
figshare.com
xlsx
Updated May 23, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Whitnee Broyles (2023). supplementary data- deidentified colorectal data.xlsx [Dataset]. http://doi.org/10.6084/m9.figshare.23118239.v1
Explore at:
xlsxAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.23118239.v1
Dataset updated
May 23, 2023
Dataset provided by
figshare
Figsharehttp://figshare.com/
Authors
Whitnee Broyles
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This is a data set from Baylor University Medical Center's colon and rectal cancer patients with their pathologic and genetic data included.
S
2017Female
health.data.ny.gov
application/rdfxml +5
Updated Nov 24, 2020
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
New York State Department of Health (2020). 2017Female [Dataset]. https://health.data.ny.gov/dataset/2017Female/4ij4-sjp2
Explore at:
csv, tsv, json, application/rdfxml, xml, application/rssxmlAvailable download formats
Dataset updated
Nov 24, 2020
Authors
New York State Department of Health
Description
The Statewide Planning and Research Cooperative System (SPARCS) Inpatient De-identified File contains discharge level detail on patient characteristics, diagnoses, treatments, services, and charges. This data file contains basic record level detail for the discharge. The de-identified data file does not contain data that is protected health information (PHI) under HIPAA. The health information is not individually identifiable; all data elements considered identifiable have been redacted. For example, the direct identifiers regarding a date have the day and month portion of the date removed.
Hospital Inpatient Discharges (SPARCS De-Identified): 2010
healthdata.gov
health.data.ny.gov
+1more
application/rdfxml +5
Updated Apr 8, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
health.data.ny.gov (2025). Hospital Inpatient Discharges (SPARCS De-Identified): 2010 [Dataset]. https://healthdata.gov/State/Hospital-Inpatient-Discharges-SPARCS-De-Identified/2adj-zbc9/data
Explore at:
csv, tsv, application/rssxml, application/rdfxml, json, xmlAvailable download formats
Dataset updated
Apr 8, 2025
Dataset provided by
health.data.ny.gov
Description
The Statewide Planning and Research Cooperative System (SPARCS) Inpatient De-identified dataset contains discharge level detail on patient characteristics, diagnoses, treatments, services, charges, and costs. This data contains basic record level detail regarding the discharge; however the data does not contain protected health information (PHI) under Health Insurance Portability and Accountability Act (HIPAA). The health information is not individually identifiable; all data elements considered identifiable have been redacted. For example, the direct identifiers regarding a date have the day and month portion of the date removed.
P
MIMIC-IV-Note Dataset
paperswithcode.com
Updated Feb 24, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2025). MIMIC-IV-Note Dataset [Dataset]. https://paperswithcode.com/dataset/mimic-iv-note
Explore at:
Dataset updated
Feb 24, 2025
Description
The advent of large, open access text databases has driven advances in state-of-the-art model performance in natural language processing (NLP). The relatively limited amount of clinical data available for NLP has been cited as a significant barrier to the field's progress. Here we describe MIMIC-IV-Note: a collection of deidentified free-text clinical notes for patients included in the MIMIC-IV clinical database. MIMIC-IV-Note contains 331,794 deidentified discharge summaries from 145,915 patients admitted to the hospital and emergency department at the Beth Israel Deaconess Medical Center in Boston, MA, USA. The database also contains 2,321,355 deidentified radiology reports for 237,427 patients. All notes have had protected health information removed in accordance with the Health Insurance Portability and Accountability Act (HIPAA) Safe Harbor provision. All notes are linkable to MIMIC-IV providing important context to the clinical data therein. The database is intended to stimulate research in clinical natural language processing and associated areas.
Data De-identification Software Market Report | Global Forecast From 2025 To...
dataintelo.com
csv, pdf, pptx
Updated Jan 7, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Dataintelo (2025). Data De-identification Software Market Report | Global Forecast From 2025 To 2033 [Dataset]. https://dataintelo.com/report/global-data-de-identification-software-market
Explore at:
csv, pdf, pptxAvailable download formats
Dataset updated
Jan 7, 2025
Dataset authored and provided by
Dataintelo
License
https://dataintelo.com/privacy-and-policyhttps://dataintelo.com/privacy-and-policy
Time period covered
2024 - 2032
Area covered
Global
Description
Data De-identification Software Market Outlook

The global data de-identification software market size was valued at approximately USD 500 million in 2023 and is projected to reach around USD 1.5 billion by 2032, growing at a CAGR of 13.5% during the forecast period. The growth in this market is driven by the increasing need for data privacy and compliance with stringent regulatory requirements across various industries.

The primary growth factor for the data de-identification software market is the rising awareness and concern regarding data privacy and security. With the advent of big data and the proliferation of digital services, organizations are increasingly recognizing the importance of protecting personal and sensitive information. Data breaches and cyber-attacks have led to significant financial and reputational damages, prompting businesses to invest in advanced data de-identification solutions to mitigate risks. Moreover, regulatory frameworks such as GDPR in Europe, CCPA in California, and HIPAA in the United States mandate strict compliance measures for data privacy, further propelling the demand for these software solutions.

Another significant driver is the growing adoption of cloud-based services and data analytics. As organizations migrate their data to cloud platforms, the need for robust data protection mechanisms becomes paramount. De-identification software enables companies to anonymize sensitive information before storing it in the cloud, ensuring compliance with data protection regulations and reducing the risk of exposure. Additionally, the rise of data analytics for business intelligence and decision-making necessitates the use of de-identified data to maintain privacy while extracting valuable insights.

The healthcare sector is particularly noteworthy for its substantial contribution to the market growth. The industry deals with large volumes of sensitive patient information that must be protected from unauthorized access. Data de-identification software plays a crucial role in enabling healthcare providers to share and analyze patient data for research and treatment purposes without compromising privacy. The COVID-19 pandemic has further accelerated the adoption of digital health solutions, increasing the demand for data de-identification tools to ensure compliance with privacy regulations and maintain patient trust.

Data Masking Technology is becoming increasingly vital as organizations strive to protect sensitive information while maintaining data utility. This technology allows businesses to create a realistic but fictional version of their data, ensuring that sensitive information is not exposed during processes such as software testing, development, and analytics. By substituting sensitive data with anonymized values, data masking technology helps organizations comply with data protection regulations without hindering their operational efficiency. As data privacy concerns continue to rise, the adoption of data masking technology is expected to grow, offering a robust solution for safeguarding sensitive information across various sectors.

Regionally, North America holds a significant share of the data de-identification software market, driven by the presence of key market players, stringent regulatory requirements, and a high level of digitalization across industries. The Asia Pacific region is expected to witness the fastest growth during the forecast period, attributed to the rapid adoption of digital technologies, increasing awareness of data privacy, and evolving regulatory landscape in countries like China, Japan, and India. Europe also plays a vital role due to the stringent data protection regulations enforced by the GDPR, which mandates rigorous data de-identification practices.

Component Analysis

By component, the data de-identification software market is segmented into software and services. The software segment is anticipated to dominate the market, driven by the increasing demand for advanced de-identification tools that can handle large volumes of data efficiently. Organizations are investing in sophisticated software solutions that offer automated and customizable de-identification processes to meet specific compliance requirements. These software solutions often come with features like encryption, tokenization, and data masking, enhancing their appeal to businesses across different sectors.

<a href="https://dataintelo.com/report/data-masking-
mimic-iii-clinical-database-demo-1.4
kaggle.com
Updated Apr 1, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Montassar bellah (2025). mimic-iii-clinical-database-demo-1.4 [Dataset]. https://www.kaggle.com/datasets/montassarba/mimic-iii-clinical-database-demo-1-4
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Apr 1, 2025
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Montassar bellah
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
Abstract MIMIC-III is a large, freely-available database comprising deidentified health-related data associated with over 40,000 patients who stayed in critical care units of the Beth Israel Deaconess Medical Center between 2001 and 2012 [1]. The MIMIC-III Clinical Database is available on PhysioNet (doi: 10.13026/C2XW26). Though deidentified, MIMIC-III contains detailed information regarding the care of real patients, and as such requires credentialing before access. To allow researchers to ascertain whether the database is suitable for their work, we have manually curated a demo subset, which contains information for 100 patients also present in the MIMIC-III Clinical Database. Notably, the demo dataset does not include free-text notes.

Background In recent years there has been a concerted move towards the adoption of digital health record systems in hospitals. Despite this advance, interoperability of digital systems remains an open issue, leading to challenges in data integration. As a result, the potential that hospital data offers in terms of understanding and improving care is yet to be fully realized.

MIMIC-III integrates deidentified, comprehensive clinical data of patients admitted to the Beth Israel Deaconess Medical Center in Boston, Massachusetts, and makes it widely accessible to researchers internationally under a data use agreement. The open nature of the data allows clinical studies to be reproduced and improved in ways that would not otherwise be possible.

The MIMIC-III database was populated with data that had been acquired during routine hospital care, so there was no associated burden on caregivers and no interference with their workflow. For more information on the collection of the data, see the MIMIC-III Clinical Database page.

Methods The demo dataset contains all intensive care unit (ICU) stays for 100 patients. These patients were selected randomly from the subset of patients in the dataset who eventually die. Consequently, all patients will have a date of death (DOD). However, patients do not necessarily die during an individual hospital admission or ICU stay.

This project was approved by the Institutional Review Boards of Beth Israel Deaconess Medical Center (Boston, MA) and the Massachusetts Institute of Technology (Cambridge, MA). Requirement for individual patient consent was waived because the project did not impact clinical care and all protected health information was deidentified.

Data Description MIMIC-III is a relational database consisting of 26 tables. For a detailed description of the database structure, see the MIMIC-III Clinical Database page. The demo shares an identical schema, except all rows in the NOTEEVENTS table have been removed.

The data files are distributed in comma separated value (CSV) format following the RFC 4180 standard. Notably, string fields which contain commas, newlines, and/or double quotes are encapsulated by double quotes ("). Actual double quotes in the data are escaped using an additional double quote. For example, the string she said "the patient was notified at 6pm" would be stored in the CSV as "she said ""the patient was notified at 6pm""". More detail is provided on the RFC 4180 description page: https://tools.ietf.org/html/rfc4180

Usage Notes The MIMIC-III demo provides researchers with an opportunity to review the structure and content of MIMIC-III before deciding whether or not to carry out an analysis on the full dataset.

CSV files can be opened natively using any text editor or spreadsheet program. However, some tables are large, and it may be preferable to navigate the data stored in a relational database. One alternative is to create an SQLite database using the CSV files. SQLite is a lightweight database format which stores all constituent tables in a single file, and SQLite databases interoperate well with a number software tools.

DB Browser for SQLite is a high quality, visual, open source tool to create, design, and edit database files compatible with SQLite. We have found this tool to be useful for navigating SQLite files. Information regarding installation of the software and creation of the database can be found online: https://sqlitebrowser.org/

Release Notes Release notes for the demo follow the release notes for the MIMIC-III database.

Acknowledgements This research and development was supported by grants NIH-R01-EB017205, NIH-R01-EB001659, and NIH-R01-GM104987 from the National Institutes of Health. The authors would also like to thank Philips Healthcare and staff at the Beth Israel Deaconess Medical Center, Boston, for supporting database development, and Ken Pierce for providing ongoing support for the MIMIC research community.

Conflicts of Interest The authors declare no competing financial interests.

References Johnson, A. E. W., Pollard, T. J., Shen, L., Lehman, L. H., Feng, M., Ghassemi, M., Mo...
Hospital Inpatient Discharges (SPARCS De-Identified): Patient Safety...
healthdata.gov
application/rdfxml +5
Updated Apr 8, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2025). Hospital Inpatient Discharges (SPARCS De-Identified): Patient Safety Indicators (PSI) Area Measures by Patient County: Calendar Year 2015 - haga-w2c5 - Archive Repository [Dataset]. https://healthdata.gov/dataset/Hospital-Inpatient-Discharges-SPARCS-De-Identified/5kmx-k368
Explore at:
tsv, csv, xml, application/rdfxml, json, application/rssxmlAvailable download formats
Dataset updated
Apr 8, 2025
Description
This dataset tracks the updates made on the dataset "Hospital Inpatient Discharges (SPARCS De-Identified): Patient Safety Indicators (PSI) Area Measures by Patient County: Calendar Year 2015" as a repository for previous versions of the data and metadata.
P
MIMIC-IV v2.2 Dataset
paperswithcode.com
Updated Feb 24, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2025). MIMIC-IV v2.2 Dataset [Dataset]. https://paperswithcode.com/dataset/mimic-iv-v2-2
Explore at:
Dataset updated
Feb 24, 2025
Description
Retrospectively collected medical data has the opportunity to improve patient care through knowledge discovery and algorithm development. Broad reuse of medical data is desirable for the greatest public good, but data sharing must be done in a manner which protects patient privacy. The Medical Information Mart for Intensive Care (MIMIC)-III database provided critical care data for over 40,000 patients admitted to intensive care units at the Beth Israel Deaconess Medical Center (BIDMC). Importantly, MIMIC-III was deidentified, and patient identifiers were removed according to the Health Insurance Portability and Accountability Act (HIPAA) Safe Harbor provision. MIMIC-III has been integral in driving large amounts of research in clinical informatics, epidemiology, and machine learning. Here we present MIMIC-IV, an update to MIMIC-III, which incorporates contemporary data and improves on numerous aspects of MIMIC-III. MIMIC-IV adopts a modular approach to data organization, highlighting data provenance and facilitating both individual and combined use of disparate data sources. MIMIC-IV is intended to carry on the success of MIMIC-III and support a broad set of applications within healthcare.
Hospital Inpatient Discharges (SPARCS De-Identified): Patient Safety...
healthdata.gov
application/rdfxml +5
Updated Apr 8, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2025). Hospital Inpatient Discharges (SPARCS De-Identified): Patient Safety Indicators (PSI) Composite Measures by Hospital: Beginning 2009 - dm65-yjif - Archive Repository [Dataset]. https://healthdata.gov/dataset/Hospital-Inpatient-Discharges-SPARCS-De-Identified/u9pk-e2ff
Explore at:
application/rssxml, application/rdfxml, xml, csv, tsv, jsonAvailable download formats
Dataset updated
Apr 8, 2025
Description
This dataset tracks the updates made on the dataset "Hospital Inpatient Discharges (SPARCS De-Identified): Patient Safety Indicators (PSI) Composite Measures by Hospital: Beginning 2009" as a repository for previous versions of the data and metadata.
patient de-identified dataset
figshare.com
mdb
Updated May 30, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Addmore Chadambuka (2023). patient de-identified dataset [Dataset]. http://doi.org/10.6084/m9.figshare.22769276.v1
Explore at:
mdbAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.22769276.v1
Dataset updated
May 30, 2023
Dataset provided by
figshare
Figsharehttp://figshare.com/
Authors
Addmore Chadambuka
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The dataset contains the following variables: age, sex, employment status, marital status, baseline CD4 count, baseline viral load, tuberculosis infection, tuberculosis preventative therapy (TPT), and cotrimoxazole preventative therapy (CPT)
2015 de-identified NY inpatient discharge (SPARCS)
kaggle.com
Updated Jan 24, 2018
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Jonas Almeida (2018). 2015 de-identified NY inpatient discharge (SPARCS) [Dataset]. https://www.kaggle.com/datasets/jonasalmeida/2015-deidentified-ny-inpatient-discharge-sparcs/data
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jan 24, 2018
Dataset provided by
Kaggle
Authors
Jonas Almeida
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Area covered
New York
Description
Public Health Data

This is the public dataset made available at https://health.data.ny.gov/Health/Hospital-Inpatient-Discharges-SPARCS-De-Identified/82xm-y6g8 by the Dept of Health of New York state. The following description can be found at that page:

The Statewide Planning and Research Cooperative System (SPARCS) Inpatient De-identified File contains discharge level detail on patient characteristics, diagnoses, treatments, services, and charges. This data file contains basic record level detail for the discharge. The de-identified data file does not contain data that is protected health information (PHI) under HIPAA. The health information is not individually identifiable; all data elements considered identifiable have been redacted. For example, the direct identifiers regarding a date have the day and month portion of the date removed.

It would be nice to ...

... for example, be able to predict length of stay in the hospital using the parameters likely to be available when teh patient is admitted.
mimic-iv-clinical-database-demo-2.2
kaggle.com
Updated Apr 1, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Montassar bellah (2025). mimic-iv-clinical-database-demo-2.2 [Dataset]. https://www.kaggle.com/datasets/montassarba/mimic-iv-clinical-database-demo-2-2/data
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Apr 1, 2025
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Montassar bellah
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
Abstract The Medical Information Mart for Intensive Care (MIMIC)-IV database is comprised of deidentified electronic health records for patients admitted to the Beth Israel Deaconess Medical Center. Access to MIMIC-IV is limited to credentialed users. Here, we have provided an openly-available demo of MIMIC-IV containing a subset of 100 patients. The dataset includes similar content to MIMIC-IV, but excludes free-text clinical notes. The demo may be useful for running workshops and for assessing whether the MIMIC-IV is appropriate for a study before making an access request.

Background The increasing adoption of digital electronic health records has led to the existence of large datasets that could be used to carry out important research across many areas of medicine. Research progress has been limited, however, due to limitations in the way that the datasets are curated and made available for research. The MIMIC datasets allow credentialed researchers around the world unprecedented access to real world clinical data, helping to reduce the barriers to conducting important medical research. The public availability of the data allows studies to be reproduced and collaboratively improved in ways that would not otherwise be possible.

Methods First, the set of individuals to include in the demo was chosen. Each person in MIMIC-IV is assigned a unique subject_id. As the subject_id is randomly generated, ordering by subject_id results in a random subset of individuals. We only considered individuals with an anchor_year_group value of 2011 - 2013 or 2014 - 2016 to ensure overlap with MIMIC-CXR v2.0.0. The first 100 subject_id who satisfied the anchor_year_group criteria were selected for the demo dataset.

All tables from MIMIC-IV were included in the demo dataset. Tables containing patient information, such as emar or labevents, were filtered using the list of selected subject_id. Tables which do not contain patient level information were included in their entirety (e.g. d_items or d_labitems). Note that all tables which do not contain patient level information are prefixed with the characters 'd_'.

Deidentification was performed following the same approach as the MIMIC-IV database. Protected health information (PHI) as listed in the HIPAA Safe Harbor provision was removed. Patient identifiers were replaced using a random cipher, resulting in deidentified integer identifiers for patients, hospitalizations, and ICU stays. Stringent rules were applied to structured columns based on the data type. Dates were shifted consistently using a random integer removing seasonality, day of the week, and year information. Text fields were filtered by manually curated allow and block lists, as well as context-specific regular expressions. For example, columns containing dose values were filtered to only contain numeric values. If necessary, a free-text deidentification algorithm was applied to remove PHI from free-text. Results of this algorithm were manually reviewed and verified to remove identified PHI.

Data Description MIMIC-IV is a relational database consisting of 26 tables. For a detailed description of the database structure, see the MIMIC-IV Clinical Database page [1] or the MIMIC-IV online documentation [2]. The demo shares an identical schema and structure to the equivalent version of MIMIC-IV.

Data files are distributed in comma separated value (CSV) format following the RFC 4180 standard [3]. The dataset is also made available on Google BigQuery. Instructions to accessing the dataset on BigQuery are provided on the online MIMIC-IV documentation, under the cloud page [2].

An additional file is included: demo_subject_id.csv. This is a list of the subject_id used to filter MIMIC-IV to the demo subset.

Usage Notes The MIMIC-IV demo provides researchers with the opportunity to better understand MIMIC-IV data.

CSV files can be opened natively using any text editor or spreadsheet program. However, as some tables are large it may be preferable to navigate the data via a relational database. We suggest either working with the data in Google BigQuery (see the "Files" section for access details) or creating an SQLite database using the CSV files. SQLite is a lightweight database format which stores all constituent tables in a single file, and SQLite databases interoperate well with a number software tools.

Code is made available for use with MIMIC-IV on the MIMIC-IV code repository [4]. Code provided includes derivation of clinical concepts, tutorials, and reproducible analyses.

Release Notes Release notes for the demo follow the release notes for the MIMIC-IV database.

Ethics This project was approved by the Institutional Review Boards of Beth Israel Deaconess Medical Center (Boston, MA) and the Massachusetts Institute of Technology (Cambridge, MA). Requirement for individual patient consent was waived because the pr...

Facebook

Twitter

Click to copy link

Link copied

Cite

Archive Market Research (2025). De-identified Health Data Market Report [Dataset]. https://www.archivemarketresearch.com/reports/de-identified-health-data-market-9104

De-identified Health Data Market Report

Explore at:

ppt, pdf, docAvailable download formats

Dataset updated

Jan 22, 2025

Dataset authored and provided by

Archive Market Research

License

https://www.archivemarketresearch.com/privacy-policyhttps://www.archivemarketresearch.com/privacy-policy

Time period covered

2025 - 2033

Area covered

global

Variables measured

Market Size

Description

Recent developments include: In February 2024, Veradigm published its first Veradigm Insights Report: Cardiovascular Conditions in 2024, analyzing de-identified real-world data from 53 million cardiovascular patients. The report assesses the prevalence of cardiovascular disease (CVD) and related conditions across all U.S. states, with demographic breakdowns based on age, ethnicity, and sex. , In July 2021, Verana Health and Komodo Health partnered to integrate Komodo’s Healthcare Map into Verana’s de-identified EHR datasets, spanning over 325 million patient journeys. This collaboration aims to provide life sciences researchers with detailed insights into patient pathways, encompassing treatment histories, hospitalizations, and socioeconomic factors. The partnership is expected to enhance research efforts in ophthalmology, neurology, and urology by combining clinical outcomes with real-world patient data, supporting more informed treatment development. , In September 2024, ICON announced a collaboration with Intel to utilize de-identified data from its clinical research platform alongside Intel's AI technology. This partnership enhances patient recruitment and streamlines clinical trial processes by deriving insights from de-identified patient data. The initiative aims to advance precision medicine and improve efficiencies in drug development and outcomes by integrating ICON's clinical trial expertise with Intel's AI capabilities. .

Clear search

Close search

Google apps

Main menu

De-identified Health Data Market Report

Hospital Inpatient Discharges (SPARCS De-Identified): 2013

Open View - EMS Substance Use Response Incident Patients - Deidentified

De-identified Health Data Market Size & Forecast, 2025-2032

De-identified Data from the PArTNER Study: A Pragmatic Clinical Trial to...

Hospital Inpatient Discharges (SPARCS De-Identified): 2022

New York State Hospital De-Identified Data Data Package

Hospital Compare Inpatient Discharges Data Package

supplementary data- deidentified colorectal data.xlsx

2017Female

Hospital Inpatient Discharges (SPARCS De-Identified): 2010

MIMIC-IV-Note Dataset

Data De-identification Software Market Report | Global Forecast From 2025 To...

Data De-identification Software Market Outlook

Component Analysis

mimic-iii-clinical-database-demo-1.4

Hospital Inpatient Discharges (SPARCS De-Identified): Patient Safety...

MIMIC-IV v2.2 Dataset

Hospital Inpatient Discharges (SPARCS De-Identified): Patient Safety...

patient de-identified dataset

2015 de-identified NY inpatient discharge (SPARCS)

Public Health Data

It would be nice to ...

mimic-iv-clinical-database-demo-2.2

De-identified Health Data Market Report