39 datasets found

Hospital Admissions Data
kaggle.com
zip
Updated Jan 21, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ashish Sahani (2022). Hospital Admissions Data [Dataset]. https://www.kaggle.com/datasets/ashishsahani/hospital-admissions-data
Explore at:
zip(522833 bytes)Available download formats
Dataset updated
Jan 21, 2022
Authors
Ashish Sahani
License
Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
Description
This dataset is being provided under creative commons License (Attribution-Non-Commercial-Share Alike 4.0 International (CC BY-NC-SA 4.0)) https://creativecommons.org/licenses/by-nc-sa/4.0/

Context

This data was collected from patients admitted over a period of two years (1 April 2017 to 31 March 2019) at Hero DMC Heart Institute, Unit of Dayanand Medical College and Hospital, Ludhiana, Punjab, India. This is a tertiary care medical college and hospital. During the study period, the cardiology unit had 14,845 admissions corresponding to 12,238 patients. 1921 patients who had multiple admissions.

Specifically, data were related to patients ; date of admission; date of discharge; demographics, such as age, sex, locality (rural or urban); type of admission (emergency or outpatient); patient history, including smoking, alcohol, diabetes mellitus (DM), hypertension (HTN), prior coronary artery disease (CAD), prior cardiomyopathy (CMP), and chronic kidney disease (CKD); and lab parameters corresponding to hemoglobin (HB), total lymphocyte count (TLC), platelets, glucose, urea, creatinine, brain natriuretic peptide (BNP), raised cardiac enzymes (RCE) and ejection fraction (EF). Other comorbidities and features (28 features), including heart failure, STEMI, and pulmonary embolism, were recorded and analyzed.

Shock was defined as systolic blood pressure < 90 mmHg, and when the cause for shock was any reason other than cardiac. Patients in shock due to cardiac reasons were classified into cardiogenic shock. Patients in shock due to multifactorial pathophysiology (cardiac and non-cardiac) were considered for both categories. The outcomes indicating whether the patient was discharged or expired in the hospital were also recorded.

Further details about this dataset can be found here: https://doi.org/10.3390/diagnostics12020241

If you use this dataset in academic research all publications arising out of it must cite the following paper: Bollepalli, S.C.; Sahani, A.K.; Aslam, N.; Mohan, B.; Kulkarni, K.; Goyal, A.; Singh, B.; Singh, G.; Mittal, A.; Tandon, R.; Chhabra, S.T.; Wander, G.S.; Armoundas, A.A. An Optimized Machine Learning Model Accurately Predicts In-Hospital Outcomes at Admission to a Cardiac Unit. Diagnostics 2022, 12, 241. https://doi.org/10.3390/diagnostics12020241

If you intend to use this data for commercial purpose explicit written permission is required from data providers.

Content

table_headings.csv has explanatory names of all columns.

Acknowledgements

Data was collected from Hero Dayanand Medical College Heart Institute Unit of Dayanand Medical College and Hospital, Ludhiana, Punjab, India.

Inspiration

For any questions about the data or collaborations please contact ashish.sahani@iitrpr.ac.in
Hospital Emergency Dataset
kaggle.com
zip
Updated Jan 30, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Xavier Berge (2025). Hospital Emergency Dataset [Dataset]. https://www.kaggle.com/datasets/xavierberge/hospital-emergency-dataset
Explore at:
zip(228798 bytes)Available download formats
Dataset updated
Jan 30, 2025
Authors
Xavier Berge
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
This dataset encapsulates comprehensive patient information collected from a hospital emergency room (ER) dashboard. It serves as a valuable resource for healthcare analytics, focused on understanding patient demographics, treatment outcomes, and operational efficiency within emergency departments.
C
Hospital Annual Financial Data - Selected Data & Pivot Tables
data.chhs.ca.gov
data.ca.gov
+4more
csv, data, doc, html +5
Updated Oct 8, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Department of Health Care Access and Information (2025). Hospital Annual Financial Data - Selected Data & Pivot Tables [Dataset]. https://data.chhs.ca.gov/dataset/hospital-annual-financial-data-selected-data-pivot-tables
Explore at:
xlsx, xlsx(754073), pdf(333268), xlsx(758376), xlsx(769128), xls(19599360), xlsx(770931), pdf(303198), xlsx(779866), xls(51424256), pdf(121968), xlsx(765216), csv(205488092), xls(18301440), html, xlsx(756356), xls(14657536), xlsx(768036), zip, xlsx(752914), xlsx(763636), xls(19650048), xlsx(791201), xlsm(1360350), xlsx(783155), xls, xls(18445312), pdf(310420), pdf(383996), xls(44967936), data, xlsx(750199), doc, xlsx(14714368), xlsx(777616), xls(51554816), xls(44933632), xlsx(758089), xls(920576), pdf(258239), xlsx(770375), xls(16002048), xls(19577856), xlsm(1369828), xlsx(780332)Available download formats
Dataset updated
Oct 8, 2025
Dataset authored and provided by
Department of Health Care Access and Information
Description
On an annual basis (individual hospital fiscal year), individual hospitals and hospital systems report detailed facility-level data on services capacity, inpatient/outpatient utilization, patients, revenues and expenses by type and payer, balance sheet and income statement.

Due to the large size of the complete dataset, a selected set of data representing a wide range of commonly used data items, has been created that can be easily managed and downloaded. The selected data file includes general hospital information, utilization data by payer, revenue data by payer, expense data by natural expense category, financial ratios, and labor information.

There are two groups of data contained in this dataset: 1) Selected Data - Calendar Year: To make it easier to compare hospitals by year, hospital reports with report periods ending within a given calendar year are grouped together. The Pivot Tables for a specific calendar year are also found here. 2) Selected Data - Fiscal Year: Hospital reports with report periods ending within a given fiscal year (July-June) are grouped together.
Hospital Database Management System SQL Project
kaggle.com
zip
Updated May 9, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Andrew Dolcimascolo-Garrett (2024). Hospital Database Management System SQL Project [Dataset]. https://www.kaggle.com/datasets/andrewdolcigarrett/hospital-database-management-system-sql-project
Explore at:
zip(1487278 bytes)Available download formats
Dataset updated
May 9, 2024
Authors
Andrew Dolcimascolo-Garrett
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
Dataset

This dataset was created by Andrew Dolcimascolo-Garrett

Released under MIT

Contents
G
Open Database of Healthcare Facilities
open.canada.ca
catalogue.arctic-sdi.org
csv, esri rest +4
Updated Mar 2, 2022
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Statistics Canada (2022). Open Database of Healthcare Facilities [Dataset]. https://open.canada.ca/data/en/dataset/a1bcd4ee-8e57-499b-9c6f-94f6902fdf32
Explore at:
fgdb/gdb, esri rest, csv, html, pdf, wmsAvailable download formats
Dataset updated
Mar 2, 2022
Dataset provided by
Statistics Canada
License
Open Government Licence - Canada 2.0https://open.canada.ca/en/open-government-licence-canada
License information was derived automatically
Description
The Open Database of Healthcare Facilities (ODHF) is a collection of open data containing the names, types, and locations of health facilities across Canada. It is released under the Open Government License - Canada. The ODHF compiles open, publicly available, and directly-provided data on health facilities across Canada. Data sources include regional health authorities, provincial, territorial and municipal governments, and public health and professional healthcare bodies. This database aims to provide enhanced access to a harmonized listing of health facilities across Canada by making them available as open data. This database is a component of the Linkable Open Data Environment (LODE).
Healthcare Management System
kaggle.com
zip
Updated Dec 23, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Anouska Abhisikta (2023). Healthcare Management System [Dataset]. https://www.kaggle.com/datasets/anouskaabhisikta/healthcare-management-system
Explore at:
zip(74279 bytes)Available download formats
Dataset updated
Dec 23, 2023
Authors
Anouska Abhisikta
License
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Description
Patients Table:

PatientID: Unique identifier for each patient.

firstname: First name of the patient.

lastname: Last name of the patient.

email: Email address of the patient.

This table stores information about individual patients, including their names and contact details.

Doctors Table:

DoctorID: Unique identifier for each doctor.

DoctorName: Full name of the doctor.

Specialization: Area of medical specialization.

DoctorContact: Contact details of the doctor.

This table contains details about healthcare providers, including their names, specializations, and contact information.

Appointments Table:

AppointmentID: Unique identifier for each appointment.

Date: Date of the appointment.

Time: Time of the appointment.

PatientID: Foreign key referencing the Patients table, indicating the patient for the appointment.

DoctorID: Foreign key referencing the Doctors table, indicating the doctor for the appointment.

This table records scheduled appointments, linking patients to doctors.

MedicalProcedure Table:

ProcedureID: Unique identifier for each medical procedure.

ProcedureName: Name or description of the medical procedure.

AppointmentID: Foreign key referencing the Appointments table, indicating the appointment associated with the procedure.

This table stores details about medical procedures associated with specific appointments.

Billing Table:

InvoiceID: Unique identifier for each billing transaction.

PatientID: Foreign key referencing the Patients table, indicating the patient for the billing transaction.

Items: Description of items or services billed.

Amount: Amount charged for the billing transaction.

This table maintains records of billing transactions, associating them with specific patients.

demo Table:

ID: Primary key, serves as a unique identifier for each record.

Name: Name of the entity.

Hint: Additional information or hint about the entity.

This table appears to be a demonstration or testing table, possibly unrelated to the healthcare management system.

This dataset schema is designed to capture comprehensive information about patients, doctors, appointments, medical procedures, and billing transactions in a healthcare management system. Adjustments can be made based on specific requirements, and additional attributes can be included as needed.
m
EHR Dataset for Patient Treatment Classification
data.mendeley.com
Updated May 10, 2020
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Mujiono Sadikin (2020). EHR Dataset for Patient Treatment Classification [Dataset]. http://doi.org/10.17632/7kv3rctx7m.1
Explore at:
Unique identifier
https://doi.org/10.17632/7kv3rctx7m.1
Dataset updated
May 10, 2020
Authors
Mujiono Sadikin
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The dataset is Electronic Health Record Predicting collected from a private Hospital in Indonesia. It contains the patients laboratory test results used to determine next patient treatment whether in care or out care patient. The task embedded to the dataset is classification prediction.
C
Hospital Inpatient - Characteristics by Facility (Pivot Profile)
data.chhs.ca.gov
data.ca.gov
+2more
.xlsx, xls, xlsx, zip
Updated Nov 7, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Department of Health Care Access and Information (2025). Hospital Inpatient - Characteristics by Facility (Pivot Profile) [Dataset]. https://data.chhs.ca.gov/dataset/hospital-inpatient-characteristics-by-facility-pivot-profile
Explore at:
xls, xlsx, xlsx(1778842), xlsx(1736211), xlsx(1736990), xlsx(1762190), xlsx(1740830), xlsx(1730937), .xlsx(1724148), zipAvailable download formats
Dataset updated
Nov 7, 2025
Dataset authored and provided by
Department of Health Care Access and Information
Description
This dataset contains annual Excel pivot tables that display summaries of the inpatients treated in each hospital. The summary data include discharges, discharge days, average length of stay, age groups, race groups, sex, expected payer, type of care, do not resuscitate orders, admission source, admission type, discharge disposition, principal diagnosis groups, principal procedure groups, and principal external cause of injury/morbidity groups. The data can also be summarized statewide or for a specific hospital county, bed size grouping, and/or type of control.
Data from: Clinical Dataset
kaggle.com
zip
Updated Oct 5, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Mohamadreza Momeni (2023). Clinical Dataset [Dataset]. https://www.kaggle.com/datasets/imtkaggleteam/clinical-dataset
Explore at:
zip(16220 bytes)Available download formats
Dataset updated
Oct 5, 2023
Authors
Mohamadreza Momeni
Description
The purest type of electronic clinical data which is obtained at the point of care at a medical facility, hospital, clinic or practice. Often referred to as the electronic medical record (EMR), the EMR is generally not available to outside researchers. The data collected includes administrative and demographic information, diagnosis, treatment, prescription drugs, laboratory tests, physiologic monitoring data, hospitalization, patient insurance, etc.

Individual organizations such as hospitals or health systems may provide access to internal staff. Larger collaborations, such as the NIH Collaboratory Distributed Research Network provides mediated or collaborative access to clinical data repositories by eligible researchers. Additionally, the UW De-identified Clinical Data Repository (DCDR) and the Stanford Center for Clinical Informatics allow for initial cohort identification.

About Dataset:

333 scholarly articles cite this dataset.

Unique identifier: DOI

Dataset updated: 2023

Authors: Haoyang Mi

In this dataset, we have two dataset:

1- Clinical Data_Discovery_Cohort: Name of columns: Patient ID Specimen date Dead or Alive Date of Death Date of last Follow Sex Race Stage Event Time

2- Clinical_Data_Validation_Cohort Name of columns: Patient ID Survival time (days) Event Tumor size Grade Stage Age Sex Cigarette Pack per year Type Adjuvant Batch EGFR KRAS

Feel free to put your thought and analysis in a notebook for this datasets. And you can create some interesting and valuable ML projects for this case. Thanks for your attention.
p
MIMIC-III Clinical Database
physionet.org
oppositeofnorth.com
Updated Sep 4, 2016
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Alistair Johnson; Tom Pollard; Roger Mark (2016). MIMIC-III Clinical Database [Dataset]. http://doi.org/10.13026/C2XW26
Explore at:
Unique identifier
https://doi.org/10.13026/C2XW26
Dataset updated
Sep 4, 2016
Authors
Alistair Johnson; Tom Pollard; Roger Mark
License
https://github.com/MIT-LCP/license-and-dua/tree/master/draftshttps://github.com/MIT-LCP/license-and-dua/tree/master/drafts
Description
MIMIC-III is a large, freely-available database comprising deidentified health-related data associated with over forty thousand patients who stayed in critical care units of the Beth Israel Deaconess Medical Center between 2001 and 2012. The database includes information such as demographics, vital sign measurements made at the bedside (~1 data point per hour), laboratory test results, procedures, medications, caregiver notes, imaging reports, and mortality (including post-hospital discharge).MIMIC supports a diverse range of analytic studies spanning epidemiology, clinical decision-rule improvement, and electronic tool development. It is notable for three factors: it is freely available to researchers worldwide; it encompasses a diverse and very large population of ICU patients; and it contains highly granular data, including vital signs, laboratory results, and medications.
HCUP Nationwide Readmissions Database (NRD)- Restricted Access Files
catalog.data.gov
data.virginia.gov
+2more
Updated Jul 26, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Agency for Healthcare Research and Quality, Department of Health & Human Services (2023). HCUP Nationwide Readmissions Database (NRD)- Restricted Access Files [Dataset]. https://catalog.data.gov/dataset/healthcare-cost-and-utilization-project-nationwide-readmissions-database-nrd
Explore at:
Dataset updated
Jul 26, 2023
Dataset provided by
Agency for Healthcare Research and Qualityhttp://www.ahrq.gov/
United States Department of Health and Human Serviceshttp://www.hhs.gov/
Description
The Healthcare Cost and Utilization Project (HCUP) Nationwide Readmissions Database (NRD) is a unique and powerful database designed to support various types of analyses of national readmission rates for all payers and the uninsured. The NRD includes discharges for patients with and without repeat hospital visits in a year and those who have died in the hospital. Repeat stays may or may not be related. The criteria to determine the relationship between hospital admissions is left to the analyst using the NRD. This database addresses a large gap in health care data - the lack of nationally representative information on hospital readmissions for all ages. Outcomes of interest include national readmission rates, reasons for returning to the hospital for care, and the hospital costs for discharges with and without readmissions. Unweighted, the NRD contains data from approximately 18 million discharges each year. Weighted, it estimates roughly 35 million discharges. Developed through a Federal-State-Industry partnership sponsored by the Agency for Healthcare Research and Quality, HCUP data inform decision making at the national, State, and community levels. The NRD is drawn from HCUP State Inpatient Databases (SID) containing verified patient linkage numbers that can be used to track a person across hospitals within a State, while adhering to strict privacy guidelines. The NRD is not designed to support regional, State-, or hospital-specific readmission analyses. The NRD contains more than 100 clinical and non-clinical data elements provided in a hospital discharge abstract. Data elements include but are not limited to: diagnoses, procedures, patient demographics (e.g., sex, age), expected source of payer, regardless of expected payer, including but not limited to Medicare, Medicaid, private insurance, self-pay, or those billed as ‘no charge, discharge month, quarter, and year, total charges, length of stay, and data elements essential to readmission analyses. The NIS excludes data elements that could directly or indirectly identify individuals. Restricted access data files are available with a data use agreement and brief online security training.
m
Cardiovascular_Disease_Dataset
data.mendeley.com
kaggle.com
Updated Apr 16, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Bhanu Prakash Doppala (2021). Cardiovascular_Disease_Dataset [Dataset]. http://doi.org/10.17632/dzz48mvjht.1
Explore at:
Unique identifier
https://doi.org/10.17632/dzz48mvjht.1
Dataset updated
Apr 16, 2021
Authors
Bhanu Prakash Doppala
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This heart disease dataset is acquired from one o f the multispecialty hospitals in India. Over 14 common features which makes it one of the heart disease dataset available so far for research purposes. This dataset consists of 1000 subjects with 12 features. This dataset will be useful for building a early-stage heart disease detection as well as to generate predictive machine learning models.
MIMIC-III - Deep Reinforcement Learning
kaggle.com
zip
Updated Apr 7, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Asjad K (2022). MIMIC-III - Deep Reinforcement Learning [Dataset]. https://www.kaggle.com/datasets/asjad99/mimiciii
Explore at:
zip(11100065 bytes)Available download formats
Dataset updated
Apr 7, 2022
Authors
Asjad K
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
Digitization of healthcare data along with algorithmic breakthroughts in AI will have a major impact on healthcare delivery in coming years. Its intresting to see application of AI to assist clinicians during patient treatment in a privacy preserving way. While scientific knowledge can help guide interventions, there remains a key need to quickly cut through the space of decision policies to find effective strategies to support patients during the care process.

Offline Reinforcement learning (also referred to as safe or batch reinforcement learning) is a promising sub-field of RL which provides us with a mechanism for solving real world sequential decision making problems where access to simulator is not available. Here we assume that learn a policy from fixed dataset of trajectories with further interaction with the environment(agent doesn't receive reward or punishment signal from the environment). It has shown that such an approach can leverage vast amount of existing logged data (in the form of previous interactions with the environment) and can outperform supervised learning approaches or heuristic based policies for solving real world - decision making problems. Offline RL algorithms when trained on sufficiently large and diverse offline datasets can produce close to optimal policies(ability to generalize beyond training data).

As Part of my PhD, research, I investigated the problem of developing a Clinical Decision Support System for Sepsis Management using Offline Deep Reinforcement Learning.

MIMIC-III ('Medical Information Mart for Intensive Care') is a large open-access anonymized single-center database which consists of comprehensive clinical data of 61,532 critical care admissions from 2001–2012 collected at a Boston teaching hospital. Dataset consists of 47 features (including demographics, vitals, and lab test results) on a cohort of sepsis patients who meet the sepsis-3 definition criteria.

we try to answer the following question:

Given a particular patient’s characteristics and physiological information at each time step as input, can our DeepRL approach, learn an optimal treatment policy that can prescribe the right intervention(e.g use of ventilator) to the patient each stage of the treatment process, in order to improve the final outcome(e.g patient mortality)?

we can use popular state-of-the-art algorithms such as Deep Q Learning(DQN), Double Deep Q Learning (DDQN), DDQN combined with BNC, Mixed Monte Carlo(MMC) and Persistent Advantage Learning (PAL). Using these methods we can train an RL policy to recommend optimum treatment path for a given patient.

Data acquisition, standard pre-processing and modelling details can be found here in Github repo: https://github.com/asjad99/MIMIC_RL_COACH
d
USGS National Structures Dataset - USGS National Map Downloadable Data...
catalog.data.gov
data.usgs.gov
+1more
Updated Nov 26, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
U.S. Geological Survey (2025). USGS National Structures Dataset - USGS National Map Downloadable Data Collection [Dataset]. https://catalog.data.gov/dataset/usgs-national-structures-dataset-usgs-national-map-downloadable-data-collection
Explore at:
Dataset updated
Nov 26, 2025
Dataset provided by
United States Geological Surveyhttp://www.usgs.gov/
Description
USGS Structures from The National Map (TNM) consists of data to include the name, function, location, and other core information and characteristics of selected manmade facilities across all US states and territories. The types of structures collected are largely determined by the needs of disaster planning and emergency response, and homeland security organizations. Structures currently included are: School, School:Elementary, School:Middle, School:High, College/University, Technical/Trade School, Ambulance Service, Fire Station/EMS Station, Law Enforcement, Prison/Correctional Facility, Post Office, Hospital/Medical Center, Cabin, Campground, Cemetery, Historic Site/Point of Interest, Picnic Area, Trailhead, Vistor/Information Center, US Capitol, State Capitol, US Supreme Court, State Supreme Court, Court House, Headquarters, Ranger Station, White House, and City/Town Hall. Structures data are designed to be used in general mapping and in the analysis of structure related activities using geographic information system technology. Included is a feature class of preliminary building polygons provided by FEMA, USA Structures. The National Map structures data is commonly combined with other data themes, such as boundaries, elevation, hydrography, and transportation, to produce general reference base maps. The National Map viewer allows free downloads of public domain structures data in either Esri File Geodatabase or Shapefile formats. For additional information on the structures data model, go to https://www.usgs.gov/ngp-standards-and-specifications/national-map-structures-content.
American Hospital Association (AHA) Annual Survey Database - 2021
archive.ciser.cornell.edu
Updated Feb 10, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
American Hospital Association (2024). American Hospital Association (AHA) Annual Survey Database - 2021 [Dataset]. https://archive.ciser.cornell.edu/studies/2893/data-and-documentation
Explore at:
Dataset updated
Feb 10, 2024
Dataset authored and provided by
American Hospital Associationhttp://www.aha.org/
Variables measured
Organization
Description
AHA Annual Survey Database™ for Fiscal Year 2021 is a comprehensive hospital database for peer comparisons, market analysis, and health services research. It is produced primarily from the AHA Annual Survey of Hospitals, which has been administered by the American Hospital Association (AHA) since 1946. The survey responses are supplemented by data drawn the U.S. Census Bureau, hospital accrediting bodies, and other organizations.
Cancer patient´s care transition database.xlsx
figshare.com
xlsx
Updated Mar 6, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Elisiane Lorenzini; Julia Estela Willrich Boell; Nelly D. Oelke; Caroline Donini Rodrigues; Letícia Flores Trindade; Vanessa Dalsasso Batista Winter; Michelle Mariah Malkiewiez; Gabriela Ceretta Flôres; Pâmella Pluta; Adriane Cristina Bernat Kolankiewicz (2020). Cancer patient´s care transition database.xlsx [Dataset]. http://doi.org/10.6084/m9.figshare.11831343.v3
Explore at:
xlsxAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.11831343.v3
Dataset updated
Mar 6, 2020
Dataset provided by
Figsharehttp://figshare.com/
Authors
Elisiane Lorenzini; Julia Estela Willrich Boell; Nelly D. Oelke; Caroline Donini Rodrigues; Letícia Flores Trindade; Vanessa Dalsasso Batista Winter; Michelle Mariah Malkiewiez; Gabriela Ceretta Flôres; Pâmella Pluta; Adriane Cristina Bernat Kolankiewicz
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The dataset contains information of 213 cancer patients undergoing clinical or surgical treatment characterized on sociodemographic and clinical data as well as data from the Care Transition Measure (CTM 15-Brazil). Data collection was carried out 7 to 30 days after their discharge from hospital from June to August 2019. Understanding these data can contribute to improving quality of care transitions and avoiding hospital readmissions. To this end, this dataset contains a broad array of variables:

*gender

*age group

*place of residence

*race

*marital status

*schooling

*paid work activity

*type of treatment

*cancer staging

*metastasis

*comorbidities

*main complaint

*continue use medication

*diagnosis

*cancer type

*diagnostic year

*oncology treatment

*first hospitalization

*readmission in the last 30 days

*number of hospitalizations in the last 30 days

*readmission in the last 6 months

*number of hospitalizations in the last 6 months

*readmission in the last year

*number of hospitalizations in the last year

*questions 1-15 from CTM 15-Brazil

The data are presented as a single Excel XLSX file: cancer patient´s care transitions dataset.xlsx.

The analyses of the present dataset have the potential to generate hospital readmission prevention strategies to be implemented by the hospital team. Researchers who are interested in CTs of cancer patients can extensively explore the variables described here.

The project from which these data were extracted was approved by the institution’s research ethics committee (approval n. 3.266.259/2019) at Associação Hospital de Caridade Ijuí, Rio Grande do Sul, Brazil.
p
CHB-MIT Scalp EEG Database
physionet.org
Updated Jun 9, 2010
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
John Guttag (2010). CHB-MIT Scalp EEG Database [Dataset]. http://doi.org/10.13026/C2K01R
Explore at:
Unique identifier
https://doi.org/10.13026/C2K01R
Dataset updated
Jun 9, 2010
Authors
John Guttag
License
Open Data Commons Attribution License (ODC-By) v1.0https://www.opendatacommons.org/licenses/by/1.0/
License information was derived automatically
Description
This database, collected at the Children’s Hospital Boston, consists of EEG recordings from pediatric subjects with intractable seizures. Subjects were monitored for up to several days following withdrawal of anti-seizure medication in order to characterize their seizures and assess their candidacy for surgical intervention. The recordings are grouped into 23 cases and were collected from 22 subjects (5 males, ages 3–22; and 17 females, ages 1.5–19).
p
Data from: MIT-BIH Arrhythmia Database
physionet.org
opendatalab.com
+2more
Updated Feb 24, 2005
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
George Moody; Roger Mark (2005). MIT-BIH Arrhythmia Database [Dataset]. http://doi.org/10.13026/C2F305
Explore at:
Unique identifier
https://doi.org/10.13026/C2F305
Dataset updated
Feb 24, 2005
Authors
George Moody; Roger Mark
License
Open Data Commons Attribution License (ODC-By) v1.0https://www.opendatacommons.org/licenses/by/1.0/
License information was derived automatically
Description
The MIT-BIH Arrhythmia Database contains 48 half-hour excerpts of two-channel ambulatory ECG recordings, obtained from 47 subjects studied by the BIH Arrhythmia Laboratory between 1975 and 1979. Twenty-three recordings were chosen at random from a set of 4000 24-hour ambulatory ECG recordings collected from a mixed population of inpatients (about 60%) and outpatients (about 40%) at Boston's Beth Israel Hospital; the remaining 25 recordings were selected from the same set to include less common but clinically significant arrhythmias that would not be well-represented in a small random sample.
c
Padchest complete dataset
bimcv.cipf.es
Updated Jul 21, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2023). Padchest complete dataset [Dataset]. https://bimcv.cipf.es/bimcv-projects/padchest/
Explore at:
Dataset updated
Jul 21, 2023
License
http://bimcv.cipf.es/bimcv-projects/padchest/padchest-dataset-research-use-agreement/http://bimcv.cipf.es/bimcv-projects/padchest/padchest-dataset-research-use-agreement/
Description
A labeled large-scale, high resolution chest x-ray dataset for automated ex-ploration of medical images along with their associated reports. This dataset includes more than 160,000 images from 67,000 patients that were interpreted and reported by radiologists at Hospital San Juan (Spain) from 2009 to 2017, covering six different position views and additional information on image acquisition and patient demography - 1.2TB
F
Tamil Call Center Data for Healthcare AI
futurebeeai.com
wav
Updated Aug 1, 2022
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
FutureBee AI (2022). Tamil Call Center Data for Healthcare AI [Dataset]. https://www.futurebeeai.com/dataset/speech-dataset/healthcare-call-center-conversation-tamil-india
Explore at:
wavAvailable download formats
Dataset updated
Aug 1, 2022
Dataset provided by
FutureBeeAI
Authors
FutureBee AI
License
https://www.futurebeeai.com/policies/ai-data-license-agreementhttps://www.futurebeeai.com/policies/ai-data-license-agreement
Dataset funded by
FutureBeeAI
Description
Introduction
This Tamil Call Center Speech Dataset for the Healthcare industry is purpose-built to accelerate the development of Tamil speech recognition, spoken language understanding, and conversational AI systems. With 30 Hours of unscripted, real-world conversations, it delivers the linguistic and contextual depth needed to build high-performance ASR models for medical and wellness-related customer service.
Created by FutureBeeAI, this dataset empowers voice AI teams, NLP researchers, and data scientists to develop domain-specific models for hospitals, clinics, insurance providers, and telemedicine platforms.
Speech Data
The dataset features 30 Hours of dual-channel call center conversations between native Tamil speakers. These recordings cover a variety of healthcare support topics, enabling the development of speech technologies that are contextually aware and linguistically rich.
•Participant Diversity:
•
Speakers: 60 verified native Tamil speakers from our contributor community.

•
Regions: Diverse regions across Tamil Nadu to ensure broad dialectal representation.

•
Participant Profile: Age range of 18–70 with a gender mix of 60% male and 40% female.

•RecordingDetails:
•
Conversation Nature: Naturally flowing, unscripted conversations.

•
Call Duration: Each session ranges between 5 to 15 minutes.

•
Audio Format: WAV format, stereo, 16-bit depth at 8kHz and 16kHz sample rates.

•
Recording Environment: Captured in clear conditions without background noise or echo.

Topic Diversity
The dataset spans inbound and outbound calls, capturing a broad range of healthcare-specific interactions and sentiment types (positive, neutral, negative).
•Inbound Calls:
•Appointment Scheduling
•New Patient Registration
•Surgical Consultation
•Dietary Advice and Consultations
•Insurance Coverage Inquiries
•Follow-up Treatment Requests, and more
•OutboundCalls:
•Appointment Reminders
•Preventive Care Campaigns
•Test Results & Lab Reports
•Health Risk Assessment Calls
•Vaccination Updates
•Wellness Subscription Outreach, and more
These real-world interactions help build speech models that understand healthcare domain nuances and user intent.
Transcription
Every audio file is accompanied by high-quality, manually created transcriptions in JSON format.
•Transcription Includes:
•Speaker-identified Dialogues
•Time-coded Segments
•Non-speech Annotations (e.g., silence, cough)
•High transcription accuracy with word error rate is below 5%, backed by dual-layer QA checks.
Metadata
Each conversation and speaker includes detailed metadata to support fine-tuned training and analysis.
•
Participant Metadata: ID, gender, age, region, accent, and dialect.

•
Conversation Metadata: Topic, sentiment, call type, sample rate, and technical specs.

Usage and Applications
This dataset can be used across a range of healthcare and voice AI use cases:
•
<b style="font-weight:

Facebook

Twitter

Click to copy link

Link copied

Cite

Ashish Sahani (2022). Hospital Admissions Data [Dataset]. https://www.kaggle.com/datasets/ashishsahani/hospital-admissions-data

Hospital Admissions Data

Two Year Hospital Admissions and Discharge Data from Hero DMC Heart Institute

Explore at:

zip(522833 bytes)Available download formats

Dataset updated

Jan 21, 2022

Authors

Ashish Sahani

License

Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically

Description

This dataset is being provided under creative commons License (Attribution-Non-Commercial-Share Alike 4.0 International (CC BY-NC-SA 4.0)) https://creativecommons.org/licenses/by-nc-sa/4.0/

Context

This data was collected from patients admitted over a period of two years (1 April 2017 to 31 March 2019) at Hero DMC Heart Institute, Unit of Dayanand Medical College and Hospital, Ludhiana, Punjab, India. This is a tertiary care medical college and hospital. During the study period, the cardiology unit had 14,845 admissions corresponding to 12,238 patients. 1921 patients who had multiple admissions.

Specifically, data were related to patients ; date of admission; date of discharge; demographics, such as age, sex, locality (rural or urban); type of admission (emergency or outpatient); patient history, including smoking, alcohol, diabetes mellitus (DM), hypertension (HTN), prior coronary artery disease (CAD), prior cardiomyopathy (CMP), and chronic kidney disease (CKD); and lab parameters corresponding to hemoglobin (HB), total lymphocyte count (TLC), platelets, glucose, urea, creatinine, brain natriuretic peptide (BNP), raised cardiac enzymes (RCE) and ejection fraction (EF). Other comorbidities and features (28 features), including heart failure, STEMI, and pulmonary embolism, were recorded and analyzed.

Shock was defined as systolic blood pressure < 90 mmHg, and when the cause for shock was any reason other than cardiac. Patients in shock due to cardiac reasons were classified into cardiogenic shock. Patients in shock due to multifactorial pathophysiology (cardiac and non-cardiac) were considered for both categories. The outcomes indicating whether the patient was discharged or expired in the hospital were also recorded.

Further details about this dataset can be found here: https://doi.org/10.3390/diagnostics12020241

If you use this dataset in academic research all publications arising out of it must cite the following paper: Bollepalli, S.C.; Sahani, A.K.; Aslam, N.; Mohan, B.; Kulkarni, K.; Goyal, A.; Singh, B.; Singh, G.; Mittal, A.; Tandon, R.; Chhabra, S.T.; Wander, G.S.; Armoundas, A.A. An Optimized Machine Learning Model Accurately Predicts In-Hospital Outcomes at Admission to a Cardiac Unit. Diagnostics 2022, 12, 241. https://doi.org/10.3390/diagnostics12020241

If you intend to use this data for commercial purpose explicit written permission is required from data providers.

Content

table_headings.csv has explanatory names of all columns.

Acknowledgements

Data was collected from Hero Dayanand Medical College Heart Institute Unit of Dayanand Medical College and Hospital, Ludhiana, Punjab, India.

Inspiration

For any questions about the data or collaborations please contact ashish.sahani@iitrpr.ac.in

Clear search

Close search

Google apps

Main menu

Hospital Admissions Data

Context

Content

Acknowledgements

Inspiration

Hospital Emergency Dataset

Hospital Annual Financial Data - Selected Data & Pivot Tables

Hospital Database Management System SQL Project

Dataset

Contents

Open Database of Healthcare Facilities

Healthcare Management System

EHR Dataset for Patient Treatment Classification

Hospital Inpatient - Characteristics by Facility (Pivot Profile)

Data from: Clinical Dataset

MIMIC-III Clinical Database

HCUP Nationwide Readmissions Database (NRD)- Restricted Access Files

Cardiovascular_Disease_Dataset

MIMIC-III - Deep Reinforcement Learning

USGS National Structures Dataset - USGS National Map Downloadable Data...

American Hospital Association (AHA) Annual Survey Database - 2021

Cancer patient´s care transition database.xlsx

CHB-MIT Scalp EEG Database

Data from: MIT-BIH Arrhythmia Database

Padchest complete dataset

Tamil Call Center Data for Healthcare AI

Introduction

Speech Data

Topic Diversity

Transcription

Metadata

Usage and Applications

Hospital Admissions Data

Two Year Hospital Admissions and Discharge Data from Hero DMC Heart Institute

Context

Content

Acknowledgements

Inspiration