39 datasets found
  1. Hospital Admissions Data

    • kaggle.com
    zip
    Updated Jan 21, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ashish Sahani (2022). Hospital Admissions Data [Dataset]. https://www.kaggle.com/datasets/ashishsahani/hospital-admissions-data
    Explore at:
    zip(522833 bytes)Available download formats
    Dataset updated
    Jan 21, 2022
    Authors
    Ashish Sahani
    License

    Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
    License information was derived automatically

    Description

    This dataset is being provided under creative commons License (Attribution-Non-Commercial-Share Alike 4.0 International (CC BY-NC-SA 4.0)) https://creativecommons.org/licenses/by-nc-sa/4.0/

    Context

    This data was collected from patients admitted over a period of two years (1 April 2017 to 31 March 2019) at Hero DMC Heart Institute, Unit of Dayanand Medical College and Hospital, Ludhiana, Punjab, India. This is a tertiary care medical college and hospital. During the study period, the cardiology unit had 14,845 admissions corresponding to 12,238 patients. 1921 patients who had multiple admissions.

    Specifically, data were related to patients ; date of admission; date of discharge; demographics, such as age, sex, locality (rural or urban); type of admission (emergency or outpatient); patient history, including smoking, alcohol, diabetes mellitus (DM), hypertension (HTN), prior coronary artery disease (CAD), prior cardiomyopathy (CMP), and chronic kidney disease (CKD); and lab parameters corresponding to hemoglobin (HB), total lymphocyte count (TLC), platelets, glucose, urea, creatinine, brain natriuretic peptide (BNP), raised cardiac enzymes (RCE) and ejection fraction (EF). Other comorbidities and features (28 features), including heart failure, STEMI, and pulmonary embolism, were recorded and analyzed.

    Shock was defined as systolic blood pressure < 90 mmHg, and when the cause for shock was any reason other than cardiac. Patients in shock due to cardiac reasons were classified into cardiogenic shock. Patients in shock due to multifactorial pathophysiology (cardiac and non-cardiac) were considered for both categories. The outcomes indicating whether the patient was discharged or expired in the hospital were also recorded.

    Further details about this dataset can be found here: https://doi.org/10.3390/diagnostics12020241

    If you use this dataset in academic research all publications arising out of it must cite the following paper: Bollepalli, S.C.; Sahani, A.K.; Aslam, N.; Mohan, B.; Kulkarni, K.; Goyal, A.; Singh, B.; Singh, G.; Mittal, A.; Tandon, R.; Chhabra, S.T.; Wander, G.S.; Armoundas, A.A. An Optimized Machine Learning Model Accurately Predicts In-Hospital Outcomes at Admission to a Cardiac Unit. Diagnostics 2022, 12, 241. https://doi.org/10.3390/diagnostics12020241

    If you intend to use this data for commercial purpose explicit written permission is required from data providers.

    Content

    table_headings.csv has explanatory names of all columns.

    Acknowledgements

    Data was collected from Hero Dayanand Medical College Heart Institute Unit of Dayanand Medical College and Hospital, Ludhiana, Punjab, India.

    Inspiration

    For any questions about the data or collaborations please contact ashish.sahani@iitrpr.ac.in

  2. Hospital Emergency Dataset

    • kaggle.com
    zip
    Updated Jan 30, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Xavier Berge (2025). Hospital Emergency Dataset [Dataset]. https://www.kaggle.com/datasets/xavierberge/hospital-emergency-dataset
    Explore at:
    zip(228798 bytes)Available download formats
    Dataset updated
    Jan 30, 2025
    Authors
    Xavier Berge
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    This dataset encapsulates comprehensive patient information collected from a hospital emergency room (ER) dashboard. It serves as a valuable resource for healthcare analytics, focused on understanding patient demographics, treatment outcomes, and operational efficiency within emergency departments.

  3. C

    Hospital Annual Financial Data - Selected Data & Pivot Tables

    • data.chhs.ca.gov
    • data.ca.gov
    • +4more
    csv, data, doc, html +5
    Updated Oct 8, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Department of Health Care Access and Information (2025). Hospital Annual Financial Data - Selected Data & Pivot Tables [Dataset]. https://data.chhs.ca.gov/dataset/hospital-annual-financial-data-selected-data-pivot-tables
    Explore at:
    xlsx, xlsx(754073), pdf(333268), xlsx(758376), xlsx(769128), xls(19599360), xlsx(770931), pdf(303198), xlsx(779866), xls(51424256), pdf(121968), xlsx(765216), csv(205488092), xls(18301440), html, xlsx(756356), xls(14657536), xlsx(768036), zip, xlsx(752914), xlsx(763636), xls(19650048), xlsx(791201), xlsm(1360350), xlsx(783155), xls, xls(18445312), pdf(310420), pdf(383996), xls(44967936), data, xlsx(750199), doc, xlsx(14714368), xlsx(777616), xls(51554816), xls(44933632), xlsx(758089), xls(920576), pdf(258239), xlsx(770375), xls(16002048), xls(19577856), xlsm(1369828), xlsx(780332)Available download formats
    Dataset updated
    Oct 8, 2025
    Dataset authored and provided by
    Department of Health Care Access and Information
    Description

    On an annual basis (individual hospital fiscal year), individual hospitals and hospital systems report detailed facility-level data on services capacity, inpatient/outpatient utilization, patients, revenues and expenses by type and payer, balance sheet and income statement.

    Due to the large size of the complete dataset, a selected set of data representing a wide range of commonly used data items, has been created that can be easily managed and downloaded. The selected data file includes general hospital information, utilization data by payer, revenue data by payer, expense data by natural expense category, financial ratios, and labor information.

    There are two groups of data contained in this dataset: 1) Selected Data - Calendar Year: To make it easier to compare hospitals by year, hospital reports with report periods ending within a given calendar year are grouped together. The Pivot Tables for a specific calendar year are also found here. 2) Selected Data - Fiscal Year: Hospital reports with report periods ending within a given fiscal year (July-June) are grouped together.

  4. Hospital Database Management System SQL Project

    • kaggle.com
    zip
    Updated May 9, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Andrew Dolcimascolo-Garrett (2024). Hospital Database Management System SQL Project [Dataset]. https://www.kaggle.com/datasets/andrewdolcigarrett/hospital-database-management-system-sql-project
    Explore at:
    zip(1487278 bytes)Available download formats
    Dataset updated
    May 9, 2024
    Authors
    Andrew Dolcimascolo-Garrett
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    Dataset

    This dataset was created by Andrew Dolcimascolo-Garrett

    Released under MIT

    Contents

  5. G

    Open Database of Healthcare Facilities

    • open.canada.ca
    • catalogue.arctic-sdi.org
    csv, esri rest +4
    Updated Mar 2, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statistics Canada (2022). Open Database of Healthcare Facilities [Dataset]. https://open.canada.ca/data/en/dataset/a1bcd4ee-8e57-499b-9c6f-94f6902fdf32
    Explore at:
    fgdb/gdb, esri rest, csv, html, pdf, wmsAvailable download formats
    Dataset updated
    Mar 2, 2022
    Dataset provided by
    Statistics Canada
    License

    Open Government Licence - Canada 2.0https://open.canada.ca/en/open-government-licence-canada
    License information was derived automatically

    Description

    The Open Database of Healthcare Facilities (ODHF) is a collection of open data containing the names, types, and locations of health facilities across Canada. It is released under the Open Government License - Canada. The ODHF compiles open, publicly available, and directly-provided data on health facilities across Canada. Data sources include regional health authorities, provincial, territorial and municipal governments, and public health and professional healthcare bodies. This database aims to provide enhanced access to a harmonized listing of health facilities across Canada by making them available as open data. This database is a component of the Linkable Open Data Environment (LODE).

  6. Healthcare Management System

    • kaggle.com
    zip
    Updated Dec 23, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Anouska Abhisikta (2023). Healthcare Management System [Dataset]. https://www.kaggle.com/datasets/anouskaabhisikta/healthcare-management-system
    Explore at:
    zip(74279 bytes)Available download formats
    Dataset updated
    Dec 23, 2023
    Authors
    Anouska Abhisikta
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    Patients Table:

    • PatientID: Unique identifier for each patient.
    • firstname: First name of the patient.
    • lastname: Last name of the patient.
    • email: Email address of the patient.

    This table stores information about individual patients, including their names and contact details.

    Doctors Table:

    • DoctorID: Unique identifier for each doctor.
    • DoctorName: Full name of the doctor.
    • Specialization: Area of medical specialization.
    • DoctorContact: Contact details of the doctor.

    This table contains details about healthcare providers, including their names, specializations, and contact information.

    Appointments Table:

    • AppointmentID: Unique identifier for each appointment.
    • Date: Date of the appointment.
    • Time: Time of the appointment.
    • PatientID: Foreign key referencing the Patients table, indicating the patient for the appointment.
    • DoctorID: Foreign key referencing the Doctors table, indicating the doctor for the appointment.

    This table records scheduled appointments, linking patients to doctors.

    MedicalProcedure Table:

    • ProcedureID: Unique identifier for each medical procedure.
    • ProcedureName: Name or description of the medical procedure.
    • AppointmentID: Foreign key referencing the Appointments table, indicating the appointment associated with the procedure.

    This table stores details about medical procedures associated with specific appointments.

    Billing Table:

    • InvoiceID: Unique identifier for each billing transaction.
    • PatientID: Foreign key referencing the Patients table, indicating the patient for the billing transaction.
    • Items: Description of items or services billed.
    • Amount: Amount charged for the billing transaction.

    This table maintains records of billing transactions, associating them with specific patients.

    demo Table:

    • ID: Primary key, serves as a unique identifier for each record.
    • Name: Name of the entity.
    • Hint: Additional information or hint about the entity.

    This table appears to be a demonstration or testing table, possibly unrelated to the healthcare management system.

    This dataset schema is designed to capture comprehensive information about patients, doctors, appointments, medical procedures, and billing transactions in a healthcare management system. Adjustments can be made based on specific requirements, and additional attributes can be included as needed.

  7. m

    EHR Dataset for Patient Treatment Classification

    • data.mendeley.com
    Updated May 10, 2020
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mujiono Sadikin (2020). EHR Dataset for Patient Treatment Classification [Dataset]. http://doi.org/10.17632/7kv3rctx7m.1
    Explore at:
    Dataset updated
    May 10, 2020
    Authors
    Mujiono Sadikin
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The dataset is Electronic Health Record Predicting collected from a private Hospital in Indonesia. It contains the patients laboratory test results used to determine next patient treatment whether in care or out care patient. The task embedded to the dataset is classification prediction.

  8. C

    Hospital Inpatient - Characteristics by Facility (Pivot Profile)

    • data.chhs.ca.gov
    • data.ca.gov
    • +2more
    .xlsx, xls, xlsx, zip
    Updated Nov 7, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Department of Health Care Access and Information (2025). Hospital Inpatient - Characteristics by Facility (Pivot Profile) [Dataset]. https://data.chhs.ca.gov/dataset/hospital-inpatient-characteristics-by-facility-pivot-profile
    Explore at:
    xls, xlsx, xlsx(1778842), xlsx(1736211), xlsx(1736990), xlsx(1762190), xlsx(1740830), xlsx(1730937), .xlsx(1724148), zipAvailable download formats
    Dataset updated
    Nov 7, 2025
    Dataset authored and provided by
    Department of Health Care Access and Information
    Description

    This dataset contains annual Excel pivot tables that display summaries of the inpatients treated in each hospital. The summary data include discharges, discharge days, average length of stay, age groups, race groups, sex, expected payer, type of care, do not resuscitate orders, admission source, admission type, discharge disposition, principal diagnosis groups, principal procedure groups, and principal external cause of injury/morbidity groups. The data can also be summarized statewide or for a specific hospital county, bed size grouping, and/or type of control.

  9. Data from: Clinical Dataset

    • kaggle.com
    zip
    Updated Oct 5, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mohamadreza Momeni (2023). Clinical Dataset [Dataset]. https://www.kaggle.com/datasets/imtkaggleteam/clinical-dataset
    Explore at:
    zip(16220 bytes)Available download formats
    Dataset updated
    Oct 5, 2023
    Authors
    Mohamadreza Momeni
    Description

    The purest type of electronic clinical data which is obtained at the point of care at a medical facility, hospital, clinic or practice. Often referred to as the electronic medical record (EMR), the EMR is generally not available to outside researchers. The data collected includes administrative and demographic information, diagnosis, treatment, prescription drugs, laboratory tests, physiologic monitoring data, hospitalization, patient insurance, etc.

    Individual organizations such as hospitals or health systems may provide access to internal staff. Larger collaborations, such as the NIH Collaboratory Distributed Research Network provides mediated or collaborative access to clinical data repositories by eligible researchers. Additionally, the UW De-identified Clinical Data Repository (DCDR) and the Stanford Center for Clinical Informatics allow for initial cohort identification.

    About Dataset:

    333 scholarly articles cite this dataset.

    Unique identifier: DOI

    Dataset updated: 2023

    Authors: Haoyang Mi

    In this dataset, we have two dataset:

    1- Clinical Data_Discovery_Cohort: Name of columns: Patient ID Specimen date Dead or Alive Date of Death Date of last Follow Sex Race Stage Event Time

    2- Clinical_Data_Validation_Cohort Name of columns: Patient ID Survival time (days) Event Tumor size Grade Stage Age Sex Cigarette Pack per year Type Adjuvant Batch EGFR KRAS

    Feel free to put your thought and analysis in a notebook for this datasets. And you can create some interesting and valuable ML projects for this case. Thanks for your attention.

  10. p

    MIMIC-III Clinical Database

    • physionet.org
    • oppositeofnorth.com
    Updated Sep 4, 2016
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Alistair Johnson; Tom Pollard; Roger Mark (2016). MIMIC-III Clinical Database [Dataset]. http://doi.org/10.13026/C2XW26
    Explore at:
    Dataset updated
    Sep 4, 2016
    Authors
    Alistair Johnson; Tom Pollard; Roger Mark
    License

    https://github.com/MIT-LCP/license-and-dua/tree/master/draftshttps://github.com/MIT-LCP/license-and-dua/tree/master/drafts

    Description

    MIMIC-III is a large, freely-available database comprising deidentified health-related data associated with over forty thousand patients who stayed in critical care units of the Beth Israel Deaconess Medical Center between 2001 and 2012. The database includes information such as demographics, vital sign measurements made at the bedside (~1 data point per hour), laboratory test results, procedures, medications, caregiver notes, imaging reports, and mortality (including post-hospital discharge).MIMIC supports a diverse range of analytic studies spanning epidemiology, clinical decision-rule improvement, and electronic tool development. It is notable for three factors: it is freely available to researchers worldwide; it encompasses a diverse and very large population of ICU patients; and it contains highly granular data, including vital signs, laboratory results, and medications.

  11. HCUP Nationwide Readmissions Database (NRD)- Restricted Access Files

    • catalog.data.gov
    • data.virginia.gov
    • +2more
    Updated Jul 26, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Agency for Healthcare Research and Quality, Department of Health & Human Services (2023). HCUP Nationwide Readmissions Database (NRD)- Restricted Access Files [Dataset]. https://catalog.data.gov/dataset/healthcare-cost-and-utilization-project-nationwide-readmissions-database-nrd
    Explore at:
    Dataset updated
    Jul 26, 2023
    Description

    The Healthcare Cost and Utilization Project (HCUP) Nationwide Readmissions Database (NRD) is a unique and powerful database designed to support various types of analyses of national readmission rates for all payers and the uninsured. The NRD includes discharges for patients with and without repeat hospital visits in a year and those who have died in the hospital. Repeat stays may or may not be related. The criteria to determine the relationship between hospital admissions is left to the analyst using the NRD. This database addresses a large gap in health care data - the lack of nationally representative information on hospital readmissions for all ages. Outcomes of interest include national readmission rates, reasons for returning to the hospital for care, and the hospital costs for discharges with and without readmissions. Unweighted, the NRD contains data from approximately 18 million discharges each year. Weighted, it estimates roughly 35 million discharges. Developed through a Federal-State-Industry partnership sponsored by the Agency for Healthcare Research and Quality, HCUP data inform decision making at the national, State, and community levels. The NRD is drawn from HCUP State Inpatient Databases (SID) containing verified patient linkage numbers that can be used to track a person across hospitals within a State, while adhering to strict privacy guidelines. The NRD is not designed to support regional, State-, or hospital-specific readmission analyses. The NRD contains more than 100 clinical and non-clinical data elements provided in a hospital discharge abstract. Data elements include but are not limited to: diagnoses, procedures, patient demographics (e.g., sex, age), expected source of payer, regardless of expected payer, including but not limited to Medicare, Medicaid, private insurance, self-pay, or those billed as ‘no charge, discharge month, quarter, and year, total charges, length of stay, and data elements essential to readmission analyses. The NIS excludes data elements that could directly or indirectly identify individuals. Restricted access data files are available with a data use agreement and brief online security training.

  12. m

    Cardiovascular_Disease_Dataset

    • data.mendeley.com
    • kaggle.com
    Updated Apr 16, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Bhanu Prakash Doppala (2021). Cardiovascular_Disease_Dataset [Dataset]. http://doi.org/10.17632/dzz48mvjht.1
    Explore at:
    Dataset updated
    Apr 16, 2021
    Authors
    Bhanu Prakash Doppala
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This heart disease dataset is acquired from one o f the multispecialty hospitals in India. Over 14 common features which makes it one of the heart disease dataset available so far for research purposes. This dataset consists of 1000 subjects with 12 features. This dataset will be useful for building a early-stage heart disease detection as well as to generate predictive machine learning models.

  13. MIMIC-III - Deep Reinforcement Learning

    • kaggle.com
    zip
    Updated Apr 7, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Asjad K (2022). MIMIC-III - Deep Reinforcement Learning [Dataset]. https://www.kaggle.com/datasets/asjad99/mimiciii
    Explore at:
    zip(11100065 bytes)Available download formats
    Dataset updated
    Apr 7, 2022
    Authors
    Asjad K
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Digitization of healthcare data along with algorithmic breakthroughts in AI will have a major impact on healthcare delivery in coming years. Its intresting to see application of AI to assist clinicians during patient treatment in a privacy preserving way. While scientific knowledge can help guide interventions, there remains a key need to quickly cut through the space of decision policies to find effective strategies to support patients during the care process.

    Offline Reinforcement learning (also referred to as safe or batch reinforcement learning) is a promising sub-field of RL which provides us with a mechanism for solving real world sequential decision making problems where access to simulator is not available. Here we assume that learn a policy from fixed dataset of trajectories with further interaction with the environment(agent doesn't receive reward or punishment signal from the environment). It has shown that such an approach can leverage vast amount of existing logged data (in the form of previous interactions with the environment) and can outperform supervised learning approaches or heuristic based policies for solving real world - decision making problems. Offline RL algorithms when trained on sufficiently large and diverse offline datasets can produce close to optimal policies(ability to generalize beyond training data).

    As Part of my PhD, research, I investigated the problem of developing a Clinical Decision Support System for Sepsis Management using Offline Deep Reinforcement Learning.

    MIMIC-III ('Medical Information Mart for Intensive Care') is a large open-access anonymized single-center database which consists of comprehensive clinical data of 61,532 critical care admissions from 2001–2012 collected at a Boston teaching hospital. Dataset consists of 47 features (including demographics, vitals, and lab test results) on a cohort of sepsis patients who meet the sepsis-3 definition criteria.

    we try to answer the following question:

    Given a particular patient’s characteristics and physiological information at each time step as input, can our DeepRL approach, learn an optimal treatment policy that can prescribe the right intervention(e.g use of ventilator) to the patient each stage of the treatment process, in order to improve the final outcome(e.g patient mortality)?

    we can use popular state-of-the-art algorithms such as Deep Q Learning(DQN), Double Deep Q Learning (DDQN), DDQN combined with BNC, Mixed Monte Carlo(MMC) and Persistent Advantage Learning (PAL). Using these methods we can train an RL policy to recommend optimum treatment path for a given patient.

    Data acquisition, standard pre-processing and modelling details can be found here in Github repo: https://github.com/asjad99/MIMIC_RL_COACH

  14. d

    USGS National Structures Dataset - USGS National Map Downloadable Data...

    • catalog.data.gov
    • data.usgs.gov
    • +1more
    Updated Nov 26, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    U.S. Geological Survey (2025). USGS National Structures Dataset - USGS National Map Downloadable Data Collection [Dataset]. https://catalog.data.gov/dataset/usgs-national-structures-dataset-usgs-national-map-downloadable-data-collection
    Explore at:
    Dataset updated
    Nov 26, 2025
    Dataset provided by
    United States Geological Surveyhttp://www.usgs.gov/
    Description

    USGS Structures from The National Map (TNM) consists of data to include the name, function, location, and other core information and characteristics of selected manmade facilities across all US states and territories. The types of structures collected are largely determined by the needs of disaster planning and emergency response, and homeland security organizations. Structures currently included are: School, School:Elementary, School:Middle, School:High, College/University, Technical/Trade School, Ambulance Service, Fire Station/EMS Station, Law Enforcement, Prison/Correctional Facility, Post Office, Hospital/Medical Center, Cabin, Campground, Cemetery, Historic Site/Point of Interest, Picnic Area, Trailhead, Vistor/Information Center, US Capitol, State Capitol, US Supreme Court, State Supreme Court, Court House, Headquarters, Ranger Station, White House, and City/Town Hall. Structures data are designed to be used in general mapping and in the analysis of structure related activities using geographic information system technology. Included is a feature class of preliminary building polygons provided by FEMA, USA Structures. The National Map structures data is commonly combined with other data themes, such as boundaries, elevation, hydrography, and transportation, to produce general reference base maps. The National Map viewer allows free downloads of public domain structures data in either Esri File Geodatabase or Shapefile formats. For additional information on the structures data model, go to https://www.usgs.gov/ngp-standards-and-specifications/national-map-structures-content.

  15. American Hospital Association (AHA) Annual Survey Database - 2021

    • archive.ciser.cornell.edu
    Updated Feb 10, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    American Hospital Association (2024). American Hospital Association (AHA) Annual Survey Database - 2021 [Dataset]. https://archive.ciser.cornell.edu/studies/2893/data-and-documentation
    Explore at:
    Dataset updated
    Feb 10, 2024
    Dataset authored and provided by
    American Hospital Associationhttp://www.aha.org/
    Variables measured
    Organization
    Description

    AHA Annual Survey Database™ for Fiscal Year 2021 is a comprehensive hospital database for peer comparisons, market analysis, and health services research. It is produced primarily from the AHA Annual Survey of Hospitals, which has been administered by the American Hospital Association (AHA) since 1946. The survey responses are supplemented by data drawn the U.S. Census Bureau, hospital accrediting bodies, and other organizations.

  16. Cancer patient´s care transition database.xlsx

    • figshare.com
    xlsx
    Updated Mar 6, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Elisiane Lorenzini; Julia Estela Willrich Boell; Nelly D. Oelke; Caroline Donini Rodrigues; Letícia Flores Trindade; Vanessa Dalsasso Batista Winter; Michelle Mariah Malkiewiez; Gabriela Ceretta Flôres; Pâmella Pluta; Adriane Cristina Bernat Kolankiewicz (2020). Cancer patient´s care transition database.xlsx [Dataset]. http://doi.org/10.6084/m9.figshare.11831343.v3
    Explore at:
    xlsxAvailable download formats
    Dataset updated
    Mar 6, 2020
    Dataset provided by
    Figsharehttp://figshare.com/
    Authors
    Elisiane Lorenzini; Julia Estela Willrich Boell; Nelly D. Oelke; Caroline Donini Rodrigues; Letícia Flores Trindade; Vanessa Dalsasso Batista Winter; Michelle Mariah Malkiewiez; Gabriela Ceretta Flôres; Pâmella Pluta; Adriane Cristina Bernat Kolankiewicz
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The dataset contains information of 213 cancer patients undergoing clinical or surgical treatment characterized on sociodemographic and clinical data as well as data from the Care Transition Measure (CTM 15-Brazil). Data collection was carried out 7 to 30 days after their discharge from hospital from June to August 2019. Understanding these data can contribute to improving quality of care transitions and avoiding hospital readmissions. To this end, this dataset contains a broad array of variables:

    *gender

    *age group

    *place of residence

    *race

    *marital status

    *schooling

    *paid work activity

    *type of treatment

    *cancer staging

    *metastasis

    *comorbidities

    *main complaint

    *continue use medication

    *diagnosis

    *cancer type

    *diagnostic year

    *oncology treatment

    *first hospitalization

    *readmission in the last 30 days

    *number of hospitalizations in the last 30 days

    *readmission in the last 6 months

    *number of hospitalizations in the last 6 months

    *readmission in the last year

    *number of hospitalizations in the last year

    *questions 1-15 from CTM 15-Brazil

    The data are presented as a single Excel XLSX file: cancer patient´s care transitions dataset.xlsx.

    The analyses of the present dataset have the potential to generate hospital readmission prevention strategies to be implemented by the hospital team. Researchers who are interested in CTs of cancer patients can extensively explore the variables described here.

    The project from which these data were extracted was approved by the institution’s research ethics committee (approval n. 3.266.259/2019) at Associação Hospital de Caridade Ijuí, Rio Grande do Sul, Brazil.

  17. p

    CHB-MIT Scalp EEG Database

    • physionet.org
    Updated Jun 9, 2010
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    John Guttag (2010). CHB-MIT Scalp EEG Database [Dataset]. http://doi.org/10.13026/C2K01R
    Explore at:
    Dataset updated
    Jun 9, 2010
    Authors
    John Guttag
    License

    Open Data Commons Attribution License (ODC-By) v1.0https://www.opendatacommons.org/licenses/by/1.0/
    License information was derived automatically

    Description

    This database, collected at the Children’s Hospital Boston, consists of EEG recordings from pediatric subjects with intractable seizures. Subjects were monitored for up to several days following withdrawal of anti-seizure medication in order to characterize their seizures and assess their candidacy for surgical intervention. The recordings are grouped into 23 cases and were collected from 22 subjects (5 males, ages 3–22; and 17 females, ages 1.5–19).

  18. p

    Data from: MIT-BIH Arrhythmia Database

    • physionet.org
    • opendatalab.com
    • +2more
    Updated Feb 24, 2005
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    George Moody; Roger Mark (2005). MIT-BIH Arrhythmia Database [Dataset]. http://doi.org/10.13026/C2F305
    Explore at:
    Dataset updated
    Feb 24, 2005
    Authors
    George Moody; Roger Mark
    License

    Open Data Commons Attribution License (ODC-By) v1.0https://www.opendatacommons.org/licenses/by/1.0/
    License information was derived automatically

    Description

    The MIT-BIH Arrhythmia Database contains 48 half-hour excerpts of two-channel ambulatory ECG recordings, obtained from 47 subjects studied by the BIH Arrhythmia Laboratory between 1975 and 1979. Twenty-three recordings were chosen at random from a set of 4000 24-hour ambulatory ECG recordings collected from a mixed population of inpatients (about 60%) and outpatients (about 40%) at Boston's Beth Israel Hospital; the remaining 25 recordings were selected from the same set to include less common but clinically significant arrhythmias that would not be well-represented in a small random sample.

  19. c

    Padchest complete dataset

    • bimcv.cipf.es
    Updated Jul 21, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2023). Padchest complete dataset [Dataset]. https://bimcv.cipf.es/bimcv-projects/padchest/
    Explore at:
    Dataset updated
    Jul 21, 2023
    License

    http://bimcv.cipf.es/bimcv-projects/padchest/padchest-dataset-research-use-agreement/http://bimcv.cipf.es/bimcv-projects/padchest/padchest-dataset-research-use-agreement/

    Description

    A labeled large-scale, high resolution chest x-ray dataset for automated ex-ploration of medical images along with their associated reports. This dataset includes more than 160,000 images from 67,000 patients that were interpreted and reported by radiologists at Hospital San Juan (Spain) from 2009 to 2017, covering six different position views and additional information on image acquisition and patient demography - 1.2TB

  20. F

    Tamil Call Center Data for Healthcare AI

    • futurebeeai.com
    wav
    Updated Aug 1, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    FutureBee AI (2022). Tamil Call Center Data for Healthcare AI [Dataset]. https://www.futurebeeai.com/dataset/speech-dataset/healthcare-call-center-conversation-tamil-india
    Explore at:
    wavAvailable download formats
    Dataset updated
    Aug 1, 2022
    Dataset provided by
    FutureBeeAI
    Authors
    FutureBee AI
    License

    https://www.futurebeeai.com/policies/ai-data-license-agreementhttps://www.futurebeeai.com/policies/ai-data-license-agreement

    Dataset funded by
    FutureBeeAI
    Description

    Introduction

    This Tamil Call Center Speech Dataset for the Healthcare industry is purpose-built to accelerate the development of Tamil speech recognition, spoken language understanding, and conversational AI systems. With 30 Hours of unscripted, real-world conversations, it delivers the linguistic and contextual depth needed to build high-performance ASR models for medical and wellness-related customer service.

    Created by FutureBeeAI, this dataset empowers voice AI teams, NLP researchers, and data scientists to develop domain-specific models for hospitals, clinics, insurance providers, and telemedicine platforms.

    Speech Data

    The dataset features 30 Hours of dual-channel call center conversations between native Tamil speakers. These recordings cover a variety of healthcare support topics, enabling the development of speech technologies that are contextually aware and linguistically rich.

    Participant Diversity:
    Speakers: 60 verified native Tamil speakers from our contributor community.
    Regions: Diverse regions across Tamil Nadu to ensure broad dialectal representation.
    Participant Profile: Age range of 18–70 with a gender mix of 60% male and 40% female.
    RecordingDetails:
    Conversation Nature: Naturally flowing, unscripted conversations.
    Call Duration: Each session ranges between 5 to 15 minutes.
    Audio Format: WAV format, stereo, 16-bit depth at 8kHz and 16kHz sample rates.
    Recording Environment: Captured in clear conditions without background noise or echo.

    Topic Diversity

    The dataset spans inbound and outbound calls, capturing a broad range of healthcare-specific interactions and sentiment types (positive, neutral, negative).

    Inbound Calls:
    Appointment Scheduling
    New Patient Registration
    Surgical Consultation
    Dietary Advice and Consultations
    Insurance Coverage Inquiries
    Follow-up Treatment Requests, and more
    OutboundCalls:
    Appointment Reminders
    Preventive Care Campaigns
    Test Results & Lab Reports
    Health Risk Assessment Calls
    Vaccination Updates
    Wellness Subscription Outreach, and more

    These real-world interactions help build speech models that understand healthcare domain nuances and user intent.

    Transcription

    Every audio file is accompanied by high-quality, manually created transcriptions in JSON format.

    Transcription Includes:
    Speaker-identified Dialogues
    Time-coded Segments
    Non-speech Annotations (e.g., silence, cough)
    High transcription accuracy with word error rate is below 5%, backed by dual-layer QA checks.

    Metadata

    Each conversation and speaker includes detailed metadata to support fine-tuned training and analysis.

    Participant Metadata: ID, gender, age, region, accent, and dialect.
    Conversation Metadata: Topic, sentiment, call type, sample rate, and technical specs.

    Usage and Applications

    This dataset can be used across a range of healthcare and voice AI use cases:

    <b style="font-weight:

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Ashish Sahani (2022). Hospital Admissions Data [Dataset]. https://www.kaggle.com/datasets/ashishsahani/hospital-admissions-data
Organization logo

Hospital Admissions Data

Two Year Hospital Admissions and Discharge Data from Hero DMC Heart Institute

Explore at:
zip(522833 bytes)Available download formats
Dataset updated
Jan 21, 2022
Authors
Ashish Sahani
License

Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically

Description

This dataset is being provided under creative commons License (Attribution-Non-Commercial-Share Alike 4.0 International (CC BY-NC-SA 4.0)) https://creativecommons.org/licenses/by-nc-sa/4.0/

Context

This data was collected from patients admitted over a period of two years (1 April 2017 to 31 March 2019) at Hero DMC Heart Institute, Unit of Dayanand Medical College and Hospital, Ludhiana, Punjab, India. This is a tertiary care medical college and hospital. During the study period, the cardiology unit had 14,845 admissions corresponding to 12,238 patients. 1921 patients who had multiple admissions.

Specifically, data were related to patients ; date of admission; date of discharge; demographics, such as age, sex, locality (rural or urban); type of admission (emergency or outpatient); patient history, including smoking, alcohol, diabetes mellitus (DM), hypertension (HTN), prior coronary artery disease (CAD), prior cardiomyopathy (CMP), and chronic kidney disease (CKD); and lab parameters corresponding to hemoglobin (HB), total lymphocyte count (TLC), platelets, glucose, urea, creatinine, brain natriuretic peptide (BNP), raised cardiac enzymes (RCE) and ejection fraction (EF). Other comorbidities and features (28 features), including heart failure, STEMI, and pulmonary embolism, were recorded and analyzed.

Shock was defined as systolic blood pressure < 90 mmHg, and when the cause for shock was any reason other than cardiac. Patients in shock due to cardiac reasons were classified into cardiogenic shock. Patients in shock due to multifactorial pathophysiology (cardiac and non-cardiac) were considered for both categories. The outcomes indicating whether the patient was discharged or expired in the hospital were also recorded.

Further details about this dataset can be found here: https://doi.org/10.3390/diagnostics12020241

If you use this dataset in academic research all publications arising out of it must cite the following paper: Bollepalli, S.C.; Sahani, A.K.; Aslam, N.; Mohan, B.; Kulkarni, K.; Goyal, A.; Singh, B.; Singh, G.; Mittal, A.; Tandon, R.; Chhabra, S.T.; Wander, G.S.; Armoundas, A.A. An Optimized Machine Learning Model Accurately Predicts In-Hospital Outcomes at Admission to a Cardiac Unit. Diagnostics 2022, 12, 241. https://doi.org/10.3390/diagnostics12020241

If you intend to use this data for commercial purpose explicit written permission is required from data providers.

Content

table_headings.csv has explanatory names of all columns.

Acknowledgements

Data was collected from Hero Dayanand Medical College Heart Institute Unit of Dayanand Medical College and Hospital, Ludhiana, Punjab, India.

Inspiration

For any questions about the data or collaborations please contact ashish.sahani@iitrpr.ac.in

Search
Clear search
Close search
Google apps
Main menu