27 datasets found

Malaria disease and grading system dataset from public hospitals reflecting...
data.niaid.nih.gov
datadryad.org
zip
Updated Nov 10, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Temitope Olufunmi Atoyebi; Rashidah Funke Olanrewaju; N. V. Blamah; Emmanuel Chinanu Uwazie (2023). Malaria disease and grading system dataset from public hospitals reflecting complicated and uncomplicated conditions [Dataset]. http://doi.org/10.5061/dryad.4xgxd25gn
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.5061/dryad.4xgxd25gn
Dataset updated
Nov 10, 2023
Dataset provided by
Nasarawa State University
Authors
Temitope Olufunmi Atoyebi; Rashidah Funke Olanrewaju; N. V. Blamah; Emmanuel Chinanu Uwazie
License
https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html
Description
Malaria is the leading cause of death in the African region. Data mining can help extract valuable knowledge from available data in the healthcare sector. This makes it possible to train models to predict patient health faster than in clinical trials. Implementations of various machine learning algorithms such as K-Nearest Neighbors, Bayes Theorem, Logistic Regression, Support Vector Machines, and Multinomial Naïve Bayes (MNB), etc., has been applied to malaria datasets in public hospitals, but there are still limitations in modeling using the Naive Bayes multinomial algorithm. This study applies the MNB model to explore the relationship between 15 relevant attributes of public hospitals data. The goal is to examine how the dependency between attributes affects the performance of the classifier. MNB creates transparent and reliable graphical representation between attributes with the ability to predict new situations. The model (MNB) has 97% accuracy. It is concluded that this model outperforms the GNB classifier which has 100% accuracy and the RF which also has 100% accuracy. Methods Prior to collection of data, the researcher was be guided by all ethical training certification on data collection, right to confidentiality and privacy reserved called Institutional Review Board (IRB). Data was be collected from the manual archive of the Hospitals purposively selected using stratified sampling technique, transform the data to electronic form and store in MYSQL database called malaria. Each patient file was extracted and review for signs and symptoms of malaria then check for laboratory confirmation result from diagnosis. The data was be divided into two tables: the first table was called data1 which contain data for use in phase 1 of the classification, while the second table data2 which contains data for use in phase 2 of the classification. Data Source Collection Malaria incidence data set is obtained from Public hospitals from 2017 to 2021. These are the data used for modeling and analysis. Also, putting in mind the geographical location and socio-economic factors inclusive which are available for patients inhabiting those areas. Naive Bayes (Multinomial) is the model used to analyze the collected data for malaria disease prediction and grading accordingly. Data Preprocessing: Data preprocessing shall be done to remove noise and outlier. Transformation: The data shall be transformed from analog to electronic record. Data Partitioning The data which shall be collected will be divided into two portions; one portion of the data shall be extracted as a training set, while the other portion will be used for testing. The training portion shall be taken from a table stored in a database and will be called data which is training set1, while the training portion taking from another table store in a database is shall be called data which is training set2. The dataset was split into two parts: a sample containing 70% of the training data and 30% for the purpose of this research. Then, using MNB classification algorithms implemented in Python, the models were trained on the training sample. On the 30% remaining data, the resulting models were tested, and the results were compared with the other Machine Learning models using the standard metrics. Classification and prediction: Base on the nature of variable in the dataset, this study will use Naïve Bayes (Multinomial) classification techniques; Classification phase 1 and Classification phase 2. The operation of the framework is illustrated as follows: i. Data collection and preprocessing shall be done. ii. Preprocess data shall be stored in a training set 1 and training set 2. These datasets shall be used during classification. iii. Test data set is shall be stored in database test data set. iv. Part of the test data set must be compared for classification using classifier 1 and the remaining part must be classified with classifier 2 as follows: Classifier phase 1: It classify into positive or negative classes. If the patient is having malaria, then the patient is classified as positive (P), while a patient is classified as negative (N) if the patient does not have malaria.
Classifier phase 2: It classify only data set that has been classified as positive by classifier 1, and then further classify them into complicated and uncomplicated class label. The classifier will also capture data on environmental factors, genetics, gender and age, cultural and socio-economic variables. The system will be designed such that the core parameters as a determining factor should supply their value.
h
Patient Episode Dataset for Wales (PEDW)
web.dev.hdruk.cloud
Updated Oct 8, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Digital Health and Care Wales (DHCW) (2024). Patient Episode Dataset for Wales (PEDW) [Dataset]. https://web.dev.hdruk.cloud/dataset/318
Explore at:
Dataset updated
Oct 8, 2024
Dataset authored and provided by
Digital Health and Care Wales (DHCW)
License
https://saildatabank.com/data/apply-to-work-with-the-data/https://saildatabank.com/data/apply-to-work-with-the-data/
Description
NHS Wales hospital admissions (Inpatients and daycases) dataset comprising of attendance and clinical information for all hospital admissions: includes diagnoses and operations performed. Includes spell and episode level data.

The data are collected and coded at each hospital. Administrative information is collected from the central PAS (Patient Administrative System), such as specialty of care, admission and discharge dates. After the patient is discharged the handwritten patient notes are transcribed by clinical coder into medical coding terminology (ICD10 and OPCS).

The data held in PEDW is of interest to public health services since it can provide information regarding both health service utilisation and also the incidence and prevalence of disease. However, since PEDW was created to track hospital activity from the point of view of payments for services, rather than epidemiological analysis, the use of PEDW for public health work is not straightforward. For example:

Counts will vary depending on the number of diagnosis fields used e.g. primary only, all fields; There are a number of different things that can be counted in PEDW e.g. individual episodes of care, admissions, discharges, periods of continuous care (group of episodes), patients or procedures. When looking at diagnosis or procedures the number will vary depending on whether you look at only in the primary diagnosis / procedure field or if the secondary fields are also included. Coding practices vary. In particular, coding practices for recording secondary diagnoses is likely to vary for different hospitals. This makes regional variations more difficult to interpret. The validation process led by the Corporate Health Improvement Programme and implemented by Digital Health and Care Wales (DHCW) is aiming to address some of these inconsistencies.

Due to the complexity and pitfalls of PEDW it is recommended that any PEDW requests for public health purposes are discussed with a member of the SAIL team. In turn the SAIL will seek advice from DHCW if required.

This dataset requires additional governance approvals from the data provider before data can be provisioned to a SAIL project.
f
Example of variations in actual mortality rates under internal...
plos.figshare.com
xls
Updated Jun 9, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Martin Roessler; Jochen Schmitt; Olaf Schoffer (2023). Example of variations in actual mortality rates under internal standardization: Parameter values. [Dataset]. http://doi.org/10.1371/journal.pone.0257003.t007
Explore at:
xlsAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0257003.t007
Dataset updated
Jun 9, 2023
Dataset provided by
PLOS ONE
Authors
Martin Roessler; Jochen Schmitt; Olaf Schoffer
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Example of variations in actual mortality rates under internal standardization: Parameter values.
Hospital Structural Measures
johnsnowlabs.com
csv
Updated Jan 20, 2021
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
John Snow Labs (2021). Hospital Structural Measures [Dataset]. https://www.johnsnowlabs.com/marketplace/hospital-structural-measures/
Explore at:
csvAvailable download formats
Dataset updated
Jan 20, 2021
Dataset authored and provided by
John Snow Labs
Time period covered
2016 - 2018
Area covered
United States
Description
This dataset includes a list of hospitals and the availability of structural measures at that hospital. A structural measure reflects the environment in which hospitals care for patients. For example, whether or not a hospital participates in a Cardiac Surgery Registry.
z
NZ Facilities - Dataset - data.govt.nz - discover and use data
portal.zero.govt.nz
catalogue.data.govt.nz
Updated Jun 14, 2021
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2021). NZ Facilities - Dataset - data.govt.nz - discover and use data [Dataset]. https://portal.zero.govt.nz/77d6ef04507c10508fcfc67a7c24be32/dataset/nz-facilities2
Explore at:
Dataset updated
Jun 14, 2021
Description
This dataset provides boundaries of facilities, currently hospitals and schools, within mainland New Zealand originally sourced in early 2021 from a combination of NationalMap and authoritative sources, including NZ Ministry of Education and NZ Ministry of Health. A facility represents a particular activity such as a hospital or school. A facility boundary represents the extent of the land which appears to be used by a facility. A facility boundary can be different to corresponding cadastral parcel polygons because a facility can span across multiple parcels or be located in only part of a parcel. For example, a parcel owned by the crown can include multiple schools and other facilities such as parks and reserves. Facility boundaries in this dataset were used to apply hospital and school building names to the NZ Building Outlines dataset published on the LINZ Data Service. A more detailed description of NZ Facilities can be found in the NZ Facilities Data Dictionary. This Data Dictionary also includes information on how NZ Facilities was used to support the attribution of NZ Building Outlines. NZ Facilities contains data sourced from NationalMap, Ministry of Education and Ministry of Health licensed for reuse under CC BY 4.0. Related data NZ Building Outlines - provides current building outlines only, derived from the latest LINZ aerial imagery. NZ Building Outlines (All Sources) - contains all combinations of building outlines from multiple years of imagery that have existed since the beginning of this dataset, and the dates when each building outline existed in the associated aerial imagery. APIs and web services This dataset is available via ArcGIS Online and ArcGIS REST services, as well as our standard APIs. LDS APIs and OGC web services ArcGIS Online map services
f
Example of variations in case mix under internal standardization: Parameter...
plos.figshare.com
xls
Updated Jun 1, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Martin Roessler; Jochen Schmitt; Olaf Schoffer (2023). Example of variations in case mix under internal standardization: Parameter values. [Dataset]. http://doi.org/10.1371/journal.pone.0257003.t005
Explore at:
xlsAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0257003.t005
Dataset updated
Jun 1, 2023
Dataset provided by
PLOS ONE
Authors
Martin Roessler; Jochen Schmitt; Olaf Schoffer
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Example of variations in case mix under internal standardization: Parameter values.
E
Hospital Discharge Records database
www-acc.healthinformationportal.eu
healthinformationportal.eu
html
Updated Jan 10, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ministero della Salute Italiano (2023). Hospital Discharge Records database [Dataset]. https://www-acc.healthinformationportal.eu/services/find-data?page=26
Explore at:
htmlAvailable download formats
Dataset updated
Jan 10, 2023
Dataset authored and provided by
Ministero della Salute Italiano
Variables measured
sex, title, topics, acronym, country, funding, language, data_owners, description, contact_name, and 16 more
Measurement technique
Hospitalization statistics of the hospitals of the National Health System
Dataset funded by
<p>Public funding</p>
Description
The information flow of the Hospital Discharge database (SDO flow) is the tool for collecting information relating to all hospitalization episodes provided in public and private hospitals throughout the national territory.

Born for purely administrative purposes of the hospital setting, the SDO, thanks to the wealth of information contained, not only of an administrative but also of a clinical nature, has become an indispensable tool for a wide range of analyzes and elaborations, ranging from areas to support of health planning activities for monitoring the provision of hospital assistance and the Essential Levels of Assistance, for use for proxy analyzes of other levels of assistance as well as for more strictly clinical-epidemiological and outcome analyzes. In this regard, the SDO database is a fundamental element of the National Outcomes Program (PNE).

The information collected includes the patient's personal characteristics (including age, sex, residence, level of education), characteristics of the hospitalization (for example institution and discharge discipline, hospitalization regime, method of discharge, booking date, priority class of hospitalization) and clinical features (e.g. main diagnosis, concomitant diagnoses, diagnostic or therapeutic procedures)

Information relating to drugs administered during hospitalization or adverse reactions to them (subject to other specific information flows) is excluded from the discharge form.
Antimicrobial resistance surveillance report, Hypothetical Hospital,...
figshare.com
pdf
Updated Jan 12, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Cherry Lim (2021). Antimicrobial resistance surveillance report, Hypothetical Hospital, Hypothetical Country, 01 Jan 2016 to 31 Dec 2016 [Dataset]. http://doi.org/10.6084/m9.figshare.12117348.v2
Explore at:
pdfAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.12117348.v2
Dataset updated
Jan 12, 2021
Dataset provided by
figshare
Authors
Cherry Lim
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
AutoMated tool for Antimicrobial resistance Surveillance System (AMASS) was developed as an offline, open-access and easy-to-use application that allows a hospital to perform data analysis independently and generate isolate-based and sample-based surveillance reports stratified by infection origin from routinely collected electronic databases. AMASS performs data analysis and generates reports automatically. The application can be downloaded from https://www.amass.websiteThis is a repository of the files for the example data files. The Example_Dataset_2 is a data set generated for AMASS users. This example data set was created to represent a large data set from a 1000-bed hospital, containing microbiology and hospital admission data. The data files can be found under the "Example_Dataset_2" folder of AMASS. The attached four files here are:1) The AMR surveillance report automatically generated from AMASS using hospital admission and microbiology data of Example_Dataset_22) Aggregated summary data in .csv format automatically generated from AMASS3) Data dictionary for microbiology data file of the example data files4) Data dictionary for hospital admission data of the example data files
r
MyHospitals Profile Data - Number of Beds
researchdata.edu.au
data.gov.au
null
Updated Jun 28, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
National Health Performance Authority (2023). MyHospitals Profile Data - Number of Beds [Dataset]. https://researchdata.edu.au/myhospitals-profile-data-number-beds/2737683
Explore at:
nullAvailable download formats
Dataset updated
Jun 28, 2023
Dataset provided by
Australian Urban Research Infrastructure Network (AURIN)
Authors
National Health Performance Authority
License
Attribution-NonCommercial-NoDerivs 3.0 (CC BY-NC-ND 3.0)https://creativecommons.org/licenses/by-nc-nd/3.0/
License information was derived automatically
Area covered

Description
MyHospitals provides performance information for public and private hospitals in Australia. You can also compare the performance of these hospitals and find information about hospitals near you.

The annual average number of beds available to be used by an admitted patient was grouped into the following categories: fewer than 50, 50-100, 100-200, 200-500 and more than 500. These data are as reported by states and territories to the NPHED, and are referred to in statistical publications (including Australian hospital statistics) as 'average available beds'. The average number of available beds presented may differ from counts published elsewhere. For example, counts based on bed numbers at a specified date such as 30 June may differ from the average available beds over the reporting period. Comparability of bed numbers can be affected by the range and types of patients treated by a hospital. For example, hospitals may have different proportions of beds available for general versus special purposes (such as beds or cots used exclusively for intensive care). Bed counts also include chairs for same-day admissions.

Data is current as of December 2015. Data sourced from: http://www.myhospitals.gov.au/about-the-data/download-data
f
Table_1_Rule-Based Models for Risk Estimation and Analysis of In-hospital...
figshare.com
frontiersin.figshare.com
xlsx
Updated Jun 8, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Oliver Haas; Andreas Maier; Eva Rothgang (2023). Table_1_Rule-Based Models for Risk Estimation and Analysis of In-hospital Mortality in Emergency and Critical Care.XLSX [Dataset]. http://doi.org/10.3389/fmed.2021.785711.s001
Explore at:
xlsxAvailable download formats
Unique identifier
https://doi.org/10.3389/fmed.2021.785711.s001
Dataset updated
Jun 8, 2023
Dataset provided by
Frontiers
Authors
Oliver Haas; Andreas Maier; Eva Rothgang
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
We propose a novel method that uses associative classification and odds ratios to predict in-hospital mortality in emergency and critical care. Manual mortality risk scores have previously been used to assess the care needed for each patient and their need for palliative measures. Automated approaches allow providers to get a quick and objective estimation based on electronic health records. We use association rule mining to find relevant patterns in the dataset. The odds ratio is used instead of classical association rule mining metrics as a quality measure to analyze association instead of frequency. The resulting measures are used to estimate the in-hospital mortality risk. We compare two prediction models: one minimal model with socio-demographic factors that are available at the time of admission and can be provided by the patients themselves, namely gender, ethnicity, type of insurance, language, and marital status, and a full model that additionally includes clinical information like diagnoses, medication, and procedures. The method was tested and validated on MIMIC-IV, a publicly available clinical dataset. The minimal prediction model achieved an area under the receiver operating characteristic curve value of 0.69, while the full prediction model achieved a value of 0.98. The models serve different purposes. The minimal model can be used as a first risk assessment based on patient-reported information. The full model expands on this and provides an updated risk assessment each time a new variable occurs in the clinical case. In addition, the rules in the models allow us to analyze the dataset based on data-backed rules. We provide several examples of interesting rules, including rules that hint at errors in the underlying data, rules that correspond to existing epidemiological research, and rules that were previously unknown and can serve as starting points for future studies.
mimic-iii-clinical-database-demo-1.4
kaggle.com
Updated Apr 1, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Montassar bellah (2025). mimic-iii-clinical-database-demo-1.4 [Dataset]. https://www.kaggle.com/datasets/montassarba/mimic-iii-clinical-database-demo-1-4
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Apr 1, 2025
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Montassar bellah
Description
Abstract MIMIC-III is a large, freely-available database comprising deidentified health-related data associated with over 40,000 patients who stayed in critical care units of the Beth Israel Deaconess Medical Center between 2001 and 2012 [1]. The MIMIC-III Clinical Database is available on PhysioNet (doi: 10.13026/C2XW26). Though deidentified, MIMIC-III contains detailed information regarding the care of real patients, and as such requires credentialing before access. To allow researchers to ascertain whether the database is suitable for their work, we have manually curated a demo subset, which contains information for 100 patients also present in the MIMIC-III Clinical Database. Notably, the demo dataset does not include free-text notes.

Background In recent years there has been a concerted move towards the adoption of digital health record systems in hospitals. Despite this advance, interoperability of digital systems remains an open issue, leading to challenges in data integration. As a result, the potential that hospital data offers in terms of understanding and improving care is yet to be fully realized.

MIMIC-III integrates deidentified, comprehensive clinical data of patients admitted to the Beth Israel Deaconess Medical Center in Boston, Massachusetts, and makes it widely accessible to researchers internationally under a data use agreement. The open nature of the data allows clinical studies to be reproduced and improved in ways that would not otherwise be possible.

The MIMIC-III database was populated with data that had been acquired during routine hospital care, so there was no associated burden on caregivers and no interference with their workflow. For more information on the collection of the data, see the MIMIC-III Clinical Database page.

Methods The demo dataset contains all intensive care unit (ICU) stays for 100 patients. These patients were selected randomly from the subset of patients in the dataset who eventually die. Consequently, all patients will have a date of death (DOD). However, patients do not necessarily die during an individual hospital admission or ICU stay.

This project was approved by the Institutional Review Boards of Beth Israel Deaconess Medical Center (Boston, MA) and the Massachusetts Institute of Technology (Cambridge, MA). Requirement for individual patient consent was waived because the project did not impact clinical care and all protected health information was deidentified.

Data Description MIMIC-III is a relational database consisting of 26 tables. For a detailed description of the database structure, see the MIMIC-III Clinical Database page. The demo shares an identical schema, except all rows in the NOTEEVENTS table have been removed.

The data files are distributed in comma separated value (CSV) format following the RFC 4180 standard. Notably, string fields which contain commas, newlines, and/or double quotes are encapsulated by double quotes ("). Actual double quotes in the data are escaped using an additional double quote. For example, the string she said "the patient was notified at 6pm" would be stored in the CSV as "she said ""the patient was notified at 6pm""". More detail is provided on the RFC 4180 description page: https://tools.ietf.org/html/rfc4180

Usage Notes The MIMIC-III demo provides researchers with an opportunity to review the structure and content of MIMIC-III before deciding whether or not to carry out an analysis on the full dataset.

CSV files can be opened natively using any text editor or spreadsheet program. However, some tables are large, and it may be preferable to navigate the data stored in a relational database. One alternative is to create an SQLite database using the CSV files. SQLite is a lightweight database format which stores all constituent tables in a single file, and SQLite databases interoperate well with a number software tools.

DB Browser for SQLite is a high quality, visual, open source tool to create, design, and edit database files compatible with SQLite. We have found this tool to be useful for navigating SQLite files. Information regarding installation of the software and creation of the database can be found online: https://sqlitebrowser.org/

Release Notes Release notes for the demo follow the release notes for the MIMIC-III database.

Acknowledgements This research and development was supported by grants NIH-R01-EB017205, NIH-R01-EB001659, and NIH-R01-GM104987 from the National Institutes of Health. The authors would also like to thank Philips Healthcare and staff at the Beth Israel Deaconess Medical Center, Boston, for supporting database development, and Ken Pierce for providing ongoing support for the MIMIC research community.

Conflicts of Interest The authors declare no competing financial interests.

References Johnson, A. E. W., Pollard, T. J., Shen, L., Lehman, L. H., Feng, M., Ghassemi, M., Mo...
a
Top 3 Non-Local Acute Care Hospitals Accessed by Local Residents, Fiscal...
open.alberta.ca
Updated Oct 21, 2015
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2015). Top 3 Non-Local Acute Care Hospitals Accessed by Local Residents, Fiscal Year 2017/2018 - Open Government [Dataset]. https://open.alberta.ca/dataset/top-3-non-local-acute-care-hospitals-accessed-by-local-residents-fiscal-year-2017-2018
Explore at:
Dataset updated
Oct 21, 2015
Description
This table provides inpatient separations made by local area residents to the top three accessed non-local facilities. The data is provided for the most recent fiscal year available. The data is provided for the most recent fiscal year available. An inpatient separation from a health care facility occurs anytime a patient (or resident) leaves because of death, discharge, sign-out against medical advice or transfer. The number of separations is the most commonly used measure of the utilization of hospital services. Separations, rather than admissions, are used because hospital abstracts for inpatient care are based on information gathered at the time of discharge. This indicator dataset contains information at both Local Geographic Area (for example, Lacombe, Red Deer, Calgary West Bow, etc.) and Alberta levels. Local geographic area refers to 132 geographic areas created by Alberta Health (AH) and Alberta Health Services (AHS) based on census boundaries. This table is the part of "Alberta Health Primary Health Care - Community Profiles" report published March 2019.
h
Our Future Health Linked Health Records Data
healthdatagateway.org
unknown
Updated Jul 6, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Our Future Health (2024). Our Future Health Linked Health Records Data [Dataset]. https://healthdatagateway.org/dataset/889
Explore at:
unknownAvailable download formats
Dataset updated
Jul 6, 2024
Dataset authored and provided by
Our Future Health
License
https://research.ourfuturehealth.org.uk/apply-to-access-the-data/https://research.ourfuturehealth.org.uk/apply-to-access-the-data/
Description
Our Future Health is a prospective, observational cohort study of the general adult population of the United Kingdom (UK). The programme aims to support a wide range of observational health research. We gather personal, health and lifestyle information from each participant through a self-completed baseline health questionnaire and at an in-person clinic visit. We will further link this data to other health-related data sets. Participants have also given consent for us to recontact them, for example to invite them to take part in further or repeat data collections, or other embedded studies such as clinical trials.

The Our Future Health programme is currently open to all adults (18 years and older) living in the UK. In July 2022, we started recruiting participants in England and will continue to expand across the rest of the UK. The data we’ve gathered so far (March 2025) includes linked NHS England clinical data on 1,151,453 participants

Additional linked datasets are available: - ‘Baseline Health Questionnaire Data’ which contains baseline demographic information and responses to our health questionnaire from 1,414,260 participants. - ‘Genotype Array Data’ which includes genotype array data on 707,522 variants from a subset of 651,050 participants - Clinical Measurements Data which contains clinical data from 1,025,498 participants.

The data is stored in the Our Future Health Trusted Research Environment. We de-identify all participant data we gather before it’s available for use. All researchers will need to become registered researchers at Our Future Health and have an approved research study before they're given access to the data.

We aim to collect a variety of data types from up to 5 million adult participants from across the UK. We hope to make more data types available on a quarterly basis.
Outpatient Imaging Efficiency Core Measures by Hospital
johnsnowlabs.com
csv
Updated Jan 20, 2021
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
John Snow Labs (2021). Outpatient Imaging Efficiency Core Measures by Hospital [Dataset]. https://www.johnsnowlabs.com/marketplace/outpatient-imaging-efficiency-core-measures-by-hospital/
Explore at:
csvAvailable download formats
Dataset updated
Jan 20, 2021
Dataset authored and provided by
John Snow Labs
Time period covered
2016 - 2023
Area covered
United States
Description
This dataset includes the hospital data for the Outpatient Imaging Efficiency Core Measures. These Core Measures gives information about the hospitals' use of medical imaging tests for outpatients. Examples of medical imaging tests include CT Scans, MRIs, and mammograms.
u
Top 3 Non-Local Acute Care Hospitals Accessed by Local Residents, Fiscal...
data.urbandatacentre.ca
Updated Oct 22, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2024). Top 3 Non-Local Acute Care Hospitals Accessed by Local Residents, Fiscal Year 2013/2014 - Catalogue - Canadian Urban Data Catalogue (CUDC) [Dataset]. https://data.urbandatacentre.ca/dataset/gov-canada-3446a2eb-6561-437a-806d-7655636f3f2e
Explore at:
Dataset updated
Oct 22, 2024
Description
This table provides inpatient separations made by local area residents to the top three accessed non-local facilities. The data is provided for the most recent fiscal year available. The data is provided for the most recent fiscal year available. An inpatient separation from a health care facility occurs anytime a patient (or resident) leaves because of death, discharge, sign-out against medical advice or transfer. The number of separations is the most commonly used measure of the utilization of hospital services. Separations, rather than admissions, are used because hospital abstracts for inpatient care are based on information gathered at the time of discharge. This indicator dataset contains information at both Local Geographic Area (for example, Lacombe, Red Deer, Calgary West Bow, etc.) and Alberta levels. Local geographic area refers to 132 geographic areas created by Alberta Health (AH) and Alberta Health Services (AHS) based on census boundaries. This table is the part of "Alberta Health Primary Health Care - Community Profiles" report published March 2015.
O
CT School Learning Model Indicators by County (7-day metrics) - ARCHIVE
data.ct.gov
s.cnmilf.com
+1more
application/rdfxml +5
Updated Oct 8, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Department of Public Health (2020). CT School Learning Model Indicators by County (7-day metrics) - ARCHIVE [Dataset]. https://data.ct.gov/Health-and-Human-Services/CT-School-Learning-Model-Indicators-by-County-7-da/rpph-4ysy
Explore at:
json, csv, application/rdfxml, application/rssxml, xml, tsvAvailable download formats
Dataset updated
Oct 8, 2020
Dataset authored and provided by
Department of Public Health
License
U.S. Government Workshttps://www.usa.gov/government-works
License information was derived automatically
Area covered
Connecticut
Description
DPH note about change from 7-day to 14-day metrics: As of 10/15/2020, this dataset is no longer being updated. Starting on 10/15/2020, the school learning model indicator metrics will be calculated using a 14-day average rather than a 7-day average. The new school learning model indicators dataset using 14-day averages can be accessed here: https://data.ct.gov/Health-and-Human-Services/CT-School-Learning-Model-Indicators-by-County-14-d/e4bh-ax24

As you know, we are learning more about COVID-19 all the time, including the best ways to measure COVID-19 activity in our communities. CT DPH has decided to shift to 14-day rates because these are more stable, particularly at the town level, as compared to 7-day rates. In addition, since the school indicators were initially published by DPH last summer, CDC has recommended 14-day rates and other states (e.g., Massachusetts) have started to implement 14-day metrics for monitoring COVID transmission as well.

With respect to geography, we also have learned that many people are looking at the town-level data to inform decision making, despite emphasis on the county-level metrics in the published addenda. This is understandable as there has been variation within counties in COVID-19 activity (for example, rates that are higher in one town than in most other towns in the county).

This dataset includes the leading and secondary metrics identified by the Connecticut Department of Health (DPH) and the Department of Education (CSDE) to support local district decision-making on the level of in-person, hybrid (blended), and remote learning model for Pre K-12 education.

Data represent daily averages for each week by date of specimen collection (cases and positivity), date of hospital admission, or date of ED visit. Hospitalization data come from the Connecticut Hospital Association and are based on hospital location, not county of patient residence. COVID-19-like illness includes fever and cough or shortness of breath or difficulty breathing or the presence of coronavirus diagnosis code and excludes patients with influenza-like illness. All data are preliminary.

These data are updated weekly; the previous week period for each dataset is the previous Sunday-Saturday, known as an MMWR week (https://wwwn.cdc.gov/nndss/document/MMWR_week_overview.pdf). The date listed is the date the dataset was last updated and corresponds to a reporting period of the previous MMWR week. For instance, the data for 8/20/2020 corresponds to a reporting period of 8/9/2020-8/15/2020.

These metrics were adapted from recommendations by the Harvard Global Institute and supplemented by existing DPH measures.

For national data on COVID-19, see COVID View, the national weekly surveillance summary of U.S. COVID-19 activity, at https://www.cdc.gov/coronavirus/2019-ncov/covid-data/covidview/index.html

Notes: 9/25/2020: Data for Mansfield and Middletown for the week of Sept 13-19 were unavailable at the time of reporting due to delays in lab reporting.
d
Inpatients formally detained in hospitals under the Mental Health Act 1983...
digital.nhs.uk
csv, pdf, xlsx
Updated Nov 30, 2016
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2016). Inpatients formally detained in hospitals under the Mental Health Act 1983 and patients subject to Supervised Community Treatment [Dataset]. https://digital.nhs.uk/data-and-information/publications/statistical/inpatients-formally-detained-in-hospitals-under-the-mental-health-act-1983-and-patients-subject-to-supervised-community-treatment
Explore at:
pdf(145.4 kB), csv(152.8 kB), xlsx(48.3 kB), xlsx(123.5 kB), pdf(499.9 kB), pdf(309.8 kB)Available download formats
Dataset updated
Nov 30, 2016
License
https://digital.nhs.uk/about-nhs-digital/terms-and-conditionshttps://digital.nhs.uk/about-nhs-digital/terms-and-conditions
Time period covered
Apr 1, 2015 - Mar 31, 2016
Area covered
England
Description
This publication summarises information collected about uses of The Mental Health Act (1983) ('The Act'), as amended by The Mental Health Act 2007 ('The 2007 Act') and by other legislation, during 2015/16. Under The Act, people with a mental disorder may formally be detained in hospital in the interests of their own health or safety, or can be treated in the community but subject to recall to hospital when necessary for assessment and/or treatment under a Community Treatment Order (sometimes referred to as 'Supervised Community Treatment' or 'SCT'). The release consists of a report providing high level statistics at a national level, which is accompanied by reference data tables and a machine readable file including key measures at provider level. The publication also makes reference to relevant figures from other data sources, including equalities information from the Mental Health and Learning Disabilities Dataset (MHLDDS) and Data on the Use of section 136 Mental Health Act 1983 collected and published by the National Police Chiefs' Council (NPCC). The Mental Health Bulletin 2015/16 is published on the same day as this report. Whilst this report remains the official source of figures for the year, the Mental Health Bulletin publication presents several complementary measures, broken by age, gender, ethnic group and CCG - detail that is not available from the collection that this publication is based upon. In future years, information on uses of The Act will be sourced from the same information used in the Mental Health Bulletin. This will allow uses of The Act to be fully understood in the wider context of service use for the first time. Details of this change along with example analysis of what will now be possible using the MHSDS can be found in a special report included in this publication called Mental Health Act Statistics, Improved reporting for better care.
Hospital Inpatient Discharges (SPARCS De-Identified): 2009
health.data.ny.gov
data.wu.ac.at
application/rdfxml +5
Updated Jul 12, 2017
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
New York State Department of Health (2017). Hospital Inpatient Discharges (SPARCS De-Identified): 2009 [Dataset]. https://health.data.ny.gov/Health/Hospital-Inpatient-Discharges-SPARCS-De-Identified/q6hk-esrj
Explore at:
csv, tsv, application/rdfxml, application/rssxml, xml, jsonAvailable download formats
Dataset updated
Jul 12, 2017
Dataset authored and provided by
New York State Department of Health
Description
The Statewide Planning and Research Cooperative System (SPARCS) Inpatient De-identified dataset contains discharge level detail on patient characteristics, diagnoses, treatments, services, and charges. This data contains basic record level detail regarding the discharge; however the data does not contain protected health information (PHI) under Health Insurance Portability and Accountability Act (HIPAA). The health information is not individually identifiable; all data elements The Statewide Planning and Research Cooperative System (SPARCS) Inpatient De-identified dataset contains discharge level detail on patient characteristics, diagnoses, treatments, services, charges and costs. This data contains basic record level detail regarding the discharge; however the data does not contain protected health information (PHI) under Health Insurance Portability and Accountability Act (HIPAA). The health information is not individually identifiable; all data elements considered identifiable have been redacted. For example, the direct identifiers regarding a date have the day and month portion of the date removed.
i
Type 2 Diabetes Dataset
ieee-dataport.org
Updated Jan 17, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Kabrambam Rupabanta Singh (2024). Type 2 Diabetes Dataset [Dataset]. http://doi.org/10.21227/xm4p-nx87
Explore at:
Unique identifier
https://doi.org/10.21227/xm4p-nx87
Dataset updated
Jan 17, 2024
Dataset provided by
IEEE Dataport
Authors
Kabrambam Rupabanta Singh
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This diabetes dataset was collected from 2000 people at the Frankfurt Hospital, Germany. There are eight features in the dataset. Among the 2000 samples, 684 people are Diabetes patients and the rest of them are normal. This dataset is available in the Kaggle repository. The eight features are given below.i. Pregnancies: To express the Number of pregnanciesii. Glucose: To express the Glucose level in bloodiii. BloodPressure: To express the Blood pressure measurementiv. SkinThickness: To express the thickness of the skinv. Insulin: To express the Insulin level in bloodvi. BMI: To express the Body mass indexvii. DiabetesPedigreeFunction: To express the Diabetes percentageviii. Age: To express the age
Neovascular age related macular degeneration at University Hospitals...
healthdatagateway.org
unknown
Updated Dec 2, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
University Hospitals Birmingham NHS Foundation Trust (2023). Neovascular age related macular degeneration at University Hospitals Birmingham [Dataset]. https://healthdatagateway.org/dataset/90
Explore at:
unknownAvailable download formats
Dataset updated
Dec 2, 2023
Dataset provided by
National Health Servicehttps://www.nhs.uk/
University Hospitals Birmingham NHS Foundation Trusthttp://www.uhb.nhs.uk/
Authors
University Hospitals Birmingham NHS Foundation Trust
License
https://www.insight.hdrhub.org/https://www.insight.hdrhub.org/
Description
Background: Age-related macular degeneration (AMD) is a degenerative disease of the human retina affecting individuals over the age of 55 years. AMD is the leading cause of blindness in industrialized countries. Worldwide, the number of people with AMD is predicted to increase from 196 million in 2020 to 288 million by 2040.

The UHB AMD Dataset is a longitudinal dataset consisting of routinely collected imaging and clinical metadata from patients receiving treatment for age-related macular degeneration (AMD) at UHB, from 2007 to the present.

This dataset encompasses all patients at UHB who have received at least one injection of either Lucentis (ranibizumab) or Eylea (aflibercept) or avastin. This dataset will include data from both eyes in each case - for example, it will include data from fellow eyes that are not receiving injections. For these reasons, the dataset will include longitudinal data from a mixture of eyes with both “dry” and “wet” AMD. Clinical metadata includes demographic information, visual acuities (predominantly measured with Early Treatment Diabetic Retinopathy Study (ETDRS) charts), treatment, and outcomes.

This dataset is continuously updating, however, as of October 2021, it consisted of 15063 eyes receiving treatment for AMD. This is a large single centre database from patients with AMD and covers more than a decade of follow-up for these patients.

Geography The Queen Elizabeth Hospital is one of the largest single-site hospitals in the United Kingdom, with 1,215 inpatient beds. Queen Elizabeth Hospital is part of one of the largest teaching trusts in England (University Hospitals Birmingham). Set within the West Midlands and it has a catchment population of circa 5.9million. The region includes a diverse ethnic, and socio-economic mix, with a higher than UK average of minority ethnic groups. It has a large number of elderly residents but is the youngest population in the UK. There are particularly high rates of diabetes, physical inactivity, obesity, and smoking.

Data source: Ophthalmology department at Queen Elizabeth Hospital, University Hospitals Birmingham NHS Foundation Trust, Birmingham, United Kingdom.

Facebook

Twitter

Click to copy link

Link copied

Cite

Temitope Olufunmi Atoyebi; Rashidah Funke Olanrewaju; N. V. Blamah; Emmanuel Chinanu Uwazie (2023). Malaria disease and grading system dataset from public hospitals reflecting complicated and uncomplicated conditions [Dataset]. http://doi.org/10.5061/dryad.4xgxd25gn

Malaria disease and grading system dataset from public hospitals reflecting complicated and uncomplicated conditions

Explore at:

zipAvailable download formats

Unique identifier

https://doi.org/10.5061/dryad.4xgxd25gn

Dataset updated

Nov 10, 2023

Dataset provided by

Nasarawa State University

Authors

Temitope Olufunmi Atoyebi; Rashidah Funke Olanrewaju; N. V. Blamah; Emmanuel Chinanu Uwazie

License

https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html

Description

Malaria is the leading cause of death in the African region. Data mining can help extract valuable knowledge from available data in the healthcare sector. This makes it possible to train models to predict patient health faster than in clinical trials. Implementations of various machine learning algorithms such as K-Nearest Neighbors, Bayes Theorem, Logistic Regression, Support Vector Machines, and Multinomial Naïve Bayes (MNB), etc., has been applied to malaria datasets in public hospitals, but there are still limitations in modeling using the Naive Bayes multinomial algorithm. This study applies the MNB model to explore the relationship between 15 relevant attributes of public hospitals data. The goal is to examine how the dependency between attributes affects the performance of the classifier. MNB creates transparent and reliable graphical representation between attributes with the ability to predict new situations. The model (MNB) has 97% accuracy. It is concluded that this model outperforms the GNB classifier which has 100% accuracy and the RF which also has 100% accuracy. Methods Prior to collection of data, the researcher was be guided by all ethical training certification on data collection, right to confidentiality and privacy reserved called Institutional Review Board (IRB). Data was be collected from the manual archive of the Hospitals purposively selected using stratified sampling technique, transform the data to electronic form and store in MYSQL database called malaria. Each patient file was extracted and review for signs and symptoms of malaria then check for laboratory confirmation result from diagnosis. The data was be divided into two tables: the first table was called data1 which contain data for use in phase 1 of the classification, while the second table data2 which contains data for use in phase 2 of the classification. Data Source Collection Malaria incidence data set is obtained from Public hospitals from 2017 to 2021. These are the data used for modeling and analysis. Also, putting in mind the geographical location and socio-economic factors inclusive which are available for patients inhabiting those areas. Naive Bayes (Multinomial) is the model used to analyze the collected data for malaria disease prediction and grading accordingly. Data Preprocessing: Data preprocessing shall be done to remove noise and outlier. Transformation: The data shall be transformed from analog to electronic record. Data Partitioning The data which shall be collected will be divided into two portions; one portion of the data shall be extracted as a training set, while the other portion will be used for testing. The training portion shall be taken from a table stored in a database and will be called data which is training set1, while the training portion taking from another table store in a database is shall be called data which is training set2. The dataset was split into two parts: a sample containing 70% of the training data and 30% for the purpose of this research. Then, using MNB classification algorithms implemented in Python, the models were trained on the training sample. On the 30% remaining data, the resulting models were tested, and the results were compared with the other Machine Learning models using the standard metrics. Classification and prediction: Base on the nature of variable in the dataset, this study will use Naïve Bayes (Multinomial) classification techniques; Classification phase 1 and Classification phase 2. The operation of the framework is illustrated as follows: i. Data collection and preprocessing shall be done. ii. Preprocess data shall be stored in a training set 1 and training set 2. These datasets shall be used during classification. iii. Test data set is shall be stored in database test data set. iv. Part of the test data set must be compared for classification using classifier 1 and the remaining part must be classified with classifier 2 as follows: Classifier phase 1: It classify into positive or negative classes. If the patient is having malaria, then the patient is classified as positive (P), while a patient is classified as negative (N) if the patient does not have malaria.
Classifier phase 2: It classify only data set that has been classified as positive by classifier 1, and then further classify them into complicated and uncomplicated class label. The classifier will also capture data on environmental factors, genetics, gender and age, cultural and socio-economic variables. The system will be designed such that the core parameters as a determining factor should supply their value.

Clear search

Close search

Google apps

Main menu

Malaria disease and grading system dataset from public hospitals reflecting...

Patient Episode Dataset for Wales (PEDW)

Example of variations in actual mortality rates under internal...

Hospital Structural Measures

NZ Facilities - Dataset - data.govt.nz - discover and use data

Example of variations in case mix under internal standardization: Parameter...

Hospital Discharge Records database

Antimicrobial resistance surveillance report, Hypothetical Hospital,...

MyHospitals Profile Data - Number of Beds

Table_1_Rule-Based Models for Risk Estimation and Analysis of In-hospital...

mimic-iii-clinical-database-demo-1.4

Top 3 Non-Local Acute Care Hospitals Accessed by Local Residents, Fiscal...

Our Future Health Linked Health Records Data

Outpatient Imaging Efficiency Core Measures by Hospital

Top 3 Non-Local Acute Care Hospitals Accessed by Local Residents, Fiscal...

CT School Learning Model Indicators by County (7-day metrics) - ARCHIVE

Inpatients formally detained in hospitals under the Mental Health Act 1983...

Hospital Inpatient Discharges (SPARCS De-Identified): 2009

Type 2 Diabetes Dataset

Neovascular age related macular degeneration at University Hospitals...

Malaria disease and grading system dataset from public hospitals reflecting complicated and uncomplicated conditions