51 datasets found

P
Healthcare Diagnostics Dataset
paperswithcode.com
Updated Mar 7, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2025). Healthcare Diagnostics Dataset [Dataset]. https://paperswithcode.com/dataset/healthcare-diagnostics
Explore at:
Dataset updated
Mar 7, 2025
Description
Problem Statement

👉 Download the case studies here

A healthcare provider faced challenges in diagnosing diseases from medical images due to the increasing volume of imaging data and the limited availability of skilled radiologists. Manual analysis of X-rays, MRIs, and CT scans was time-intensive, prone to inconsistencies, and delayed critical diagnoses. The provider needed an automated solution to assist radiologists in early disease detection and improve diagnostic efficiency.

Challenge

Automating medical image analysis came with the following challenges:

Accurately identifying subtle anomalies in medical images, which often require expert interpretation.

Ensuring the system’s reliability and compliance with stringent healthcare standards.

Integrating the solution with existing healthcare workflows without disrupting radiologists’ processes.

Solution Provided

An AI-powered diagnostic system was developed using Convolutional Neural Networks (CNN) and computer vision technologies. The solution was designed to:

Analyze medical images to detect early signs of diseases such as tumors, fractures, and infections.

Highlight areas of concern for radiologists, enabling faster decision-making.

Integrate seamlessly with hospital systems, including PACS (Picture Archiving and Communication System) and EHR (Electronic Health Records).

Development Steps

Data Collection

Compiled a diverse dataset of anonymized medical images, including X-rays, MRIs, and CT scans, along with corresponding diagnoses from expert radiologists.

Preprocessing

Normalized and annotated images to highlight regions of interest, ensuring high-quality input for model training.

Model Training

Trained a Convolutional Neural Network (CNN) to identify patterns and anomalies in medical images. Used transfer learning and augmentation techniques to enhance model robustness.

Validation

Tested the model on unseen medical images to evaluate diagnostic accuracy, sensitivity, and specificity.

Deployment

Integrated the trained AI model into the healthcare provider’s imaging systems, providing real-time diagnostic assistance.

Monitoring & Improvement

Established a feedback loop to continually update the model with new cases, improving performance over time.

Results

Increased Diagnostic Accuracy

Achieved an 18% improvement in diagnostic accuracy, reducing the likelihood of misdiagnoses.

Expedited Diagnosis Process

Automated image analysis significantly shortened the time required for diagnosis, enabling quicker treatment decisions.

Enhanced Patient Outcomes

Early and accurate disease detection improved treatment efficacy and patient recovery rates.

Reduced Radiologist Workload

The AI system alleviated the burden on radiologists by automating routine analysis, allowing them to focus on complex cases.

Scalable Solution

The system demonstrated scalability, handling large volumes of imaging data efficiently across multiple facilities.
Global Data Regulation Diagnostic Survey Dataset 2021 - Afghanistan, Angola,...
microdata.worldbank.org
catalog.ihsn.org
+1more
Updated Oct 26, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
World Bank (2023). Global Data Regulation Diagnostic Survey Dataset 2021 - Afghanistan, Angola, Argentina...and 77 more [Dataset]. https://microdata.worldbank.org/index.php/catalog/3866
Explore at:
Dataset updated
Oct 26, 2023
Dataset authored and provided by
World Bankhttp://worldbank.org/
Time period covered
2020
Area covered
Angola, Afghanistan, Argentina...and 77 more
Description
Abstract

The Global Data Regulation Diagnostic provides a comprehensive assessment of the quality of the data governance environment. Diagnostic results show that countries have put in greater effort in adopting enabler regulatory practices than in safeguard regulatory practices. However, for public intent data, enablers for private intent data, safeguards for personal and nonpersonal data, cybersecurity and cybercrime, as well as cross-border data flows. Across all these dimensions, no income group demonstrates advanced regulatory frameworks across all dimensions, indicating significant room for the regulatory development of both enablers and safeguards remains at an intermediate stage: 47 percent of enabler good practices and 41 percent of good safeguard practices are adopted across countries. Under the enabler and safeguard pillars, the diagnostic covers dimensions of e-commerce/e-transactions, enablers further improvement on data governance environment.

The Global Data Regulation Diagnostic is the first comprehensive assessment of laws and regulations on data governance. It covers enabler and safeguard regulatory practices in 80 countries providing indicators to assess and compare their performance. This Global Data Regulation Diagnostic develops objective and standardized indicators to measure the regulatory environment for the data economy across countries. The indicators aim to serve as a diagnostic tool so countries can assess and compare their performance vis-á-vis other countries. Understanding the gap with global regulatory good practices is a necessary first step for governments when identifying and prioritizing reforms.

Geographic coverage

80 countries

Analysis unit

Country

Kind of data

Observation data/ratings [obs]

Sampling procedure

The diagnostic is based on a detailed assessment of domestic laws, regulations, and administrative requirements in 80 countries selected to ensure a balanced coverage across income groups, regions, and different levels of digital technology development. Data are further verified through a detailed desk research of legal texts, reflecting the regulatory status of each country as of June 1, 2020.

Mode of data collection

Mail Questionnaire [mail]

Research instrument

The questionnaire comprises 37 questions designed to determine if a country has adopted good regulatory practice on data governance. The responses are then scored and assigned a normative interpretation. Related questions fall into seven clusters so that when the scores are averaged, each cluster provides an overall sense of how it performs in its corresponding regulatory and legal dimensions. These seven dimensions are: (1) E-commerce/e-transaction; (2) Enablers for public intent data; (3) Enablers for private intent data; (4) Safeguards for personal data; (5) Safeguards for nonpersonal data; (6) Cybersecurity and cybercrime; (7) Cross-border data transfers.

Response rate

100%
m
Multiple Myeloma Dataset (MM-dataset)
data.mendeley.com
Updated Dec 23, 2019
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Multiple Myeloma Dataset (MM-dataset) [Dataset]. https://data.mendeley.com/datasets/7wpcv7kp6f/1
Explore at:
Unique identifier
https://doi.org/10.17632/7wpcv7kp6f.1
Dataset updated
Dec 23, 2019
Authors
Rima GUILAL
License
Attribution-NonCommercial 3.0 (CC BY-NC 3.0)https://creativecommons.org/licenses/by-nc/3.0/
License information was derived automatically
Description
The Multiple Myeloma dataset (MM-dataset) is a new multi-class database with 59 features for 203 patient records categorized into 9 labels stage of MM cancer which are classified by specialists on Hematology. It is made public, in order to allow comparative experiments with other research works.

The Multiple Myeloma (MM) is a type of blood cancer that affects the plasma cells in bone morrow. Its diagnosis is difficult in the early stage and depends on several medical exams and tests, thus its process is very long and can discourage patients. This may be the principal problem.

In the litereture, all the proposed reasearches to the assistance with the medical diagnosis in multiple myeloma (MM) disease, are based on genetic databases. So, we proposed our new dataset which contains the results of different MM diagnosis exams, and which can be used to detect clinical and para-clinical factors for the diagnosis of MM.
Diabetes Diagnosis Dataset
kaggle.com
Updated Jul 31, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Abhay Ayare (2024). Diabetes Diagnosis Dataset [Dataset]. https://www.kaggle.com/datasets/abhayayare/diabetes-diagnosis-dataset
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jul 31, 2024
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Abhay Ayare
Description
This dataset provides comprehensive information about patients' demographics and their corresponding medical metrics related to diabetes. It is designed for use in predictive analysis, machine learning models, and educational purposes. The data includes patient IDs, age, gender, country, blood glucose levels, insulin levels, and diagnosis results.
d
Data from: A Diagnostic Approach for Electro-Mechanical Actuators in...
catalog.data.gov
datadiscoverystudio.org
+2more
Updated Apr 11, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Dashlink (2025). A Diagnostic Approach for Electro-Mechanical Actuators in Aerospace Systems [Dataset]. https://catalog.data.gov/dataset/a-diagnostic-approach-for-electro-mechanical-actuators-in-aerospace-systems
Explore at:
Dataset updated
Apr 11, 2025
Dataset provided by
Dashlink
Description
Electro-mechanical actuators (EMA) are finding increasing use in aerospace applications, especially with the trend towards all all-electric aircraft and spacecraft designs. However, electro-mechanical actuators still lack the knowledge base accumulated for other fielded actuator types, particularly with regard to fault detection and characterization. This paper presents a thorough analysis of some of the critical failure modes documented for EMAs and describes experiments conducted on detecting and isolating a subset of them. The list of failures has been prepared through an extensive Failure Modes and Criticality Analysis (FMECA) reference, literature review, and accessible industry experience. Methods for data acquisition and validation of algorithms on EMA test stands are described. A variety of condition indicators were developed that enabled detection, identification, and isolation among the various fault modes. A diagnostic algorithm based on an artificial neural network is shown to operate successfully using these condition indicators and furthermore, robustness of these diagnostic routines to sensor faults is demonstrated by showing their ability to distinguish between them and component failures. The paper concludes with a roadmap leading from this effort towards developing successful prognostic algorithms for electromechanical actuators.
Data from: Integrated Diagnostic/Prognostic Experimental Setup for Capacitor...
data.nasa.gov
s.cnmilf.com
+2more
Updated Mar 31, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
data.nasa.gov (2025). Integrated Diagnostic/Prognostic Experimental Setup for Capacitor Degradation and Health Monitoring [Dataset]. https://data.nasa.gov/dataset/integrated-diagnostic-prognostic-experimental-setup-for-capacitor-degradation-and-health-m
Explore at:
Dataset updated
Mar 31, 2025
Dataset provided by
NASAhttp://nasa.gov/
Description
This paper proposes the experiments and setups for studying diagnosis and prognosis of electrolytic capacitors in DC-DC power converters. Electrolytic capacitors and power MOSFET’s have higher failure rates than other components in DC-DC converter systems. Currently, our work focuses on experimental analysis and modeling electrolytic capacitors degradation and its effects on the output of DC-DC converter systems. The output degradation is typically measured by the increase in Equivalent series resistance and decrease in capacitance leading to output ripple currents.Typically, the ripple current effects dominate, and they can have adverse effects on downstream components. A model based approach to studying degradation phenomena enables us to combine the physics based modeling of the DC-DC converter with physics of failure models of capacitor degradation, and predict using stochastic simulation methods how system performance deteriorates with time. Degradation experiments were conducted where electrolytic capacitors were subjected to electrical and thermal stress to accelerate the aging of the system. This more systematic analysis may provide a more general and accurate method for computing the remaining useful life (RUL) of the component and the converter system.
Major Diagnostic Categories Summary
data.ca.gov
data.chhs.ca.gov
+1more
csv, docx, zip
Updated Aug 29, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Department of Health Care Access and Information (2024). Major Diagnostic Categories Summary [Dataset]. https://data.ca.gov/dataset/major-diagnostic-categories-summary
Explore at:
docx, csv, zipAvailable download formats
Dataset updated
Aug 29, 2024
Dataset authored and provided by
Department of Health Care Access and Information
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This dataset provides the adjusted length of stay, type of care, discharges with valid charges, charges by hospital, licensure of bed, and Major Diagnostic Category (MDC).
IPPS for all Diagnosis Related Groups - FY 2017
kaggle.com
zip
Updated Aug 30, 2019
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Isaac (2019). IPPS for all Diagnosis Related Groups - FY 2017 [Dataset]. https://www.kaggle.com/datasets/isaacmo1/ipps-for-all-diagnosis-related-groups-fy-2017
Explore at:
zip(5051470 bytes)Available download formats
Dataset updated
Aug 30, 2019
Authors
Isaac
License
https://www.usa.gov/government-works/https://www.usa.gov/government-works/
Description
This data set and initial kernel is my first data set and analysis that I’m doing through R. I’m using the data from data.CMS.Gov for Inpatient Prospective Payment System Provider Summary forAll Diagnosis-Related Groups(DRG) - FY 2017. This data was published on the day that I started the analysis, 8/28/2019. It can be found here

I just recently started to teach myself R through books and online tutorials and the goal is to use these skills to stand out in my job and create future opportunities by leveraging data science skills. Eventually, I want to delve in to Machine Learning algorithms and how to apply them in the healthcare space. For now, the low hanging fruit is data analysis in a field I’m intimately familiar with.

A number of the code chunks here are mimicked from books or from others online that have done similar analysis. As I’m learning, I’m trying to emulate some of the best practices and techniques that others are using until I get familiar enough to understand and apply them to more unique problems. This is part of the reason why I’ve chosen a data set that was released today.

Any constructive critiques or suggestions are greatly appreciated. Also, if you can do more with the data, it would be greatly appreciated as I'll be able to see how others process and analyze the data.
P
ViMedical_Disease Dataset
paperswithcode.com
Updated Jul 27, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2024). ViMedical_Disease Dataset [Dataset]. https://paperswithcode.com/dataset/vimedical-disease
Explore at:
Dataset updated
Jul 27, 2024
Description
This dataset contains over 12K+ questions and symptoms related to various common diseases in Vietnamese. It's designed to aid in the classification of medical symptoms and provide preliminary disease identification. The dataset covers a wide range of diseases, including cardiovascular, digestive, neurological, dermatological, endocrine, and others.

For more information and updates about the dataset, please refer to the main repository here.

This dataset can be used for:

Data analysis Building disease prediction models Creating chatbots Providing information to users

The dataset has two columns: Disease: The name of the disease in Vietnamese. Question: Questions and descriptions of disease symptoms in Vietnamese, often posed as a query seeking information about a possible diagnosis.

Important Notes: This dataset provides information on disease symptoms, not official medical diagnoses. Users should consult a doctor for proper diagnosis and treatment.
Inpatient Prospective Payment System (IPPS) Provider Summary for the Top 100...
catalog.data.gov
data.virginia.gov
+2more
Updated Sep 6, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Department of Health & Human Services (2023). Inpatient Prospective Payment System (IPPS) Provider Summary for the Top 100 Diagnosis-Related Groups (DRG) [Dataset]. https://catalog.data.gov/dataset/inpatient-prospective-payment-system-ipps-provider-summary-for-the-top-100-diagnosis-relat
Explore at:
Dataset updated
Sep 6, 2023
Dataset provided by
United States Department of Health and Human Serviceshttp://www.hhs.gov/
Description
A provider level summary of Inpatient Prospective Payment System (IPPS) discharges, average charges and average Medicare payments for the Top 100 Diagnosis-Related Groups (DRG)
Model-based Diagnostics for Propellant Loading Systems - Dataset - NASA Open...
data.staging.idas-ds1.appdat.jsc.nasa.gov
data.nasa.gov
Updated Feb 19, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
nasa.gov (2025). Model-based Diagnostics for Propellant Loading Systems - Dataset - NASA Open Data Portal [Dataset]. https://data.staging.idas-ds1.appdat.jsc.nasa.gov/dataset/model-based-diagnostics-for-propellant-loading-systems
Explore at:
Dataset updated
Feb 19, 2025
Dataset provided by
NASAhttp://nasa.gov/
Description
The loading of spacecraft propellants is a complex, risky operation. Therefore, diagnostic solutions are neces- sary to quickly identify when a fault occurs, so that recov- ery actions can be taken or an abort procedure can be initi- ated. Model-based diagnosis solutions, established using an in-depth analysis and understanding of the underlying physi- cal processes, offer the advanced capability to quickly detect and isolate faults, identify their severity, and predict their ef- fects on system performance. We develop a physics-based model of a cryogenic propellant loading system, which de- scribes the complex dynamics of liquid hydrogen filling from a storage tank to an external vehicle tank, as well as the in- fluence of different faults on this process. The model takes into account the main physical processes such as highly non- equilibrium condensation and evaporation of the hydrogen vapor, pressurization, and also the dynamics of liquid hydro- gen and vapor flows inside the system in the presence of he- lium gas. Since the model incorporates multiple faults in the system, it provides a suitable framework for model-based di- agnostics and prognostics algorithms. Using this model, we analyze the effects of faults on the system, derive symbolic fault signatures for the purposes of fault isolation, and per- form fault identification using a particle filter approach. We demonstrate the detection, isolation, and identification of a number of faults using simulation-based experiments.
d
Data from: An Event-based Distributed Diagnosis Framework using Structural...
catalog.data.gov
datasets.ai
+2more
Updated Apr 11, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Dashlink (2025). An Event-based Distributed Diagnosis Framework using Structural Model Decomposition [Dataset]. https://catalog.data.gov/dataset/an-event-based-distributed-diagnosis-framework-using-structural-model-decomposition
Explore at:
Dataset updated
Apr 11, 2025
Dataset provided by
Dashlink
Description
Complex engineering systems require efficient on-line fault diagnosis methodologies to improve safety and reduce maintenance costs. Traditionally, diagnosis approaches are centralized, but these solutions do not scale well. Also, centralized diagnosis solutions are difficult to implement on increasingly prevalent distributed, networked embedded systems. This paper presents a distributed diagnosis framework for physical systems with continuous behavior. Using Possible Conflicts, a structural model decomposition method from the Artificial Intelligence model-based diagnosis (DX) community, we develop a distributed diagnoser design algorithm to build local event-based diagnosers. These diagnosers are constructed based on global diagnosability analysis of the system, enabling them to generate local diagnosis results that are globally correct without the use of a centralized coordinator. We also use Possible Conflicts to design local parameter estimators that are integrated with the local diagnosers to form a comprehensive distributed diagnosis framework. Hence, this is a fully distributed approach to fault detection, isolation, and identification. We evaluate the developed scheme on a four-wheeled rover for different design scenarios to show the advantages of using Possible Conflicts, and generate on-line diagnosis results in simulation to demonstrate the approach.
Dataset for: Quantifying how diagnostic test accuracy depends on threshold...
search.datacite.org
Updated Jul 31, 2019
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Hayley Elizabeth Jones; Constantine Gatsonis; Thomas A Trikalinos; Nicky J Welton; Tony Ades (2019). Dataset for: Quantifying how diagnostic test accuracy depends on threshold in a meta-analysis [Dataset]. http://doi.org/10.6084/m9.figshare.8267015
Explore at:
Unique identifier
https://doi.org/10.6084/m9.figshare.8267015
Dataset updated
Jul 31, 2019
Dataset provided by
DataCitehttps://www.datacite.org/
Wiley
Authors
Hayley Elizabeth Jones; Constantine Gatsonis; Thomas A Trikalinos; Nicky J Welton; Tony Ades
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Description
Tests for disease often produce a continuous measure, such as the concentration of some biomarker in a blood sample. In clinical practice, a threshold C is selected such that results, say, greater than C are declared positive, and those less than C negative. Measures of test accuracy such as sensitivity and specificity depend crucially on C, and the optimal value of this threshold is usually a key question for clinical practice. Standard methods for meta-analysis of test accuracy (i) do not provide summary estimates of accuracy at each threshold, precluding selection of the optimal threshold, and further (ii) do not make use of all available data. We describe a multinomial meta-analysis model that can take any number of pairs of sensitivity and specificity from each study and explicitly quantifies how accuracy depends on C. Our model assumes that some pre-specified or Box-Cox transformation of test results in the diseased and disease-free populations has a logistic distribution. The Box-Cox transformation parameter can be estimated from the data, allowing for a flexible range of underlying distributions. We parameterise in terms of the means and scale parameters of the two logistic distributions. In addition to credible intervals for the pooled sensitivity and specificity across all thresholds, we produce prediction intervals, allowing for between-study heterogeneity in all parameters. We demonstrate the model using two case study meta-analyses, examining the accuracy of tests for acute heart failure and pre-eclampsia. We show how the model can be extended to explore reasons for heterogeneity using study-level covariates.
A
‘Diagnostic Procedure Codes ( Procedure Group 1119)’ analyzed by Analyst-2
analyst-2.ai
Updated Jan 28, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com) (2022). ‘Diagnostic Procedure Codes ( Procedure Group 1119)’ analyzed by Analyst-2 [Dataset]. https://analyst-2.ai/analysis/data-gov-diagnostic-procedure-codes-procedure-group-1119-f4eb/latest
Explore at:
Dataset updated
Jan 28, 2022
Dataset authored and provided by
Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com)
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Analysis of ‘Diagnostic Procedure Codes ( Procedure Group 1119)’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://catalog.data.gov/dataset/0116d7da-b5d7-4436-ad4e-d93a5c4eb428 on 28 January 2022.

--- Dataset description provided by original source is as follows ---

Most codes in this group do not appear on the Prioritized List of Health Services; however, they are covered when billed with a diagnosis code from OHP's Diagnostic Workup File (Code Group 6032). This list does not guarantee coverage. For specific coverage information, call the Provider Services Unit at 1-800-336-6016.

--- Original source retains full ownership of the source dataset ---
d
Data from: Improving Distributed Diagnosis Through Structural Model...
catalog.data.gov
data.staging.idas-ds1.appdat.jsc.nasa.gov
+1more
Updated Apr 10, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Dashlink (2025). Improving Distributed Diagnosis Through Structural Model Decomposition [Dataset]. https://catalog.data.gov/dataset/improving-distributed-diagnosis-through-structural-model-decomposition
Explore at:
Dataset updated
Apr 10, 2025
Dataset provided by
Dashlink
Description
Complex engineering systems require efficient fault diagnosis methodologies, but centralized ap- proaches do not scale well, and this motivates the development of distributed solutions. This work presents an event-based approach for distributed diagnosis of abrupt parametric faults in continuous systems, by using the structural model decompo- sition capabilities provided by Possible Conflicts. We develop a distributed diagnosis algorithm that uses residuals, computed by extending Possible Conflicts, to build local event-based diagnosers based on global diagnosability analysis that gen- erate globally correct local diagnosis results. The proposed approach is applied to a multi-tank sys- tem, and results demonstrate an improvement in the design of local diagnosers. Since local diag- nosers use only a subset of the residuals, and use subsystem models to compute residuals (instead of the global system model), the local diagnosers are more efficient than previously developed dis- tributed approaches.
Z
Simulation to optimize the laboratory diagnosis of bacteremia: event times...
data.niaid.nih.gov
zenodo.org
Updated Apr 18, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Gerada, Alessandro (2024). Simulation to optimize the laboratory diagnosis of bacteremia: event times dataset from observational and retrospective studies [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_10991329
Explore at:
Dataset updated
Apr 18, 2024
Dataset provided by
Roberts, Gareth
Gerada, Alessandro
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Introduction

Dataset consisting of three .csv files generated from observational and retrospective data analysis for the manuscript titled: Simulation to optimize the laboratory diagnosis of bacteremia, by Gerada et al. Dataset consists of event time-stamps for blood cultures. All identifiers have been stripped.

Dataset summary

observational_data.csv – time-stamps generated from direct observation of blood culture processing. Note that data is in wide format, so that column names contain the event name. Also, events are not related to a particular specimen, therefore, column lengths are not equal.

retrospecitve_audit.csv – time-stamps generated from manualk retrospective audit of laboratory information system. Column Lab indicates the incubator. Rows are per specimen.

retrospective_data.csv – time-stamps pulled automatically from laboratory information system (without manual curation). Rows are per specimen.
d
Data from: Distributed Diagnosis in Uncertain Environments Using Dynamic...
catalog.data.gov
datasets.ai
+4more
Updated Apr 10, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Dashlink (2025). Distributed Diagnosis in Uncertain Environments Using Dynamic Bayesian Networks [Dataset]. https://catalog.data.gov/dataset/distributed-diagnosis-in-uncertain-environments-using-dynamic-bayesian-networks
Explore at:
Dataset updated
Apr 10, 2025
Dataset provided by
Dashlink
Description
This paper presents a distributed Bayesian fault diagnosis scheme for physical systems. Our diagnoser design is based on a procedure for factoring the global system bond graph (BG) into a set of structurally observable bond graph fac- tors (BG-Fs). Each BG-F is systematically translated into a corresponding DBN Factor (DBN-F), which is then used in its corresponding local diagnoser for quantitative fault detec- tion, isolation, and identification. By construction, the ran- dom variables in each DBN-F are conditionally independent of the random variables in all other DBN-Fs, given a subset of communicated measurements considered as system inputs. Each DBN-F and BG-F pair is used to derive a local diag- noser that generates globally correct diagnosis results by lo- cal analysis. Together, the local diagnosers diagnose all single faults of interest in the system. We demonstrate on an electri- cal system how our distributed diagnosis scheme is compu- tationally more efficient than its centralized counterpart, but without compromising the accuracy of the diagnosis results.
d
HES-DID Data Linkage Report
digital.nhs.uk
pdf
Updated Jul 7, 2016
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2016). HES-DID Data Linkage Report [Dataset]. https://digital.nhs.uk/data-and-information/publications/statistical/hes-did-data-linkage-report
Explore at:
pdf(210.8 kB), pdf(165.5 kB)Available download formats
Dataset updated
Jul 7, 2016
License
https://digital.nhs.uk/about-nhs-digital/terms-and-conditionshttps://digital.nhs.uk/about-nhs-digital/terms-and-conditions
Time period covered
Apr 1, 2015 - Feb 29, 2016
Area covered
England
Description
This is the latest statistical publication of linked HES (Hospital Episode Statistics) and DID (Diagnostic Imaging Dataset) data held by the Health and Social Care Information Centre. The HES-DID linkage provides the ability to undertake national (within England) analysis along acute patient pathways to understand typical imaging requirements for given procedures, and/or the outcomes after particular imaging has been undertaken, thereby enabling a much deeper understanding of outcomes of imaging and to allow assessment of variation in practice. This publication aims to highlight to users the availability of this updated linkage and provide users of the data with some standard information to assess their analysis approach against. The two data sets have been linked using specific patient identifiers collected in HES and DID. The linkage allows the data sets to be linked from April 2012 when the DID data was first collected; however this report focuses on patients who were present in either data set for the period April 2015-February 2016 only. For DID this is provisional 2015/16 data. For HES this is provisional 2015/16 data. The linkage used for this publication was created on 06 June 2016 and released together with this publication on 07 July 2016.
a
‘Anemia Diagnosis Dataset’ analyzed by Analyst-2
analyst-2.ai
Updated Jan 28, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com) (2022). ‘Anemia Diagnosis Dataset’ analyzed by Analyst-2 [Dataset]. https://analyst-2.ai/analysis/kaggle-anemia-diagnosis-dataset-93fe/7d52fab6/?iid=003-489&v=presentation
Explore at:
Dataset updated
Jan 28, 2022
Dataset authored and provided by
Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com)
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Analysis of ‘Anemia Diagnosis Dataset’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://www.kaggle.com/saurabhshahane/anemia-diagnosis-dataset on 28 January 2022.

--- Dataset description provided by original source is as follows ---

Context

This data set presents the prevalence of different types of Anemia including it’s severity and association with age and gender of the study population with CBC data set parameters as variables. We generated dataset from complete blood count test performed by Hematology analyzer to determine the prevalence of different types of Anemia treated at the Eureka diagnostic center in Lucknow, India. All the procedures for the CBC test were done following standard operating protocols defined for the Hematology analyzer. For CBC investigation, 400 patient samples were randomly selected to compute the dataset from the patients who visited the Eureka diagnostic center in Lucknow for various clinical examinations. The diagnostic center performs 4 – 8CBC investigations a day on average. During the data collection period between September 2020 to December 2020, 1000 CBC investigations were performed, out of which 400 random samples were selected. We included adult males and females who are not pregnant and older than 15 years of age in the study population. Infants, young children less than 10 years old and pregnant women were excluded from the study due to various factors like variable CBC test values and other factors. After excluding the above stated persons from the randomly chosen sample of 400 patients, we were left with 364 patients in the final data set.

Acknowledgements

Vohra, Rajan; pahareeya, jankisharan; Hussain, Abir (2021), “Complete Blood Count Anemia Diagnosis”, Mendeley Data, V1, doi: 10.17632/dy9mfjchm7.1

--- Original source retains full ownership of the source dataset ---
Medical Imaging (CT-Xray) Colorization New Dataset
kaggle.com
Updated Mar 18, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Shuvo Kumar Basak-4004.o (2025). Medical Imaging (CT-Xray) Colorization New Dataset [Dataset]. http://doi.org/10.34740/kaggle/dsv/11072909
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Unique identifier
https://doi.org/10.34740/kaggle/dsv/11072909
Dataset updated
Mar 18, 2025
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Shuvo Kumar Basak-4004.o
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
Medical Imaging (CT-Xray) Colorization New Dataset 🩺💻🖼️ This dataset provides a collection of medical imaging data, including both CT (Computed Tomography) and X-ray images, with an added focus on colorization techniques. The goal of this dataset is to facilitate the enhancement of diagnostic processes by applying various colorization techniques to grayscale medical images, allowing researchers and machine learning models to explore the effects of color in radiology.

Key Features: CT and X-ray Images 🏥: Contains both CT scans and X-ray images, widely used in medical diagnostics. Colorized Medical Images 🌈: Each image has been colorized using advanced methods to improve visual interpretation and analysis, including details that might not be immediately obvious in grayscale images. New Dataset 📊: This dataset is newly created to provide high-quality colorized medical imaging, ideal for training AI models in medical image analysis and enhancing diagnostic accuracy. https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F15408835%2F4bfb7257cf09b0a118808b289c6c3ed4%2Fmotion_image.gif?generation=1742292037458801&alt=media" alt=""> https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F15408835%2F20c64287d3b580a36bf8f948f82dbb6b%2Fmotion_image2.gif?generation=1742292060396551&alt=media" alt=""> https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F15408835%2Fdb91cac64f5a6a9100ac117fc8a55ee5%2Fmotion_image4.gif?generation=1742292150147491&alt=media" alt=""> https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F15408835%2F8624a8cab05645e3a5f02a2c1e3e9e3f%2Fmotion_image3.gif?generation=1742292165846162&alt=media" alt="">

Methods Used for Colorization: Basic Color Map Application 🎨: Applying standard color maps to highlight structures in CT and X-ray images. Adaptive Histogram Equalization (CLAHE) 🔍: Adaptive enhancement to improve contrast and highlight important features, especially in medical contexts. Contrast Stretching 📈: Adjusting image intensity to enhance visual details and improve diagnostic quality. Gaussian Blur 🌀: Applied to reduce noise, offering a smoother image for better processing. Edge Detection (Canny) ✨: Detecting edges and contours, useful for identifying specific features in medical scans. Random Color Palettes 🎨: Using randomized color schemes for unique visual representations. Gamma Correction 🌟: Adjusting image brightness to reveal more information hidden in the shadows. LUT (Lookup Table) Color Mapping 💡: Applying predefined color lookups for visually appealing representations. Alpha Blending 🔶: Blending colorized regions based on certain thresholds to highlight structures or anomalies. 3D Rendering 🔺: For creating 3D-like visualizations from 2D scans. Heatmap Visualization 🔥: Highlighting areas of interest, such as anomalies or tumors, using heatmap color gradients. Interactive Segmentation 🖱️: Interactive visualizations that help in segmenting regions of interest in medical images. Applications 🏥💡 This dataset has numerous applications, particularly in the field of medical image analysis, AI development, and diagnostic improvement. Some of the major applications include:

Medical Diagnostics Enhancement 🔍:

Colorization can aid radiologists in interpreting CT and X-ray images by making abnormalities more visible. Helps in visualizing tumors, fractures, or other anomalies, especially in cases where grayscale images are hard to interpret. AI and Machine Learning for Healthcare 🤖:

Used for training deep learning models in image segmentation, detection, and classification of diseases (e.g., cancer detection). AI models can be trained on these colorized images to improve accuracy in diagnostic tools, leading to early disease detection. Medical Image Enhancement 🖼️:

Enables improved contrast, better detail visibility, and highlighting of specific anatomical regions using color. Colorization may improve the accuracy of radiological assessments by allowing professionals to more easily spot abnormalities and changes over time. Data Augmentation for Model Training 📚:

The colorized images can serve as an additional data source for training AI models, increasing model robustness through synthetic data generation. Various colorization methods (like heatmaps and random palettes) can be used to augment image variations, improving model performance under different conditions. Visualizing Anomalies for Anomaly Detection 🔥:

Heatmap visualization helps detect subtle and hidden anomalies by coloring the areas of interest with intensity, enabling faster identification of potential issues. Edge detection and segmentation techniques enhance the ability to detect the edges and boundaries of tumors, fractures, and other critical features. 3D Image Rendering for Detailed Analysis 🧠:

3D rend...

Facebook

Twitter

Click to copy link

Link copied

Cite

(2025). Healthcare Diagnostics Dataset [Dataset]. https://paperswithcode.com/dataset/healthcare-diagnostics

Healthcare Diagnostics Dataset

Explore at:

Dataset updated

Mar 7, 2025

Description

Problem Statement

👉 Download the case studies here

A healthcare provider faced challenges in diagnosing diseases from medical images due to the increasing volume of imaging data and the limited availability of skilled radiologists. Manual analysis of X-rays, MRIs, and CT scans was time-intensive, prone to inconsistencies, and delayed critical diagnoses. The provider needed an automated solution to assist radiologists in early disease detection and improve diagnostic efficiency.

Challenge

Automating medical image analysis came with the following challenges:

Accurately identifying subtle anomalies in medical images, which often require expert interpretation.

Ensuring the system’s reliability and compliance with stringent healthcare standards.

Integrating the solution with existing healthcare workflows without disrupting radiologists’ processes.

Solution Provided

An AI-powered diagnostic system was developed using Convolutional Neural Networks (CNN) and computer vision technologies. The solution was designed to:

Analyze medical images to detect early signs of diseases such as tumors, fractures, and infections.

Highlight areas of concern for radiologists, enabling faster decision-making.

Integrate seamlessly with hospital systems, including PACS (Picture Archiving and Communication System) and EHR (Electronic Health Records).

Development Steps

Data Collection

Compiled a diverse dataset of anonymized medical images, including X-rays, MRIs, and CT scans, along with corresponding diagnoses from expert radiologists.

Preprocessing

Normalized and annotated images to highlight regions of interest, ensuring high-quality input for model training.

Model Training

Trained a Convolutional Neural Network (CNN) to identify patterns and anomalies in medical images. Used transfer learning and augmentation techniques to enhance model robustness.

Validation

Tested the model on unseen medical images to evaluate diagnostic accuracy, sensitivity, and specificity.

Deployment

Integrated the trained AI model into the healthcare provider’s imaging systems, providing real-time diagnostic assistance.

Monitoring & Improvement

Established a feedback loop to continually update the model with new cases, improving performance over time.

Results

Increased Diagnostic Accuracy

Achieved an 18% improvement in diagnostic accuracy, reducing the likelihood of misdiagnoses.

Expedited Diagnosis Process

Automated image analysis significantly shortened the time required for diagnosis, enabling quicker treatment decisions.

Enhanced Patient Outcomes

Early and accurate disease detection improved treatment efficacy and patient recovery rates.

Reduced Radiologist Workload

The AI system alleviated the burden on radiologists by automating routine analysis, allowing them to focus on complex cases.

Scalable Solution

The system demonstrated scalability, handling large volumes of imaging data efficiently across multiple facilities.

Clear search

Close search

Google apps

Main menu

Healthcare Diagnostics Dataset

Global Data Regulation Diagnostic Survey Dataset 2021 - Afghanistan, Angola,...

Abstract

Geographic coverage

Analysis unit

Kind of data

Sampling procedure

Mode of data collection

Research instrument

Response rate

Multiple Myeloma Dataset (MM-dataset)

Diabetes Diagnosis Dataset

Data from: A Diagnostic Approach for Electro-Mechanical Actuators in...

Data from: Integrated Diagnostic/Prognostic Experimental Setup for Capacitor...

Major Diagnostic Categories Summary

IPPS for all Diagnosis Related Groups - FY 2017

ViMedical_Disease Dataset

Inpatient Prospective Payment System (IPPS) Provider Summary for the Top 100...

Model-based Diagnostics for Propellant Loading Systems - Dataset - NASA Open...

Data from: An Event-based Distributed Diagnosis Framework using Structural...

Dataset for: Quantifying how diagnostic test accuracy depends on threshold...

‘Diagnostic Procedure Codes ( Procedure Group 1119)’ analyzed by Analyst-2

Data from: Improving Distributed Diagnosis Through Structural Model...

Simulation to optimize the laboratory diagnosis of bacteremia: event times...

Data from: Distributed Diagnosis in Uncertain Environments Using Dynamic...

HES-DID Data Linkage Report

‘Anemia Diagnosis Dataset’ analyzed by Analyst-2

Context

Acknowledgements

Medical Imaging (CT-Xray) Colorization New Dataset

Healthcare Diagnostics DatasetSee More Versions

Healthcare Diagnostics Dataset