https://www.pioneerdatahub.co.uk/data/data-request-process/https://www.pioneerdatahub.co.uk/data/data-request-process/
OMOP dataset: Hospital COVID patients: severity, acuity, therapies, outcomes Dataset number 2.0
Coronavirus disease 2019 (COVID-19) was identified in January 2020. Currently, there have been more than 6 million cases & more than 1.5 million deaths worldwide. Some individuals experience severe manifestations of infection, including viral pneumonia, adult respiratory distress syndrome (ARDS) & death. There is a pressing need for tools to stratify patients, to identify those at greatest risk. Acuity scores are composite scores which help identify patients who are more unwell to support & prioritise clinical care. There are no validated acuity scores for COVID-19 & it is unclear whether standard tools are accurate enough to provide this support. This secondary care COVID OMOP dataset contains granular demographic, morbidity, serial acuity and outcome data to inform risk prediction tools in COVID-19.
PIONEER geography The West Midlands (WM) has a population of 5.9 million & includes a diverse ethnic & socio-economic mix. There is a higher than average percentage of minority ethnic groups. WM has a large number of elderly residents but is the youngest population in the UK. Each day >100,000 people are treated in hospital, see their GP or are cared for by the NHS. The West Midlands was one of the hardest hit regions for COVID admissions in both wave 1 & 2.
EHR. University Hospitals Birmingham NHS Foundation Trust (UHB) is one of the largest NHS Trusts in England, providing direct acute services & specialist care across four hospital sites, with 2.2 million patient episodes per year, 2750 beds & 100 ITU beds. UHB runs a fully electronic healthcare record (EHR) (PICS; Birmingham Systems), a shared primary & secondary care record (Your Care Connected) & a patient portal “My Health”. UHB has cared for >5000 COVID admissions to date. This is a subset of data in OMOP format.
Scope: All COVID swab confirmed hospitalised patients to UHB from January – August 2020. The dataset includes highly granular patient demographics & co-morbidities taken from ICD-10 & SNOMED-CT codes. Serial, structured data pertaining to care process (timings, staff grades, specialty review, wards), presenting complaint, acuity, all physiology readings (pulse, blood pressure, respiratory rate, oxygen saturations), all blood results, microbiology, all prescribed & administered treatments (fluids, antibiotics, inotropes, vasopressors, organ support), all outcomes.
Available supplementary data: Health data preceding & following admission event. Matched “non-COVID” controls; ambulance, 111, 999 data, synthetic data. Further OMOP data available as an additional service.
Available supplementary support: Analytics, Model build, validation & refinement; A.I.; Data partner support for ETL (extract, transform & load) process, Clinical expertise, Patient & end-user access, Purchaser access, Regulatory requirements, Data-driven trials, “fast screen” services.
This dataset is grouped by service provider specialty, and provides information about the number of recipients, number of claims, and dollar amount for given diagnosis claims. Restricted to claims with service date between 01/2012 to 12/2017. Restricted to claims with a primary diagnosis only. Restricted to top 100 most frequent diagnosis codes that are marked as primary diagnosis of a claim. Provider is the rendering provider marked in the claim. Provider specialty is the primary specialty of the rendering provider. This data is for research purposes and is not intended to be used for reporting. Due to differences in geographic aggregation, time period considerations, and units of analysis, these numbers may differ from those reported by FSSA.
https://www.pioneerdatahub.co.uk/data/data-request-process/https://www.pioneerdatahub.co.uk/data/data-request-process/
The acute-care pathway (from the emergency department (ED) through acute medical units or ambulatory care and on to wards) is the most visible aspect of the hospital health-care system to most patients. Acute hospital admissions are increasing yearly and overcrowded emergency departments and high bed occupancy rates are associated with a range of adverse patient outcomes. Predicted growth in demand for acute care driven by an ageing population and increasing multimorbidity is likely to exacerbate these problems in the absence of innovation to improve the processes of care.
Key targets for Emergency Medicine services are changing, moving away from previous 4-hour targets. This will likely impact the assessment of patients admitted to hospital through Emergency Departments.
This data set provides highly granular patient level information, showing the day-to-day variation in case mix and acuity. The data includes detailed demography, co-morbidity, symptoms, longitudinal acuity scores, physiology and laboratory results, all investigations, prescriptions, diagnoses and outcomes. It could be used to develop new pathways or understand the prevalence or severity of specific disease presentations.
PIONEER geography: The West Midlands (WM) has a population of 5.9 million & includes a diverse ethnic & socio-economic mix.
Electronic Health Record: University Hospital Birmingham is one of the largest NHS Trusts in England, providing direct acute services & specialist care across four hospital sites, with 2.2 million patient episodes per year, 2750 beds & an expanded 250 ITU bed capacity during COVID. UHB runs a fully electronic healthcare record (EHR) (PICS; Birmingham Systems), a shared primary & secondary care record (Your Care Connected) & a patient portal “My Health”.
Scope: All patients with a medical emergency admitted to hospital, flowing through the acute medical unit. Longitudinal & individually linked, so that the preceding & subsequent health journey can be mapped & healthcare utilisation prior to & after admission understood. The dataset includes patient demographics, co-morbidities taken from ICD-10 & SNOMED-CT codes. Serial, structured data pertaining to process of care (timings, admissions, wards and readmissions), physiology readings (NEWS2 score and clinical frailty scale), Charlson comorbidity index and time dimensions.
Available supplementary data: Matched controls; ambulance data, OMOP data, synthetic data.
Available supplementary support: Analytics, Model build, validation & refinement; A.I.; Data partner support for ETL (extract, transform & load) process, Clinical expertise, Patient & end-user access, Purchaser access, Regulatory requirements, Data-driven trials, “fast screen” services.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The dataset contains counts of inpatient visits leading to a discharge to hospice care. Inpatient visits included in the counts consist of individuals aged 18 or over with a discharge disposition leading to home or facility hospice care. The total counts per each individual year can be viewed based on different patient characteristics, including patient age groups, individual counties of residence, primary payer type, diagnosis category, and patient sex/race/ethnicity. The disease categories include circulatory conditions, diabetes, malignant/benign neoplasms, malnutrition, neurodegenerative disease, renal failure or other kidney diagnoses, respiratory conditions and circulatory conditions. The categories represent common groupings of diagnoses seen in other studies related to hospice care and were created by grouping together relevant medical MSDRG codes in the HCAI inpatient data.
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
By [source]
This dataset contains the power to help us better understand the prevalence and treatment outcomes of childhood allergies over an extended period of time. Not only does it publicize the number of individuals currently suffering from asthma, atopic dermatitis, allergic rhinitis and food allergies through retrospective data as reported by healthcare providers - but it also features a set of columns which allow us to gain valuable insights into how these outcomes differ across different demographics such as gender, race and ethnicity. By further examining this data, we can start to recognize patterns in trends among the diagnosed cases - paving way for new treatments and prevention strategies that could prevent severe allergic reactions for many children all around the world
For more datasets, click here.
- 🚨 Your notebook can be here! 🚨!
Assess what kind of questions you want to answer using this data - do you want to focus on one particular type of allergy or analyze them together? Do you want a descriptive analysis or would an analysis that looks for correlations between conditions be more appropriate?
Once you have determined your research question(s), identify what variables from the dataset are pertinent to your inquiry and assess any outliers that might need further investigation or filtering out during your analysis. Also consider any independent variables or confounding factors which might affect your results as well as any existing hypotheses related to the topic that might help guide your research project expectations
Be aware of potential sources of bias when using self-reported healthcare provider information such as difficulties in disease identification (i.e allergies may be misdiagnosed). Additionally note that many allergy cases may go unreported/unrecorded due issues such as lack access/awareness about healthcare etc). A good way combat bias is by sample size - use largest possible datasets whenever available!
Begin collecting relevant data from columns pertaining medical history (allergy diagnosis start & end date etc.), patient demographic information (gender factor ,ethnicity factor etc.), treatment trends & outcomes( first Asthma RX date , last asthma RX date , NUM asthma rx etc ). To get the most insights outta thisdata all these factors must be taken into account – if there isn’t enough evidence then explore other reliable sources too
Structure & organize collected data so they can me easily accessed later – maybe create separate sheets/tabs with different categories i.e patient/treatment information OR create individual sheets for each subject depending upon how much info needs collecting .Designing formulaic functions will not only make life easier but critically save time & energy when it comes analyzing vast amounts data stored within workbook ! Remember larger sample sizes provide more
- Use the dataset to identify risk factors or patterns in childhood allergies that can inform preventative and treatment measures.
- Investigate the correlation between demographic characteristics (e.g., age, gender) and diagnosis or severity of childhood allergies by using cross-tabs or other statistical techniques on the data provided in this dataset.
- Analyze longitudinal trends in treatment outcomes for various types of childhood allergy, such as asthma, atopic dermatitis and food allergy by comparing patient results over time (i.e., looking at pre-treatment diagnosis and post-treatment diagnoses)
If you use this dataset in your research, please credit the original authors. Data Source
License: CC0 1.0 Universal (CC0 1.0) - Public Domain Dedication No Copyright - You can copy, modify, distribute and perform the work, even for commercial purposes, all without asking permission. See Other Information.
File: food-allergy-analysis-Zenodo.csv | Column name | Description | |:----------------------------|:--------------------------------------------------------------| | BIRTH_YEAR | Year of birth of the patient. (Integer) | | GENDER_FACTOR ...
https://www.pioneerdatahub.co.uk/data/data-request-process/https://www.pioneerdatahub.co.uk/data/data-request-process/
Background Acute compartment syndrome (ACS) is an emergency orthopaedic condition wherein a rapid rise in compartmental pressure compromises blood perfusion to the tissues leading to ischaemia and muscle necrosis. This serious condition is often misdiagnosed or associated with significant diagnostic delay, and can lead to limb amputations and death.
The most common causes of ACS are high impact trauma, especially fractures of the lower limbs which account for 40% of ACS cases. ACS is a challenge to diagnose and treat effectively, with differing clinical thresholds being utilised which can result in unnecessary osteotomy. The highly granular synthetic data for over 900 patients with ACS provide the following key parameters to support critical research into this condition:
PIONEER geography: The West Midlands (WM) has a population of 5.9 million & includes a diverse ethnic & socio-economic mix. UHB is one of the largest NHS Trusts in England, providing direct acute services & specialist care across four hospital sites, with 2.2 million patient episodes per year, 2750 beds & an expanded 250 ITU bed capacity during COVID. UHB runs a fully electronic healthcare record (EHR) (PICS; Birmingham Systems), a shared primary & secondary care record (Your Care Connected) & a patient portal “My Health”.
Scope: Enabling data-driven research and machine learning models towards improving the diagnosis of Acute compartment syndrome. Longitudinal & individually linked, so that the preceding & subsequent health journey can be mapped & healthcare utilisation prior to & after admission understood. The dataset includes highly granular patient demographics, physiological parameters, muscle biomarkers, blood biomarkers and co-morbidities taken from ICD-10 & SNOMED-CT codes. Serial, structured data pertaining to process of care (timings and admissions), presenting complaint, lab analysis results (eGFR, troponin, CRP, INR, ABG glucose), systolic and diastolic blood pressures, procedures and surgery details.
Available supplementary data: ACS cohort, Matched controls; ambulance, OMOP data. Available supplementary support: Analytics, Model build, validation & refinement; A.I.; Data partner support for ETL (extract, transform & load) process, Clinical expertise, Patient & end-user access, Purchaser access, Regulatory requirements, Data-driven trials, “fast screen” services.
https://www.pioneerdatahub.co.uk/data/data-request-process/https://www.pioneerdatahub.co.uk/data/data-request-process/
Lactate is a chemical produced by the body as cells consume energy - in times of stress more lactate is produced. In the past, we thought that lactate was just a waste product, but more recently we have learned that lactate has an important role to play in the body.
People suffering from certain severe illnesses may have a high ‘lactate’ level in their blood. This is particularly common in the following:
Severe infections which the body cannot properly control (sepsis)
People who have sustained severe injuries (traumatic injury)
People who are critically unwell with other illnesses (needing treatment in an intensive care unit)
Some patients will develop a high lactate level when they are in hospital. Doctors recognise that this indicates the patient is becoming more unwell, but it is often challenging to know exactly what is causing the lactate level to be raised.
Raised lactate level has been associated with worse outcome in other syndromes, including major trauma and undifferentiated critical illness; however healthy individuals may generate very high lactate levels during strenuous exercise from which they recover without any harm. It is unclear whether lactate in itself is harmful to patients. This dataset provides unique insight into the potential role of lactate as not only a biomarker but a therapeutic target in acute illness.
PIONEER geography The West Midlands (WM) has a population of 5.9 million and includes a diverse ethnic and socio-economic mix.
EHR. UHB is one of the largest NHS Trusts in England, providing direct acute services and specialist care across four hospital sites, with 2.2 million patient episodes per year, 2750 beds and an expanded 250 ITU bed capacity during COVID. UHB runs a fully electronic healthcare record (EHR) (PICS; Birmingham Systems), a shared primary and secondary care record (Your Care Connected) and a patient portal “My Health”.
Scope: Longitudinal and individually linked, so that the preceding and subsequent health journey can be mapped and healthcare utilisation prior to and after admission understood. The dataset includes highly granular patient demographics, co-morbidities taken from ICD-10 and SNOMED-CT codes. Serial, structured data pertaining to process of care (timings, admissions, wards), presenting complaint, physiology readings (BMI, temperature and weight), Sample analysis results (blood sodium level, lactate, haemoglobin, oxygen saturations, and others) drug administered and all outcomes.
Available supplementary data: Matched controls; ambulance, OMOP data, synthetic data.
Available supplementary support: Analytics, Model build, validation & refinement; A.I.; Data partner support for ETL (extract, transform and load) process, Clinical expertise, Patient and end-user access, Purchaser access, Regulatory requirements, Data-driven trials, “fast screen” services.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Context
The dataset tabulates the Fidelity population distribution across 18 age groups. It lists the population in each age group along with the percentage population relative of the total population for Fidelity. The dataset can be utilized to understand the population distribution of Fidelity by age. For example, using this dataset, we can identify the largest age group in Fidelity.
Key observations
The largest age group in Fidelity, MO was for the group of age 25-29 years with a population of 62 (16.67%), according to the 2021 American Community Survey. At the same time, the smallest age group in Fidelity, MO was the 75-79 years with a population of 3 (0.81%). Source: U.S. Census Bureau American Community Survey (ACS) 2017-2021 5-Year Estimates.
When available, the data consists of estimates from the U.S. Census Bureau American Community Survey (ACS) 2017-2021 5-Year Estimates.
Age groups:
Variables / Data Columns
Good to know
Margin of Error
Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.
Custom data
If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.
Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.
This dataset is a part of the main dataset for Fidelity Population by Age. You can refer the same here
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset comprises electrocardiogram (ECG) data organized into three distinct categories based on patient cardiac health and dataset collected by the National Heart Foundation Bangladesh (NHFB) from June 2023 to December 2023.
1. Arrhythmia Patients: This category contains ECG data from individuals diagnosed with cardiac arrhythmias, characterized by irregular heart rhythms. The data within this category may encompass various types of arrhythmias, requiring further sub-classification depending on the specific research objectives.
2. Myocardial Patients: This category encompasses ECG data from patients experiencing myocardial issues, most likely referring to myocardial infarction (heart attack) or other diseases affecting the myocardium (heart muscle). The specific myocardial conditions represented within this category may require further specification depending on the dataset's scope and purpose.
3. Normal Patients: This category serves as a control group and includes ECG data from individuals deemed to have healthy cardiac function. These individuals exhibit no clinically significant ECG abnormalities or diagnosed cardiac conditions.
Dataset Structure:
The dataset is structured into three folders, each corresponding to a specific patient category: "Arrhythmia Patient," "Myocardial Patient," and "Normal Patient." .
Potential Applications:
This dataset can be utilized for various research and educational purposes, including:
Developing and evaluating algorithms for automated arrhythmia detection and classification.
Investigating the ECG characteristics associated with different myocardial conditions.
Training machine learning models for cardiac disease diagnosis and risk stratification.
Educating students and healthcare professionals on ECG interpretation and cardiac pathologies.
Further Information:
Detailed information regarding the data acquisition protocol, ECG recording parameters, patient demographics, and data annotation procedures is essential for comprehensive dataset utilization. Accessing relevant documentation accompanying the dataset is crucial for ensuring appropriate data interpretation and analysis.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
M, male; F, female; TBI, traumatic brain injury; EDH, epidural hematoma; SDH, subdural hematoma; ICH, intracerebral hematoma; BG, basal ganglion; F, frontal; T, temporal; P, parietal; DC, decompressive craniectomy; Uni+HR, unilateral craniectomy+removal of hematoma; Bil+HR, bilateral craniectomy+removal of hematoma; CIS, cranial index of symmetry; CAD, computer-assisted design.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Objectives: The primary goal of this dataset is to enable the automated segmentation and quantification of atherosclerotic plaque features in OCT images. Cardiovascular disease, with atherosclerosis at its core, remains a global health challenge. Accurate identification of vulnerable plaques is crucial for preventing acute cardiovascular events such as myocardial infarction and stroke. OCT imaging provides high-resolution insights into plaque morphology but is often constrained by manual interpretation challenges. This dataset, curated with diverse annotations of key plaque morphological features, aims to facilitate the development and evaluation of machine learning models for precise plaque analysis. By advancing segmentation capabilities, this dataset contributes to improved diagnostics and therapeutic strategies in cardiovascular care.
Ethical Approval: The dataset complies with ethical standards, adhering to the Declaration of Helsinki. Ethical approval was granted by the Local Ethical Committee of the Research Institute for Complex Issues of Cardiovascular Diseases (Kemerovo, Russia) under protocol code 2022/06 (approved on June 30, 2022). All participants provided informed consent. Data collection involved patients aged 18 years or older, ensuring balanced gender representation and inclusion of various comorbid conditions for comprehensive clinical relevance (refer to Table 1).
Description: The dataset consists of OCT images acquired from 103 patients across two cardiovascular research centers. These images, collected over one year, represent a diverse array of imaging devices and patient demographics. The dataset includes 25,698 annotated slices, each capturing key plaque morphological features. These features include lumen (LM), fibrous cap (FC), lipid core (LC), and vasa vasorum (VV). The images vary in dimensions from 704 x 704 to 1024 x 1024 pixels, reflecting differences in anatomical characteristics and imaging conditions. Annotations were performed using Supervisely, with meticulous double-verification processes to ensure accuracy.
Annotation Method: Two cardiologists annotated the dataset, identifying plaque features using binary masks. The annotations underwent a review and double-verification by a senior cardiologist and technical specialist, enhancing precision and consistency. The morphological features segmented include the vascular lumen, fibrous cap, lipid core, and vasa vasorum, each providing critical insights into plaque stability and cardiovascular risk.
Dataset Split: A 5-fold cross-validation technique was employed for dataset splitting, ensuring robust model evaluation while preventing data leakage. Approximately 80% of images were allocated for training in each fold, with the remaining 20% reserved for testing (refer to Table 2). This method allowed a balanced and comprehensive assessment of segmentation performance across the dataset.
Access to the Study: Further information about this study, including curated source code, dataset details, and trained models, can be accessed through the following repositories:
Table 1. Baseline characteristics of patients included in the study.
Parameter |
Value |
Sex: |
|
Male, n (%) |
77 (74.7) |
Female, n (%) |
26 (25.3) |
Median Age, years [min – max] |
69 [43 – 83] |
Arterial hypertension, n (%) |
92 (89.3) |
Diabetes Mellitus, n (%) |
22 (21.4) |
Myocardial Infarction, n (%) |
22 (21.4) |
Polyvascular Disease, n (%) |
29 (28.2) |
Angina Pectoris: |
|
Silent ischemia, n (%) |
9 (8.7) |
Functional class 1, n (%) |
24 (23.3) |
Functional class 2, n (%) |
55 (53.4) |
Functional class 3, n (%) |
15 (14.6) |
Table 2. Image and plaque morphological feature distributions across folds and subsets.
Fold | Subset | LM | FC | LC | VV | Total objects | Total images |
1 | Train | 17264 | 5610 | 5576 | 328 | 28778 | 16901 |
1 | Test | 4544 | 1616 | 1616 | 122 | 7898 | 4492 |
2 | Train | 17554 | 5709 | 5690 | 237 | 29190 | 17207 |
2 | Test | 4254 | 1517 | 1502 | 213 | 7486 | 4186 |
3 | Train | 17220 | 5600 | 5565 | 407 | 28792 | 16962 |
3 | Test | 4588 | 1626 | 1627 | 43 | 7884 | 4431 |
4 | Train | 17813 | 5724 | 5686 | 416 | 29639 | 17473 |
4 | Test | 3995 | 1502 | 1506 | 34 | 7037 | 3920 |
5 | Train | 17381 | 6261 | 6251 | 412 | 30405 | 17029 |
5 | Test | 4427 | 965 | 941 | 38 | 6371 | 4364 |
https://www.pioneerdatahub.co.uk/data/data-request-process/https://www.pioneerdatahub.co.uk/data/data-request-process/
PIONEER: The impact of ethnicity and multi-morbidity on COVID-related outcomes; a primary care supplemented hospitalised dataset Dataset number 3.0
Coronavirus disease 2019 (COVID-19) was identified in January 2020. Currently, there have been more than 65million cases and more than 1.5 million deaths worldwide. Some individuals experience severe manifestations of infection, including viral pneumonia, adult respiratory distress syndrome (ARDS) and death. Evidence suggests that older patients, those from some ethnic minority groups and those with multiple long-term health conditions have worse outcomes. This secondary care COVID dataset contains granular demographic and morbidity data, supplemented from primary care records, to add to the understanding of patient factors on disease outcomes.
PIONEER geography The West Midlands (WM) has a population of 5.9 million & includes a diverse ethnic & socio-economic mix. There is a higher than average percentage of minority ethnic groups. WM has a large number of elderly residents but is the youngest population in the UK. Each day >100,000 people are treated in hospital, see their GP or are cared for by the NHS. The West Midlands was one of the hardest hit regions for COVID admissions in both wave 1 and 2.
EHR. University Hospitals Birmingham NHS Foundation Trust (UHB) is one of the largest NHS Trusts in England, providing direct acute services & specialist care across four hospital sites, with 2.2 million patient episodes per year, 2750 beds & 100 ITU beds. UHB runs a fully electronic healthcare record (EHR) (PICS; Birmingham Systems), a shared primary & secondary care record (Your Care Connected) & a patient portal “My Health”. UHB has cared for >5000 COVID admissions to date.
Scope: All COVID swab confirmed hospitalised patients to UHB from January – May 2020. The dataset includes highly granular patient demographics & co-morbidities taken from ICD-10 & SNOMED-CT codes but also primary care records and clinic letters. Serial, structured data pertaining to care process (timings, staff grades, specialty review, wards), presenting complaint, acuity, all physiology readings (pulse, blood pressure, respiratory rate, oxygen saturations), all blood results, microbiology, all prescribed & administered treatments (fluids, antibiotics, inotropes, vasopressors, organ support), all outcomes. Linked images available (radiographs, CT, MRI, ultrasound).
Available supplementary data: Health data preceding and following admission event. Matched “non-COVID” controls; ambulance, 111, 999 data, synthetic data.
Available supplementary support: Analytics, Model build, validation & refinement; A.I.; Data partner support for ETL (extract, transform & load) process, Clinical expertise, Patient & end-user access, Purchaser access, Regulatory requirements, Data-driven trials, “fast screen” services.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Patient demographics and co-morbidities in the 6-month pre-index period and during selection period.
Not seeing a result you expected?
Learn how you can add new datasets to our index.
https://www.pioneerdatahub.co.uk/data/data-request-process/https://www.pioneerdatahub.co.uk/data/data-request-process/
OMOP dataset: Hospital COVID patients: severity, acuity, therapies, outcomes Dataset number 2.0
Coronavirus disease 2019 (COVID-19) was identified in January 2020. Currently, there have been more than 6 million cases & more than 1.5 million deaths worldwide. Some individuals experience severe manifestations of infection, including viral pneumonia, adult respiratory distress syndrome (ARDS) & death. There is a pressing need for tools to stratify patients, to identify those at greatest risk. Acuity scores are composite scores which help identify patients who are more unwell to support & prioritise clinical care. There are no validated acuity scores for COVID-19 & it is unclear whether standard tools are accurate enough to provide this support. This secondary care COVID OMOP dataset contains granular demographic, morbidity, serial acuity and outcome data to inform risk prediction tools in COVID-19.
PIONEER geography The West Midlands (WM) has a population of 5.9 million & includes a diverse ethnic & socio-economic mix. There is a higher than average percentage of minority ethnic groups. WM has a large number of elderly residents but is the youngest population in the UK. Each day >100,000 people are treated in hospital, see their GP or are cared for by the NHS. The West Midlands was one of the hardest hit regions for COVID admissions in both wave 1 & 2.
EHR. University Hospitals Birmingham NHS Foundation Trust (UHB) is one of the largest NHS Trusts in England, providing direct acute services & specialist care across four hospital sites, with 2.2 million patient episodes per year, 2750 beds & 100 ITU beds. UHB runs a fully electronic healthcare record (EHR) (PICS; Birmingham Systems), a shared primary & secondary care record (Your Care Connected) & a patient portal “My Health”. UHB has cared for >5000 COVID admissions to date. This is a subset of data in OMOP format.
Scope: All COVID swab confirmed hospitalised patients to UHB from January – August 2020. The dataset includes highly granular patient demographics & co-morbidities taken from ICD-10 & SNOMED-CT codes. Serial, structured data pertaining to care process (timings, staff grades, specialty review, wards), presenting complaint, acuity, all physiology readings (pulse, blood pressure, respiratory rate, oxygen saturations), all blood results, microbiology, all prescribed & administered treatments (fluids, antibiotics, inotropes, vasopressors, organ support), all outcomes.
Available supplementary data: Health data preceding & following admission event. Matched “non-COVID” controls; ambulance, 111, 999 data, synthetic data. Further OMOP data available as an additional service.
Available supplementary support: Analytics, Model build, validation & refinement; A.I.; Data partner support for ETL (extract, transform & load) process, Clinical expertise, Patient & end-user access, Purchaser access, Regulatory requirements, Data-driven trials, “fast screen” services.