17 datasets found

H
CDE Patient Demographics
find.data.gov.scot
dtechtive.com
Updated May 31, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
BARTS HEALTH (2023). CDE Patient Demographics [Dataset]. https://find.data.gov.scot/datasets/25890
Explore at:
Dataset updated
May 31, 2023
Dataset provided by
BARTS HEALTH
Description
Locally defined dataset containing a full list of patient registrations held within the Trust's EHR system. Details extend to include GP details and patient identifers.
Patient-Level Information and Costing Systems - Integrated Data Set
standards.nhs.uk
Updated Jun 19, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
NHS England (2024). Patient-Level Information and Costing Systems - Integrated Data Set [Dataset]. https://standards.nhs.uk/published-standards/patientlevel-information-and-costing-systems-integrated-data-set
Explore at:
Dataset updated
Jun 19, 2024
Dataset provided by
National Health Servicehttps://www.nhs.uk/
Authors
NHS England
Description
Part of a set of two collections for patient-level costing, which provide a consistent approach to reporting cost information at patient level.
COVID-19 Case Surveillance Public Use Data
data.cdc.gov
paperswithcode.com
+5more
application/rdfxml +5
Updated Jul 9, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
CDC Data, Analytics and Visualization Task Force (2024). COVID-19 Case Surveillance Public Use Data [Dataset]. https://data.cdc.gov/Case-Surveillance/COVID-19-Case-Surveillance-Public-Use-Data/vbim-akqf
Explore at:
application/rdfxml, tsv, csv, json, xml, application/rssxmlAvailable download formats
Dataset updated
Jul 9, 2024
Dataset provided by
Centers for Disease Control and Preventionhttp://www.cdc.gov/
Authors
CDC Data, Analytics and Visualization Task Force
License
https://www.usa.gov/government-workshttps://www.usa.gov/government-works
Description
Note: Reporting of new COVID-19 Case Surveillance data will be discontinued July 1, 2024, to align with the process of removing SARS-CoV-2 infections (COVID-19 cases) from the list of nationally notifiable diseases. Although these data will continue to be publicly available, the dataset will no longer be updated.

Authorizations to collect certain public health data expired at the end of the U.S. public health emergency declaration on May 11, 2023. The following jurisdictions discontinued COVID-19 case notifications to CDC: Iowa (11/8/21), Kansas (5/12/23), Kentucky (1/1/24), Louisiana (10/31/23), New Hampshire (5/23/23), and Oklahoma (5/2/23). Please note that these jurisdictions will not routinely send new case data after the dates indicated. As of 7/13/23, case notifications from Oregon will only include pediatric cases resulting in death.

This case surveillance public use dataset has 12 elements for all COVID-19 cases shared with CDC and includes demographics, any exposure history, disease severity indicators and outcomes, presence of any underlying medical conditions and risk behaviors, and no geographic data.

CDC has three COVID-19 case surveillance datasets:
COVID-19 Case Surveillance Public Use Data with Geography: Public use, patient-level dataset with clinical data (including symptoms), demographics, and county and state of residence. (19 data elements)
COVID-19 Case Surveillance Public Use Data: Public use, patient-level dataset with clinical and symptom data and demographics, with no geographic data. (12 data elements)
COVID-19 Case Surveillance Restricted Access Detailed Data: Restricted access, patient-level dataset with clinical and symptom data, demographics, and state and county of residence. Access requires a registration process and a data use agreement. (33 data elements)
The following apply to all three datasets:
Data elements can be found on the COVID-19 case report form located at www.cdc.gov/coronavirus/2019-ncov/downloads/pui-form.pdf.
Data are considered provisional by CDC and are subject to change until the data are reconciled and verified with the state and territorial data providers.
Some data cells are suppressed to protect individual privacy.
The datasets will include all cases with the earliest date available in each record (date received by CDC or date related to illness/specimen collection) at least 14 days prior to the creation of the current datasets. This 14-day lag allows case reporting to be stabilized and ensures that time-dependent outcome data are accurately captured.
Datasets are updated monthly.
Datasets are created using CDC’s Policy on Public Health Research and Nonresearch Data Management and Access and include protections designed to protect individual privacy.
For more information about data collection and reporting, please see https://www.cdc.gov/coronavirus/2019-ncov/covid-data/about-us-cases-deaths.html.
For more information about the COVID-19 case surveillance data, please see https://www.cdc.gov/coronavirus/2019-ncov/covid-data/faq-surveillance.html

Overview

The COVID-19 case surveillance database includes individual-level data reported to U.S. states and autonomous reporting entities, including New York City and the District of Columbia (D.C.), as well as U.S. territories and affiliates. On April 5, 2020, COVID-19 was added to the Nationally Notifiable Condition List and classified as “immediately notifiable, urgent (within 24 hours)” by a Council of State and Territorial Epidemiologists (CSTE) Interim Position Statement (Interim-20-ID-01). CSTE updated the position statement on August 5, 2020, to clarify the interpretation of antigen detection tests and serologic test results within the case classification (Interim-20-ID-02). The statement also recommended that all states and territories enact laws to make COVID-19 reportable in their jurisdiction, and that jurisdictions conducting surveillance should submit case notifications to CDC. COVID-19 case surveillance data are collected by jurisdictions and reported voluntarily to CDC.

For more information: NNDSS Supports the COVID-19 Response | CDC.

The deidentified data in the “COVID-19 Case Surveillance Public Use Data” include demographic characteristics, any exposure history, disease severity indicators and outcomes, clinical data, laboratory diagnostic test results, and presence of any underlying medical conditions and risk behaviors. All data elements can be found on the COVID-19 case report form located at www.cdc.gov/coronavirus/2019-ncov/downloads/pui-form.pdf.

COVID-19 Case Reports

COVID-19 case reports have been routinely submitted using nationally standardized case reporting forms. On April 5, 2020, CSTE released an Interim Position Statement with national surveillance case definitions for COVID-19 included. Current versions of these case definitions are available here: https://ndc.services.cdc.gov/case-definitions/coronavirus-disease-2019-2021/.

All cases reported on or after were requested to be shared by public health departments to CDC using the standardized case definitions for laboratory-confirmed or probable cases. On May 5, 2020, the standardized case reporting form was revised. Case reporting using this new form is ongoing among U.S. states and territories.

Data are Considered Provisional

The COVID-19 case surveillance data are dynamic; case reports can be modified at any time by the jurisdictions sharing COVID-19 data with CDC. CDC may update prior cases shared with CDC based on any updated information from jurisdictions. For instance, as new information is gathered about previously reported cases, health departments provide updated data to CDC. As more information and data become available, analyses might find changes in surveillance data and trends during a previously reported time window. Data may also be shared late with CDC due to the volume of COVID-19 cases.
Annual finalized data: To create the final NNDSS data used in the annual tables, CDC works carefully with the reporting jurisdictions to reconcile the data received during the year until each state or territorial epidemiologist confirms that the data from their area are correct.
Access Addressing Gaps in Public Health Reporting of Race and Ethnicity for COVID-19, a report from the Council of State and Territorial Epidemiologists, to better understand the challenges in completing race and ethnicity data for COVID-19 and recommendations for improvement.

Data Limitations

To learn more about the limitations in using case surveillance data, visit FAQ: COVID-19 Data and Surveillance.

Data Quality Assurance Procedures

CDC’s Case Surveillance Section routinely performs data quality assurance procedures (i.e., ongoing corrections and logic checks to address data errors). To date, the following data cleaning steps have been implemented:
Questions that have been left unanswered (blank) on the case report form are reclassified to a Missing value, if applicable to the question. For example, in the question “Was the individual hospitalized?” where the possible answer choices include “Yes,” “No,” or “Unknown,” the blank value is recoded to Missing because the case report form did not include a response to the question.
Logic checks are performed for date data. If an illogical date has been provided, CDC reviews the data with the reporting jurisdiction. For example, if a symptom onset date in the future is reported to CDC, this value is set to null until the reporting jurisdiction updates the date appropriately.
Additional data quality processing to recode free text data is ongoing. Data on symptoms, race and ethnicity, and healthcare worker status have been prioritized.

Data Suppression

To prevent release of data that could be used to identify people, data cells are suppressed for low frequency (<5) records and indirect identifiers (e.g., date of first positive specimen). Suppression includes rare combinations of demographic characteristics (sex, age group, race/ethnicity). Suppressed values are re-coded to the NA answer option; records with data suppression are never removed.

For questions, please contact Ask SRRG (eocevent394@cdc.gov).

Additional COVID-19 Data

COVID-19 data are available to the public as summary or aggregate count files, including total counts of cases and deaths by state and by county. These
f
PMC-Patients Dataset
figshare.com
application/x-gzip
Updated Nov 6, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Zhengyun Zhao (2023). PMC-Patients Dataset [Dataset]. http://doi.org/10.6084/m9.figshare.24504115.v1
Explore at:
application/x-gzipAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.24504115.v1
Dataset updated
Nov 6, 2023
Dataset provided by
figshare
Authors
Zhengyun Zhao
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
PMC-Patients DatasetThe core file of our dataset, containing the patient summaries, demographics, and relational annotations.### PMC-Patients.jsonPatient summaries are presented as a json file, which is a list of dictionaries with the following keys:- patient_id: string. A continuous id of patients, starting from 0.- patient_uid: string. Unique ID for each patient, with format PMID-x, where PMID is the PubMed Identifier of source article of the note and x denotes index of the note in source article.- PMID: string. PMID for source article.- file_path: string. File path of xml file of source article.- title: string. Source article title.- patient: string. Patient note.- age: list of tuples. Each entry is in format (value, unit) where value is a float number and unit is in 'year', 'month', 'week', 'day' and 'hour' indicating age unit. For example, [[1.0, 'year'], [2.0, 'month']] indicating the patient is a one-year- and two-month-old infant.- gender: 'M' or 'F'. Male or Female.- relevant_articles: dict. The key is PMID of the relevant articles and the corresponding value is its relevance score (2 or 1 as defined in the Methods'' section).- `similar_patients`: dict. The key is patient_uid of the similar patients and the corresponding value is its similarity score (2 or 1 as defined in theMethods'' section).
PMC patient notes
redivis.com
application/jsonl +7
Updated Apr 16, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Stanford Center for Population Health Sciences (2024). PMC patient notes [Dataset]. http://doi.org/10.57761/8y6b-nw25
Explore at:
stata, application/jsonl, arrow, sas, spss, avro, csv, parquetAvailable download formats
Unique identifier
https://doi.org/10.57761/8y6b-nw25
Dataset updated
Apr 16, 2024
Dataset provided by
Redivis Inc.
Authors
Stanford Center for Population Health Sciences
Description
Abstract

Sample dataset for training webinar on Tuesday, April 16 2024.

Methodology

PMC-Patients is a first-of-its-kind dataset consisting of 167k patient summaries extracted from case reports in PubMed Central (PMC), 3.1M patient-article relevance and 293k patient-patient similarity annotations defined by PubMed citation graph.

Dataset Description - **Homepage:** https://github.com/pmc-patients/pmc-patients

- **Repository:** https://github.com/pmc-patients/pmc-patients

- **Paper:** https://arxiv.org/pdf/2202.13876.pdf

- **Leaderboard:** https://pmc-patients.github.io/

- **Point of Contact:** zhengyun21@mails.tsinghua.edu.cn Dataset Structure This file contains all information about patients summaries in PMC-Patients, with the following columns:

`patient_id`: string. A continuous id of patients, starting from 0.

`patient_uid`: string. Unique ID for each patient, with format PMID-x, where PMID is the PubMed Identifier of the source article of the patient and x denotes index of the patient in source article.

`PMID`: string. PMID for source article.

`file_path`: string. File path of xml file of source article.

`title`: string. Source article title.

`patient`: string. Patient summary.

`age`: list of tuples. Each entry is in format `(value, unit)` where value is a float number and unit is in 'year', 'month', 'week', 'day' and 'hour' indicating age unit. For example, `[[1.0, 'year'], [2.0, 'month']]` indicating the patient is a one-year- and two-month-old infant.

`gender`: 'M' or 'F'. Male or Female.

`relevant_articles`: dict. The key is PMID of the relevant articles and the corresponding value is its relevance score (2 or 1 as defined in the ``Methods'' section).

`similar_patients`: dict. The key is patient_uid of the similar patients and the corresponding value is its similarity score (2 or 1 as defined in the ``Methods'' section).

%3C!-- --%3E

Dataset Creation

If you are interested in the collection of PMC-Patients and reproducing our baselines, please refer to [this repository](https://github.com/zhao-zy15/PMC-Patients).
Patient-Level Information and Costing Systems - Integrated Data Set - Update...
standards.nhs.uk
Updated May 1, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
NHS England (2024). Patient-Level Information and Costing Systems - Integrated Data Set - Update to DAPB4000 [Dataset]. https://standards.nhs.uk/future-standards/patientlevel-information-and-costing-systems-integrated-data-set-update-to-dapb4000
Explore at:
Dataset updated
May 1, 2024
Dataset provided by
National Health Servicehttps://www.nhs.uk/
Authors
NHS England
Description
A submission to cover annual changes to the PLICS integrated data set.
COVID-19 Patient Impact & Hospital Capacity Data
kaggle.com
Updated Mar 10, 2021
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
aditirajagopal (2021). COVID-19 Patient Impact & Hospital Capacity Data [Dataset]. https://www.kaggle.com/aditirajagopal/covid19-patient-impact-hospital-capacity-data
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Mar 10, 2021
Dataset provided by
Kagglehttp://kaggle.com/
Authors
aditirajagopal
License
http://opendatacommons.org/licenses/dbcl/1.0/http://opendatacommons.org/licenses/dbcl/1.0/
Description
Source: https://healthdata.gov/dataset/covid-19-reported-patient-impact-and-hospital-capacity-facility

The following dataset provides facility-level data for hospital utilization aggregated on a weekly basis (Friday to Thursday). These are derived from reports with facility-level granularity across two main sources: (1) HHS TeleTracking, and (2) reporting provided directly to HHS Protect by state/territorial health departments on behalf of their healthcare facilities.

The hospital population includes all hospitals registered with Centers for Medicare & Medicaid Services (CMS) as of June 1, 2020. It includes non-CMS hospitals that have reported since July 15, 2020. It does not include psychiatric, rehabilitation, Indian Health Service (IHS) facilities, U.S. Department of Veterans Affairs (VA) facilities, Defense Health Agency (DHA) facilities, and religious non-medical facilities.

For a given entry, the term “collection_week” signifies the start of the period that is aggregated. For example, a “collection_week” of 2020-11-20 means the average/sum/coverage of the elements captured from that given facility starting and including Friday, November 20, 2020, and ending and including reports for Thursday, November 26, 2020.

Reported elements include an append of either “_coverage”, “_sum”, or “_avg”.

A “_coverage” append denotes how many times the facility reported that element during that collection week. A “_sum” append denotes the sum of the reports provided for that facility for that element during that collection week. A “_avg” append is the average of the reports provided for that facility for that element during that collection week.

The file will be updated weekly. No statistical analysis is applied to impute non-response. For averages, calculations are based on the number of values collected for a given hospital in that collection week. Suppression is applied to the file for sums and averages less than four (4). In these cases, the field will be replaced with “-999,999”.

This data is preliminary and subject to change as more data become available. Data is available starting on July 31, 2020.

Sometimes, reports for a given facility will be provided to both HHS TeleTracking and HHS Protect. When this occurs, to ensure that there are not duplicate reports, deduplication is applied according to prioritization rules within HHS Protect.

For influenza fields listed in the file, the current HHS guidance marks these fields as optional. As a result, coverage of these elements are varied.
p
MIMIC-IV
physionet.org
Updated Oct 11, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Alistair Johnson; Lucas Bulgarelli; Tom Pollard; Brian Gow; Benjamin Moody; Steven Horng; Leo Anthony Celi; Roger Mark (2024). MIMIC-IV [Dataset]. http://doi.org/10.13026/kpb9-mt58
Explore at:
Unique identifier
https://doi.org/10.13026/kpb9-mt58
Dataset updated
Oct 11, 2024
Authors
Alistair Johnson; Lucas Bulgarelli; Tom Pollard; Brian Gow; Benjamin Moody; Steven Horng; Leo Anthony Celi; Roger Mark
License
https://github.com/MIT-LCP/license-and-dua/tree/master/draftshttps://github.com/MIT-LCP/license-and-dua/tree/master/drafts
Description
Retrospectively collected medical data has the opportunity to improve patient care through knowledge discovery and algorithm development. Broad reuse of medical data is desirable for the greatest public good, but data sharing must be done in a manner which protects patient privacy. Here we present Medical Information Mart for Intensive Care (MIMIC)-IV, a large deidentified dataset of patients admitted to the emergency department or an intensive care unit at the Beth Israel Deaconess Medical Center in Boston, MA. MIMIC-IV contains data for over 65,000 patients admitted to an ICU and over 200,000 patients admitted to the emergency department. MIMIC-IV incorporates contemporary data and adopts a modular approach to data organization, highlighting data provenance and facilitating both individual and combined use of disparate data sources. MIMIC-IV is intended to carry on the success of MIMIC-III and support a broad set of applications within healthcare.
d
ARCHIVED: COVID-19 Cases by Population Characteristics Over Time
catalog.data.gov
data.sfgov.org
+2more
Updated Mar 29, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
data.sfgov.org (2025). ARCHIVED: COVID-19 Cases by Population Characteristics Over Time [Dataset]. https://catalog.data.gov/dataset/covid-19-cases-by-population-characteristics-over-time
Explore at:
Dataset updated
Mar 29, 2025
Dataset provided by
data.sfgov.org
Description
A. SUMMARY This archived dataset includes data for population characteristics that are no longer being reported publicly. The date on which each population characteristic type was archived can be found in the field “data_loaded_at”. B. HOW THE DATASET IS CREATED Data on the population characteristics of COVID-19 cases are from:  * Case interviews  * Laboratories  * Medical providers    These multiple streams of data are merged, deduplicated, and undergo data verification processes.   Race/ethnicity * We include all race/ethnicity categories that are collected for COVID-19 cases. * The population estimates for the "Other" or “Multi-racial” groups should be considered with caution. The Census definition is likely not exactly aligned with how the City collects this data. For that reason, we do not recommend calculating population rates for these groups. Gender * The City collects information on gender identity using these guidelines. Skilled Nursing Facility (SNF) occupancy * A Skilled Nursing Facility (SNF) is a type of long-term care facility that provides care to individuals, generally in their 60s and older, who need functional assistance in their daily lives.  * This dataset includes data for COVID-19 cases reported in Skilled Nursing Facilities (SNFs) through 12/31/2022, archived on 1/5/2023. These data were identified where “Characteristic_Type” = ‘Skilled Nursing Facility Occupancy’. Sexual orientation * The City began asking adults 18 years old or older for their sexual orientation identification during case interviews as of April 28, 2020. Sexual orientation data prior to this date is unavailable. * The City doesn’t collect or report information about sexual orientation for persons under 12 years of age. * Case investigation interviews transitioned to the California Department of Public Health, Virtual Assistant information gathering beginning December 2021. The Virtual Assistant is only sent to adults who are 18+ years old. Learn more about our data collection guidelines pertaining to sexual orientation. Comorbidities * Underlying conditions are reported when a person has one or more underlying health conditions at the time of diagnosis or death. Homelessness Persons are identified as homeless based on several data sources: * self-reported living situation * the location at the time of testing * Department of Public Health homelessness and health databases * Residents in Single-Room Occupancy hotels are not included in these figures. These methods serve as an estimate of persons experiencing homelessness. They may not meet other homelessness definitions. Single Room Occupancy (SRO) tenancy * SRO buildings are defined by the San Francisco Housing Code as having six or more "residential guest rooms" which may be attached to shared bathrooms, kitchens, and living spaces. * The details of a person's living arrangements are verified during case interviews. Transmission Type * Information on transmission of COVID-19 is based on case interviews with individuals who have a confirmed positive test. Individuals are asked if they have been in close contact with a known COVID-19 case. If they answer yes, transmission category is recorded as contact with a known case. If they report no contact with a known case, transmission category is recorded as community transmission. If the case is not interviewed or was not asked the question, they are counted as unknown. C. UPDATE PROCESS This dataset has been archived and will no longer update as of 9/11/2023. D. HOW TO USE THIS DATASET Population estimates are only available for age groups and race/ethnicity categories. San Francisco po
Short-term PM2.5 exposure and early-readmission risk in Heart Failure...
catalog.data.gov
Updated Nov 15, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
U.S. EPA Office of Research and Development (ORD) (2024). Short-term PM2.5 exposure and early-readmission risk in Heart Failure Patients [Dataset]. https://catalog.data.gov/dataset/short-term-pm2-5-exposure-and-early-readmission-risk-in-heart-failure-patients
Explore at:
Dataset updated
Nov 15, 2024
Dataset provided by
United States Environmental Protection Agencyhttp://www.epa.gov/
Description
In this manuscript EPA researchers used high resolution (1x1 km) modeled air quality data from a model built by Harvard collaborators to estimate the association between short-term exposure to air pollution and the occurrence of 30-day readmissions in a heart failure population. The heart failure population was taken from patients presenting to a University of North Carolina Healthcare System (UNCHCS) affiliated hospital or clinic that reported electronic health records to the Carolina Data Warehouse for Health (CDW-H). A description of the variables used in this analysis are available in the data dictionary (L:/PRIV/EPHD_CRB/Cavin/CARES/Data Dictonaries/HF short term PM25 and readmissions data dictionary.xlsx) associated with this manuscript. Analysis code is available in L:/PRIV/EPHD_CRB/Cavin/CARES/Project Analytic Code/Lauren Wyatt/DailyPM_HF_readmission. This dataset is not publicly accessible because: Dataset is PII in the form of electronic health records. It can be accessed through the following means: Data can be accessed with an approved IRB. Format: In this manuscript EPA researchers used high resolution (1x1 km) modeled air quality data from a model built by Harvard collaborators to estimate the association between short-term exposure to air pollution and the occurrence of 30-day readmissions in a heart failure population. The heart failure population was taken from patients presenting to a University of North Carolina Healthcare System (UNCHCS) affiliated hospital or clinic that reported electronic health records to the Carolina Data Warehouse for Health (CDW-H). A description of the variables used in this analysis are available in the data dictionary (L:/PRIV/EPHD_CRB/Cavin/CARES/Data Dictonaries/HF short term PM25 and readmissions data dictionary.xlsx) associated with this manuscript. Analysis code is available in L:/PRIV/EPHD_CRB/Cavin/CARES/Project Analytic Code/Lauren Wyatt/DailyPM_HF_readmission. This dataset is associated with the following publication: Wyatt, L., A. Weaver, J. Moyer, J. Schwartz, Q. Di, D. Diazsanchez, W. Cascio, and C. Ward-Caviness. Short-term PM2.5 exposure and early-readmission risk: A retrospective cohort study in North Carolina Heart Failure Patients. American Heart Journal. Mosby Year Book Incorporated, Orlando, FL, USA, 248: 130-138, (2022).
a
AIHW - Mental Health Services - Emergency Department Presentations by...
data.aurin.org.au
Updated Mar 6, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2025). AIHW - Mental Health Services - Emergency Department Presentations by Demographics (SA3) 2014-2018 - Dataset - AURIN [Dataset]. https://data.aurin.org.au/dataset/au-govt-aihw-aihw-mental-hlth-serv-emrgncy-presentations-demo-sa3-2014-18-sa3-2016
Explore at:
Dataset updated
Mar 6, 2025
License
Attribution 3.0 (CC BY 3.0)https://creativecommons.org/licenses/by/3.0/
License information was derived automatically
Description
This dataset presents the footprint of the number of emergency department presentations in public hospitals by patient demographics and location. Mental health-related emergency department (ED) presentations are defined as presentations to public hospital EDs that have a principal diagnosis of mental and behavioural disorders. However, the definition does not fully capture all potential mental health-related presentations to EDs such as intentional self-harm, as intent can be difficult to identify in an ED environment and can also be difficult to code. The data spans the financial years of 2014-2018 and is aggregated to Statistical Area Level 3 (SA3) geographic areas from the 2016 Australian Statistical Geography Standard (ASGS). State and territory health authorities collect a core set of nationally comparable information on most public hospital ED presentations in their jurisdiction, which is compiled annually into the National Non-Admitted Patient Emergency Department Care Database (NNAPEDCD). The data reported for 2014–15 to 2017–18 is sourced from the NNAPEDCD. Information about mental health-related services provided in EDs prior to 2014–15 was supplied directly to the Australian Institute of Health and Welfare (AIHW) by states and territories. Mental health services in Australia (MHSA) provides a picture of the national response of the health and welfare service system to the mental health care needs of Australians. MHSA is updated progressively throughout each year as data becomes available. The data accompanies the Mental Health Services - In Brief 2018 Web Report. For further information about this dataset, visit the data source:Australian Institute of Health and Welfare - Mental health services in Australia Data Tables. Please note: AURIN has spatially enabled the original data.
[Archived] COVID-19 Deaths by Population Characteristics Over Time
healthdata.gov
data.sfgov.org
+1more
application/rdfxml +5
Updated Apr 8, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
data.sfgov.org (2025). [Archived] COVID-19 Deaths by Population Characteristics Over Time [Dataset]. https://healthdata.gov/dataset/-Archived-COVID-19-Deaths-by-Population-Characteri/hs5f-amst
Explore at:
csv, json, xml, application/rssxml, tsv, application/rdfxmlAvailable download formats
Dataset updated
Apr 8, 2025
Dataset provided by
data.sfgov.org
Description
As of July 2nd, 2024 the COVID-19 Deaths by Population Characteristics Over Time dataset has been retired. This dataset is archived and will no longer update. We will be publishing a cumulative deaths by population characteristics dataset that will update moving forward.

A. SUMMARY This dataset shows San Francisco COVID-19 deaths by population characteristics and by date. This data may not be immediately available for recently reported deaths. Data updates as more information becomes available. Because of this, death totals for previous days may increase or decrease. More recent data is less reliable.

Population characteristics are subgroups, or demographic cross-sections, like age, race, or gender. The City tracks how deaths have been distributed among different subgroups. This information can reveal trends and disparities among groups.

B. HOW THE DATASET IS CREATED As of January 1, 2023, COVID-19 deaths are defined as persons who had COVID-19 listed as a cause of death or a significant condition contributing to their death on their death certificate. This definition is in alignment with the California Department of Public Health and the national https://preparedness.cste.org/wp-content/uploads/2022/12/CSTE-Revised-Classification-of-COVID-19-associated-Deaths.Final_.11.22.22.pdf">Council of State and Territorial Epidemiologists. Death certificates are maintained by the California Department of Public Health.

Data on the population characteristics of COVID-19 deaths are from: *Case reports *Medical records *Electronic lab reports *Death certificates

Data are continually updated to maximize completeness of information and reporting on San Francisco COVID-19 deaths.

To protect resident privacy, we summarize COVID-19 data by only one characteristic at a time. Data are not shown until cumulative citywide deaths reach five or more.

Data notes on each population characteristic type is listed below.

Race/ethnicity * We include all race/ethnicity categories that are collected for COVID-19 cases.

Gender * The City collects information on gender identity using these guidelines.

C. UPDATE PROCESS Updates automatically at 06:30 and 07:30 AM Pacific Time on Wednesday each week.

Dataset will not update on the business day following any federal holiday.

D. HOW TO USE THIS DATASET Population estimates are only available for age groups and race/ethnicity categories. San Francisco population estimates for race/ethnicity and age groups can be found in a view based on the San Francisco Population and Demographic Census dataset. These population estimates are from the 2016-2020 5-year American Community Survey (ACS).

This dataset includes many different types of characteristics. Filter the “Characteristic Type” column to explore a topic area. Then, the “Characteristic Group” column shows each group or category within that topic area and the number of deaths on each date.

New deaths are the count of deaths within that characteristic group on that specific date. Cumulative deaths are the running total of all San Francisco COVID-19 deaths in that characteristic group up to the date listed.

This data may not be immediately available for more recent deaths. Data updates as more information becomes available.

To explore data on the total number of deaths, use the COVID-19 Deaths Over Time dataset.

E. CHANGE LOG
9/11/2023 - on this date, we began using an updated definition of a COVID-19 death to align with the California Department o
A
‘COVID-19 Cases by Population Characteristics Over Time’ analyzed by...
analyst-2.ai
Updated Feb 15, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com) (2022). ‘COVID-19 Cases by Population Characteristics Over Time’ analyzed by Analyst-2 [Dataset]. https://analyst-2.ai/analysis/data-gov-covid-19-cases-by-population-characteristics-over-time-097d/6c8f14dd/?iid=004-510&v=presentation
Explore at:
Dataset updated
Feb 15, 2022
Dataset authored and provided by
Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com)
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Analysis of ‘COVID-19 Cases by Population Characteristics Over Time’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://catalog.data.gov/dataset/a3291d85-0076-43c5-a59c-df49480cdc6d on 13 February 2022.

--- Dataset description provided by original source is as follows ---

Note: On January 22, 2022, system updates to improve the timeliness and accuracy of San Francisco COVID-19 cases and deaths data were implemented. You might see some fluctuations in historic data as a result of this change. Due to the changes, starting on January 22, 2022, the number of new cases reported daily will be higher than under the old system as cases that would have taken longer to process will be reported earlier.

A. SUMMARY This dataset shows San Francisco COVID-19 cases by population characteristics and by specimen collection date. Cases are included on the date the positive test was collected.

Population characteristics are subgroups, or demographic cross-sections, like age, race, or gender. The City tracks how cases have been distributed among different subgroups. This information can reveal trends and disparities among groups.

Data is lagged by five days, meaning the most recent specimen collection date included is 5 days prior to today. Tests take time to process and report, so more recent data is less reliable.

B. HOW THE DATASET IS CREATED Data on the population characteristics of COVID-19 cases and deaths are from: * Case interviews * Laboratories * Medical providers

These multiple streams of data are merged, deduplicated, and undergo data verification processes. This data may not be immediately available for recently reported cases because of the time needed to process tests and validate cases. Daily case totals on previous days may increase or decrease. Learn more.

Data are continually updated to maximize completeness of information and reporting on San Francisco residents with COVID-19.

Data notes on each population characteristic type is listed below.

Race/ethnicity * We include all race/ethnicity categories that are collected for COVID-19 cases. * The population estimates for the "Other" or “Multi-racial” groups should be considered with caution. The Census definition is likely not exactly aligned with how the City collects this data. For that reason, we do not recommend calculating population rates for these groups.

Sexual orientation * Sexual orientation data is collected from individuals who are 18 years old or older. These individuals can choose whether to provide this information during case interviews. Learn more about our data collection guidelines. * The City began asking for this information on April 28, 2020.

Gender * The City collects information on gender identity using these guidelines.

Comorbidities * Underlying conditions are reported when a person has one or more underlying health conditions at the time of diagnosis or death.

Transmission type * Information on transmission of COVID-19 is based on case interviews with individuals who have a confirmed positive test. Individuals are asked if they have been in close contact with a known COVID-19 case. If they answer yes, transmission category is recorded as contact with a known case. If they report no contact with a known case, transmission category is recorded as community transmission. If the case is not interviewed or was not asked the question, they are counted as unknown.

Homelessness Persons are identified as homeless based on several data sources: * self-reported living situation
* the location at the time of testing * Department of Public Health homelessness and health databases * Residents in Single-Room Occupancy hotels are not included in these figures.
These methods serve as an estimate of persons experiencing homelessness. They may not meet other homelessness definitions.

Skilled Nursing Facility (SNF) occupancy * A Skilled Nursing

--- Original source retains full ownership of the source dataset ---
d
COVID-19 Deaths by Population Characteristics
catalog.data.gov
data.sfgov.org
+2more
Updated Jun 29, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
data.sfgov.org (2025). COVID-19 Deaths by Population Characteristics [Dataset]. https://catalog.data.gov/dataset/covid-19-deaths-by-population-characteristics
Explore at:
Dataset updated
Jun 29, 2025
Dataset provided by
data.sfgov.org
Description
A. SUMMARY This dataset shows San Francisco COVID-19 deaths by population characteristics. This data may not be immediately available for recently reported deaths. Data updates as more information becomes available. Because of this, death totals may increase or decrease. Population characteristics are subgroups, or demographic cross-sections, like age, race, or gender. The City tracks how deaths have been distributed among different subgroups. This information can reveal trends and disparities among groups. B. HOW THE DATASET IS CREATED As of January 1, 2023, COVID-19 deaths are defined as persons who had COVID-19 listed as a cause of death or a significant condition contributing to their death on their death certificate. This definition is in alignment with the California Department of Public Health and the national Council of State and Territorial Epidemiologists. Death certificates are maintained by the California Department of Public Health. Data on the population characteristics of COVID-19 deaths are from: Case reports Medical records Electronic lab reports Death certificates Data are continually updated to maximize completeness of information and reporting on San Francisco COVID-19 deaths. To protect resident privacy, we summarize COVID-19 data by only one population characteristic at a time. Data are not shown until cumulative citywide deaths reach five or more. Data notes on select population characteristic types are listed below. Race/ethnicity * We include all race/ethnicity categories that are collected for COVID-19 cases. Gender * The City collects information on gender identity using these guidelines. C. UPDATE PROCESS Updates automatically at 06:30 and 07:30 AM Pacific Time on Wednesday each week. Dataset will not update on the business day following any federal holiday. D. HOW TO USE THIS DATASET Population estimates are only available for age groups and race/ethnicity categories. San Francisco population estimates for race/ethnicity and age groups can be found in a dataset based on the San Francisco Population and Demographic Census dataset.These population estimates are from the 2018-2022 5-year American Community Survey (ACS). This dataset includes several characteristic types. Filter the “Characteristic Type” column to explore a topic area. Then, the “Characteristic Group” column shows each group or category within that topic area and the number of cumulative deaths. Cumulative deaths are the running total of all San Francisco COVID-19 deaths in that characteristic group up to the date listed. To explore data on the total number of deaths, use the COVID-19 Deaths Over Time dataset. E. CHANGE LOG
d
Synthetic Suicide Prevention Dataset with SDoH
datasets.ai
datahub.va.gov
+3more
57
Updated Aug 27, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Department of Veterans Affairs (2024). Synthetic Suicide Prevention Dataset with SDoH [Dataset]. https://datasets.ai/datasets/synthetic-suicide-prevention-dataset-with-sdoh
Explore at:
57Available download formats
Dataset updated
Aug 27, 2024
Dataset authored and provided by
Department of Veterans Affairs
Description
The included dataset contains 10,000 synthetic Veteran patient records generated by Synthea. The scope of the data includes over 500 clinical concepts across 90 disease modules, as well as additional social determinants of health (SDoH) data elements that are not traditionally tracked in electronic health records. Each synthetic patient conceptually represents one Veteran in the existing US population; each Veteran has a name, sociodemographic profile, a series of documented clinical encounters and diagnoses, as well as associated cost and payer data. To learn more about Synthea, please visit the Synthea wiki at https://github.com/synthetichealth/synthea/wiki. To find a description of how this dataset is organized by data type, please visit the Synthea CSV File Data Dictionary at https://github.com/synthetichealth/synthea/wiki/CSV-File-Data-Dictionary.The included dataset contains 10,000 synthetic Veteran patient records generated by Synthea. The scope of the data includes over 500 clinical concepts across 90 disease modules, as well as additional social determinants of health (SDoH) data elements that are not traditionally tracked in electronic health records. Each synthetic patient conceptually represents one Veteran in the existing US population; each Veteran has a name, sociodemographic profile, a series of documented clinical encounters and diagnoses, as well as associated cost and payer data. To learn more about Synthea, please visit the Synthea wiki at https://github.com/synthetichealth/synthea/wiki. To find a description of how this dataset is organized by data type, please visit the Synthea CSV File Data Dictionary at https://github.com/synthetichealth/synthea/wiki/CSV-File-Data-Dictionary.
C
Violence Reduction - Victim Demographics - Aggregated
data.cityofchicago.org
s.cnmilf.com
+1more
application/rdfxml +5
Updated Jul 13, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
City of Chicago (2025). Violence Reduction - Victim Demographics - Aggregated [Dataset]. https://data.cityofchicago.org/Public-Safety/Violence-Reduction-Victim-Demographics-Aggregated/gj7a-742p
Explore at:
application/rssxml, csv, json, application/rdfxml, xml, tsvAvailable download formats
Dataset updated
Jul 13, 2025
Dataset authored and provided by
City of Chicago
Description
This dataset contains aggregate data on violent index victimizations at the quarter level of each year (i.e., January – March, April – June, July – September, October – December), from 2001 to the present (1991 to present for Homicides), with a focus on those related to gun violence. Index crimes are 10 crime types selected by the FBI (codes 1-4) for special focus due to their seriousness and frequency. This dataset includes only those index crimes that involve bodily harm or the threat of bodily harm and are reported to the Chicago Police Department (CPD). Each row is aggregated up to victimization type, age group, sex, race, and whether the victimization was domestic-related. Aggregating at the quarter level provides large enough blocks of incidents to protect anonymity while allowing the end user to observe inter-year and intra-year variation. Any row where there were fewer than three incidents during a given quarter has been deleted to help prevent re-identification of victims. For example, if there were three domestic criminal sexual assaults during January to March 2020, all victims associated with those incidents have been removed from this dataset. Human trafficking victimizations have been aggregated separately due to the extremely small number of victimizations.

This dataset includes a " GUNSHOT_INJURY_I " column to indicate whether the victimization involved a shooting, showing either Yes ("Y"), No ("N"), or Unknown ("UKNOWN.") For homicides, injury descriptions are available dating back to 1991, so the "shooting" column will read either "Y" or "N" to indicate whether the homicide was a fatal shooting or not. For non-fatal shootings, data is only available as of 2010. As a result, for any non-fatal shootings that occurred from 2010 to the present, the shooting column will read as “Y.” Non-fatal shooting victims will not be included in this dataset prior to 2010; they will be included in the authorized dataset, but with "UNKNOWN" in the shooting column.

The dataset is refreshed daily, but excludes the most recent complete day to allow CPD time to gather the best available information. Each time the dataset is refreshed, records can change as CPD learns more about each victimization, especially those victimizations that are most recent. The data on the Mayor's Office Violence Reduction Dashboard is updated daily with an approximately 48-hour lag. As cases are passed from the initial reporting officer to the investigating detectives, some recorded data about incidents and victimizations may change once additional information arises. Regularly updated datasets on the City's public portal may change to reflect new or corrected information.

How does this dataset classify victims?

The methodology by which this dataset classifies victims of violent crime differs by victimization type:

Homicide and non-fatal shooting victims: A victimization is considered a homicide victimization or non-fatal shooting victimization depending on its presence in CPD's homicide victims data table or its shooting victims data table. A victimization is considered a homicide only if it is present in CPD's homicide data table, while a victimization is considered a non-fatal shooting only if it is present in CPD's shooting data tables and absent from CPD's homicide data table.

To determine the IUCR code of homicide and non-fatal shooting victimizations, we defer to the incident IUCR code available in CPD's Crimes, 2001-present dataset (available on the City's open data portal). If the IUCR code in CPD's Crimes dataset is inconsistent with the homicide/non-fatal shooting categorization, we defer to CPD's Victims dataset.

For a criminal homicide, the only sensible IUCR codes are 0110 (first-degree murder) or 0130 (second-degree murder). For a non-fatal shooting, a sensible IUCR code must signify a criminal sexual assault, a robbery, or, most commonly, an aggravated battery. In rare instances, the IUCR code in CPD's Crimes and Victims dataset do not align with the homicide/non-fatal shooting categorization:

In instances where a homicide victimization does not correspond to an IUCR code 0110 or 0130, we set the IUCR code to "01XX" to indicate that the victimization was a homicide but we do not know whether it was a first-degree murder (IUCR code = 0110) or a second-degree murder (IUCR code = 0130).

When a non-fatal shooting victimization does not correspond to an IUCR code that signifies a criminal sexual assault, robbery, or aggravated battery, we enter “UNK” in the IUCR column, “YES” in the GUNSHOT_I column, and “NON-FATAL” in the PRIMARY column to indicate that the victim was non-fatally shot, but the precise IUCR code is unknown.

Other violent crime victims: For other violent crime types, we refer to the IUCR classification that exists in CPD's victim table, with only one exception:

When there is an incident that is associated with no victim with a matching IUCR code, we assume that this is an error. Every crime should have at least 1 victim with a matching IUCR code. In these cases, we change the IUCR code to reflect the incident IUCR code because CPD's incident table is considered to be more reliable than the victim table.

Note: All businesses identified as victims in CPD data have been removed from this dataset.

Note: The definition of “homicide” (shooting or otherwise) does not include justifiable homicide or involuntary manslaughter. This dataset also excludes any cases that CPD considers to be “unfounded” or “noncriminal.”

Note: In some instances, the police department's raw incident-level data and victim-level data that were inputs into this dataset do not align on the type of crime that occurred. In those instances, this dataset attempts to correct mismatches between incident and victim specific crime types. When it is not possible to determine which victims are associated with the most recent crime determination, the dataset will show empty cells in the respective demographic fields (age, sex, race, etc.).

Note: The initial reporting officer usually asks victims to report demographic data. If victims are unable to recall, the reporting officer will use their best judgment. “Unknown” can be reported if it is truly unknown.
Means of Transportation to Work
catalog.data.gov
data-usdot.opendata.arcgis.com
+1more
Updated Dec 19, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Bureau of Transportation Statistics (BTS) (Point of Contact) (2024). Means of Transportation to Work [Dataset]. https://catalog.data.gov/dataset/means-of-transportation-to-work2
Explore at:
Dataset updated
Dec 19, 2024
Dataset provided by
Bureau of Transportation Statisticshttp://www.rita.dot.gov/bts
Description
The Means of Transportation to Work dataset was compiled using information from December 31, 2023 and updated December 12, 2024 from the Bureau of Transportation Statistics (BTS) and is part of the U.S. Department of Transportation (USDOT)/Bureau of Transportation Statistics (BTS) National Transportation Atlas Database (NTAD). The Means of Transportation to Work table from the 2023 American Community Survey (ACS) 5-year estimates was joined to 2023 tract-level geographies for all 50 States, District of Columbia and Puerto Rico provided by the Census Bureau. A new file was created that combines the demographic variables from the former with the cartographic boundaries of the latter. The national level census tract layer contains data on the number and percentage of commuters (workers 16 years and over) that used various transportation modes to get to work.
Not seeing a result you expected?
Learn how you can add new datasets to our index.

Facebook

Twitter

Click to copy link

Link copied

Cite

BARTS HEALTH (2023). CDE Patient Demographics [Dataset]. https://find.data.gov.scot/datasets/25890

CDE Patient Demographics

Explore at:

Dataset updated

May 31, 2023

Dataset provided by

BARTS HEALTH

Description

Locally defined dataset containing a full list of patient registrations held within the Trust's EHR system. Details extend to include GP details and patient identifers.

Clear search

Close search

Google apps

Main menu

CDE Patient Demographics

Patient-Level Information and Costing Systems - Integrated Data Set

COVID-19 Case Surveillance Public Use Data

CDC has three COVID-19 case surveillance datasets:

Overview

COVID-19 Case Reports

Data are Considered Provisional

Data Limitations

Data Quality Assurance Procedures

Data Suppression

Additional COVID-19 Data

PMC-Patients Dataset

PMC patient notes

Abstract

Methodology

Patient-Level Information and Costing Systems - Integrated Data Set - Update...

COVID-19 Patient Impact & Hospital Capacity Data

MIMIC-IV

ARCHIVED: COVID-19 Cases by Population Characteristics Over Time

Short-term PM2.5 exposure and early-readmission risk in Heart Failure...

AIHW - Mental Health Services - Emergency Department Presentations by...

[Archived] COVID-19 Deaths by Population Characteristics Over Time

‘COVID-19 Cases by Population Characteristics Over Time’ analyzed by...

COVID-19 Deaths by Population Characteristics

Synthetic Suicide Prevention Dataset with SDoH

Violence Reduction - Victim Demographics - Aggregated

Means of Transportation to Work

CDE Patient Demographics