Locally defined dataset containing a full list of patient registrations held within the Trust's EHR system. Details extend to include GP details and patient identifers.
Part of a set of two collections for patient-level costing, which provide a consistent approach to reporting cost information at patient level.
https://www.usa.gov/government-workshttps://www.usa.gov/government-works
Note: Reporting of new COVID-19 Case Surveillance data will be discontinued July 1, 2024, to align with the process of removing SARS-CoV-2 infections (COVID-19 cases) from the list of nationally notifiable diseases. Although these data will continue to be publicly available, the dataset will no longer be updated.
Authorizations to collect certain public health data expired at the end of the U.S. public health emergency declaration on May 11, 2023. The following jurisdictions discontinued COVID-19 case notifications to CDC: Iowa (11/8/21), Kansas (5/12/23), Kentucky (1/1/24), Louisiana (10/31/23), New Hampshire (5/23/23), and Oklahoma (5/2/23). Please note that these jurisdictions will not routinely send new case data after the dates indicated. As of 7/13/23, case notifications from Oregon will only include pediatric cases resulting in death.
This case surveillance public use dataset has 12 elements for all COVID-19 cases shared with CDC and includes demographics, any exposure history, disease severity indicators and outcomes, presence of any underlying medical conditions and risk behaviors, and no geographic data.
The COVID-19 case surveillance database includes individual-level data reported to U.S. states and autonomous reporting entities, including New York City and the District of Columbia (D.C.), as well as U.S. territories and affiliates. On April 5, 2020, COVID-19 was added to the Nationally Notifiable Condition List and classified as “immediately notifiable, urgent (within 24 hours)” by a Council of State and Territorial Epidemiologists (CSTE) Interim Position Statement (Interim-20-ID-01). CSTE updated the position statement on August 5, 2020, to clarify the interpretation of antigen detection tests and serologic test results within the case classification (Interim-20-ID-02). The statement also recommended that all states and territories enact laws to make COVID-19 reportable in their jurisdiction, and that jurisdictions conducting surveillance should submit case notifications to CDC. COVID-19 case surveillance data are collected by jurisdictions and reported voluntarily to CDC.
For more information:
NNDSS Supports the COVID-19 Response | CDC.
The deidentified data in the “COVID-19 Case Surveillance Public Use Data” include demographic characteristics, any exposure history, disease severity indicators and outcomes, clinical data, laboratory diagnostic test results, and presence of any underlying medical conditions and risk behaviors. All data elements can be found on the COVID-19 case report form located at www.cdc.gov/coronavirus/2019-ncov/downloads/pui-form.pdf.
COVID-19 case reports have been routinely submitted using nationally standardized case reporting forms. On April 5, 2020, CSTE released an Interim Position Statement with national surveillance case definitions for COVID-19 included. Current versions of these case definitions are available here: https://ndc.services.cdc.gov/case-definitions/coronavirus-disease-2019-2021/.
All cases reported on or after were requested to be shared by public health departments to CDC using the standardized case definitions for laboratory-confirmed or probable cases. On May 5, 2020, the standardized case reporting form was revised. Case reporting using this new form is ongoing among U.S. states and territories.
To learn more about the limitations in using case surveillance data, visit FAQ: COVID-19 Data and Surveillance.
CDC’s Case Surveillance Section routinely performs data quality assurance procedures (i.e., ongoing corrections and logic checks to address data errors). To date, the following data cleaning steps have been implemented:
To prevent release of data that could be used to identify people, data cells are suppressed for low frequency (<5) records and indirect identifiers (e.g., date of first positive specimen). Suppression includes rare combinations of demographic characteristics (sex, age group, race/ethnicity). Suppressed values are re-coded to the NA answer option; records with data suppression are never removed.
For questions, please contact Ask SRRG (eocevent394@cdc.gov).
COVID-19 data are available to the public as summary or aggregate count files, including total counts of cases and deaths by state and by county. These
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
json
file, which is a list of dictionaries with the following keys:- patient_id
: string. A continuous id of patients, starting from 0.- patient_uid
: string. Unique ID for each patient, with format PMID-x, where PMID is the PubMed Identifier of source article of the note and x denotes index of the note in source article.- PMID
: string. PMID for source article.- file_path
: string. File path of xml file of source article.- title
: string. Source article title.- patient
: string. Patient note.- age
: list of tuples. Each entry is in format (value, unit)
where value is a float number and unit is in 'year', 'month', 'week', 'day' and 'hour' indicating age unit. For example, [[1.0, 'year'], [2.0, 'month']]
indicating the patient is a one-year- and two-month-old infant.- gender
: 'M' or 'F'. Male or Female.- relevant_articles
: dict. The key is PMID of the relevant articles and the corresponding value is its relevance score (2 or 1 as defined in the Methods'' section).- `similar_patients`: dict. The key is patient_uid of the similar patients and the corresponding value is its similarity score (2 or 1 as defined in the
Methods'' section).Sample dataset for training webinar on Tuesday, April 16 2024.
PMC-Patients is a first-of-its-kind dataset consisting of 167k patient summaries extracted from case reports in PubMed Central (PMC), 3.1M patient-article relevance and 293k patient-patient similarity annotations defined by PubMed citation graph.
Dataset Description - **Homepage:** https://github.com/pmc-patients/pmc-patients
- **Repository:** https://github.com/pmc-patients/pmc-patients
- **Paper:** https://arxiv.org/pdf/2202.13876.pdf
- **Leaderboard:** https://pmc-patients.github.io/
- **Point of Contact:** zhengyun21@mails.tsinghua.edu.cn Dataset Structure This file contains all information about patients summaries in PMC-Patients, with the following columns:
%3C!-- --%3E
Dataset Creation
If you are interested in the collection of PMC-Patients and reproducing our baselines, please refer to [this repository](https://github.com/zhao-zy15/PMC-Patients).
A submission to cover annual changes to the PLICS integrated data set.
In this manuscript EPA researchers used high resolution (1x1 km) modeled air quality data from a model built by Harvard collaborators to estimate the association between short-term exposure to air pollution and the occurrence of 30-day readmissions in a heart failure population. The heart failure population was taken from patients presenting to a University of North Carolina Healthcare System (UNCHCS) affiliated hospital or clinic that reported electronic health records to the Carolina Data Warehouse for Health (CDW-H). A description of the variables used in this analysis are available in the data dictionary (L:/PRIV/EPHD_CRB/Cavin/CARES/Data Dictonaries/HF short term PM25 and readmissions data dictionary.xlsx) associated with this manuscript. Analysis code is available in L:/PRIV/EPHD_CRB/Cavin/CARES/Project Analytic Code/Lauren Wyatt/DailyPM_HF_readmission. This dataset is not publicly accessible because: Dataset is PII in the form of electronic health records. It can be accessed through the following means: Data can be accessed with an approved IRB. Format: In this manuscript EPA researchers used high resolution (1x1 km) modeled air quality data from a model built by Harvard collaborators to estimate the association between short-term exposure to air pollution and the occurrence of 30-day readmissions in a heart failure population. The heart failure population was taken from patients presenting to a University of North Carolina Healthcare System (UNCHCS) affiliated hospital or clinic that reported electronic health records to the Carolina Data Warehouse for Health (CDW-H). A description of the variables used in this analysis are available in the data dictionary (L:/PRIV/EPHD_CRB/Cavin/CARES/Data Dictonaries/HF short term PM25 and readmissions data dictionary.xlsx) associated with this manuscript. Analysis code is available in L:/PRIV/EPHD_CRB/Cavin/CARES/Project Analytic Code/Lauren Wyatt/DailyPM_HF_readmission. This dataset is associated with the following publication: Wyatt, L., A. Weaver, J. Moyer, J. Schwartz, Q. Di, D. Diazsanchez, W. Cascio, and C. Ward-Caviness. Short-term PM2.5 exposure and early-readmission risk: A retrospective cohort study in North Carolina Heart Failure Patients. American Heart Journal. Mosby Year Book Incorporated, Orlando, FL, USA, 248: 130-138, (2022).
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Analysis of ‘COVID-19 Cases by Population Characteristics Over Time’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://catalog.data.gov/dataset/a3291d85-0076-43c5-a59c-df49480cdc6d on 13 February 2022.
--- Dataset description provided by original source is as follows ---
Note: On January 22, 2022, system updates to improve the timeliness and accuracy of San Francisco COVID-19 cases and deaths data were implemented. You might see some fluctuations in historic data as a result of this change. Due to the changes, starting on January 22, 2022, the number of new cases reported daily will be higher than under the old system as cases that would have taken longer to process will be reported earlier.
A. SUMMARY This dataset shows San Francisco COVID-19 cases by population characteristics and by specimen collection date. Cases are included on the date the positive test was collected.
Population characteristics are subgroups, or demographic cross-sections, like age, race, or gender. The City tracks how cases have been distributed among different subgroups. This information can reveal trends and disparities among groups.
Data is lagged by five days, meaning the most recent specimen collection date included is 5 days prior to today. Tests take time to process and report, so more recent data is less reliable.
B. HOW THE DATASET IS CREATED Data on the population characteristics of COVID-19 cases and deaths are from: * Case interviews * Laboratories * Medical providers
These multiple streams of data are merged, deduplicated, and undergo data verification processes. This data may not be immediately available for recently reported cases because of the time needed to process tests and validate cases. Daily case totals on previous days may increase or decrease. Learn more.
Data are continually updated to maximize completeness of information and reporting on San Francisco residents with COVID-19.
Data notes on each population characteristic type is listed below.
Race/ethnicity * We include all race/ethnicity categories that are collected for COVID-19 cases. * The population estimates for the "Other" or “Multi-racial” groups should be considered with caution. The Census definition is likely not exactly aligned with how the City collects this data. For that reason, we do not recommend calculating population rates for these groups.
Sexual orientation * Sexual orientation data is collected from individuals who are 18 years old or older. These individuals can choose whether to provide this information during case interviews. Learn more about our data collection guidelines. * The City began asking for this information on April 28, 2020.
Gender * The City collects information on gender identity using these guidelines.
Comorbidities * Underlying conditions are reported when a person has one or more underlying health conditions at the time of diagnosis or death.
Transmission type * Information on transmission of COVID-19 is based on case interviews with individuals who have a confirmed positive test. Individuals are asked if they have been in close contact with a known COVID-19 case. If they answer yes, transmission category is recorded as contact with a known case. If they report no contact with a known case, transmission category is recorded as community transmission. If the case is not interviewed or was not asked the question, they are counted as unknown.
Homelessness
Persons are identified as homeless based on several data sources:
* self-reported living situation
* the location at the time of testing
* Department of Public Health homelessness and health databases
* Residents in Single-Room Occupancy hotels are not included in these figures.
These methods serve as an estimate of persons experiencing homelessness. They may not meet other homelessness definitions.
Skilled Nursing Facility (SNF) occupancy * A Skilled Nursing
--- Original source retains full ownership of the source dataset ---
As of July 2nd, 2024 the COVID-19 Deaths by Population Characteristics Over Time dataset has been retired. This dataset is archived and will no longer update. We will be publishing a cumulative deaths by population characteristics dataset that will update moving forward.
A. SUMMARY This dataset shows San Francisco COVID-19 deaths by population characteristics and by date. This data may not be immediately available for recently reported deaths. Data updates as more information becomes available. Because of this, death totals for previous days may increase or decrease. More recent data is less reliable.
Population characteristics are subgroups, or demographic cross-sections, like age, race, or gender. The City tracks how deaths have been distributed among different subgroups. This information can reveal trends and disparities among groups.
B. HOW THE DATASET IS CREATED As of January 1, 2023, COVID-19 deaths are defined as persons who had COVID-19 listed as a cause of death or a significant condition contributing to their death on their death certificate. This definition is in alignment with the California Department of Public Health and the national https://preparedness.cste.org/wp-content/uploads/2022/12/CSTE-Revised-Classification-of-COVID-19-associated-Deaths.Final_.11.22.22.pdf">Council of State and Territorial Epidemiologists. Death certificates are maintained by the California Department of Public Health.
Data on the population characteristics of COVID-19 deaths are from: *Case reports *Medical records *Electronic lab reports *Death certificates
Data are continually updated to maximize completeness of information and reporting on San Francisco COVID-19 deaths.
To protect resident privacy, we summarize COVID-19 data by only one characteristic at a time. Data are not shown until cumulative citywide deaths reach five or more.
Data notes on each population characteristic type is listed below.
Race/ethnicity * We include all race/ethnicity categories that are collected for COVID-19 cases.
Gender * The City collects information on gender identity using these guidelines.
C. UPDATE PROCESS Updates automatically at 06:30 and 07:30 AM Pacific Time on Wednesday each week.
Dataset will not update on the business day following any federal holiday.
D. HOW TO USE THIS DATASET Population estimates are only available for age groups and race/ethnicity categories. San Francisco population estimates for race/ethnicity and age groups can be found in a view based on the San Francisco Population and Demographic Census dataset. These population estimates are from the 2016-2020 5-year American Community Survey (ACS).
This dataset includes many different types of characteristics. Filter the “Characteristic Type” column to explore a topic area. Then, the “Characteristic Group” column shows each group or category within that topic area and the number of deaths on each date.
New deaths are the count of deaths within that characteristic group on that specific date. Cumulative deaths are the running total of all San Francisco COVID-19 deaths in that characteristic group up to the date listed.
This data may not be immediately available for more recent deaths. Data updates as more information becomes available.
To explore data on the total number of deaths, use the COVID-19 Deaths Over Time dataset.
E. CHANGE LOG
A. SUMMARY This archived dataset includes data for population characteristics that are no longer being reported publicly. The date on which each population characteristic type was archived can be found in the field “data_loaded_at”. B. HOW THE DATASET IS CREATED Data on the population characteristics of COVID-19 cases are from: * Case interviews * Laboratories * Medical providers These multiple streams of data are merged, deduplicated, and undergo data verification processes. Race/ethnicity * We include all race/ethnicity categories that are collected for COVID-19 cases. * The population estimates for the "Other" or “Multi-racial” groups should be considered with caution. The Census definition is likely not exactly aligned with how the City collects this data. For that reason, we do not recommend calculating population rates for these groups. Gender * The City collects information on gender identity using these guidelines. Skilled Nursing Facility (SNF) occupancy * A Skilled Nursing Facility (SNF) is a type of long-term care facility that provides care to individuals, generally in their 60s and older, who need functional assistance in their daily lives. * This dataset includes data for COVID-19 cases reported in Skilled Nursing Facilities (SNFs) through 12/31/2022, archived on 1/5/2023. These data were identified where “Characteristic_Type” = ‘Skilled Nursing Facility Occupancy’. Sexual orientation * The City began asking adults 18 years old or older for their sexual orientation identification during case interviews as of April 28, 2020. Sexual orientation data prior to this date is unavailable. * The City doesn’t collect or report information about sexual orientation for persons under 12 years of age. * Case investigation interviews transitioned to the California Department of Public Health, Virtual Assistant information gathering beginning December 2021. The Virtual Assistant is only sent to adults who are 18+ years old. Learn more about our data collection guidelines pertaining to sexual orientation. Comorbidities * Underlying conditions are reported when a person has one or more underlying health conditions at the time of diagnosis or death. Homelessness Persons are identified as homeless based on several data sources: * self-reported living situation * the location at the time of testing * Department of Public Health homelessness and health databases * Residents in Single-Room Occupancy hotels are not included in these figures. These methods serve as an estimate of persons experiencing homelessness. They may not meet other homelessness definitions. Single Room Occupancy (SRO) tenancy * SRO buildings are defined by the San Francisco Housing Code as having six or more "residential guest rooms" which may be attached to shared bathrooms, kitchens, and living spaces. * The details of a person's living arrangements are verified during case interviews. Transmission Type * Information on transmission of COVID-19 is based on case interviews with individuals who have a confirmed positive test. Individuals are asked if they have been in close contact with a known COVID-19 case. If they answer yes, transmission category is recorded as contact with a known case. If they report no contact with a known case, transmission category is recorded as community transmission. If the case is not interviewed or was not asked the question, they are counted as unknown. C. UPDATE PROCESS This dataset has been archived and will no longer update as of 9/11/2023. D. HOW TO USE THIS DATASET Population estimates are only available for age groups and race/ethnicity categories. San Francisco po
Attribution 3.0 (CC BY 3.0)https://creativecommons.org/licenses/by/3.0/
License information was derived automatically
This dataset presents the footprint of the number of emergency department presentations in public hospitals by patient demographics and location. Mental health-related emergency department (ED) presentations are defined as presentations to public hospital EDs that have a principal diagnosis of mental and behavioural disorders. However, the definition does not fully capture all potential mental health-related presentations to EDs such as intentional self-harm, as intent can be difficult to identify in an ED environment and can also be difficult to code. The data spans the financial years of 2014-2018 and is aggregated to Statistical Area Level 3 (SA3) geographic areas from the 2016 Australian Statistical Geography Standard (ASGS). State and territory health authorities collect a core set of nationally comparable information on most public hospital ED presentations in their jurisdiction, which is compiled annually into the National Non-Admitted Patient Emergency Department Care Database (NNAPEDCD). The data reported for 2014–15 to 2017–18 is sourced from the NNAPEDCD. Information about mental health-related services provided in EDs prior to 2014–15 was supplied directly to the Australian Institute of Health and Welfare (AIHW) by states and territories. Mental health services in Australia (MHSA) provides a picture of the national response of the health and welfare service system to the mental health care needs of Australians. MHSA is updated progressively throughout each year as data becomes available. The data accompanies the Mental Health Services - In Brief 2018 Web Report. For further information about this dataset, visit the data source:Australian Institute of Health and Welfare - Mental health services in Australia Data Tables. Please note: AURIN has spatially enabled the original data.
A. SUMMARY This dataset shows San Francisco COVID-19 deaths by population characteristics. This data may not be immediately available for recently reported deaths. Data updates as more information becomes available. Because of this, death totals may increase or decrease. Population characteristics are subgroups, or demographic cross-sections, like age, race, or gender. The City tracks how deaths have been distributed among different subgroups. This information can reveal trends and disparities among groups. B. HOW THE DATASET IS CREATED As of January 1, 2023, COVID-19 deaths are defined as persons who had COVID-19 listed as a cause of death or a significant condition contributing to their death on their death certificate. This definition is in alignment with the California Department of Public Health and the national Council of State and Territorial Epidemiologists. Death certificates are maintained by the California Department of Public Health. Data on the population characteristics of COVID-19 deaths are from: Case reports Medical records Electronic lab reports Death certificates Data are continually updated to maximize completeness of information and reporting on San Francisco COVID-19 deaths. To protect resident privacy, we summarize COVID-19 data by only one population characteristic at a time. Data are not shown until cumulative citywide deaths reach five or more. Data notes on select population characteristic types are listed below. Race/ethnicity * We include all race/ethnicity categories that are collected for COVID-19 cases. Gender * The City collects information on gender identity using these guidelines. C. UPDATE PROCESS Updates automatically at 06:30 and 07:30 AM Pacific Time on Wednesday each week. Dataset will not update on the business day following any federal holiday. D. HOW TO USE THIS DATASET Population estimates are only available for age groups and race/ethnicity categories. San Francisco population estimates for race/ethnicity and age groups can be found in a dataset based on the San Francisco Population and Demographic Census dataset.These population estimates are from the 2018-2022 5-year American Community Survey (ACS). This dataset includes several characteristic types. Filter the “Characteristic Type” column to explore a topic area. Then, the “Characteristic Group” column shows each group or category within that topic area and the number of cumulative deaths. Cumulative deaths are the running total of all San Francisco COVID-19 deaths in that characteristic group up to the date listed. To explore data on the total number of deaths, use the COVID-19 Deaths Over Time dataset. E. CHANGE LOG
Not seeing a result you expected?
Learn how you can add new datasets to our index.
Locally defined dataset containing a full list of patient registrations held within the Trust's EHR system. Details extend to include GP details and patient identifers.