100+ datasets found
  1. Infectious Diseases by Disease, County, Year, and Sex

    • data.chhs.ca.gov
    • data.ca.gov
    • +3more
    csv, zip
    Updated Nov 7, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    California Department of Public Health (2025). Infectious Diseases by Disease, County, Year, and Sex [Dataset]. https://data.chhs.ca.gov/dataset/infectious-disease
    Explore at:
    zip, csv(12953665)Available download formats
    Dataset updated
    Nov 7, 2025
    Dataset authored and provided by
    California Department of Public Healthhttps://www.cdph.ca.gov/
    Description

    These data contain case counts and rates for selected communicable diseases—listed in the data dictionary—that met the surveillance case definition for that disease and was reported for California residents, by disease, county, year, and sex. The data represent cases with an estimated illness onset date from 2001 through the last year indicated from California Confidential Morbidity Reports and/or Laboratory Reports. Data captured represent reportable case counts as of the date indicated in the “Temporal Coverage” section below, so the data presented may differ from previous publications due to delays inherent to case reporting, laboratory reporting, and epidemiologic investigation.

  2. HIV: annual data

    • gov.uk
    Updated Oct 7, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    UK Health Security Agency (2025). HIV: annual data [Dataset]. https://www.gov.uk/government/statistics/hiv-annual-data-tables
    Explore at:
    Dataset updated
    Oct 7, 2025
    Dataset provided by
    GOV.UKhttp://gov.uk/
    Authors
    UK Health Security Agency
    Description

    The following slide set is available to download for presentational use:

    Data on all HIV diagnoses, AIDS and deaths among people diagnosed with HIV are collected from HIV outpatient clinics, laboratories and other healthcare settings. Data relating to people living with HIV is collected from HIV outpatient clinics. Data relates to England, Wales, Northern Ireland and Scotland, unless stated.

    HIV testing, pre-exposure prophylaxis, and post-exposure prophylaxis data relates to activity at sexual health services in England only.

    View the pre-release access lists for these statistics.

    Previous reports, data tables and slide sets are also available for:

    Our statistical practice is regulated by the Office for Statistics Regulation (OSR). The OSR sets the standards of trustworthiness, quality and value in the https://code.statisticsauthority.gov.uk/">Code of Practice for Statistics that all producers of Official Statistics should adhere to.

    Additional information on HIV surveillance can be found in the HIV Action Plan for England monitoring and evaluation framework reports. Other HIV in the UK reports published by Public Health England (PHE) are available online.

  3. g

    Coronavirus (Covid-19) Data in the United States

    • github.com
    • openicpsr.org
    • +4more
    csv
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    New York Times, Coronavirus (Covid-19) Data in the United States [Dataset]. https://github.com/nytimes/covid-19-data
    Explore at:
    csvAvailable download formats
    Dataset provided by
    New York Times
    License

    https://github.com/nytimes/covid-19-data/blob/master/LICENSEhttps://github.com/nytimes/covid-19-data/blob/master/LICENSE

    Description

    The New York Times is releasing a series of data files with cumulative counts of coronavirus cases in the United States, at the state and county level, over time. We are compiling this time series data from state and local governments and health departments in an attempt to provide a complete record of the ongoing outbreak.

    Since the first reported coronavirus case in Washington State on Jan. 21, 2020, The Times has tracked cases of coronavirus in real time as they were identified after testing. Because of the widespread shortage of testing, however, the data is necessarily limited in the picture it presents of the outbreak.

    We have used this data to power our maps and reporting tracking the outbreak, and it is now being made available to the public in response to requests from researchers, scientists and government officials who would like access to the data to better understand the outbreak.

    The data begins with the first reported coronavirus case in Washington State on Jan. 21, 2020. We will publish regular updates to the data in this repository.

  4. US HAI Infections by State

    • kaggle.com
    zip
    Updated Jan 23, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    The Devastator (2023). US HAI Infections by State [Dataset]. https://www.kaggle.com/datasets/thedevastator/us-hai-infections-by-state
    Explore at:
    zip(13073 bytes)Available download formats
    Dataset updated
    Jan 23, 2023
    Authors
    The Devastator
    License

    Open Database License (ODbL) v1.0https://www.opendatacommons.org/licenses/odbl/1.0/
    License information was derived automatically

    Description

    US HAI Infections by State

    Investigating Prevalence and Prevention of Healthcare-Associated Infections

    By Health [source]

    About this dataset

    This dataset provides a comprehensive overview of Healthcare-Associated Infections (HAIs) across all US states. Spanning multiple subspecialty areas such as central lines and urinary catheters infections, HAIs can be devastating to patient care and outcomes. The data within this set is collected by the Centers for Disease Control and Prevention (CDC) through the National Healthcare Safety Network (NHSN). It contains information on infection rates along with other important data such as measure names, scores, footnotes and measure start & end dates. This dataset presents us with an opportunity to better understand the prevalence of HAIs on a state level in order to improve patient safety measures that are used in hospitals nationwide

    More Datasets

    For more datasets, click here.

    Featured Notebooks

    • 🚨 Your notebook can be here! 🚨!

    How to use the dataset

    This dataset provides valuable insights into healthcare-associated infections (HAIs) across the United States. By understanding which states have higher rates of HAIs, we can better understand the overall state of health care in the country.

    To get started using this data set, take a look at some of the columns included: Measure Name, Score, Footnote and Measure Start and End Date. The measure name will give you an overview of what type of infection is being measured in each state. Each measurement has an associated score that tells you how well each state is doing with respect to other states when it comes to preventing these infections from occurring. The footnote gives more information about specific details surrounding that particular HAI measure for that state (for instance, if all or only some hospitals are included in the measure). Finally, the start and end dates tell you when the measure began and ended in regards to data collection for each state.

    Once you have explored some of these columns, start looking deeper into what data points each column contains - such as which states have a high number of infections related to surgical procedures compared to others who don't? Thinking critically about this data will reveal trends amongst different states and how they compare when it comes to providing quality health care services within their facilities.

    By exploring these trends further with visuals such as charts or graphs, you can better determine which areas need improvement so that we may develop preventative measures against further incidences occurring in hospitals across all US states

    Research Ideas

    • Creating a state-by-state map of healthcare-associated infection rates in order to identify which states have the highest and lowest rates of HAIs.
    • Developing a predictive model to determine the likelihood of an infection in a particular hospital based on data from all the other hospitals in the same state, allowing hospitals to adjust their safety protocols accordingly.
    • Constructing an infographic displaying different points picked up within this dataset such as what are common sources of infection, breakdowns by states and types, etc

    Acknowledgements

    If you use this dataset in your research, please credit the original authors. Data Source

    License

    License: Open Database License (ODbL) v1.0 - You are free to: - Share - copy and redistribute the material in any medium or format. - Adapt - remix, transform, and build upon the material for any purpose, even commercially. - You must: - Give appropriate credit - Provide a link to the license, and indicate if changes were made. - ShareAlike - You must distribute your contributions under the same license as the original. - Keep intact - all notices that refer to this license, including copyright notices. - No Derivatives - If you remix, transform, or build upon the material, you may not distribute the modified material. - No additional restrictions - You may not apply legal terms or technological measures that legally restrict others from doing anything the license permits.

    Columns

    File: Healthcare_Associated_Infections_-_State.csv | Column name | Description | |:-----------------------|:------------------------------------------------------------------------------------------------------------------------|...

  5. NNDSS - Table II. Chlamydia trachomatis infection to Coccidioidomycosis

    • data.cdc.gov
    • data.virginia.gov
    • +6more
    csv, xlsx, xml
    Updated Jan 2, 2019
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Division of Health Informatics and Surveillance (DHIS), Centers for Disease Control and Prevention (2019). NNDSS - Table II. Chlamydia trachomatis infection to Coccidioidomycosis [Dataset]. https://data.cdc.gov/NNDSS/NNDSS-Table-II-Chlamydia-trachomatis-infection-to-/5egk-p6rd
    Explore at:
    xlsx, csv, xmlAvailable download formats
    Dataset updated
    Jan 2, 2019
    Dataset provided by
    Centers for Disease Control and Preventionhttp://www.cdc.gov/
    Authors
    Division of Health Informatics and Surveillance (DHIS), Centers for Disease Control and Prevention
    Description

    NNDSS - Table II. Chlamydia trachomatis infection to Coccidioidomycosis - 2018. In this Table, provisional cases of selected notifiable diseases (≥1,000 cases reported during the preceding year), and selected low frequency diseases are displayed. The Table includes total number of cases reported in the United States, by region and by states or territory.

    Note:

    This table contains provisional cases of selected national notifiable diseases from the National Notifiable Diseases Surveillance System (NNDSS). NNDSS data from the 50 states, New York City, the District of Columbia and the U.S. territories are collated and published weekly on the NNDSS Data and Statistics web page (https://wwwn.cdc.gov/nndss/data-and-statistics.html). Cases reported by state health departments to CDC for weekly publication are provisional because of the time needed to complete case follow-up. Therefore, numbers presented in later weeks may reflect changes made to these counts as additional information becomes available. The national surveillance case definitions used to define a case are available on the NNDSS web site at https://wwwn.cdc.gov/nndss/. Information about the weekly provisional data and guides to interpreting data are available at: https://wwwn.cdc.gov/nndss/infectious-tables.html

    Footnotes:

    C.N.M.I.: Commonwealth of Northern Mariana Islands. U: Unavailable. —: No reported cases. N: Not reportable. NA: Not Available. NN: Not Nationally Notifiable. NP: Nationally notifiable but not published. Cum: Cumulative year-to-date counts. Med: Median. Max: Maximum.

  6. California Infectious Disease Cases

    • kaggle.com
    zip
    Updated Jan 24, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    The Devastator (2023). California Infectious Disease Cases [Dataset]. https://www.kaggle.com/datasets/thedevastator/california-infectious-disease-cases
    Explore at:
    zip(2093378 bytes)Available download formats
    Dataset updated
    Jan 24, 2023
    Authors
    The Devastator
    Area covered
    California
    Description

    California Infectious Disease Cases

    Rates and Counts By County, Disease, Sex, and Year (2001-2014)

    By Health [source]

    About this dataset

    This dataset provides comprehensive information on the number and rate of infectious diseases in California. Focusing on counties, sexes, and various diseases between 2001-2014, it offers powerful insights into the health status of its citizens. Its data also reveals trends in the spread of common illnesses in this state. Whether you are an epidemiologist looking to inform public health policy or a researcher seeking to investigate particular illnesses within certain populations, this dataset contains all the necessary information to answer your questions. Explore it today and discover hidden stories waiting to be uncovered!

    More Datasets

    For more datasets, click here.

    Featured Notebooks

    • 🚨 Your notebook can be here! 🚨!

    How to use the dataset

    This dataset contains counts and rates of infectious diseases in California by county, disease, sex, and year. This dataset can be used to generate trends to understand the changes in incidence of different types of diseases over time and across counties or between sexes.

    To use this dataset: - Select the columns you are interested in exploring - these could include Disease, County, Sex or Year. - Filter out the rows that do not relate to your question - for example filtering by a specific county or disease. - Examine the average rate per 100000 people for each group you selected as well as its lower and upper confidence intervals (CI). - Use Rate as your dependent variable for analysis; Population is likely also important determining factors. Make sure to check if any Rates have 'unstable' flags.
    - Visualise or statistically analyse your data using suitable methods such as descriptive statistics (means/medians/mode etc.)for comparison between 2+ groups or correlation/regression based models when comparing one variable to another over time etc.

    Research Ideas

    • Analyzing the geographic spread of infectious diseases over time to identify areas in need of increased education, resources, and care.
    • Comparing rates of disease by sex to identify and understand any gender-based differences in infectious disease cases.
    • Using the Unstable column to determine whether a particular county or region needs further study of a certain type of infectious disease due to unusual spikes or drops in rate or count during a specific year

    Acknowledgements

    If you use this dataset in your research, please credit the original authors. Data Source

    License

    License: Dataset copyright by authors - You are free to: - Share - copy and redistribute the material in any medium or format for any purpose, even commercially. - Adapt - remix, transform, and build upon the material for any purpose, even commercially. - You must: - Give appropriate credit - Provide a link to the license, and indicate if changes were made. - ShareAlike - You must distribute your contributions under the same license as the original. - Keep intact - all notices that refer to this license, including copyright notices.

    Columns

    File: Infectious_Disease_Cases_by_County_Year_and_Sex_2001-2014.csv | Column name | Description | |:---------------|:---------------------------------------------------------------------------------------------------------------| | Disease | The type of infectious disease reported. (String) | | County | The county in California where the cases were reported. (String) | | Year | The year in which the cases were reported. (Integer) | | Sex | The gender of the individuals who contracted the disease. (String) | | Population | The population size of the county in which the cases were reported. (Integer) | | Rate | The rate of infection per 100 thousand people living in the county. (Float) | | CI.lower | The lower confidence interval associated with the rate of infection. (Float) | | CI.upper | The upper confidence interval associated with the rate of infection. (Float) ...

  7. COVID-19 Post-Vaccination Infection Data (ARCHIVED)

    • data.chhs.ca.gov
    • data.ca.gov
    • +4more
    csv, xlsx, zip
    Updated Nov 7, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    California Department of Public Health (2025). COVID-19 Post-Vaccination Infection Data (ARCHIVED) [Dataset]. https://data.chhs.ca.gov/dataset/covid-19-post-vaccination-infection-data
    Explore at:
    csv(38212), zip, csv(90508), csv(78921), xlsx(11056)Available download formats
    Dataset updated
    Nov 7, 2025
    Dataset authored and provided by
    California Department of Public Healthhttps://www.cdph.ca.gov/
    Description

    Note: This dataset is no longer being updated due to the end of the COVID-19 Public Health Emergency.

    The California Department of Public Health (CDPH) is identifying vaccination status of COVID-19 cases, hospitalizations, and deaths by analyzing the state immunization registry and registry of confirmed COVID-19 cases. Post-vaccination cases are individuals who have a positive SARS-Cov-2 molecular test (e.g. PCR) at least 14 days after they have completed their primary vaccination series.

    Tracking cases of COVID-19 that occur after vaccination is important for monitoring the impact of immunization campaigns. While COVID-19 vaccines are safe and effective, some cases are still expected in persons who have been vaccinated, as no vaccine is 100% effective. For more information, please see https://www.cdph.ca.gov/Programs/CID/DCDC/Pages/COVID-19/Post-Vaccine-COVID19-Cases.aspx

    Post-vaccination infection data is updated monthly and includes data on cases, hospitalizations, and deaths among the unvaccinated and the vaccinated. Partially vaccinated individuals are excluded. To account for reporting and processing delays, there is at least a one-month lag in provided data (for example data published on 9/9/22 will include data through 7/31/22).

    Notes:

    • On September 9, 2022, the post-vaccination data has been changed to compare unvaccinated with those with at least a primary series completed for persons age 5+. These data will be updated monthly (first Thursday of the month) and include at least a one month lag.

    • On February 2, 2022, the post-vaccination data has been changed to distinguish between vaccination with a primary series only versus vaccinated and boosted. The previous dataset has been uploaded as an archived table. Additionally, the lag on this data has been extended to 14 days.

    • On November 29, 2021, the denominator for calculating vaccine coverage has been changed from age 16+ to age 12+ to reflect new vaccine eligibility criteria. The previous dataset based on age 16+ denominators has been uploaded as an archived table.

  8. NNDSS - TABLE 1Q. Hepatitis B, perinatal infection to Hepatitis C, acute,...

    • catalog.data.gov
    • data.virginia.gov
    • +2more
    Updated Jul 9, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Centers for Disease Control and Prevention (2025). NNDSS - TABLE 1Q. Hepatitis B, perinatal infection to Hepatitis C, acute, Probable [Dataset]. https://catalog.data.gov/dataset/nndss-table-1q-hepatitis-b-perinatal-infection-to-hepatitis-c-acute-probable-2ee3e
    Explore at:
    Dataset updated
    Jul 9, 2025
    Dataset provided by
    Centers for Disease Control and Preventionhttp://www.cdc.gov/
    Description

    NNDSS - TABLE 1Q. Hepatitis B, perinatal infection to Hepatitis C, acute, Probable - 2022. In this Table, provisional cases* of notifiable diseases are displayed for United States, U.S. territories, and Non-U.S. residents. Notes: • These are weekly cases of selected infectious national notifiable diseases, from the National Notifiable Diseases Surveillance System (NNDSS). NNDSS data reported by the 50 states, New York City, the District of Columbia, and the U.S. territories are collated and published weekly as numbered tables available at https://www.cdc.gov/nndss/data-statistics/index.html. Cases reported by state health departments to CDC for weekly publication are subject to ongoing revision of information and delayed reporting. Therefore, numbers listed in later weeks may reflect changes made to these counts as additional information becomes available. Case counts in the tables are presented as published each week. See also Guide to Interpreting Provisional and Finalized NNDSS Data at https://www.cdc.gov/nndss/docs/Readers-Guide-WONDER-Tables-20210421-508.pdf. • Notices, errata, and other notes are available in the Notice To Data Users page at https://wonder.cdc.gov/nndss/NTR.html. • The list of national notifiable infectious diseases and conditions and their national surveillance case definitions are available at https://ndc.services.cdc.gov/. This list incorporates the Council of State and Territorial Epidemiologists (CSTE) position statements approved by CSTE for national surveillance. Footnotes: *Case counts for reporting years 2021 and 2022 are provisional and subject to change. Cases are assigned to the reporting jurisdiction submitting the case to NNDSS, if the case's country of usual residence is the U.S., a U.S. territory, unknown, or null (i.e. country not reported); otherwise, the case is assigned to the 'Non-U.S. Residents' category. Country of usual residence is currently not reported by all jurisdictions or for all conditions. For further information on interpretation of these data, see https://www.cdc.gov/nndss/docs/Readers-Guide-WONDER-Tables-20210421-508.pdf. †Previous 52 week maximum and cumulative YTD are determined from periods of time when the condition was reportable in the jurisdiction (i.e., may be less than 52 weeks of data or incomplete YTD data). U: Unavailable — The reporting jurisdiction was unable to send the data to CDC or CDC was unable to process the data. -: No reported cases — The reporting jurisdiction did not submit any cases to CDC. N: Not reportable — The disease or condition was not reportable by law, statute, or regulation in the reporting jurisdiction. NN: Not nationally notifiable — This condition was not designated as being nationally notifiable. NP: Nationally notifiable but not published. NC: Not calculated — There is insufficient data available to support the calculation of this statistic. Cum: Cumulative year-to-date counts. Max: Maximum — Maximum case count during the previous 52 weeks.

  9. f

    Data from: Prediction of new HIV infection in men who have sex with men...

    • datasetcatalog.nlm.nih.gov
    • tandf.figshare.com
    Updated Mar 10, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Zhang, Cong; Zou, Lei; Lin, Bing; Shi, Guiqian; Deng, Jielian; Tao, Yi; Li, Kangjie; Xie, Biao; Wang, Qian; Zeng, Haijiao; Zhong, Xiaoni (2025). Prediction of new HIV infection in men who have sex with men based on machine learning: secondary analysis of a prospective cohort study from Western China [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0002059811
    Explore at:
    Dataset updated
    Mar 10, 2025
    Authors
    Zhang, Cong; Zou, Lei; Lin, Bing; Shi, Guiqian; Deng, Jielian; Tao, Yi; Li, Kangjie; Xie, Biao; Wang, Qian; Zeng, Haijiao; Zhong, Xiaoni
    Area covered
    Western China
    Description

    This study aimed to construct a model based on machine learning to predict new HIV infections in HIV-negative men who have sex with men (MSM). This is a secondary analysis of a previous random clinical trial aiming to evaluate the preventive effects of PrEP on new HIV infection in MSM. During 2013–2015, 1455 HIV-negative MSM were enrolled. Participants were divided into treatment group and control group and regularly followed up until they seroconverted to HIV positive or until the 2-year endpoint reached. Five machine-learning approaches were applied to predict the risk of HIV infection. Model performance was evaluated using Harrel’s C-index and area under the receiver operator characteristic curve (AUC) and validated in an external validation cohort. To explain this model, shapley additive explanation (SHAP) values were calculated and visualized. During the observation period, 102 MSM developed HIV infection. Thirteen parameters are selected to construct the model. The random survival forest model showed the best performance in the validation cohort, with a C-index of 0.7013, and could significantly categorize MSM into three groups. Our model indicated that MSM with younger age, receptive anal intercourse, and multiple male sexual partners had an increased risk of HIV infection, and those with higher AIDS knowledge scores had a lower risk. We presented a machine learning-based model to predict their risk of developing HIV infection. This model could be applied to recognize MSM who are at a higher risk of developing HIV infection.

  10. D

    ARCHIVED: COVID-19 Cases and Deaths Summarized by Geography

    • data.sfgov.org
    Updated Sep 11, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Department of Public Health - Population Health Division (2023). ARCHIVED: COVID-19 Cases and Deaths Summarized by Geography [Dataset]. https://data.sfgov.org/COVID-19/ARCHIVED-COVID-19-Cases-and-Deaths-Summarized-by-G/tpyr-dvnc
    Explore at:
    xml, csv, kml, kmz, application/geo+json, xlsxAvailable download formats
    Dataset updated
    Sep 11, 2023
    Dataset authored and provided by
    Department of Public Health - Population Health Division
    License

    ODC Public Domain Dedication and Licence (PDDL) v1.0http://www.opendatacommons.org/licenses/pddl/1.0/
    License information was derived automatically

    Description

    A. SUMMARY Medical provider confirmed COVID-19 cases and confirmed COVID-19 related deaths in San Francisco, CA aggregated by several different geographic areas and normalized by 2016-2020 American Community Survey (ACS) 5-year estimates for population data to calculate rate per 10,000 residents.

    On September 12, 2021, a new case definition of COVID-19 was introduced that includes criteria for enumerating new infections after previous probable or confirmed infections (also known as reinfections). A reinfection is defined as a confirmed positive PCR lab test more than 90 days after a positive PCR or antigen test. The first reinfection case was identified on December 7, 2021.

    Cases and deaths are both mapped to the residence of the individual, not to where they were infected or died. For example, if one was infected in San Francisco at work but lives in the East Bay, those are not counted as SF Cases or if one dies in Zuckerberg San Francisco General but is from another county, that is also not counted in this dataset.

    Dataset is cumulative and covers cases going back to 3/2/2020 when testing began.

    Geographic areas summarized are: 1. Analysis Neighborhoods 2. Census Tracts 3. Census Zip Code Tabulation Areas

    B. HOW THE DATASET IS CREATED Addresses from medical data are geocoded by the San Francisco Department of Public Health (SFDPH). Those addresses are spatially joined to the geographic areas. Counts are generated based on the number of address points that match each geographic area. The 2016-2020 American Community Survey (ACS) population estimates provided by the Census are used to create a rate which is equal to ([count] / [acs_population]) * 10000) representing the number of cases per 10,000 residents.

    C. UPDATE PROCESS Geographic analysis is scripted by SFDPH staff and synced to this dataset daily at 7:30 Pacific Time.

    D. HOW TO USE THIS DATASET San Francisco population estimates for geographic regions can be found in a view based on the San Francisco Population and Demographic Census dataset. These population estimates are from the 2016-2020 5-year American Community Survey (ACS).

    Privacy rules in effect To protect privacy, certain rules are in effect: 1. Case counts greater than 0 and less than 10 are dropped - these will be null (blank) values 2. Death counts greater than 0 and less than 10 are dropped - these will be null (blank) values 3. Cases and deaths dropped altogether for areas where acs_population < 1000

    Rate suppression in effect where counts lower than 20 Rates are not calculated unless the case count is greater than or equal to 20. Rates are generally unstable at small numbers, so we avoid calculating them directly. We advise you to apply the same approach as this is best practice in epidemiology.

    A note on Census ZIP Code Tabulation Areas (ZCTAs) ZIP Code Tabulation Areas are special boundaries created by the U.S. Census based on ZIP Codes developed by the USPS. They are not, however, the same thing. ZCTAs are areal representations of routes. Read how the Census develops ZCTAs on their website.

    Row included for Citywide case counts, incidence rate, and deaths A single row is included that has the Citywide case counts and incidence rate. This can be used for comparisons. Citywide will capture all cases regardless of address quality. While some cases cannot be mapped to sub-areas like Census Tracts, ongoing data quality efforts result in improved mapping on a rolling basis.

    E. CHANGE LOG

    • 9/11/2023 - data on COVID-19 cases and deaths summarized by geography are no longer being updated. This data is currently through 9/6/2023 and will not include any new data after this date.
    • 4/6/2023 - the State implemented system updates to improve the integrity of historical data.
    • 2/21/2023 - system updates to improve reliability and accuracy of cases data were implemented.
    • 1/31/2023 - updated “acs_population” column to reflect the 2020 Census Bureau American Community Survey (ACS) San Francisco Population estimates.
    • 1/31/2023 - implemented system updates to streamline and improve our geo-coded data, resulting in small shifts in our case and death data by geography.
    • 1/31/2023 - renamed column “last_updated_at” to “data_as_of”.
    • 2/23/2022 - the New Cases Map dashboard began pulling from this dataset. To access Cases by Geography Over Time, please refer to this dataset.
    • 1/22/2022 - system updates to improve timeliness and accuracy of cases and deaths data were implemented.
    • 7/15/2022 - reinfections added to cases dataset. See section SUMMARY for more information on how reinfections are identified.
    • 4/16/2021 - dataset updated to refresh with a five-day data lag.

  11. d

    Johns Hopkins COVID-19 Case Tracker

    • data.world
    • kaggle.com
    csv, zip
    Updated Dec 3, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    The Associated Press (2025). Johns Hopkins COVID-19 Case Tracker [Dataset]. https://data.world/associatedpress/johns-hopkins-coronavirus-case-tracker
    Explore at:
    zip, csvAvailable download formats
    Dataset updated
    Dec 3, 2025
    Authors
    The Associated Press
    Time period covered
    Jan 22, 2020 - Mar 9, 2023
    Area covered
    Description

    Updates

    • Notice of data discontinuation: Since the start of the pandemic, AP has reported case and death counts from data provided by Johns Hopkins University. Johns Hopkins University has announced that they will stop their daily data collection efforts after March 10. As Johns Hopkins stops providing data, the AP will also stop collecting daily numbers for COVID cases and deaths. The HHS and CDC now collect and visualize key metrics for the pandemic. AP advises using those resources when reporting on the pandemic going forward.

    • April 9, 2020

      • The population estimate data for New York County, NY has been updated to include all five New York City counties (Kings County, Queens County, Bronx County, Richmond County and New York County). This has been done to match the Johns Hopkins COVID-19 data, which aggregates counts for the five New York City counties to New York County.
    • April 20, 2020

      • Johns Hopkins death totals in the US now include confirmed and probable deaths in accordance with CDC guidelines as of April 14. One significant result of this change was an increase of more than 3,700 deaths in the New York City count. This change will likely result in increases for death counts elsewhere as well. The AP does not alter the Johns Hopkins source data, so probable deaths are included in this dataset as well.
    • April 29, 2020

      • The AP is now providing timeseries data for counts of COVID-19 cases and deaths. The raw counts are provided here unaltered, along with a population column with Census ACS-5 estimates and calculated daily case and death rates per 100,000 people. Please read the updated caveats section for more information.
    • September 1st, 2020

      • Johns Hopkins is now providing counts for the five New York City counties individually.
    • February 12, 2021

      • The Ohio Department of Health recently announced that as many as 4,000 COVID-19 deaths may have been underreported through the state’s reporting system, and that the "daily reported death counts will be high for a two to three-day period."
      • Because deaths data will be anomalous for consecutive days, we have chosen to freeze Ohio's rolling average for daily deaths at the last valid measure until Johns Hopkins is able to back-distribute the data. The raw daily death counts, as reported by Johns Hopkins and including the backlogged death data, will still be present in the new_deaths column.
    • February 16, 2021

      - Johns Hopkins has reconciled Ohio's historical deaths data with the state.

      Overview

    The AP is using data collected by the Johns Hopkins University Center for Systems Science and Engineering as our source for outbreak caseloads and death counts for the United States and globally.

    The Hopkins data is available at the county level in the United States. The AP has paired this data with population figures and county rural/urban designations, and has calculated caseload and death rates per 100,000 people. Be aware that caseloads may reflect the availability of tests -- and the ability to turn around test results quickly -- rather than actual disease spread or true infection rates.

    This data is from the Hopkins dashboard that is updated regularly throughout the day. Like all organizations dealing with data, Hopkins is constantly refining and cleaning up their feed, so there may be brief moments where data does not appear correctly. At this link, you’ll find the Hopkins daily data reports, and a clean version of their feed.

    The AP is updating this dataset hourly at 45 minutes past the hour.

    To learn more about AP's data journalism capabilities for publishers, corporations and financial institutions, go here or email kromano@ap.org.

    Queries

    Use AP's queries to filter the data or to join to other datasets we've made available to help cover the coronavirus pandemic

    Interactive

    The AP has designed an interactive map to track COVID-19 cases reported by Johns Hopkins.

    @(https://datawrapper.dwcdn.net/nRyaf/15/)

    Interactive Embed Code

    <iframe title="USA counties (2018) choropleth map Mapping COVID-19 cases by county" aria-describedby="" id="datawrapper-chart-nRyaf" src="https://datawrapper.dwcdn.net/nRyaf/10/" scrolling="no" frameborder="0" style="width: 0; min-width: 100% !important;" height="400"></iframe><script type="text/javascript">(function() {'use strict';window.addEventListener('message', function(event) {if (typeof event.data['datawrapper-height'] !== 'undefined') {for (var chartId in event.data['datawrapper-height']) {var iframe = document.getElementById('datawrapper-chart-' + chartId) || document.querySelector("iframe[src*='" + chartId + "']");if (!iframe) {continue;}iframe.style.height = event.data['datawrapper-height'][chartId] + 'px';}}});})();</script>
    

    Caveats

    • This data represents the number of cases and deaths reported by each state and has been collected by Johns Hopkins from a number of sources cited on their website.
    • In some cases, deaths or cases of people who've crossed state lines -- either to receive treatment or because they became sick and couldn't return home while traveling -- are reported in a state they aren't currently in, because of state reporting rules.
    • In some states, there are a number of cases not assigned to a specific county -- for those cases, the county name is "unassigned to a single county"
    • This data should be credited to Johns Hopkins University's COVID-19 tracking project. The AP is simply making it available here for ease of use for reporters and members.
    • Caseloads may reflect the availability of tests -- and the ability to turn around test results quickly -- rather than actual disease spread or true infection rates.
    • Population estimates at the county level are drawn from 2014-18 5-year estimates from the American Community Survey.
    • The Urban/Rural classification scheme is from the Center for Disease Control and Preventions's National Center for Health Statistics. It puts each county into one of six categories -- from Large Central Metro to Non-Core -- according to population and other characteristics. More details about the classifications can be found here.

    Johns Hopkins timeseries data - Johns Hopkins pulls data regularly to update their dashboard. Once a day, around 8pm EDT, Johns Hopkins adds the counts for all areas they cover to the timeseries file. These counts are snapshots of the latest cumulative counts provided by the source on that day. This can lead to inconsistencies if a source updates their historical data for accuracy, either increasing or decreasing the latest cumulative count. - Johns Hopkins periodically edits their historical timeseries data for accuracy. They provide a file documenting all errors in their timeseries files that they have identified and fixed here

    Attribution

    This data should be credited to Johns Hopkins University COVID-19 tracking project

  12. f

    Table_1_The Effect of International Travel Arrivals on the New HIV...

    • datasetcatalog.nlm.nih.gov
    • frontiersin.figshare.com
    Updated Feb 16, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jing, Wenzhan; Liu, Jue; Liu, Min; Du, Min; Yuan, Jie (2022). Table_1_The Effect of International Travel Arrivals on the New HIV Infections in 15–49 Years Aged Group Among 109 Countries or Territories From 2000 to 2018.DOCX [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0000308567
    Explore at:
    Dataset updated
    Feb 16, 2022
    Authors
    Jing, Wenzhan; Liu, Jue; Liu, Min; Du, Min; Yuan, Jie
    Description

    ObjectiveThe prevalent international travel may have an impact on new HIV infections, but related studies were lacking. We aimed to explore the association between international travel arrivals and new HIV infections in 15–49 years aged group from 2000 to 2018, to make tailored implications for HIV prevention.MethodsWe obtained the data of new HIV infections from the Joint United Nations Programme on HIV/AIDS and international travel arrivals from the World Bank. Correlation analysis was used to explore the relation briefly. Log-linear models were built to analyze the association between international travel arrivals and new HIV infections.ResultsInternational travel arrivals were positively correlated with new HIV infections (correlation coefficients: 0.916, p < 0.001). After controlling population density, the median age of the total population (years), socio-demographic index (SDI), travel-related mandatory HIV testing, HIV-related restrictions, and antiretroviral therapy coverage, there were 6.61% (95% CI: 5.73, 7.50; p < 0.001) percentage changes in new HIV infections of 15–49 years aged group associated with a 1 million increase in international travel arrivals.ConclusionsHigher international travel arrivals were correlated with new HIV infections in 15–49 years aged group. Therefore, multipronged structural and effective strategies and management should be implemented and strengthened.

  13. f

    Data from: Get the News Out Loudly and Quickly: The Influence of the Media...

    • datasetcatalog.nlm.nih.gov
    • plos.figshare.com
    Updated Aug 26, 2013
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mummert, Anna; Weiss, Howard (2013). Get the News Out Loudly and Quickly: The Influence of the Media on Limiting Emerging Infectious Disease Outbreaks [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0001719274
    Explore at:
    Dataset updated
    Aug 26, 2013
    Authors
    Mummert, Anna; Weiss, Howard
    Description

    During outbreaks of infectious diseases with high morbidity and mortality, individuals closely follow media reports of the outbreak. Many will attempt to minimize contacts with other individuals in order to protect themselves from infection and possibly death. This process is called social distancing. Social distancing strategies include restricting socializing and travel, and using barrier protections. We use modeling to show that for short-term outbreaks, social distancing can have a large influence on reducing outbreak morbidity and mortality. In particular, public health agencies working together with the media can significantly reduce the severity of an outbreak by providing timely accounts of new infections and deaths. Our models show that the most effective strategy to reduce infections is to provide this information as early as possible, though providing it well into the course of the outbreak can still have a significant effect. However, our models for long-term outbreaks indicate that reporting historic infection data can result in more infections than with no reporting at all. We examine three types of media influence and we illustrate the media influence with a simulated outbreak of a generic emerging infectious disease in a small city. Social distancing can never be complete; however, for a spectrum of outbreaks, we show that leaving isolation (stopping applying social distancing measures) for up to 4 hours each day has modest effect on the overall morbidity and mortality.

  14. Influenza in New York 2009-2018

    • kaggle.com
    zip
    Updated Apr 2, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Juan Carlos Galvez (2020). Influenza in New York 2009-2018 [Dataset]. https://www.kaggle.com/titustitus/h1n1-new-york-2009
    Explore at:
    zip(477966 bytes)Available download formats
    Dataset updated
    Apr 2, 2020
    Authors
    Juan Carlos Galvez
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Context

    In the Context of COVID-19 information of similar infections like influenza can be very valuable to a data scientist. New York is one of the most affected cities in the COVID-19 pandemia and the knowledge of the distribution of previous infections could be relevant in order to predict future spreadings or develop efficient sampling methods.

    Content

    The dataset contains weekly information of infections (positive test) in New York Counties during the period Oct 2009-Mar 2019. The months studied are Jan, Feb, Mar, Apr, May, Oct, Nov, Dec. There are included other variables by County like the amount of hospital beds, unemployment rate, population, average income, Median age,Total expenditure per Year in hospital interventions...( See variable description). All information is based on relevant sources. The dataset is a combination of different datasets i list below: 1. Weekly of infections by county: https://data.world/healthdatany/jr8b-6gh6/workspace/file?filename=influenza-laboratory-confirmed-cases-by-county-beginning-2009-10-season-1.csv 2. Area of Counties:https://www.health.ny.gov/statistics/vital_statistics/2006/table02.htm 3. Population size: https://catalog.data.gov/dataset/annual-population-estimates-for-new-york-state-and-counties-beginning-1970 4. Number of Adult care facilities beds: https://health.data.ny.gov/Health/Adult-Care-Facility-Map/6wkx-ptu4 5. Age related data: https://factfinder.census.gov/faces/tableservices/jsf/pages/productview.xhtml?src=CF 6. Income data: https://en.wikipedia.org/wiki/List_of_New_York_locations_by_per_capita_income 7. Labour data: https://labor.ny.gov/stats/lslaus.shtm 8. Information about hospitals beds and services: https://health.data.ny.gov/Health/Health-Facility-Certification-Information/2g9y-7kqm 9. Health expenditure by illness: https://health.data.ny.gov/Health/Hospital-Inpatient-Cost-Transparency-Beginning-200/7dtz-qxmr

    Inspiration

    Testing has been proven to be one of the most relevant tools to fight against virus spreading. Statistics provide of efficient tools to obtain estimation of total number of infections, in particular sampling methods may reduce significantly the costs of testing. This dataset pretends to be used as a tool to understand the distribution of positive tests in the state of New York in order to design sampling methods that could reduce significantly the estimation error.

  15. p

    Counts of Salmonella infection reported in UNITED STATES OF AMERICA:...

    • tycho.pitt.edu
    • data.niaid.nih.gov
    Updated Apr 1, 2018
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Willem G Van Panhuis; Anne L Cross; Donald S Burke (2018). Counts of Salmonella infection reported in UNITED STATES OF AMERICA: 1999-2017 [Dataset]. https://www.tycho.pitt.edu/dataset/US.302231008
    Explore at:
    Dataset updated
    Apr 1, 2018
    Dataset provided by
    Project Tycho, University of Pittsburgh
    Authors
    Willem G Van Panhuis; Anne L Cross; Donald S Burke
    Time period covered
    1999 - 2017
    Area covered
    United States
    Description

    Project Tycho datasets contain case counts for reported disease conditions for countries around the world. The Project Tycho data curation team extracts these case counts from various reputable sources, typically from national or international health authorities, such as the US Centers for Disease Control or the World Health Organization. These original data sources include both open- and restricted-access sources. For restricted-access sources, the Project Tycho team has obtained permission for redistribution from data contributors. All datasets contain case count data that are identical to counts published in the original source and no counts have been modified in any way by the Project Tycho team. The Project Tycho team has pre-processed datasets by adding new variables, such as standard disease and location identifiers, that improve data interpretability. We also formatted the data into a standard data format.

    Each Project Tycho dataset contains case counts for a specific condition (e.g. measles) and for a specific country (e.g. The United States). Case counts are reported per time interval. In addition to case counts, datasets include information about these counts (attributes), such as the location, age group, subpopulation, diagnostic certainty, place of acquisition, and the source from which we extracted case counts. One dataset can include many series of case count time intervals, such as "US measles cases as reported by CDC", or "US measles cases reported by WHO", or "US measles cases that originated abroad", etc.

    Depending on the intended use of a dataset, we recommend a few data processing steps before analysis: - Analyze missing data: Project Tycho datasets do not include time intervals for which no case count was reported (for many datasets, time series of case counts are incomplete, due to incompleteness of source documents) and users will need to add time intervals for which no count value is available. Project Tycho datasets do include time intervals for which a case count value of zero was reported. - Separate cumulative from non-cumulative time interval series. Case count time series in Project Tycho datasets can be "cumulative" or "fixed-intervals". Cumulative case count time series consist of overlapping case count intervals starting on the same date, but ending on different dates. For example, each interval in a cumulative count time series can start on January 1st, but end on January 7th, 14th, 21st, etc. It is common practice among public health agencies to report cases for cumulative time intervals. Case count series with fixed time intervals consist of mutually exclusive time intervals that all start and end on different dates and all have identical length (day, week, month, year). Given the different nature of these two types of case count data, we indicated this with an attribute for each count value, named "PartOfCumulativeCountSeries".

  16. p

    Counts of Meningococcal infectious disease reported in UNITED STATES OF...

    • tycho.pitt.edu
    • data.niaid.nih.gov
    Updated Apr 1, 2018
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Willem G Van Panhuis; Anne L Cross; Donald S Burke (2018). Counts of Meningococcal infectious disease reported in UNITED STATES OF AMERICA: 1951-2010 [Dataset]. https://www.tycho.pitt.edu/dataset/US.23511006
    Explore at:
    Dataset updated
    Apr 1, 2018
    Dataset provided by
    Project Tycho, University of Pittsburgh
    Authors
    Willem G Van Panhuis; Anne L Cross; Donald S Burke
    Time period covered
    1951 - 2010
    Area covered
    United States
    Description

    Project Tycho datasets contain case counts for reported disease conditions for countries around the world. The Project Tycho data curation team extracts these case counts from various reputable sources, typically from national or international health authorities, such as the US Centers for Disease Control or the World Health Organization. These original data sources include both open- and restricted-access sources. For restricted-access sources, the Project Tycho team has obtained permission for redistribution from data contributors. All datasets contain case count data that are identical to counts published in the original source and no counts have been modified in any way by the Project Tycho team. The Project Tycho team has pre-processed datasets by adding new variables, such as standard disease and location identifiers, that improve data interpretability. We also formatted the data into a standard data format.

    Each Project Tycho dataset contains case counts for a specific condition (e.g. measles) and for a specific country (e.g. The United States). Case counts are reported per time interval. In addition to case counts, datasets include information about these counts (attributes), such as the location, age group, subpopulation, diagnostic certainty, place of acquisition, and the source from which we extracted case counts. One dataset can include many series of case count time intervals, such as "US measles cases as reported by CDC", or "US measles cases reported by WHO", or "US measles cases that originated abroad", etc.

    Depending on the intended use of a dataset, we recommend a few data processing steps before analysis: - Analyze missing data: Project Tycho datasets do not include time intervals for which no case count was reported (for many datasets, time series of case counts are incomplete, due to incompleteness of source documents) and users will need to add time intervals for which no count value is available. Project Tycho datasets do include time intervals for which a case count value of zero was reported. - Separate cumulative from non-cumulative time interval series. Case count time series in Project Tycho datasets can be "cumulative" or "fixed-intervals". Cumulative case count time series consist of overlapping case count intervals starting on the same date, but ending on different dates. For example, each interval in a cumulative count time series can start on January 1st, but end on January 7th, 14th, 21st, etc. It is common practice among public health agencies to report cases for cumulative time intervals. Case count series with fixed time intervals consist of mutually exclusive time intervals that all start and end on different dates and all have identical length (day, week, month, year). Given the different nature of these two types of case count data, we indicated this with an attribute for each count value, named "PartOfCumulativeCountSeries".

  17. NNDSS - TABLE 1K. Ehrlichiosis and Anaplasmosis, Anaplasma phagocytophilum...

    • catalog.data.gov
    • data.virginia.gov
    • +5more
    Updated Jul 9, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Centers for Disease Control and Prevention (2025). NNDSS - TABLE 1K. Ehrlichiosis and Anaplasmosis, Anaplasma phagocytophilum infection to Ehrlichia chaffeensis infection [Dataset]. https://catalog.data.gov/dataset/nndss-table-1k-ehrlichiosis-and-anaplasmosis-anaplasma-phagocytophilum-infection-to-ehrlic
    Explore at:
    Dataset updated
    Jul 9, 2025
    Dataset provided by
    Centers for Disease Control and Preventionhttp://www.cdc.gov/
    Description

    NNDSS - TABLE 1K. Ehrlichiosis and Anaplasmosis, Anaplasma phagocytophilum infection to Ehrlichia chaffeensis infection - 2020. In this Table, provisional cases* of notifiable diseases are displayed for United States, U.S. territories, and Non-U.S. residents. Note: This table contains provisional cases of national notifiable diseases from the National Notifiable Diseases Surveillance System (NNDSS). NNDSS data from the 50 states, New York City, the District of Columbia and the U.S. territories are collated and published weekly on the NNDSS Data and Statistics web page (https://wwwn.cdc.gov/nndss/data-and-statistics.html). Cases reported by state health departments to CDC for weekly publication are provisional because of the time needed to complete case follow-up. Therefore, numbers presented in later weeks may reflect changes made to these counts as additional information becomes available. The national surveillance case definitions used to define a case are available on the NNDSS web site at https://wwwn.cdc.gov/nndss/. Information about the weekly provisional data and guides to interpreting data are available at: https://wwwn.cdc.gov/nndss/infectious-tables.html. Footnotes: U: Unavailable — The reporting jurisdiction was unable to send the data to CDC or CDC was unable to process the data. -: No reported cases — The reporting jurisdiction did not submit any cases to CDC. N: Not reportable — The disease or condition was not reportable by law, statute, or regulation in the reporting jurisdiction. NN: Not nationally notifiable — This condition was not designated as being nationally notifiable. NP: Nationally notifiable but not published. NC: Not calculated — There is insufficient data available to support the calculation of this statistic. Cum: Cumulative year-to-date counts. Max: Maximum — Maximum case count during the previous 52 weeks. * Case counts for reporting years 2019 and 2020 are provisional and subject to change. Cases are assigned to the reporting jurisdiction submitting the case to NNDSS, if the case's country of usual residence is the U.S., a U.S. territory, unknown, or null (i.e. country not reported); otherwise, the case is assigned to the 'Non-U.S. Residents' category. Country of usual residence is currently not reported by all jurisdictions or for all conditions. For further information on interpretation of these data, see https://wwwn.cdc.gov/nndss/document/Users_guide_WONDER_tables_cleared_final.pdf. †Previous 52 week maximum and cumulative YTD are determined from periods of time when the condition was reportable in the jurisdiction (i.e., may be less than 52 weeks of data or incomplete YTD data).

  18. New York State Statewide COVID-19 Testing

    • health.data.ny.gov
    csv, xlsx, xml
    Updated Dec 2, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    New York State Department of Health (2025). New York State Statewide COVID-19 Testing [Dataset]. https://health.data.ny.gov/Health/New-York-State-Statewide-COVID-19-Testing/jvfi-ffup
    Explore at:
    xlsx, csv, xmlAvailable download formats
    Dataset updated
    Dec 2, 2025
    Dataset authored and provided by
    New York State Department of Health
    Area covered
    New York
    Description

    This dataset includes information on the number of positive tests of individuals for COVID-19 infection performed in New York State beginning March 1, 2020, when the first case of COVID-19 was identified in the state. The primary goal of publishing this dataset is to provide users timely information about local disease spread and reporting of positive cases. The data will be updated daily, reflecting tests reported by 12:00 am (midnight) three days prior. Data are published on a three-day lag in order to allow all test results to be reported.

    Reporting of SARS-CoV2 laboratory testing results is mandated under Part 2 of the New York State Sanitary Code. Clinical laboratories, as defined in Public Health Law (PHL) § 571 electronically report test results to the New York State Department of Health (DOH) via the Electronic Clinical Laboratory Reporting System (ECLRS). The DOH Division of Epidemiology’s Bureau of Surveillance and Data System (BSDS) monitors ECLRS reporting and ensures that all results are accurate.

    Test counts are based on specimen collection date. A person may have multiple specimens tested on one day, these would be counted one time, i.e., if two specimens are collected from an individual at the same time and then evaluated, the outcome of the evaluation of those two samples to diagnose the individual is counted as a single test of one person, even though the specimens may be tested separately. All positive test results that are at least 90 days apart are counted as cases/new positives.

    New positive test counts are assigned to a county based on this order of preference: 1) the patient’s address, 2) the ordering healthcare provider/campus address, or 3) the ordering facility/campus address.

    Archived versions of the reinfections dataset are also available: First infections - https://health.data.ny.gov/d/xdss-u53e Reinfections - https://health.data.ny.gov/d/7aaj-cdtu

  19. WHO COVID-19 cases - dataset

    • kaggle.com
    zip
    Updated Nov 19, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Iron Wolf (2024). WHO COVID-19 cases - dataset [Dataset]. https://www.kaggle.com/datasets/ironwolf437/who-covid-19-cases-dataset/code
    Explore at:
    zip(597020 bytes)Available download formats
    Dataset updated
    Nov 19, 2024
    Authors
    Iron Wolf
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    Overview:

    • This dataset includes the number of coronavirus cases and deaths, and the time of recording each case by country.
    • I have added a new column, continent, to provide more information about the number of infections in each continent.

    Objective:

    The objective of this dataset is to: 1. Perform a data analysis to find out the number of cases and deaths in each country and continent, and compare the times when the number of cases or deaths was available. 2. Create a machine learning model (if possible) to train the model to predict deaths based on the cases and deaths.

    • The dataset is allowed to be modified during the analysis.

    Details of the columns:

    | | Columns | Description | Type | |----------------------|---------------------------------------------------------------------------------|---------------| |1| Date_reported | Date of recording the number of cases, whether infected or dead. | Categorical | |2| Country_code | A standardized code (ISO-3166) for the country (e.g., US for the United States).| Categorical | |3| Country | The full name of the country where the data is collected. | Categorical | |4| Continent | The continent on which the country is located (e.g., Asia, Europe). | Categorical | |5| WHO_region | The specific WHO region that the country belongs to (e.g., AFRO, EMRO, EURO). | Categorical | |6| New_cases | The number of newly reported COVID-19 cases on the reporting date. | Numerical (float) | |7| Cumulative_cases | The total number of COVID-19 cases reported in the country to date. | Numerical (int) | |8| New_deaths | The number of newly reported deaths due to COVID-19 on the reporting date. | Numerical (float) | |9| Cumulative_deaths | The total number of deaths due to COVID-19 reported in the country to date. | Numerical (int) |

  20. NNDSS - Table I. infrequently reported notifiable diseases

    • catalog.data.gov
    • data.virginia.gov
    • +7more
    Updated Jun 28, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Centers for Disease Control and Prevention (2025). NNDSS - Table I. infrequently reported notifiable diseases [Dataset]. https://catalog.data.gov/dataset/nndss-table-i-infrequently-reported-notifiable-diseases
    Explore at:
    Dataset updated
    Jun 28, 2025
    Dataset provided by
    Centers for Disease Control and Preventionhttp://www.cdc.gov/
    Description

    NNDSS - Table I. infrequently reported notifiable diseases - 2018. In this Table, provisional cases of selected infrequently reported notifiable diseases (<1,000 cases reported during the preceding year) are displayed. This tables excludes U.S. territories. Notice: The case counts for Haemophilus influenzae, invasive disease Nontypeable" and "Non-b serotype" were switched for 2018 weeks 1-52. Note: These are provisional cases of selected national notifiable diseases from the National Notifiable Diseases Surveillance System (NNDSS). NNDSS data from the 50 states, New York City, the District of Columbia are collated and published weekly on the NNDSS Data and Statistic web page (https://wwwn.cdc.gov/nndss/data-and-statistics.html). Cases reported by state health departments to CDC for weekly publication are provisional because of the time needed to complete case follow-up. Therefore, numbers presented in later weeks may reflect changes made to these counts as additional information becomes available. The national surveillance case definitions used to define a case are available on the NNDSS web site at https://wwwn.cdc.gov/nndss/. Information about the weekly provisional data and guides to interpreting data are available at: https://wwwn.cdc.gov/nndss/infectious-tables.html. Footnote: —: No reported cases. N: Not reportable. NA: Not available. NN: Not Nationally Notifiable. NP: Nationally notifiable but not published. Cum: Cumulative year-to-date counts. Case counts for reporting years 2017 and 2018 are provisional and subject to change. Data for years 2013 through 2016 are finalized. For further information on interpretation of these data, see http://wwwn.cdc.gov/nndss/document/ProvisionalNationaNotifiableDiseasesSurveillanceData20100927.pdf. † This table does not include cases from the U.S. territories. § Calculated by summing the incidence counts for the current week, the 2 weeks preceding the current week, and the 2 weeks following the current week, for a total of 5 preceding years. Additional information is available at https://wwwn.cdc.gov/nndss/document/5yearweeklyaverage.pdf. ¶ Not reportable in all jurisdictions. Data from states where the condition is not reportable are excluded from this table, except for the arboviral diseases and influenza-associated pediatric mortality. Reporting exceptions are available at http://wwwn.cdc.gov/nndss/downloads.html. ** Please refer to the CDC WONDER for weekly updates to the footnote for this condition. †† Please refer to the CDC WONDER for weekly updates to the footnote for this condition. §§ Novel influenza A virus infections are human infections with influenza A viruses that are different from currently circulating human seasonal influenza viruses. With the exception of one avian lineage influenza A (H7N2) virus, all novel influenza A virus infections reported to CDC since 2013 have been variant influenza viruses. ¶¶ Prior to 2018, cases of paratyphoid fever were included with salmonellosis cases (see Table II). *** Prior to 2015, CDC's National Notifiable Diseases Surveillance System (NNDSS) did not receive electronic data about incident cases of specific viral hemorrhagic fevers; instead data were collected in aggregate as "viral hemorrhagic fevers'. NNDSS was updated beginning in 2015 to receive data for each of the viral hemorrhagic fevers listed.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
California Department of Public Health (2025). Infectious Diseases by Disease, County, Year, and Sex [Dataset]. https://data.chhs.ca.gov/dataset/infectious-disease
Organization logo

Infectious Diseases by Disease, County, Year, and Sex

Explore at:
7 scholarly articles cite this dataset (View in Google Scholar)
zip, csv(12953665)Available download formats
Dataset updated
Nov 7, 2025
Dataset authored and provided by
California Department of Public Healthhttps://www.cdph.ca.gov/
Description

These data contain case counts and rates for selected communicable diseases—listed in the data dictionary—that met the surveillance case definition for that disease and was reported for California residents, by disease, county, year, and sex. The data represent cases with an estimated illness onset date from 2001 through the last year indicated from California Confidential Morbidity Reports and/or Laboratory Reports. Data captured represent reportable case counts as of the date indicated in the “Temporal Coverage” section below, so the data presented may differ from previous publications due to delays inherent to case reporting, laboratory reporting, and epidemiologic investigation.

Search
Clear search
Close search
Google apps
Main menu