Facebook
TwitterThese data contain case counts and rates for selected communicable diseases—listed in the data dictionary—that met the surveillance case definition for that disease and was reported for California residents, by disease, county, year, and sex. The data represent cases with an estimated illness onset date from 2001 through the last year indicated from California Confidential Morbidity Reports and/or Laboratory Reports. Data captured represent reportable case counts as of the date indicated in the “Temporal Coverage” section below, so the data presented may differ from previous publications due to delays inherent to case reporting, laboratory reporting, and epidemiologic investigation.
Facebook
TwitterThe following slide set is available to download for presentational use:
Data on all HIV diagnoses, AIDS and deaths among people diagnosed with HIV are collected from HIV outpatient clinics, laboratories and other healthcare settings. Data relating to people living with HIV is collected from HIV outpatient clinics. Data relates to England, Wales, Northern Ireland and Scotland, unless stated.
HIV testing, pre-exposure prophylaxis, and post-exposure prophylaxis data relates to activity at sexual health services in England only.
View the pre-release access lists for these statistics.
Previous reports, data tables and slide sets are also available for:
Our statistical practice is regulated by the Office for Statistics Regulation (OSR). The OSR sets the standards of trustworthiness, quality and value in the https://code.statisticsauthority.gov.uk/">Code of Practice for Statistics that all producers of Official Statistics should adhere to.
Additional information on HIV surveillance can be found in the HIV Action Plan for England monitoring and evaluation framework reports. Other HIV in the UK reports published by Public Health England (PHE) are available online.
Facebook
Twitterhttps://github.com/nytimes/covid-19-data/blob/master/LICENSEhttps://github.com/nytimes/covid-19-data/blob/master/LICENSE
The New York Times is releasing a series of data files with cumulative counts of coronavirus cases in the United States, at the state and county level, over time. We are compiling this time series data from state and local governments and health departments in an attempt to provide a complete record of the ongoing outbreak.
Since the first reported coronavirus case in Washington State on Jan. 21, 2020, The Times has tracked cases of coronavirus in real time as they were identified after testing. Because of the widespread shortage of testing, however, the data is necessarily limited in the picture it presents of the outbreak.
We have used this data to power our maps and reporting tracking the outbreak, and it is now being made available to the public in response to requests from researchers, scientists and government officials who would like access to the data to better understand the outbreak.
The data begins with the first reported coronavirus case in Washington State on Jan. 21, 2020. We will publish regular updates to the data in this repository.
Facebook
TwitterOpen Database License (ODbL) v1.0https://www.opendatacommons.org/licenses/odbl/1.0/
License information was derived automatically
By Health [source]
This dataset provides a comprehensive overview of Healthcare-Associated Infections (HAIs) across all US states. Spanning multiple subspecialty areas such as central lines and urinary catheters infections, HAIs can be devastating to patient care and outcomes. The data within this set is collected by the Centers for Disease Control and Prevention (CDC) through the National Healthcare Safety Network (NHSN). It contains information on infection rates along with other important data such as measure names, scores, footnotes and measure start & end dates. This dataset presents us with an opportunity to better understand the prevalence of HAIs on a state level in order to improve patient safety measures that are used in hospitals nationwide
For more datasets, click here.
- 🚨 Your notebook can be here! 🚨!
This dataset provides valuable insights into healthcare-associated infections (HAIs) across the United States. By understanding which states have higher rates of HAIs, we can better understand the overall state of health care in the country.
To get started using this data set, take a look at some of the columns included: Measure Name, Score, Footnote and Measure Start and End Date. The measure name will give you an overview of what type of infection is being measured in each state. Each measurement has an associated score that tells you how well each state is doing with respect to other states when it comes to preventing these infections from occurring. The footnote gives more information about specific details surrounding that particular HAI measure for that state (for instance, if all or only some hospitals are included in the measure). Finally, the start and end dates tell you when the measure began and ended in regards to data collection for each state.
Once you have explored some of these columns, start looking deeper into what data points each column contains - such as which states have a high number of infections related to surgical procedures compared to others who don't? Thinking critically about this data will reveal trends amongst different states and how they compare when it comes to providing quality health care services within their facilities.
By exploring these trends further with visuals such as charts or graphs, you can better determine which areas need improvement so that we may develop preventative measures against further incidences occurring in hospitals across all US states
- Creating a state-by-state map of healthcare-associated infection rates in order to identify which states have the highest and lowest rates of HAIs.
- Developing a predictive model to determine the likelihood of an infection in a particular hospital based on data from all the other hospitals in the same state, allowing hospitals to adjust their safety protocols accordingly.
- Constructing an infographic displaying different points picked up within this dataset such as what are common sources of infection, breakdowns by states and types, etc
If you use this dataset in your research, please credit the original authors. Data Source
License: Open Database License (ODbL) v1.0 - You are free to: - Share - copy and redistribute the material in any medium or format. - Adapt - remix, transform, and build upon the material for any purpose, even commercially. - You must: - Give appropriate credit - Provide a link to the license, and indicate if changes were made. - ShareAlike - You must distribute your contributions under the same license as the original. - Keep intact - all notices that refer to this license, including copyright notices. - No Derivatives - If you remix, transform, or build upon the material, you may not distribute the modified material. - No additional restrictions - You may not apply legal terms or technological measures that legally restrict others from doing anything the license permits.
File: Healthcare_Associated_Infections_-_State.csv | Column name | Description | |:-----------------------|:------------------------------------------------------------------------------------------------------------------------|...
Facebook
TwitterNNDSS - Table II. Chlamydia trachomatis infection to Coccidioidomycosis - 2018. In this Table, provisional cases of selected notifiable diseases (≥1,000 cases reported during the preceding year), and selected low frequency diseases are displayed. The Table includes total number of cases reported in the United States, by region and by states or territory.
Note:
This table contains provisional cases of selected national notifiable diseases from the National Notifiable Diseases Surveillance System (NNDSS). NNDSS data from the 50 states, New York City, the District of Columbia and the U.S. territories are collated and published weekly on the NNDSS Data and Statistics web page (https://wwwn.cdc.gov/nndss/data-and-statistics.html). Cases reported by state health departments to CDC for weekly publication are provisional because of the time needed to complete case follow-up. Therefore, numbers presented in later weeks may reflect changes made to these counts as additional information becomes available. The national surveillance case definitions used to define a case are available on the NNDSS web site at https://wwwn.cdc.gov/nndss/. Information about the weekly provisional data and guides to interpreting data are available at: https://wwwn.cdc.gov/nndss/infectious-tables.html
Footnotes:
C.N.M.I.: Commonwealth of Northern Mariana Islands. U: Unavailable. —: No reported cases. N: Not reportable. NA: Not Available. NN: Not Nationally Notifiable. NP: Nationally notifiable but not published. Cum: Cumulative year-to-date counts. Med: Median. Max: Maximum.
Facebook
TwitterBy Health [source]
This dataset provides comprehensive information on the number and rate of infectious diseases in California. Focusing on counties, sexes, and various diseases between 2001-2014, it offers powerful insights into the health status of its citizens. Its data also reveals trends in the spread of common illnesses in this state. Whether you are an epidemiologist looking to inform public health policy or a researcher seeking to investigate particular illnesses within certain populations, this dataset contains all the necessary information to answer your questions. Explore it today and discover hidden stories waiting to be uncovered!
For more datasets, click here.
- 🚨 Your notebook can be here! 🚨!
This dataset contains counts and rates of infectious diseases in California by county, disease, sex, and year. This dataset can be used to generate trends to understand the changes in incidence of different types of diseases over time and across counties or between sexes.
To use this dataset: - Select the columns you are interested in exploring - these could include Disease, County, Sex or Year. - Filter out the rows that do not relate to your question - for example filtering by a specific county or disease. - Examine the average rate per 100000 people for each group you selected as well as its lower and upper confidence intervals (CI). - Use Rate as your dependent variable for analysis; Population is likely also important determining factors. Make sure to check if any Rates have 'unstable' flags.
- Visualise or statistically analyse your data using suitable methods such as descriptive statistics (means/medians/mode etc.)for comparison between 2+ groups or correlation/regression based models when comparing one variable to another over time etc.
- Analyzing the geographic spread of infectious diseases over time to identify areas in need of increased education, resources, and care.
- Comparing rates of disease by sex to identify and understand any gender-based differences in infectious disease cases.
- Using the Unstable column to determine whether a particular county or region needs further study of a certain type of infectious disease due to unusual spikes or drops in rate or count during a specific year
If you use this dataset in your research, please credit the original authors. Data Source
License: Dataset copyright by authors - You are free to: - Share - copy and redistribute the material in any medium or format for any purpose, even commercially. - Adapt - remix, transform, and build upon the material for any purpose, even commercially. - You must: - Give appropriate credit - Provide a link to the license, and indicate if changes were made. - ShareAlike - You must distribute your contributions under the same license as the original. - Keep intact - all notices that refer to this license, including copyright notices.
File: Infectious_Disease_Cases_by_County_Year_and_Sex_2001-2014.csv | Column name | Description | |:---------------|:---------------------------------------------------------------------------------------------------------------| | Disease | The type of infectious disease reported. (String) | | County | The county in California where the cases were reported. (String) | | Year | The year in which the cases were reported. (Integer) | | Sex | The gender of the individuals who contracted the disease. (String) | | Population | The population size of the county in which the cases were reported. (Integer) | | Rate | The rate of infection per 100 thousand people living in the county. (Float) | | CI.lower | The lower confidence interval associated with the rate of infection. (Float) | | CI.upper | The upper confidence interval associated with the rate of infection. (Float) ...
Facebook
TwitterNote: This dataset is no longer being updated due to the end of the COVID-19 Public Health Emergency.
The California Department of Public Health (CDPH) is identifying vaccination status of COVID-19 cases, hospitalizations, and deaths by analyzing the state immunization registry and registry of confirmed COVID-19 cases. Post-vaccination cases are individuals who have a positive SARS-Cov-2 molecular test (e.g. PCR) at least 14 days after they have completed their primary vaccination series.
Tracking cases of COVID-19 that occur after vaccination is important for monitoring the impact of immunization campaigns. While COVID-19 vaccines are safe and effective, some cases are still expected in persons who have been vaccinated, as no vaccine is 100% effective. For more information, please see https://www.cdph.ca.gov/Programs/CID/DCDC/Pages/COVID-19/Post-Vaccine-COVID19-Cases.aspx
Post-vaccination infection data is updated monthly and includes data on cases, hospitalizations, and deaths among the unvaccinated and the vaccinated. Partially vaccinated individuals are excluded. To account for reporting and processing delays, there is at least a one-month lag in provided data (for example data published on 9/9/22 will include data through 7/31/22).
Notes:
On September 9, 2022, the post-vaccination data has been changed to compare unvaccinated with those with at least a primary series completed for persons age 5+. These data will be updated monthly (first Thursday of the month) and include at least a one month lag.
On February 2, 2022, the post-vaccination data has been changed to distinguish between vaccination with a primary series only versus vaccinated and boosted. The previous dataset has been uploaded as an archived table. Additionally, the lag on this data has been extended to 14 days.
On November 29, 2021, the denominator for calculating vaccine coverage has been changed from age 16+ to age 12+ to reflect new vaccine eligibility criteria. The previous dataset based on age 16+ denominators has been uploaded as an archived table.
Facebook
TwitterNNDSS - TABLE 1Q. Hepatitis B, perinatal infection to Hepatitis C, acute, Probable - 2022. In this Table, provisional cases* of notifiable diseases are displayed for United States, U.S. territories, and Non-U.S. residents. Notes: • These are weekly cases of selected infectious national notifiable diseases, from the National Notifiable Diseases Surveillance System (NNDSS). NNDSS data reported by the 50 states, New York City, the District of Columbia, and the U.S. territories are collated and published weekly as numbered tables available at https://www.cdc.gov/nndss/data-statistics/index.html. Cases reported by state health departments to CDC for weekly publication are subject to ongoing revision of information and delayed reporting. Therefore, numbers listed in later weeks may reflect changes made to these counts as additional information becomes available. Case counts in the tables are presented as published each week. See also Guide to Interpreting Provisional and Finalized NNDSS Data at https://www.cdc.gov/nndss/docs/Readers-Guide-WONDER-Tables-20210421-508.pdf. • Notices, errata, and other notes are available in the Notice To Data Users page at https://wonder.cdc.gov/nndss/NTR.html. • The list of national notifiable infectious diseases and conditions and their national surveillance case definitions are available at https://ndc.services.cdc.gov/. This list incorporates the Council of State and Territorial Epidemiologists (CSTE) position statements approved by CSTE for national surveillance. Footnotes: *Case counts for reporting years 2021 and 2022 are provisional and subject to change. Cases are assigned to the reporting jurisdiction submitting the case to NNDSS, if the case's country of usual residence is the U.S., a U.S. territory, unknown, or null (i.e. country not reported); otherwise, the case is assigned to the 'Non-U.S. Residents' category. Country of usual residence is currently not reported by all jurisdictions or for all conditions. For further information on interpretation of these data, see https://www.cdc.gov/nndss/docs/Readers-Guide-WONDER-Tables-20210421-508.pdf. †Previous 52 week maximum and cumulative YTD are determined from periods of time when the condition was reportable in the jurisdiction (i.e., may be less than 52 weeks of data or incomplete YTD data). U: Unavailable — The reporting jurisdiction was unable to send the data to CDC or CDC was unable to process the data. -: No reported cases — The reporting jurisdiction did not submit any cases to CDC. N: Not reportable — The disease or condition was not reportable by law, statute, or regulation in the reporting jurisdiction. NN: Not nationally notifiable — This condition was not designated as being nationally notifiable. NP: Nationally notifiable but not published. NC: Not calculated — There is insufficient data available to support the calculation of this statistic. Cum: Cumulative year-to-date counts. Max: Maximum — Maximum case count during the previous 52 weeks.
Facebook
TwitterThis study aimed to construct a model based on machine learning to predict new HIV infections in HIV-negative men who have sex with men (MSM). This is a secondary analysis of a previous random clinical trial aiming to evaluate the preventive effects of PrEP on new HIV infection in MSM. During 2013–2015, 1455 HIV-negative MSM were enrolled. Participants were divided into treatment group and control group and regularly followed up until they seroconverted to HIV positive or until the 2-year endpoint reached. Five machine-learning approaches were applied to predict the risk of HIV infection. Model performance was evaluated using Harrel’s C-index and area under the receiver operator characteristic curve (AUC) and validated in an external validation cohort. To explain this model, shapley additive explanation (SHAP) values were calculated and visualized. During the observation period, 102 MSM developed HIV infection. Thirteen parameters are selected to construct the model. The random survival forest model showed the best performance in the validation cohort, with a C-index of 0.7013, and could significantly categorize MSM into three groups. Our model indicated that MSM with younger age, receptive anal intercourse, and multiple male sexual partners had an increased risk of HIV infection, and those with higher AIDS knowledge scores had a lower risk. We presented a machine learning-based model to predict their risk of developing HIV infection. This model could be applied to recognize MSM who are at a higher risk of developing HIV infection.
Facebook
TwitterODC Public Domain Dedication and Licence (PDDL) v1.0http://www.opendatacommons.org/licenses/pddl/1.0/
License information was derived automatically
A. SUMMARY Medical provider confirmed COVID-19 cases and confirmed COVID-19 related deaths in San Francisco, CA aggregated by several different geographic areas and normalized by 2016-2020 American Community Survey (ACS) 5-year estimates for population data to calculate rate per 10,000 residents.
On September 12, 2021, a new case definition of COVID-19 was introduced that includes criteria for enumerating new infections after previous probable or confirmed infections (also known as reinfections). A reinfection is defined as a confirmed positive PCR lab test more than 90 days after a positive PCR or antigen test. The first reinfection case was identified on December 7, 2021.
Cases and deaths are both mapped to the residence of the individual, not to where they were infected or died. For example, if one was infected in San Francisco at work but lives in the East Bay, those are not counted as SF Cases or if one dies in Zuckerberg San Francisco General but is from another county, that is also not counted in this dataset.
Dataset is cumulative and covers cases going back to 3/2/2020 when testing began.
Geographic areas summarized are: 1. Analysis Neighborhoods 2. Census Tracts 3. Census Zip Code Tabulation Areas
B. HOW THE DATASET IS CREATED Addresses from medical data are geocoded by the San Francisco Department of Public Health (SFDPH). Those addresses are spatially joined to the geographic areas. Counts are generated based on the number of address points that match each geographic area. The 2016-2020 American Community Survey (ACS) population estimates provided by the Census are used to create a rate which is equal to ([count] / [acs_population]) * 10000) representing the number of cases per 10,000 residents.
C. UPDATE PROCESS Geographic analysis is scripted by SFDPH staff and synced to this dataset daily at 7:30 Pacific Time.
D. HOW TO USE THIS DATASET San Francisco population estimates for geographic regions can be found in a view based on the San Francisco Population and Demographic Census dataset. These population estimates are from the 2016-2020 5-year American Community Survey (ACS).
Privacy rules in effect To protect privacy, certain rules are in effect: 1. Case counts greater than 0 and less than 10 are dropped - these will be null (blank) values 2. Death counts greater than 0 and less than 10 are dropped - these will be null (blank) values 3. Cases and deaths dropped altogether for areas where acs_population < 1000
Rate suppression in effect where counts lower than 20 Rates are not calculated unless the case count is greater than or equal to 20. Rates are generally unstable at small numbers, so we avoid calculating them directly. We advise you to apply the same approach as this is best practice in epidemiology.
A note on Census ZIP Code Tabulation Areas (ZCTAs) ZIP Code Tabulation Areas are special boundaries created by the U.S. Census based on ZIP Codes developed by the USPS. They are not, however, the same thing. ZCTAs are areal representations of routes. Read how the Census develops ZCTAs on their website.
Row included for Citywide case counts, incidence rate, and deaths A single row is included that has the Citywide case counts and incidence rate. This can be used for comparisons. Citywide will capture all cases regardless of address quality. While some cases cannot be mapped to sub-areas like Census Tracts, ongoing data quality efforts result in improved mapping on a rolling basis.
E. CHANGE LOG
Facebook
TwitterNotice of data discontinuation: Since the start of the pandemic, AP has reported case and death counts from data provided by Johns Hopkins University. Johns Hopkins University has announced that they will stop their daily data collection efforts after March 10. As Johns Hopkins stops providing data, the AP will also stop collecting daily numbers for COVID cases and deaths. The HHS and CDC now collect and visualize key metrics for the pandemic. AP advises using those resources when reporting on the pandemic going forward.
April 9, 2020
April 20, 2020
April 29, 2020
September 1st, 2020
February 12, 2021
new_deaths column.February 16, 2021
The AP is using data collected by the Johns Hopkins University Center for Systems Science and Engineering as our source for outbreak caseloads and death counts for the United States and globally.
The Hopkins data is available at the county level in the United States. The AP has paired this data with population figures and county rural/urban designations, and has calculated caseload and death rates per 100,000 people. Be aware that caseloads may reflect the availability of tests -- and the ability to turn around test results quickly -- rather than actual disease spread or true infection rates.
This data is from the Hopkins dashboard that is updated regularly throughout the day. Like all organizations dealing with data, Hopkins is constantly refining and cleaning up their feed, so there may be brief moments where data does not appear correctly. At this link, you’ll find the Hopkins daily data reports, and a clean version of their feed.
The AP is updating this dataset hourly at 45 minutes past the hour.
To learn more about AP's data journalism capabilities for publishers, corporations and financial institutions, go here or email kromano@ap.org.
Use AP's queries to filter the data or to join to other datasets we've made available to help cover the coronavirus pandemic
Filter cases by state here
Rank states by their status as current hotspots. Calculates the 7-day rolling average of new cases per capita in each state: https://data.world/associatedpress/johns-hopkins-coronavirus-case-tracker/workspace/query?queryid=481e82a4-1b2f-41c2-9ea1-d91aa4b3b1ac
Find recent hotspots within your state by running a query to calculate the 7-day rolling average of new cases by capita in each county: https://data.world/associatedpress/johns-hopkins-coronavirus-case-tracker/workspace/query?queryid=b566f1db-3231-40fe-8099-311909b7b687&showTemplatePreview=true
Join county-level case data to an earlier dataset released by AP on local hospital capacity here. To find out more about the hospital capacity dataset, see the full details.
Pull the 100 counties with the highest per-capita confirmed cases here
Rank all the counties by the highest per-capita rate of new cases in the past 7 days here. Be aware that because this ranks per-capita caseloads, very small counties may rise to the very top, so take into account raw caseload figures as well.
The AP has designed an interactive map to track COVID-19 cases reported by Johns Hopkins.
@(https://datawrapper.dwcdn.net/nRyaf/15/)
<iframe title="USA counties (2018) choropleth map Mapping COVID-19 cases by county" aria-describedby="" id="datawrapper-chart-nRyaf" src="https://datawrapper.dwcdn.net/nRyaf/10/" scrolling="no" frameborder="0" style="width: 0; min-width: 100% !important;" height="400"></iframe><script type="text/javascript">(function() {'use strict';window.addEventListener('message', function(event) {if (typeof event.data['datawrapper-height'] !== 'undefined') {for (var chartId in event.data['datawrapper-height']) {var iframe = document.getElementById('datawrapper-chart-' + chartId) || document.querySelector("iframe[src*='" + chartId + "']");if (!iframe) {continue;}iframe.style.height = event.data['datawrapper-height'][chartId] + 'px';}}});})();</script>
Johns Hopkins timeseries data - Johns Hopkins pulls data regularly to update their dashboard. Once a day, around 8pm EDT, Johns Hopkins adds the counts for all areas they cover to the timeseries file. These counts are snapshots of the latest cumulative counts provided by the source on that day. This can lead to inconsistencies if a source updates their historical data for accuracy, either increasing or decreasing the latest cumulative count. - Johns Hopkins periodically edits their historical timeseries data for accuracy. They provide a file documenting all errors in their timeseries files that they have identified and fixed here
This data should be credited to Johns Hopkins University COVID-19 tracking project
Facebook
TwitterObjectiveThe prevalent international travel may have an impact on new HIV infections, but related studies were lacking. We aimed to explore the association between international travel arrivals and new HIV infections in 15–49 years aged group from 2000 to 2018, to make tailored implications for HIV prevention.MethodsWe obtained the data of new HIV infections from the Joint United Nations Programme on HIV/AIDS and international travel arrivals from the World Bank. Correlation analysis was used to explore the relation briefly. Log-linear models were built to analyze the association between international travel arrivals and new HIV infections.ResultsInternational travel arrivals were positively correlated with new HIV infections (correlation coefficients: 0.916, p < 0.001). After controlling population density, the median age of the total population (years), socio-demographic index (SDI), travel-related mandatory HIV testing, HIV-related restrictions, and antiretroviral therapy coverage, there were 6.61% (95% CI: 5.73, 7.50; p < 0.001) percentage changes in new HIV infections of 15–49 years aged group associated with a 1 million increase in international travel arrivals.ConclusionsHigher international travel arrivals were correlated with new HIV infections in 15–49 years aged group. Therefore, multipronged structural and effective strategies and management should be implemented and strengthened.
Facebook
TwitterDuring outbreaks of infectious diseases with high morbidity and mortality, individuals closely follow media reports of the outbreak. Many will attempt to minimize contacts with other individuals in order to protect themselves from infection and possibly death. This process is called social distancing. Social distancing strategies include restricting socializing and travel, and using barrier protections. We use modeling to show that for short-term outbreaks, social distancing can have a large influence on reducing outbreak morbidity and mortality. In particular, public health agencies working together with the media can significantly reduce the severity of an outbreak by providing timely accounts of new infections and deaths. Our models show that the most effective strategy to reduce infections is to provide this information as early as possible, though providing it well into the course of the outbreak can still have a significant effect. However, our models for long-term outbreaks indicate that reporting historic infection data can result in more infections than with no reporting at all. We examine three types of media influence and we illustrate the media influence with a simulated outbreak of a generic emerging infectious disease in a small city. Social distancing can never be complete; however, for a spectrum of outbreaks, we show that leaving isolation (stopping applying social distancing measures) for up to 4 hours each day has modest effect on the overall morbidity and mortality.
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
In the Context of COVID-19 information of similar infections like influenza can be very valuable to a data scientist. New York is one of the most affected cities in the COVID-19 pandemia and the knowledge of the distribution of previous infections could be relevant in order to predict future spreadings or develop efficient sampling methods.
The dataset contains weekly information of infections (positive test) in New York Counties during the period Oct 2009-Mar 2019. The months studied are Jan, Feb, Mar, Apr, May, Oct, Nov, Dec. There are included other variables by County like the amount of hospital beds, unemployment rate, population, average income, Median age,Total expenditure per Year in hospital interventions...( See variable description). All information is based on relevant sources. The dataset is a combination of different datasets i list below: 1. Weekly of infections by county: https://data.world/healthdatany/jr8b-6gh6/workspace/file?filename=influenza-laboratory-confirmed-cases-by-county-beginning-2009-10-season-1.csv 2. Area of Counties:https://www.health.ny.gov/statistics/vital_statistics/2006/table02.htm 3. Population size: https://catalog.data.gov/dataset/annual-population-estimates-for-new-york-state-and-counties-beginning-1970 4. Number of Adult care facilities beds: https://health.data.ny.gov/Health/Adult-Care-Facility-Map/6wkx-ptu4 5. Age related data: https://factfinder.census.gov/faces/tableservices/jsf/pages/productview.xhtml?src=CF 6. Income data: https://en.wikipedia.org/wiki/List_of_New_York_locations_by_per_capita_income 7. Labour data: https://labor.ny.gov/stats/lslaus.shtm 8. Information about hospitals beds and services: https://health.data.ny.gov/Health/Health-Facility-Certification-Information/2g9y-7kqm 9. Health expenditure by illness: https://health.data.ny.gov/Health/Hospital-Inpatient-Cost-Transparency-Beginning-200/7dtz-qxmr
Testing has been proven to be one of the most relevant tools to fight against virus spreading. Statistics provide of efficient tools to obtain estimation of total number of infections, in particular sampling methods may reduce significantly the costs of testing. This dataset pretends to be used as a tool to understand the distribution of positive tests in the state of New York in order to design sampling methods that could reduce significantly the estimation error.
Facebook
TwitterProject Tycho datasets contain case counts for reported disease conditions for countries around the world. The Project Tycho data curation team extracts these case counts from various reputable sources, typically from national or international health authorities, such as the US Centers for Disease Control or the World Health Organization. These original data sources include both open- and restricted-access sources. For restricted-access sources, the Project Tycho team has obtained permission for redistribution from data contributors. All datasets contain case count data that are identical to counts published in the original source and no counts have been modified in any way by the Project Tycho team. The Project Tycho team has pre-processed datasets by adding new variables, such as standard disease and location identifiers, that improve data interpretability. We also formatted the data into a standard data format.
Each Project Tycho dataset contains case counts for a specific condition (e.g. measles) and for a specific country (e.g. The United States). Case counts are reported per time interval. In addition to case counts, datasets include information about these counts (attributes), such as the location, age group, subpopulation, diagnostic certainty, place of acquisition, and the source from which we extracted case counts. One dataset can include many series of case count time intervals, such as "US measles cases as reported by CDC", or "US measles cases reported by WHO", or "US measles cases that originated abroad", etc.
Depending on the intended use of a dataset, we recommend a few data processing steps before analysis: - Analyze missing data: Project Tycho datasets do not include time intervals for which no case count was reported (for many datasets, time series of case counts are incomplete, due to incompleteness of source documents) and users will need to add time intervals for which no count value is available. Project Tycho datasets do include time intervals for which a case count value of zero was reported. - Separate cumulative from non-cumulative time interval series. Case count time series in Project Tycho datasets can be "cumulative" or "fixed-intervals". Cumulative case count time series consist of overlapping case count intervals starting on the same date, but ending on different dates. For example, each interval in a cumulative count time series can start on January 1st, but end on January 7th, 14th, 21st, etc. It is common practice among public health agencies to report cases for cumulative time intervals. Case count series with fixed time intervals consist of mutually exclusive time intervals that all start and end on different dates and all have identical length (day, week, month, year). Given the different nature of these two types of case count data, we indicated this with an attribute for each count value, named "PartOfCumulativeCountSeries".
Facebook
TwitterProject Tycho datasets contain case counts for reported disease conditions for countries around the world. The Project Tycho data curation team extracts these case counts from various reputable sources, typically from national or international health authorities, such as the US Centers for Disease Control or the World Health Organization. These original data sources include both open- and restricted-access sources. For restricted-access sources, the Project Tycho team has obtained permission for redistribution from data contributors. All datasets contain case count data that are identical to counts published in the original source and no counts have been modified in any way by the Project Tycho team. The Project Tycho team has pre-processed datasets by adding new variables, such as standard disease and location identifiers, that improve data interpretability. We also formatted the data into a standard data format.
Each Project Tycho dataset contains case counts for a specific condition (e.g. measles) and for a specific country (e.g. The United States). Case counts are reported per time interval. In addition to case counts, datasets include information about these counts (attributes), such as the location, age group, subpopulation, diagnostic certainty, place of acquisition, and the source from which we extracted case counts. One dataset can include many series of case count time intervals, such as "US measles cases as reported by CDC", or "US measles cases reported by WHO", or "US measles cases that originated abroad", etc.
Depending on the intended use of a dataset, we recommend a few data processing steps before analysis: - Analyze missing data: Project Tycho datasets do not include time intervals for which no case count was reported (for many datasets, time series of case counts are incomplete, due to incompleteness of source documents) and users will need to add time intervals for which no count value is available. Project Tycho datasets do include time intervals for which a case count value of zero was reported. - Separate cumulative from non-cumulative time interval series. Case count time series in Project Tycho datasets can be "cumulative" or "fixed-intervals". Cumulative case count time series consist of overlapping case count intervals starting on the same date, but ending on different dates. For example, each interval in a cumulative count time series can start on January 1st, but end on January 7th, 14th, 21st, etc. It is common practice among public health agencies to report cases for cumulative time intervals. Case count series with fixed time intervals consist of mutually exclusive time intervals that all start and end on different dates and all have identical length (day, week, month, year). Given the different nature of these two types of case count data, we indicated this with an attribute for each count value, named "PartOfCumulativeCountSeries".
Facebook
TwitterNNDSS - TABLE 1K. Ehrlichiosis and Anaplasmosis, Anaplasma phagocytophilum infection to Ehrlichia chaffeensis infection - 2020. In this Table, provisional cases* of notifiable diseases are displayed for United States, U.S. territories, and Non-U.S. residents. Note: This table contains provisional cases of national notifiable diseases from the National Notifiable Diseases Surveillance System (NNDSS). NNDSS data from the 50 states, New York City, the District of Columbia and the U.S. territories are collated and published weekly on the NNDSS Data and Statistics web page (https://wwwn.cdc.gov/nndss/data-and-statistics.html). Cases reported by state health departments to CDC for weekly publication are provisional because of the time needed to complete case follow-up. Therefore, numbers presented in later weeks may reflect changes made to these counts as additional information becomes available. The national surveillance case definitions used to define a case are available on the NNDSS web site at https://wwwn.cdc.gov/nndss/. Information about the weekly provisional data and guides to interpreting data are available at: https://wwwn.cdc.gov/nndss/infectious-tables.html. Footnotes: U: Unavailable — The reporting jurisdiction was unable to send the data to CDC or CDC was unable to process the data. -: No reported cases — The reporting jurisdiction did not submit any cases to CDC. N: Not reportable — The disease or condition was not reportable by law, statute, or regulation in the reporting jurisdiction. NN: Not nationally notifiable — This condition was not designated as being nationally notifiable. NP: Nationally notifiable but not published. NC: Not calculated — There is insufficient data available to support the calculation of this statistic. Cum: Cumulative year-to-date counts. Max: Maximum — Maximum case count during the previous 52 weeks. * Case counts for reporting years 2019 and 2020 are provisional and subject to change. Cases are assigned to the reporting jurisdiction submitting the case to NNDSS, if the case's country of usual residence is the U.S., a U.S. territory, unknown, or null (i.e. country not reported); otherwise, the case is assigned to the 'Non-U.S. Residents' category. Country of usual residence is currently not reported by all jurisdictions or for all conditions. For further information on interpretation of these data, see https://wwwn.cdc.gov/nndss/document/Users_guide_WONDER_tables_cleared_final.pdf. †Previous 52 week maximum and cumulative YTD are determined from periods of time when the condition was reportable in the jurisdiction (i.e., may be less than 52 weeks of data or incomplete YTD data).
Facebook
TwitterThis dataset includes information on the number of positive tests of individuals for COVID-19 infection performed in New York State beginning March 1, 2020, when the first case of COVID-19 was identified in the state. The primary goal of publishing this dataset is to provide users timely information about local disease spread and reporting of positive cases. The data will be updated daily, reflecting tests reported by 12:00 am (midnight) three days prior. Data are published on a three-day lag in order to allow all test results to be reported.
Reporting of SARS-CoV2 laboratory testing results is mandated under Part 2 of the New York State Sanitary Code. Clinical laboratories, as defined in Public Health Law (PHL) § 571 electronically report test results to the New York State Department of Health (DOH) via the Electronic Clinical Laboratory Reporting System (ECLRS). The DOH Division of Epidemiology’s Bureau of Surveillance and Data System (BSDS) monitors ECLRS reporting and ensures that all results are accurate.
Test counts are based on specimen collection date. A person may have multiple specimens tested on one day, these would be counted one time, i.e., if two specimens are collected from an individual at the same time and then evaluated, the outcome of the evaluation of those two samples to diagnose the individual is counted as a single test of one person, even though the specimens may be tested separately. All positive test results that are at least 90 days apart are counted as cases/new positives.
New positive test counts are assigned to a county based on this order of preference: 1) the patient’s address, 2) the ordering healthcare provider/campus address, or 3) the ordering facility/campus address.
Archived versions of the reinfections dataset are also available: First infections - https://health.data.ny.gov/d/xdss-u53e Reinfections - https://health.data.ny.gov/d/7aaj-cdtu
Facebook
TwitterApache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
The objective of this dataset is to: 1. Perform a data analysis to find out the number of cases and deaths in each country and continent, and compare the times when the number of cases or deaths was available. 2. Create a machine learning model (if possible) to train the model to predict deaths based on the cases and deaths.
| | Columns | Description | Type | |----------------------|---------------------------------------------------------------------------------|---------------| |1| Date_reported | Date of recording the number of cases, whether infected or dead. | Categorical | |2| Country_code | A standardized code (ISO-3166) for the country (e.g., US for the United States).| Categorical | |3| Country | The full name of the country where the data is collected. | Categorical | |4| Continent | The continent on which the country is located (e.g., Asia, Europe). | Categorical | |5| WHO_region | The specific WHO region that the country belongs to (e.g., AFRO, EMRO, EURO). | Categorical | |6| New_cases | The number of newly reported COVID-19 cases on the reporting date. | Numerical (float) | |7| Cumulative_cases | The total number of COVID-19 cases reported in the country to date. | Numerical (int) | |8| New_deaths | The number of newly reported deaths due to COVID-19 on the reporting date. | Numerical (float) | |9| Cumulative_deaths | The total number of deaths due to COVID-19 reported in the country to date. | Numerical (int) |
Facebook
TwitterNNDSS - Table I. infrequently reported notifiable diseases - 2018. In this Table, provisional cases of selected infrequently reported notifiable diseases (<1,000 cases reported during the preceding year) are displayed. This tables excludes U.S. territories. Notice: The case counts for Haemophilus influenzae, invasive disease Nontypeable" and "Non-b serotype" were switched for 2018 weeks 1-52. Note: These are provisional cases of selected national notifiable diseases from the National Notifiable Diseases Surveillance System (NNDSS). NNDSS data from the 50 states, New York City, the District of Columbia are collated and published weekly on the NNDSS Data and Statistic web page (https://wwwn.cdc.gov/nndss/data-and-statistics.html). Cases reported by state health departments to CDC for weekly publication are provisional because of the time needed to complete case follow-up. Therefore, numbers presented in later weeks may reflect changes made to these counts as additional information becomes available. The national surveillance case definitions used to define a case are available on the NNDSS web site at https://wwwn.cdc.gov/nndss/. Information about the weekly provisional data and guides to interpreting data are available at: https://wwwn.cdc.gov/nndss/infectious-tables.html. Footnote: —: No reported cases. N: Not reportable. NA: Not available. NN: Not Nationally Notifiable. NP: Nationally notifiable but not published. Cum: Cumulative year-to-date counts. Case counts for reporting years 2017 and 2018 are provisional and subject to change. Data for years 2013 through 2016 are finalized. For further information on interpretation of these data, see http://wwwn.cdc.gov/nndss/document/ProvisionalNationaNotifiableDiseasesSurveillanceData20100927.pdf. † This table does not include cases from the U.S. territories. § Calculated by summing the incidence counts for the current week, the 2 weeks preceding the current week, and the 2 weeks following the current week, for a total of 5 preceding years. Additional information is available at https://wwwn.cdc.gov/nndss/document/5yearweeklyaverage.pdf. ¶ Not reportable in all jurisdictions. Data from states where the condition is not reportable are excluded from this table, except for the arboviral diseases and influenza-associated pediatric mortality. Reporting exceptions are available at http://wwwn.cdc.gov/nndss/downloads.html. ** Please refer to the CDC WONDER for weekly updates to the footnote for this condition. †† Please refer to the CDC WONDER for weekly updates to the footnote for this condition. §§ Novel influenza A virus infections are human infections with influenza A viruses that are different from currently circulating human seasonal influenza viruses. With the exception of one avian lineage influenza A (H7N2) virus, all novel influenza A virus infections reported to CDC since 2013 have been variant influenza viruses. ¶¶ Prior to 2018, cases of paratyphoid fever were included with salmonellosis cases (see Table II). *** Prior to 2015, CDC's National Notifiable Diseases Surveillance System (NNDSS) did not receive electronic data about incident cases of specific viral hemorrhagic fevers; instead data were collected in aggregate as "viral hemorrhagic fevers'. NNDSS was updated beginning in 2015 to receive data for each of the viral hemorrhagic fevers listed.
Facebook
TwitterThese data contain case counts and rates for selected communicable diseases—listed in the data dictionary—that met the surveillance case definition for that disease and was reported for California residents, by disease, county, year, and sex. The data represent cases with an estimated illness onset date from 2001 through the last year indicated from California Confidential Morbidity Reports and/or Laboratory Reports. Data captured represent reportable case counts as of the date indicated in the “Temporal Coverage” section below, so the data presented may differ from previous publications due to delays inherent to case reporting, laboratory reporting, and epidemiologic investigation.