The COVID Tracking Project collects information from 50 US states, the District of Columbia, and 5 other US territories to provide the most comprehensive testing data we can collect for the novel coronavirus, SARS-CoV-2. We attempt to include positive and negative results, pending tests, and total people tested for each state or district currently reporting that data.
Testing is a crucial part of any public health response, and sharing test data is essential to understanding this outbreak. The CDC is currently not publishing complete testing data, so we’re doing our best to collect it from each state and provide it to the public. The information is patchy and inconsistent, so we’re being transparent about what we find and how we handle it—the spreadsheet includes our live comments about changing data and how we’re working with incomplete information.
From here, you can also learn about our methodology, see who makes this, and find out what information states provide and how we handle it.
The AP has requested a timeseries dataset reporting daily counts for distributed and administered vaccines in the U.S. from the CDC. In the absence of that dataset, we are storing daily snapshots of the cumulative counts provided by the CDC COVID Data Tracker and compiling a timeseries dataset here. This process has captured cumulative counts going back to January 4th and daily counts of new doses administered and distributed going back to January 5th. The timeseries dataset also includes seven-day rolling average calculations for the daily metrics.
We have identified a few instances of decreasing cumulative counts in this timeseries, which result in single-day negative counts. We are treating these instances as corrections, and include the negative counts in the rolling averages.
We are investigating the cumulative count decreases and will update the timeseries dataset if necessary with additional information from the CDC. When the CDC provides its own timeseries dataset we will make that available here.
The AP is using data provided by the Centers for Disease Control and Prevention to report vaccine doses distributed and administered in the United States.
This data is from the CDC's COVID Data Tracker, which is updated daily. However, keep in mind that healthcare providers can report doses to federal, state, territorial, and local agencies up to 72 hours after doses are administered.
To learn more about AP's data journalism capabilities for publishers, corporations and financial institutions, go here or email kromano@ap.org.
The AP has designed an interactive map to track COVID-19 vaccine counts reported by The CDC. @(https://interactives.ap.org/embeds/TUVpf/14/)
<iframe title="Tracking US COVID vaccinations" aria-label="Map" id="datawrapper-chart-TUVpf" src="https://interactives.ap.org/embeds/TUVpf/14/" scrolling="no" width="100%" style="border:none" height="548"></iframe><script type="text/javascript">!function(){"use strict";window.addEventListener("message",(function(a){if(void 0!==a.data["datawrapper-height"])for(var e in a.data["datawrapper-height"]){var t=document.getElementById("datawrapper-chart-"+e)||document.querySelector("iframe[src*='"+e+"']");t&&(t.style.height=a.data["datawrapper-height"][e]+"px")}}))}();</script>
From The CDC: - Numbers reported on CDC’s website are validated through a submission process with each jurisdiction and may differ from numbers posted on other websites. - Differences between reporting jurisdictions and CDC’s website may occur due to the timing of reporting and website updates. - The process used for reporting doses distributed or people vaccinated displayed by other websites may differ.
A. SUMMARY It is the policy of the San Francisco Department of Public Health to comply with patient/client/resident rights regarding Protected Health Information (PHI) as set forth in the Health Insurance Portability and Accountability Act of 1996 (HIPAA). These guidelines exists to provide guidance only as it relates to the public release of COVID-19 data through the tracker webpages, so that public reporting of de-identified information of residents’ health status, demographic and other characteristics, and geographical information reflect consistent reporting practices and meaningful differences in health outcomes, conditions that impact health, and delivery of services while safeguarding patient/client/resident rights regarding PHI.
COVID-19 related data will be released routinely in a variety of data products related to the tracker, including datasets through SF OpenData. Some data products may include data by county or smaller analysis unit such as ZIP code, neighborhood, or census tract.
Download the attached PDF for the policy.
https://github.com/nytimes/covid-19-data/blob/master/LICENSEhttps://github.com/nytimes/covid-19-data/blob/master/LICENSE
The New York Times is releasing a series of data files with cumulative counts of coronavirus cases in the United States, at the state and county level, over time. We are compiling this time series data from state and local governments and health departments in an attempt to provide a complete record of the ongoing outbreak.
Since the first reported coronavirus case in Washington State on Jan. 21, 2020, The Times has tracked cases of coronavirus in real time as they were identified after testing. Because of the widespread shortage of testing, however, the data is necessarily limited in the picture it presents of the outbreak.
We have used this data to power our maps and reporting tracking the outbreak, and it is now being made available to the public in response to requests from researchers, scientists and government officials who would like access to the data to better understand the outbreak.
The data begins with the first reported coronavirus case in Washington State on Jan. 21, 2020. We will publish regular updates to the data in this repository.
The COVID Symptom Tracker was designed by doctors and scientists at King's College London (KCL), Guys and St Thomas' Hospital working in partnership with ZOE Global. Led by Dr Tim Spector, professor of genetic epidemiology at KCL and director of TwinsUK.
The COVID Symptom Tracker (https://covid.joinzoe.com/) mobile application was designed by doctors and scientists at King's College London, Guys and St Thomas’ Hospitals working in partnership with ZOE Global Ltd – a health science company. This research is led by Dr Tim Spector, professor of genetic epidemiology at King’s College London and director of TwinsUK a scientific study of 15,000 identical and non-identical twins, which has been running for nearly three decades. The dataset schema includes: - Demographic Information (Year of Birth, Gender, Height, Weight, Postcode) - Health Screening Questions (Activity, Heart Disease, Diabetes, Lung Disease, Smoking Status, Kidney Disease, Chemotherapy, Immunosuppressants, Corticosteroids, Blood Pressure Medications, Previous COVID, COVID Symptoms, Needs Help, Housebound Problems, Help Availability, Mobility Aid) - COVID Testing Conducted - How You Feel? - Symptom Description - Location Information (Home, Hospital, Back From Hospital) - Treatment Received
Dataset Access Request: https://healthdatagateway.org/detail/9b604483-9cdc-41b2-b82c-14ee3dd705f6
April 29, 2020
October 13, 2020
The COVID Tracking Project is releasing more precise total testing counts, and has changed the way it is distributing the data that ends up on this site. Previously, total testing had been represented by positive tests plus negative tests. As states are beginning to report more specific testing counts, The COVID Tracking Project is moving toward reporting those numbers directly.
This may make it more difficult to compare your state against others in terms of positivity rate, but the net effect is we now have more precise counts:
Total Test Encounters: Total tests increase by one for every individual that is tested that day. Additional tests for that individual on that day (i.e., multiple swabs taken at the same time) are not included
Total PCR Specimens: Total tests increase by one for every testing sample retrieved from an individual. Multiple samples from an individual on a single day can be included in the count
Unique People Tested: Total tests increase by one the first time an individual is tested. The count will not increase in later days if that individual is tested again – even months later
These three totals are not all available for every state. The COVID Tracking Project prioritizes the different count types for each state in this order:
Total Test Encounters
Total PCR Specimens
Unique People Tested
If the state does not provide any of those totals directly, The COVID Tracking Project falls back to the initial calculation of total tests that it has provided up to this point: positive + negative tests.
One of the above total counts will be the number present in the cumulative_total_test_results
and total_test_results_increase
columns.
The positivity rates provided on this site will divide confirmed cases by one of these total_test_results
columns.
The AP is using data collected by the COVID Tracking Project to measure COVID-19 testing across the United States.
The COVID Tracking Project data is available at the state level in the United States. The AP has paired this data with population figures and has calculated testing rates and death rates per 1,000 people.
This data is from The COVID Tracking Project API that is updated regularly throughout the day. Like all organizations dealing with data, The COVID Tracking Project is constantly refining and cleaning up their feed, so there may be brief moments where data does not appear correctly. At this link, you’ll find The COVID Tracking Project daily data reports, and a clean version of their feed.
A Note on timing:
- The COVID Tracking Project updates regularly throughout the day, but state numbers will come in at different times. The entire Tracking Project dataset will be updated between 4-5pm EDT daily. Keep this time in mind when reporting on stories comparing states. At certain times of day, one state may be more up to date than another. We have included the date_modified
timestamp for state-level data, which represents the last time the state updated its data. The date_checked
value in the state-level data reflects the last time The COVID Tracking Project checked the state source. We have also included the last_modified
timestamp for the national-level data, which marks the last time the national data was updated.
The AP is updating this dataset hourly at 45 minutes past the hour.
To learn more about AP's data journalism capabilities for publishers, corporations and financial institutions, go here or email kromano@ap.org.
total_people_tested
counts do not include pending tests. They are the total number of tests that have returned positive
or negative
.This data should be credited to The COVID Tracking Project
Nicky Forster — nforster@ap.org
FasterCures, a center of the Milken Institute, is currently tracking the development of treatments and vaccines for COVID-19 (coronavirus) https://airtable.com/shrSAi6t5WFwqo3GM/tblEzPQS5fnc0FHYR/viwDBH7b6FjmIBX5x?blocks=bipZFzhJ7wHPv7x9z https://covid-19tracker.milkeninstitute.org/
FasterCures, a center of the Milken Institute, is currently tracking the development of treatments and vaccines for COVID-19 (coronavirus). The tracker contains an aggregation of publicly-available information from validated sources.
https://covid-19tracker.milkeninstitute.org/
https://mdl.library.utoronto.ca/covid-19/data https://covid-19tracker.milkeninstitute.org/ https://airtable.com/shrSAi6t5WFwqo3GM/tblEzPQS5fnc0FHYR/viwDBH7b6FjmIBX5x?blocks=bipZFzhJ7wHPv7x9z
Photo by Daniel Schludi on Unsplash
Covid-19 Pandemic
The COVID Tracking Project was a volunteer organization launched from The Atlantic and dedicated to collecting and publishing the data required to understand the COVID-19 outbreak in the United States. Our dataset was in use by national and local news organizations across the United States and by research projects and agencies worldwide. On August 12, 2020, we launched the Long-Term-Care COVID Tracker with weekly data back to May 28, 2020, but the work of compiling the dataset began much earlier. In mid-April 2020, a team within The COVID Tracking Project started collecting long-term care data from every state that reported it. The aim of our work on long-term-care (LTC) data was to ensure that the pandemic’s impact on residents and workers in a broad range of LTC facilities was entered into the historical record. Every Thursday evening until March 4, 2021, this dedicated team of volunteers gathered COVID-19 case and death data of long-term-care facility residents and staff from state a...
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Update: Regarding this dataset: covid19india.org has stopped operations of data collection since October 31st , 2021. For more info please read this blog post where they had cited their reasons. Meanwhile if I find another source i will add that in another dataset . This Dataset is Sourced from https://www.covid19india.org/ Read their About Page For More details on how they collected this information.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Governments are taking a wide range of measures in response to the COVID-19 outbreak. The Oxford COVID-19 Government Response Tracker (OxCGRT) aims track and compare government responses to the coronavirus outbreak worldwide rigorously and consistently.
The OxCGRT systematically collects information on several different common policy responses governments have taken, scores the stringency of such measures, and aggregates these scores into a common Stringency Index. For more, please visit > https://www.bsg.ox.ac.uk/research/research-projects/oxford-covid-19-government-response-tracker
Open Database License (ODbL) v1.0https://www.opendatacommons.org/licenses/odbl/1.0/
License information was derived automatically
This dataset contains COVID-19 Government Response Tracker for 2020. Data from Blavatnik School of Government, University of Oxford. Follow datasource.kapsarc.org for timely data to advance energy economics research. Data cited at: Hale, Thomas and Samuel Webster (2020). Oxford COVID-19 Government Response Tracker.Note:For further detailed description of indicators and changes, please find the attachment.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Analysis of ‘COVID-19 State Data’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://www.kaggle.com/nightranger77/covid19-state-data on 28 January 2022.
--- Dataset description provided by original source is as follows ---
This dataset is a per-state amalgamation of demographic, public health and other relevant predictors for COVID-19.
Used positive
, death
and totalTestResults
from the API for, respectively, Infected
, Deaths
and Tested
in this dataset.
Please read the documentation of the API for more context on those columns
Density is people per meter squared https://worldpopulationreview.com/states/
https://worldpopulationreview.com/states/gdp-by-state/
https://worldpopulationreview.com/states/per-capita-income-by-state/
https://en.wikipedia.org/wiki/List_of_U.S._states_by_Gini_coefficient
Rates from Feb 2020 and are percentage of labor force
https://www.bls.gov/web/laus/laumstrk.htm
Ratio is Male / Female
https://www.kff.org/other/state-indicator/distribution-by-gender/
https://worldpopulationreview.com/states/smoking-rates-by-state/
Death rate per 100,000 people
https://www.cdc.gov/nchs/pressroom/sosmap/flu_pneumonia_mortality/flu_pneumonia.htm
Death rate per 100,000 people
https://www.cdc.gov/nchs/pressroom/sosmap/lung_disease_mortality/lung_disease.htm
https://www.kff.org/other/state-indicator/total-active-physicians/
https://www.kff.org/other/state-indicator/total-hospitals
Includes spending for all health care services and products by state of residence. Hospital spending is included and reflects the total net revenue. Costs such as insurance, administration, research, and construction expenses are not included.
https://www.kff.org/other/state-indicator/avg-annual-growth-per-capita/
Pollution: Average exposure of the general public to particulate matter of 2.5 microns or less (PM2.5) measured in micrograms per cubic meter (3-year estimate)
https://www.americashealthrankings.org/explore/annual/measure/air/state/ALL
For each state, number of medium and large airports https://en.wikipedia.org/wiki/List_of_the_busiest_airports_in_the_United_States
Note that FL was incorrect in the table, but is corrected in the Hottest States paragraph
https://worldpopulationreview.com/states/average-temperatures-by-state/
District of Columbia temperature computed as the average of Maryland and Virginia
Urbanization as a percentage of the population https://www.icip.iastate.edu/tables/population/urban-pct-states
https://www.kff.org/other/state-indicator/distribution-by-age/
Schools that haven't closed are marked NaN https://www.edweek.org/ew/section/multimedia/map-coronavirus-and-school-closures.html
Note that some datasets above did not contain data for District of Columbia, this missing data was found via Google searches manually entered.
--- Original source retains full ownership of the source dataset ---
The COVID Racial Data Tracker advocated for, collected, published, and analyzed racial data on the COVID-19 pandemic across the United States. It was a collaboration between the COVID Tracking Project and the Boston University Center for Antiracist Research. This project began when Dr. Ibram X. Kendi, director of the BU Center for Antiracist Research, wrote a series of essays in The Atlantic about the urgent need to gather racial and ethnic demographic data to understand the outbreak and protect vulnerable communities. On April 12, 2020, we started collecting race and ethnicity data from every state that reported it. On April 15, we launched that dataset as the first iteration of the COVID Racial Data Tracker. We updated this data twice per week.
https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F5505749%2F2b83271d61e47e2523e10dc9c28e545c%2F600x200.jpg?generation=1599042483103679&alt=media" alt="">
Daily global COVID-19 data for all countries, provided by Johns Hopkins University (JHU) Center for Systems Science and Engineering (CSSE). If you want to use the update version of the data, you can use our daily updated data with the help of api key by entering it via Altadata.
In this data product, you may find the latest and historical global daily data on the COVID-19 pandemic for all countries.
The COVID‑19 pandemic, also known as the coronavirus pandemic, is an ongoing global pandemic of coronavirus disease 2019 (COVID‑19), caused by severe acute respiratory syndrome coronavirus 2 (SARS‑CoV‑2). The outbreak was first identified in December 2019 in Wuhan, China. The World Health Organization declared the outbreak a Public Health Emergency of International Concern on 30 January 2020 and a pandemic on 11 March. As of 12 August 2020, more than 20.2 million cases of COVID‑19 have been reported in more than 188 countries and territories, resulting in more than 741,000 deaths; more than 12.5 million people have recovered.
The Johns Hopkins Coronavirus Resource Center is a continuously updated source of COVID-19 data and expert guidance. They aggregate and analyze the best data available on COVID-19 - including cases, as well as testing, contact tracing and vaccine efforts - to help the public, policymakers and healthcare professionals worldwide respond to the pandemic.
This dataset is a per-state amalgamation of demographic, public health and other relevant predictors for COVID-19.
Used positive
, death
and totalTestResults
from the API for, respectively, Infected
, Deaths
and Tested
in this dataset.
Please read the documentation of the API for more context on those columns
Density is people per meter squared https://worldpopulationreview.com/states/
https://worldpopulationreview.com/states/gdp-by-state/
https://worldpopulationreview.com/states/per-capita-income-by-state/
https://en.wikipedia.org/wiki/List_of_U.S._states_by_Gini_coefficient
Rates from Feb 2020 and are percentage of labor force
https://www.bls.gov/web/laus/laumstrk.htm
Ratio is Male / Female
https://www.kff.org/other/state-indicator/distribution-by-gender/
https://worldpopulationreview.com/states/smoking-rates-by-state/
Death rate per 100,000 people
https://www.cdc.gov/nchs/pressroom/sosmap/flu_pneumonia_mortality/flu_pneumonia.htm
Death rate per 100,000 people
https://www.cdc.gov/nchs/pressroom/sosmap/lung_disease_mortality/lung_disease.htm
https://www.kff.org/other/state-indicator/total-active-physicians/
https://www.kff.org/other/state-indicator/total-hospitals
Includes spending for all health care services and products by state of residence. Hospital spending is included and reflects the total net revenue. Costs such as insurance, administration, research, and construction expenses are not included.
https://www.kff.org/other/state-indicator/avg-annual-growth-per-capita/
Pollution: Average exposure of the general public to particulate matter of 2.5 microns or less (PM2.5) measured in micrograms per cubic meter (3-year estimate)
https://www.americashealthrankings.org/explore/annual/measure/air/state/ALL
For each state, number of medium and large airports https://en.wikipedia.org/wiki/List_of_the_busiest_airports_in_the_United_States
Note that FL was incorrect in the table, but is corrected in the Hottest States paragraph
https://worldpopulationreview.com/states/average-temperatures-by-state/
District of Columbia temperature computed as the average of Maryland and Virginia
Urbanization as a percentage of the population https://www.icip.iastate.edu/tables/population/urban-pct-states
https://www.kff.org/other/state-indicator/distribution-by-age/
Schools that haven't closed are marked NaN https://www.edweek.org/ew/section/multimedia/map-coronavirus-and-school-closures.html
Note that some datasets above did not contain data for District of Columbia, this missing data was found via Google searches manually entered.
https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html
Management of the COVID-19 pandemic has proven to be a significant challenge to policy makers. This is in large part due to uneven reporting and the absence of open-access visualization tools to present and analyze local trends as well as infer healthcare needs. Here we report the development of CovidCounties.org, an interactive web application that depicts daily disease trends at the level of US counties using time series plots and maps. This application is accompanied by a manually curated dataset that catalogs all major public policy actions made at the state-level, as well as technical validation of the primary data. Finally, the underlying code for the site is also provided as open source, enabling others to validate and learn from this work.
Methods Data related to state-wide implementation of social-distancing policies were manually curated by web search and independently reviewed by a second author; disagreements were rare and resolved by discussion. Government websites were prioritized as sources of truth where feasible; otherwise, news reports covering state-wide proclamations were used. All citations are captured in the data file.
Ground truth data used in the validation were manually curated from states’ Department of Public Health websites. Citations of the validation data are included in the data file.
To confirm global accessibility of covidcounties.org, we used dareboost.com to perform loading speed tests from 14 cities across the globe using three different devices: Google Chrome via desktop, iPhone 6s/7/8, and Samsung Galaxy S6.
After over two years of public reporting, the State Profile Report will no longer be produced and distributed after February 2023. The final release was on February 23, 2023. We want to thank everyone who contributed to the design, production, and review of this report and we hope that it provided insight into the data trends throughout the COVID-19 pandemic. Data about COVID-19 will continue to be updated at CDC’s COVID Data Tracker. The State Profile Report (SPR) is generated by the Data Strategy and Execution Workgroup in the Joint Coordination Cell, in collaboration with the White House. It is managed by an interagency team with representatives from multiple agencies and offices (including the United States Department of Health and Human Services (HHS), the Centers for Disease Control and Prevention, the HHS Assistant Secretary for Preparedness and Response, and the Indian Health Service). The SPR provides easily interpretable information on key indicators for each state, down to the county level. It is a weekly snapshot in time that: Focuses on recent outcomes in the last seven days and changes relative to the month prior Provides additional contextual information at the county level for each state, and includes national level information Supports rapid visual interpretation of results with color thresholds
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset contains forecasted weekly numbers of reported COVID-19 incident cases, incident deaths, and cumulative deaths in the United States, previously reported on COVID Data Tracker (https://covid.cdc.gov/covid-data-tracker/#datatracker-home). These forecasts were generated using mathematical models by CDC partners in the COVID-19 Forecast Hub (https://covid19forecasthub.org/doc/ensemble/). A CDC ensemble model was produced every week using the submitted models from that week at the national, and state/territory level.
This dataset is intended to mirror the observed and forecasted data, previously available for download on the CDC’s COVID Data Tracker. Mortality forecasts for both new and cumulative reported COVID-19 deaths were produced at the state and territory level and national level. Forecasts of new reported COVID-19 cases were produced at the county, state/territory, and national level. Please note that this dataset is not complete for every model, date, location or combination thereof. Specifically, county level submissions for COVID-19 incident cases were accepted, but not required, and are missing or incomplete for many models and dates. State and territory-level forecasts are more complete, but not all models submitted forecasts for all locations, dates, and targets (new reported deaths, new reported cases, and cumulative reported deaths). Forecasts for COVID-19 incident cases were discontinued in February 2022. Forecasts for COVID-19 cumulative and incident deaths were discontinued in March 2023.
The Marshall Project, the nonprofit investigative newsroom dedicated to the U.S. criminal justice system, has partnered with The Associated Press to compile data on the prevalence of COVID-19 infection in prisons across the country. The Associated Press is sharing this data as the most comprehensive current national source of COVID-19 outbreaks in state and federal prisons.
Lawyers, criminal justice reform advocates and families of the incarcerated have worried about what was happening in prisons across the nation as coronavirus began to take hold in the communities outside. Data collected by The Marshall Project and AP shows that hundreds of thousands of prisoners, workers, correctional officers and staff have caught the illness as prisons became the center of some of the country’s largest outbreaks. And thousands of people — most of them incarcerated — have died.
In December, as COVID-19 cases spiked across the U.S., the news organizations also shared cumulative rates of infection among prison populations, to better gauge the total effects of the pandemic on prison populations. The analysis found that by mid-December, one in five state and federal prisoners in the United States had tested positive for the coronavirus -- a rate more than four times higher than the general population.
This data, which is updated weekly, is an effort to track how those people have been affected and where the crisis has hit the hardest.
The data tracks the number of COVID-19 tests administered to people incarcerated in all state and federal prisons, as well as the staff in those facilities. It is collected on a weekly basis by Marshall Project and AP reporters who contact each prison agency directly and verify published figures with officials.
Each week, the reporters ask every prison agency for the total number of coronavirus tests administered to its staff members and prisoners, the cumulative number who tested positive among staff and prisoners, and the numbers of deaths for each group.
The time series data is aggregated to the system level; there is one record for each prison agency on each date of collection. Not all departments could provide data for the exact date requested, and the data indicates the date for the figures.
To estimate the rate of infection among prisoners, we collected population data for each prison system before the pandemic, roughly in mid-March, in April, June, July, August, September and October. Beginning the week of July 28, we updated all prisoner population numbers, reflecting the number of incarcerated adults in state or federal prisons. Prior to that, population figures may have included additional populations, such as prisoners housed in other facilities, which were not captured in our COVID-19 data. In states with unified prison and jail systems, we include both detainees awaiting trial and sentenced prisoners.
To estimate the rate of infection among prison employees, we collected staffing numbers for each system. Where current data was not publicly available, we acquired other numbers through our reporting, including calling agencies or from state budget documents. In six states, we were unable to find recent staffing figures: Alaska, Hawaii, Kentucky, Maryland, Montana, Utah.
To calculate the cumulative COVID-19 impact on prisoner and prison worker populations, we aggregated prisoner and staff COVID case and death data up through Dec. 15. Because population snapshots do not account for movement in and out of prisons since March, and because many systems have significantly slowed the number of new people being sent to prison, it’s difficult to estimate the total number of people who have been held in a state system since March. To be conservative, we calculated our rates of infection using the largest prisoner population snapshots we had during this time period.
As with all COVID-19 data, our understanding of the spread and impact of the virus is limited by the availability of testing. Epidemiology and public health experts say that aside from a few states that have recently begun aggressively testing in prisons, it is likely that there are more cases of COVID-19 circulating undetected in facilities. Sixteen prison systems, including the Federal Bureau of Prisons, would not release information about how many prisoners they are testing.
Corrections departments in Indiana, Kansas, Montana, North Dakota and Wisconsin report coronavirus testing and case data for juvenile facilities; West Virginia reports figures for juvenile facilities and jails. For consistency of comparison with other state prison systems, we removed those facilities from our data that had been included prior to July 28. For these states we have also removed staff data. Similarly, Pennsylvania’s coronavirus data includes testing and cases for those who have been released on parole. We removed these tests and cases for prisoners from the data prior to July 28. The staff cases remain.
There are four tables in this data:
covid_prison_cases.csv
contains weekly time series data on tests, infections and deaths in prisons. The first dates in the table are on March 26. Any questions that a prison agency could not or would not answer are left blank.
prison_populations.csv
contains snapshots of the population of people incarcerated in each of these prison systems for whom data on COVID testing and cases are available. This varies by state and may not always be the entire number of people incarcerated in each system. In some states, it may include other populations, such as those on parole or held in state-run jails. This data is primarily for use in calculating rates of testing and infection, and we would not recommend using these numbers to compare the change in how many people are being held in each prison system.
staff_populations.csv
contains a one-time, recent snapshot of the headcount of workers for each prison agency, collected as close to April 15 as possible.
covid_prison_rates.csv
contains the rates of cases and deaths for prisoners. There is one row for every state and federal prison system and an additional row with the National
totals.
The Associated Press and The Marshall Project have created several queries to help you use this data:
Get your state's prison COVID data: Provides each week's data from just your state and calculates a cases-per-100000-prisoners rate, a deaths-per-100000-prisoners rate, a cases-per-100000-workers rate and a deaths-per-100000-workers rate here
Rank all systems' most recent data by cases per 100,000 prisoners here
Find what percentage of your state's total cases and deaths -- as reported by Johns Hopkins University -- occurred within the prison system here
In stories, attribute this data to: “According to an analysis of state prison cases by The Marshall Project, a nonprofit investigative newsroom dedicated to the U.S. criminal justice system, and The Associated Press.”
Many reporters and editors at The Marshall Project and The Associated Press contributed to this data, including: Katie Park, Tom Meagher, Weihua Li, Gabe Isman, Cary Aspinwall, Keri Blakinger, Jake Bleiberg, Andrew R. Calderón, Maurice Chammah, Andrew DeMillo, Eli Hager, Jamiles Lartey, Claudia Lauer, Nicole Lewis, Humera Lodhi, Colleen Long, Joseph Neff, Michelle Pitcher, Alysia Santo, Beth Schwartzapfel, Damini Sharma, Colleen Slevin, Christie Thompson, Abbie VanSickle, Adria Watson, Andrew Welsh-Huggins.
If you have questions about the data, please email The Marshall Project at info+covidtracker@themarshallproject.org or file a Github issue.
To learn more about AP's data journalism capabilities for publishers, corporations and financial institutions, go here or email kromano@ap.org.
The COVID Tracking Project collects information from 50 US states, the District of Columbia, and 5 other US territories to provide the most comprehensive testing data we can collect for the novel coronavirus, SARS-CoV-2. We attempt to include positive and negative results, pending tests, and total people tested for each state or district currently reporting that data.
Testing is a crucial part of any public health response, and sharing test data is essential to understanding this outbreak. The CDC is currently not publishing complete testing data, so we’re doing our best to collect it from each state and provide it to the public. The information is patchy and inconsistent, so we’re being transparent about what we find and how we handle it—the spreadsheet includes our live comments about changing data and how we’re working with incomplete information.
From here, you can also learn about our methodology, see who makes this, and find out what information states provide and how we handle it.