The Marshall Project, the nonprofit investigative newsroom dedicated to the U.S. criminal justice system, has partnered with The Associated Press to compile data on the prevalence of COVID-19 infection in prisons across the country. The Associated Press is sharing this data as the most comprehensive current national source of COVID-19 outbreaks in state and federal prisons.
Lawyers, criminal justice reform advocates and families of the incarcerated have worried about what was happening in prisons across the nation as coronavirus began to take hold in the communities outside. Data collected by The Marshall Project and AP shows that hundreds of thousands of prisoners, workers, correctional officers and staff have caught the illness as prisons became the center of some of the country’s largest outbreaks. And thousands of people — most of them incarcerated — have died.
In December, as COVID-19 cases spiked across the U.S., the news organizations also shared cumulative rates of infection among prison populations, to better gauge the total effects of the pandemic on prison populations. The analysis found that by mid-December, one in five state and federal prisoners in the United States had tested positive for the coronavirus -- a rate more than four times higher than the general population.
This data, which is updated weekly, is an effort to track how those people have been affected and where the crisis has hit the hardest.
The data tracks the number of COVID-19 tests administered to people incarcerated in all state and federal prisons, as well as the staff in those facilities. It is collected on a weekly basis by Marshall Project and AP reporters who contact each prison agency directly and verify published figures with officials.
Each week, the reporters ask every prison agency for the total number of coronavirus tests administered to its staff members and prisoners, the cumulative number who tested positive among staff and prisoners, and the numbers of deaths for each group.
The time series data is aggregated to the system level; there is one record for each prison agency on each date of collection. Not all departments could provide data for the exact date requested, and the data indicates the date for the figures.
To estimate the rate of infection among prisoners, we collected population data for each prison system before the pandemic, roughly in mid-March, in April, June, July, August, September and October. Beginning the week of July 28, we updated all prisoner population numbers, reflecting the number of incarcerated adults in state or federal prisons. Prior to that, population figures may have included additional populations, such as prisoners housed in other facilities, which were not captured in our COVID-19 data. In states with unified prison and jail systems, we include both detainees awaiting trial and sentenced prisoners.
To estimate the rate of infection among prison employees, we collected staffing numbers for each system. Where current data was not publicly available, we acquired other numbers through our reporting, including calling agencies or from state budget documents. In six states, we were unable to find recent staffing figures: Alaska, Hawaii, Kentucky, Maryland, Montana, Utah.
To calculate the cumulative COVID-19 impact on prisoner and prison worker populations, we aggregated prisoner and staff COVID case and death data up through Dec. 15. Because population snapshots do not account for movement in and out of prisons since March, and because many systems have significantly slowed the number of new people being sent to prison, it’s difficult to estimate the total number of people who have been held in a state system since March. To be conservative, we calculated our rates of infection using the largest prisoner population snapshots we had during this time period.
As with all COVID-19 data, our understanding of the spread and impact of the virus is limited by the availability of testing. Epidemiology and public health experts say that aside from a few states that have recently begun aggressively testing in prisons, it is likely that there are more cases of COVID-19 circulating undetected in facilities. Sixteen prison systems, including the Federal Bureau of Prisons, would not release information about how many prisoners they are testing.
Corrections departments in Indiana, Kansas, Montana, North Dakota and Wisconsin report coronavirus testing and case data for juvenile facilities; West Virginia reports figures for juvenile facilities and jails. For consistency of comparison with other state prison systems, we removed those facilities from our data that had been included prior to July 28. For these states we have also removed staff data. Similarly, Pennsylvania’s coronavirus data includes testing and cases for those who have been released on parole. We removed these tests and cases for prisoners from the data prior to July 28. The staff cases remain.
There are four tables in this data:
covid_prison_cases.csv
contains weekly time series data on tests, infections and deaths in prisons. The first dates in the table are on March 26. Any questions that a prison agency could not or would not answer are left blank.
prison_populations.csv
contains snapshots of the population of people incarcerated in each of these prison systems for whom data on COVID testing and cases are available. This varies by state and may not always be the entire number of people incarcerated in each system. In some states, it may include other populations, such as those on parole or held in state-run jails. This data is primarily for use in calculating rates of testing and infection, and we would not recommend using these numbers to compare the change in how many people are being held in each prison system.
staff_populations.csv
contains a one-time, recent snapshot of the headcount of workers for each prison agency, collected as close to April 15 as possible.
covid_prison_rates.csv
contains the rates of cases and deaths for prisoners. There is one row for every state and federal prison system and an additional row with the National
totals.
The Associated Press and The Marshall Project have created several queries to help you use this data:
Get your state's prison COVID data: Provides each week's data from just your state and calculates a cases-per-100000-prisoners rate, a deaths-per-100000-prisoners rate, a cases-per-100000-workers rate and a deaths-per-100000-workers rate here
Rank all systems' most recent data by cases per 100,000 prisoners here
Find what percentage of your state's total cases and deaths -- as reported by Johns Hopkins University -- occurred within the prison system here
In stories, attribute this data to: “According to an analysis of state prison cases by The Marshall Project, a nonprofit investigative newsroom dedicated to the U.S. criminal justice system, and The Associated Press.”
Many reporters and editors at The Marshall Project and The Associated Press contributed to this data, including: Katie Park, Tom Meagher, Weihua Li, Gabe Isman, Cary Aspinwall, Keri Blakinger, Jake Bleiberg, Andrew R. Calderón, Maurice Chammah, Andrew DeMillo, Eli Hager, Jamiles Lartey, Claudia Lauer, Nicole Lewis, Humera Lodhi, Colleen Long, Joseph Neff, Michelle Pitcher, Alysia Santo, Beth Schwartzapfel, Damini Sharma, Colleen Slevin, Christie Thompson, Abbie VanSickle, Adria Watson, Andrew Welsh-Huggins.
If you have questions about the data, please email The Marshall Project at info+covidtracker@themarshallproject.org or file a Github issue.
To learn more about AP's data journalism capabilities for publishers, corporations and financial institutions, go here or email kromano@ap.org.
Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
Anomaly detection has recently become an important problem in many industrial and financial applications. In several instances, the data to be analyzed for possible anomalies is located at multiple sites and cannot be merged due to practical constraints such as bandwidth limitations and proprietary concerns. At the same time, the size of data sets affects prediction quality in almost all data mining applications. In such circumstances, distributed data mining algorithms may be used to extract information from multiple data sites in order to make better predictions. In the absence of theoretical guarantees, however, the degree to which data decentralization affects the performance of these algorithms is not known, which reduces the data providing participants' incentive to cooperate.This creates a metaphorical 'prisoners' dilemma' in the context of data mining. In this work, we propose a novel general framework for distributed anomaly detection with theoretical performance guarantees. Our algorithmic approach combines existing anomaly detection procedures with a novel method for computing global statistics using local sufficient statistics. We show that the performance of such a distributed approach is indistinguishable from that of a centralized instantiation of the same anomaly detection algorithm, a condition that we call zero information loss. We further report experimental results on synthetic as well as real-world data to demonstrate the viability of our approach. The remaining content of this presentation is presented in Fig. 1.
https://www.usa.gov/government-works/https://www.usa.gov/government-works/
This is a dataset of prisoner mugshots and associated data (height, weight, etc). The copyright status is public domain, since it's produced by the government, the photographs do not have sufficient artistic merit, and a mere collection of facts aren't copyrightable.
The source is the Illinois Dept. of Corrections. In total, there are 68149 entries, of which a few hundred have shoddy data.
It's useful for neural network training, since it has pictures from both front and side, and they're (manually) labeled with date of birth, name (useful for clustering), weight, height, hair color, eye color, sex, race, and some various goodies such as sentence duration and whether they're sex offenders.
Here is the readme file:
---BEGIN README---
Scraped from the Illinois DOC.
https://www.idoc.state.il.us/subsections/search/inms_print.asp?idoc=
https://www.idoc.state.il.us/subsections/search/pub_showfront.asp?idoc=
https://www.idoc.state.il.us/subsections/search/pub_showside.asp?idoc=
paste <(cat ids.txt | sed 's/^/http://www.idoc.state.il.us/subsections/search/pub_showside.asp?idoc=/g') <(cat ids.txt| sed 's/^/ out=/g' | sed 's/$/.jpg/g') -d '
' > showside.txt
paste <(cat ids.txt | sed 's/^/http://www.idoc.state.il.us/subsections/search/pub_showfront.asp?idoc=/g') <(cat ids.txt| sed 's/^/ out=/g' | sed 's/$/.jpg/g') -d '
' > showfront.txt
paste <(cat ids.txt | sed 's/^/http://www.idoc.state.il.us/subsections/search/inms_print.asp?idoc=/g') <(cat ids.txt| sed 's/^/ out=/g' | sed 's/$/.html/g') -d '
' > inmates_print.txt
aria2c -i ../inmates_print.txt -j4 -x4 -l ../log-$(pwd|rev|cut -d/ -f 1|rev)-$(date +%s).txt
Then use htmltocsv.py to get the csv. Note that the script is very poorly written and may have errors. It also doesn't do anything with the warrant-related info, although there are some commented-out lines which may be relevant.
Also note that it assumes all the HTML files are located in the inmates directory., and overwrites any csv files in csv if there are any.
front.7z contains mugshots from the front
side.7z contains mugshots from the side
inmates.7z contains all the html files
csv contains the html files converted to CSV
The reason for packaging the images is that many torrent clients would otherwise crash if attempting to load the torrent.
All CSV files contain headers describing the nature of the columns. For person.csv, the id is unique. For marks.csv and sentencing.csv, it is not.
Note that the CSV files use semicolons as delimiters and also end with a trailing semicolon. If this is unsuitable, edit the arr2csvR function in htmltocsv.py.
There are 68149 inmates in total, although some (a few hundred) are marked as "Unknown"/"N/A"/"" in one or more fields.
The "height" column has been processed to contain the height in inches, rather than the height in feet and inches expressed as "X ft YY in."
Some inmates were marked "Not Available", this has been replaced with "N/A".
Likewise, the "weight" column has been altered "XXX lbs." -> "XXX". Again, some are marked "N/A".
The "date of birth" column has some inmates marked as "Not Available" and others as "". There doesn't appear to be any pattern. It may be related to the institution they are kept in. Otherwise, the format is MM/DD/YYYY.
The "weight" column is often rounded to the nearest 5 lbs.
Statistics for hair:
43305 Black
17371 Brown
2887 Blonde or Strawberry
2539 Gray or Partially Gray
740 Red or Auburn
624 Bald
396 Not Available
209 Salt and Pepper
70 White
7 Sandy
1 Unknown
Statistics for sex:
63409 Male
4740 Female
Statistics for race:
37991 Black
20992 White
8637 Hispanic
235 Asian
104 Amer Indian
94 Unknown
92 Bi-Racial
4
Statistics for eyes:
51714 Brown
7808 Blue
4259 Hazel
2469 Green
1382 Black
420 Not Available
87 Gray
9 Maroon
1 Unknown
---END README---
Here is a formal summary:
---BEGIN SUMMARY---
Documentation:
Title: Illinois DOC dataset
Source Information
-- Creators: Illinois DOC
-- Illinois Department of Corrections
1301 Concordia Court
P.O. Box 19277
Springfield, IL 62794-9277
(217) 558-2200 x 2008
-- Donor: Anonymous
-- Date: 2019
Past Usage:
-- None
Relevant Information:
-- All CSV files contain headers describing the nature of the columns. For person.csv, the id is unique. For marks.csv and sentencing.csv, it is not.
-- Note that the CSV files use semicolons as delimiters and also end with a trailing semicolon. If this is unsuitable, edit the arr2csvR function in htmltocsv...
https://www.icpsr.umich.edu/web/ICPSR/studies/9571/termshttps://www.icpsr.umich.edu/web/ICPSR/studies/9571/terms
The United Nations began its World Crime Surveys in 1978. The first survey collected statistics on a small range of offenses and on the criminal justice process for the years 1970-1975. The second survey collected data on a wide range of offenses, offenders, and criminal justice process data for the years 1975-1980. Several factors make these two collections difficult to use in combination. Some 25 percent of those countries responding to the first survey did not respond to the second and, similarly, some 30 percent of those responding to the second survey did not respond to the first. In addition, many questions asked in the second survey were not asked in the first survey. This data collection represents the efforts of the investigators to combine, revise, and recheck the data of the first two surveys. The data are divided into two parts. Part 1 comprises all data on offenses and on some criminal justice personnel. Crime data are entered for 1970 through 1980. In most cases 1975 is entered twice, since both surveys collected data for this year. Part 2 includes data on offenders, prosecutions, convictions, and prisons. Data are entered for 1970 through 1980, for every even year.
Adult correctional services, custodial and community supervision, average counts of offenders in federal programs, Canada and regions, five years of data.
This study assessed the effects of male inmate religiosity on post-release community adjustment and investigated the circumstances under which these effects were most likely to take place. The researcher carried out this study by adding Federal Bureau of Investigation criminal history information to an existing database (Clear et al.) that studied the relationship between an inmate's religiousness and his adjustment to the correctional setting. Four types of information were used in this study. The first three types were obtained by the original research team and included an inmate values and religiousness instrument, a pre-release questionnaire, and a three-month post-release follow-up phone survey. The fourth type of information, official criminal history reports, was later added to the original dataset by the principal investigator for this study. The prisoner values survey collected information on what the respondent would do if a friend sold drugs from the cell or if inmates of his race attacked others. Respondents were also asked if they thought God was revealed in the scriptures, if they shared their faith with others, and if they took active part in religious services. Information collected from the pre-release questionnaire included whether the respondent attended group therapy, religious groups with whom he would live, types of treatment programs he would participate in after prison, employment plans, how often he would go to church, whether he would be angry more in prison or in the free world, and whether he would be more afraid of being attacked in prison or in the free world. Each inmate also described his criminal history and indicated whether he thought he was able to do things as well as most others, whether he was satisfied with himself on the whole or felt that he was a failure, whether religion was talked about in the home, how often he attended religious services, whether he had friends who were religious while growing up, whether he had friends who were religious while in prison, and how often he participated in religious inmate counseling, religious services, in-prison religious seminars, and community service projects. The three-month post-release follow-up phone survey collected information on whether the respondent was involved with a church group, if the respondent was working for pay, if the respondent and his household received public assistance, if he attended religious services since his release, with whom the respondent was living, and types of treatment programs attended. Official post-release criminal records include information on the offenses the respondent was arrested and incarcerated for, prior arrests and incarcerations, rearrests, outcomes of offenses of rearrests, follow-up period to first rearrest, prison adjustment indicator, self-esteem indicator, time served, and measurements of the respondent's level of religious belief and personal identity. Demographic variables include respondent's faith, race, marital status, education, age at first arrest and incarceration, and age at incarceration for rearrest.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset is about book subjects. It has 4 rows and is filtered where the books is Who is Nelson Mandela? : the prisoner who gave the world hope. It features 10 columns including number of authors, number of books, earliest publication date, and latest publication date.
The Freedom in the World 1972-2010 dataset, produced by a US based organisation, Freedom House, contains data on political rights and civil liberties for countries. Numerical ratings of between 1 and 7 are allocated to each country or territory, with 1 representing the most free and 7 the least free. The status designation of Free, Partly Free, or Not Free, which is determined by the combination of the political rights and civil liberties ratings, indicates the general state of freedom in a country or territory.
The total number of points awarded to the political rights and civil liberties checklists determines the political rights and civil liberties ratings for each country in the Freedom House dataset. Each point total corresponds to a rating of 1 through 7, with 1 representing the highest and 7 the lowest level of freedom. Each pair of political rights and civil liberties ratings is averaged to determine an overall status of "Free," "Partly Free," or "Not Free." Those whose ratings average 1.0 to 2.5 are considered Free, 3.0 to 5.0 Partly Free, and 5.5 to 7.0 Not Free . The designations of Free, Partly Free, and Not Free each cover a broad third of the available raw points. Therefore, countries and territories within any one category, especially those at either end of the category, can have quite different human rights situations. In order to see the distinctions within each category, a country or territory's political rights and civil liberties ratings should be examined. For example, countries at the lowest end of the Free category (2 in political rights and 3 in civil liberties, or 3 in political rights and 2 in civil liberties) differ from those at the upper end of the Free group (1 for both political rights and civil liberties). Also, a designation of Free does not mean that a country enjoys perfect freedom or lacks serious problems, only that it enjoys comparably more freedom than Partly Free or Not Free (or some other Free) countries.
General Characteristics of Each Political Rights and Civil Liberties Rating: Political Rights Rating of 1 -- Countries and territories that receive a rating of 1 for political rights come closest to ensuring the freedoms embodied in the checklist questions, beginning with free and fair elections. Those who are elected rule, there are competitive parties or other political groupings, and the opposition plays an important role and has actual power. Minority groups have reasonable self-government or can participate in the government through informal consensus. Rating of 2 -- Countries and territories rated 2 in political rights are less free than those rated 1. Such factors as political corruption, violence, political discrimination against minorities, and foreign or military influence on politics may be present and weaken the quality of freedom. Ratings of 3, 4, 5 -- The same conditions that undermine freedom in countries and territories with a rating of 2 may also weaken political rights in those with a rating of 3, 4, or 5. Other damaging elements can include civil war, heavy military involvement in politics, lingering royal power, unfair elections, and one-party dominance. However, states and territories in these categories may still enjoy some elements of political rights, including the freedom to organize quasi-political groups, reasonably free referendums, or other significant means of popular influence on government. Rating of 6 -- Countries and territories with political rights rated 6 have systems ruled by military juntas, one-party dictatorships, religious hierarchies, or autocrats. These regimes may allow only a minimal manifestation of political rights, such as some degree of representation or autonomy for minorities. A few states are traditional monarchies that mitigate their relative lack of political rights through the use of consultation with their subjects, tolerance of political discussion, and acceptance of public petitions. Rating of 7 -- For countries and territories with a rating of 7, political rights are absent or virtually nonexistent as a result of the extremely oppressive nature of the regime or severe oppression in combination with civil war. States and territories in this group may also be marked by extreme violence or warlord rule that dominates political power in the absence of an authoritative, functioning central government. Civil Liberties Rating of 1 -- Countries and territories that receive a rating of 1 come closest to ensuring the freedoms expressed in the civil liberties checklist, including freedom of expression, assembly, association, education, and religion. They are distinguished by an established and generally equitable system of rule of law. Countries and territories with this rating enjoy free economic activity and tend to strive for equality of opportunity. Rating of 2 -- States and territories with a rating of 2 have deficiencies in a few aspects of civil liberties, but are still relatively free. Ratings of 3, 4, 5 -- Countries and territories that have received a rating of 3, 4, or 5 range from those that are in at least partial compliance with virtually all checklist standards to those with a combination of high or medium scores for some questions and low or very low scores on other questions. The level of oppression increases at each successive rating level, including in the areas of censorship, political terror, and the prevention of free association. There are also many cases in which groups opposed to the state engage in political terror that undermines other freedoms. Therefore, a poor rating for a country is not necessarily a comment on the intentions of the government, but may reflect real restrictions on liberty caused by nongovernmental actors. Rating of 6 -- People in countries and territories with a rating of 6 experience severely restricted rights of expression and association, and there are almost always political prisoners and other manifestations of political terror. These countries may be characterized by a few partial rights, such as some religious and social freedoms, some highly restricted private business activity, and relatively free private discussion. Rating of 7 -- States and territories with a rating of 7 have virtually no freedom. An overwhelming and justified fear of repression characterizes these societies. Countries and territories generally have ratings in political rights and civil liberties that are within two ratings numbers of each other. Without a well-developed civil society, it is difficult, if not impossible, to have an atmosphere supportive of political rights. Consequently, there is no country in the survey with a rating of 6 or 7 for civil liberties and, at the same time, a rating of 1 or 2 for political rights.
The units of analysis in the survey arel countries
Observation data/ratings [obs]
Other [oth]
Adult correctional services, custodial and community supervision, average counts of adults in provincial and territorial programs, five years of data.
https://www.icpsr.umich.edu/web/ICPSR/studies/34926/termshttps://www.icpsr.umich.edu/web/ICPSR/studies/34926/terms
These data are part of NACJD's Fast Track Release and are distributed as they there received from the data depositor. The files have been zipped by NACJD for release, but not checked or processed except of the removal of direct identifiers. Users should refer to the accompany readme file for a brief description of the files available with this collections and consult the investigator(s) if further information is needed.The purpose of this evaluation was to determine the effectiveness of the global positioning system (GPS) monitoring of high-risk gang offenders (HRGOs) who are placed on parole. The study focuses on HGROs who were released from prison and placed on parole supervision with GPS monitoring in six California jurisdictions between March 2006 and October 2009. A propensity score procedure was performed using a sample of offenders drawn from the same six communities who were not placed on GPS monitoring. The matching procedure resulted in a final sample of 784 subjects (392 treatment and 392 control). The study used six primary sources to collect data: 1)the California Department of Corrections and Rehabilitation (CDCR) data management system, 2) official arrest records, 3) parole supervision records, 4) GPS monitoring data, 5) a CDCR parole agent (PA) survey, and 6) CDCR cost information.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This is an Official Statistics bulletin produced by statisticians in the Ministry of Justice, Home Office and the Office for National Statistics. It brings together, for the first time, a range of official statistics from across the crime and criminal justice system, providing an overview of sexual offending in England and Wales. The report is structured to highlight: the victim experience; the police role in recording and detecting the crimes; how the various criminal justice agencies deal with an offender once identified; and the criminal histories of sex offenders.
Providing such an overview presents a number of challenges, not least that the available information comes from different sources that do not necessarily cover the same period, the same people (victims or offenders) or the same offences. This is explained further in the report.
Based on aggregated data from the ‘Crime Survey for England and Wales’ in 2009/10, 2010/11 and 2011/12, on average, 2.5 per cent of females and 0.4 per cent of males said that they had been a victim of a sexual offence (including attempts) in the previous 12 months. This represents around 473,000 adults being victims of sexual offences (around 404,000 females and 72,000 males) on average per year. These experiences span the full spectrum of sexual offences, ranging from the most serious offences of rape and sexual assault, to other sexual offences like indecent exposure and unwanted touching. The vast majority of incidents reported by respondents to the survey fell into the other sexual offences category.
It is estimated that 0.5 per cent of females report being a victim of the most serious offences of rape or sexual assault by penetration in the previous 12 months, equivalent to around 85,000 victims on average per year. Among males, less than 0.1 per cent (around 12,000) report being a victim of the same types of offences in the previous 12 months.
Around one in twenty females (aged 16 to 59) reported being a victim of a most serious sexual offence since the age of 16. Extending this to include other sexual offences such as sexual threats, unwanted touching or indecent exposure, this increased to one in five females reporting being a victim since the age of 16.
Around 90 per cent of victims of the most serious sexual offences in the previous year knew the perpetrator, compared with less than half for other sexual offences.
Females who had reported being victims of the most serious sexual offences in the last year were asked, regarding the most recent incident, whether or not they had reported the incident to the police. Only 15 per cent of victims of such offences said that they had done so. Frequently cited reasons for not reporting the crime were that it was ‘embarrassing’, they ‘didn’t think the police could do much to help’, that the incident was ‘too trivial or not worth reporting’, or that they saw it as a ‘private/family matter and not police business’
In 2011/12, the police recorded a total of 53,700 sexual offences across England and Wales. The most serious sexual offences of ‘rape’ (16,000 offences) and ‘sexual assault’ (22,100 offences) accounted for 71 per cent of sexual offences recorded by the police. This differs markedly from victims responding to the CSEW in 2011/12, the majority of whom were reporting being victims of other sexual offences outside the most serious category.
This reflects the fact that victims are more likely to report the most serious sexual offences to the police and, as such, the police and broader criminal justice system (CJS) tend to deal largely with the most serious end of the spectrum of sexual offending. The majority of the other sexual crimes recorded by the police related to ‘exposure or voyeurism’ (7,000) and ‘sexual activity with minors’ (5,800).
Trends in recorded crime statistics can be influenced by whether victims feel able to and decide to report such offences to the police, and by changes in police recording practices. For example, while there was a 17 per cent decrease in recorded sexual offences between 2005/06 and 2008/09, there was a seven per cent increase between 2008/09 and 2010/11. The latter increase may in part be due to greater encouragement by the police to victims to come forward and improvements in police recording, rather than an increase in the level of victimisation.
After the initial recording of a crime, the police may later decide that no crime took place as more details about the case emerge. In 2011/12, there were 4,155 offences initially recorded as sexual offences that the police later decided were not crimes. There are strict guidelines that set out circumstances under which a crime report may be ‘no crimed’. The ‘no-crime’ rate for sexual offences (7.2 per cent) compare
Not seeing a result you expected?
Learn how you can add new datasets to our index.
The Marshall Project, the nonprofit investigative newsroom dedicated to the U.S. criminal justice system, has partnered with The Associated Press to compile data on the prevalence of COVID-19 infection in prisons across the country. The Associated Press is sharing this data as the most comprehensive current national source of COVID-19 outbreaks in state and federal prisons.
Lawyers, criminal justice reform advocates and families of the incarcerated have worried about what was happening in prisons across the nation as coronavirus began to take hold in the communities outside. Data collected by The Marshall Project and AP shows that hundreds of thousands of prisoners, workers, correctional officers and staff have caught the illness as prisons became the center of some of the country’s largest outbreaks. And thousands of people — most of them incarcerated — have died.
In December, as COVID-19 cases spiked across the U.S., the news organizations also shared cumulative rates of infection among prison populations, to better gauge the total effects of the pandemic on prison populations. The analysis found that by mid-December, one in five state and federal prisoners in the United States had tested positive for the coronavirus -- a rate more than four times higher than the general population.
This data, which is updated weekly, is an effort to track how those people have been affected and where the crisis has hit the hardest.
The data tracks the number of COVID-19 tests administered to people incarcerated in all state and federal prisons, as well as the staff in those facilities. It is collected on a weekly basis by Marshall Project and AP reporters who contact each prison agency directly and verify published figures with officials.
Each week, the reporters ask every prison agency for the total number of coronavirus tests administered to its staff members and prisoners, the cumulative number who tested positive among staff and prisoners, and the numbers of deaths for each group.
The time series data is aggregated to the system level; there is one record for each prison agency on each date of collection. Not all departments could provide data for the exact date requested, and the data indicates the date for the figures.
To estimate the rate of infection among prisoners, we collected population data for each prison system before the pandemic, roughly in mid-March, in April, June, July, August, September and October. Beginning the week of July 28, we updated all prisoner population numbers, reflecting the number of incarcerated adults in state or federal prisons. Prior to that, population figures may have included additional populations, such as prisoners housed in other facilities, which were not captured in our COVID-19 data. In states with unified prison and jail systems, we include both detainees awaiting trial and sentenced prisoners.
To estimate the rate of infection among prison employees, we collected staffing numbers for each system. Where current data was not publicly available, we acquired other numbers through our reporting, including calling agencies or from state budget documents. In six states, we were unable to find recent staffing figures: Alaska, Hawaii, Kentucky, Maryland, Montana, Utah.
To calculate the cumulative COVID-19 impact on prisoner and prison worker populations, we aggregated prisoner and staff COVID case and death data up through Dec. 15. Because population snapshots do not account for movement in and out of prisons since March, and because many systems have significantly slowed the number of new people being sent to prison, it’s difficult to estimate the total number of people who have been held in a state system since March. To be conservative, we calculated our rates of infection using the largest prisoner population snapshots we had during this time period.
As with all COVID-19 data, our understanding of the spread and impact of the virus is limited by the availability of testing. Epidemiology and public health experts say that aside from a few states that have recently begun aggressively testing in prisons, it is likely that there are more cases of COVID-19 circulating undetected in facilities. Sixteen prison systems, including the Federal Bureau of Prisons, would not release information about how many prisoners they are testing.
Corrections departments in Indiana, Kansas, Montana, North Dakota and Wisconsin report coronavirus testing and case data for juvenile facilities; West Virginia reports figures for juvenile facilities and jails. For consistency of comparison with other state prison systems, we removed those facilities from our data that had been included prior to July 28. For these states we have also removed staff data. Similarly, Pennsylvania’s coronavirus data includes testing and cases for those who have been released on parole. We removed these tests and cases for prisoners from the data prior to July 28. The staff cases remain.
There are four tables in this data:
covid_prison_cases.csv
contains weekly time series data on tests, infections and deaths in prisons. The first dates in the table are on March 26. Any questions that a prison agency could not or would not answer are left blank.
prison_populations.csv
contains snapshots of the population of people incarcerated in each of these prison systems for whom data on COVID testing and cases are available. This varies by state and may not always be the entire number of people incarcerated in each system. In some states, it may include other populations, such as those on parole or held in state-run jails. This data is primarily for use in calculating rates of testing and infection, and we would not recommend using these numbers to compare the change in how many people are being held in each prison system.
staff_populations.csv
contains a one-time, recent snapshot of the headcount of workers for each prison agency, collected as close to April 15 as possible.
covid_prison_rates.csv
contains the rates of cases and deaths for prisoners. There is one row for every state and federal prison system and an additional row with the National
totals.
The Associated Press and The Marshall Project have created several queries to help you use this data:
Get your state's prison COVID data: Provides each week's data from just your state and calculates a cases-per-100000-prisoners rate, a deaths-per-100000-prisoners rate, a cases-per-100000-workers rate and a deaths-per-100000-workers rate here
Rank all systems' most recent data by cases per 100,000 prisoners here
Find what percentage of your state's total cases and deaths -- as reported by Johns Hopkins University -- occurred within the prison system here
In stories, attribute this data to: “According to an analysis of state prison cases by The Marshall Project, a nonprofit investigative newsroom dedicated to the U.S. criminal justice system, and The Associated Press.”
Many reporters and editors at The Marshall Project and The Associated Press contributed to this data, including: Katie Park, Tom Meagher, Weihua Li, Gabe Isman, Cary Aspinwall, Keri Blakinger, Jake Bleiberg, Andrew R. Calderón, Maurice Chammah, Andrew DeMillo, Eli Hager, Jamiles Lartey, Claudia Lauer, Nicole Lewis, Humera Lodhi, Colleen Long, Joseph Neff, Michelle Pitcher, Alysia Santo, Beth Schwartzapfel, Damini Sharma, Colleen Slevin, Christie Thompson, Abbie VanSickle, Adria Watson, Andrew Welsh-Huggins.
If you have questions about the data, please email The Marshall Project at info+covidtracker@themarshallproject.org or file a Github issue.
To learn more about AP's data journalism capabilities for publishers, corporations and financial institutions, go here or email kromano@ap.org.