Written by C. Andrew Warren
The week of March 15, 2020 was a tough one for the US criminal justice system. On Tuesday the 17th, there were no Covid-19 detected in US prisons. By Wednesday, prison staff had tested positive in both Alabama and Michigan. Thursday brought an additional staff member in Wisconsin, and the first case discovered in an ICE detention center. By the end of that week, cases had been found in prisons from Massachusetts to Georgia, from California to the federal prison system.
In the course of our normal work at Recidiviz, we try to help criminal justice agencies use data to answer three questions:
- Where are we now?
- How did we get here?
- Where are we heading?
The world is very different now than it was six weeks ago, but these questions are more important than ever. Criminal justice leaders and staff are making the most important decisions of their careers, and without complete information.
The question we’ve focused the most on recently at Recidiviz is #2 — how did we get here? What happened between the week of March 15th and today?
This is an urgent question; states can’t understand what they’re seeing today — or what might be coming — without seeing the trajectory of the past few weeks in each of their facilities. And as a broader criminal justice community, we also can’t understand, before the next wave of Covid-19, what’s helped or hurt outcomes for staff and incarcerated people without understanding the same.
Unfortunately, that information is hard to come by.
The disappearing past
As we’ve hunted down jail and prison outbreak data to power our tools for practitioners, we discovered a large and growing set of problems in Covid-19 criminal justice data.
First, many jails and some prison systems aren’t publishing case numbers. Those that do often report different metrics from one another — there’s no standard set of measures being reported out.
Second, many systems update case numbers in the same place each day, overwriting the day before and unintentionally erasing the history of how the outbreak developed. And many states that do leave early numbers up actually go back and revise earlier numbers, making some datasets that were collected at the time invalid.
Our own research into just ten facilities required poring over the Michigan and Bureau of Prisons websites on the Internet Archive, interrogating Google’s web cache of press releases for the state of Vermont, and guessing likely file URLs for Ohio’s Covid-19 update PDFs.
All of this makes for messy and missing data, and makes it nearly impossible to form a complete picture of how the Covid outbreak has spread throughout the justice system.
It takes a village
Fortunately, a constellation of groups in the criminal justice space have been putting together data on Covid-19 in correctional facilities since the start of the outbreak.
The team behind CovidPrisonData.com, for example, has built out web scraping infrastructure for daily data collection from over 30 state prison systems. The Covid-19 Behind Bars Data Project, from UCLA Law, has been aggregating case and death data from facilities around the country for much of the last month.
The Prison Policy Institute and The Justice Collaborative have both been pulling together data around criminal justice policy shifts made by counties and states, and the Marshall Project has been collecting information on both policy shifts and state-level cases in prisons. Meanwhile, the Vera Institute has been collecting information on the shifts in jail populations.
Each of these public datasets adds a significant piece to the puzzle of how the outbreak has played out. As we’ve used some of these different datasets to validate our internal tools, we’ve also rounded them out with a combination of manual data collection and historical checks for several of the areas where we found gaps.
We’re now publishing the full aggregated dataset we’ve built on prison and jail cases back to the community, along with a list of the organizations who have done the heavy lifting. It’s been an impressive example of how this community has pulled together in the face of a difficult, chaotic, and unprecedented time.
We’ll continue to update this aggregate dataset with public data as it becomes available, and to add additional sources that make sense. We hope it will be helpful — for states running their own research efforts, for policymakers trying to decide what’s worked and what hasn’t, and for other non-profits and researchers who are trying to help make the world a better place.
Any other group is more than welcome to wrap this data up with their own. And please let us know if you know of data that might be useful to incorporate by reaching out to firstname.lastname@example.org.
Note: The dataset being released includes public data made available on state and county websites and collected by volunteers. It does not include internal data from Recidiviz or our users.