Pivot Tables and the Students Who (Should) Love Them
Note from the editor: Allen Arthur is a M.A. in Social Journalism 16´ from the CUNY Graduate School of Journalism. He spent his year at the School learning from formerly incarcerated people, starting a Medium publication called “Greylined” and selling an investigative story to the Marshall Project in the process. He left hoping to continue work on his culminating project, a zine built with formerly incarcerated people to help those currently in prison prepare for reentry. You can see his final M.A. presentation here.
Let’s start in Tennessee.
Tennessee is one of a few states that has a statute regarding “safekeeping.” This statute allows the transfer of pretrial county jail detainees to state prisons if the local facilities are “insufficient” for their care. The idea is a noble one: if someone is schizophrenic, let’s say, an underfunded, overcrowded rural county jail probably isn’t the best place for them. However, the actual practice hasn’t quite worked out like that. Someone knowledgeable contacted me regarding problems with the mental health services these people actually receive once they are transferred and the conditions under which they live.
I don’t want to share too much detail because the project is still in progress. I can say, however, that one of my steps was to get a list of every safekeeper in Tennessee since 2011. I got a list of well over 200 names with their county of origin, date of safekeeping designation, and the facility they were sent to. In this case, all of those things are big variables with big questions.
Before I could answer those questions, I had to stare down an Excel sheet. No matter how meticulously I clean it, that sheet is still going to make me anxious. Still, I had lunch with it and started asking it questions: Are men or women designated as safekeepers more frequently? Are some counties sending all or most of the people? How many people go to which facilities? Are there years with more or less safekeepers? Did some counties vary wildly year to year?
In the Social Journalism program at CUNY J-School, we were treated to two data classes. One was with Terry Parris Jr. of ProPublica which focused mainly on using data to measure the success of what we were doing online. The second was with Miguel Paz — steward of this publication — and it focused mainly on how data can be used to tell a story. In each class, we cleaned, sorted, and dug into messy Magic Eye Excel sheets, focusing and blurring until the abstract became tangible through numbers. Three months after graduation, I am now working on this safekeeping story for a major publication, and I’ve got to say: these classes are indispensable.
Thanks to Terry and Miguel, I had the skills to start getting answers. I used Tabula to pull Excel tables out of a dreary PDF. I sorted. I filtered. I got subtotals. I made graphs. Though the words might make my classmates shiver, I even made pivot tables. Sometimes I found interesting patterns. Sometimes I found nothing. Either way, the answer directed my work. If a question I had could be answered no — has the ratio of people sent to a particular facility changed year to year? Not really — I could pull back from that angle. If the answer was yes — are some counties sending disproportionate numbers of people to the state? Indeed they are — I could then ask the biggest question journalists can ask: Why?
This data analysis helped me formulate questions for the Department of Corrections. It helped me narrow my inquiry down to a few counties instead of looking all over Tennessee. By doing this, I also had the background needed to talk to public defenders, sheriffs, judges, and district attorneys in those counties. And this was not highly complex methodology. I just asked questions of the data I was curious about and, sensical or not, I built out the results. Sure, three out of four don’t lead to much, but those gems I pulled out confirmed, overturned, or inspired hypotheses for this investigation.
So, students of Miguel and Terry, you might think those tables and columns and delimiters are computer nerdery that only emerge for mathy stories about climate change or budget expenditures. Then, one day, vulnerable people are in a dicey situation in a state with 95 counties and you have a messy list. Blur your eyes. There are lives in there.