Photo of police at the Pittsburgh G20 summit by Cory Cousins

Why automated journalism will save lives

Scott Ogle
6 min readSep 9, 2014

The case for a less human process

“You know who dies in the most population-dense areas? Black men. You know who dies in the least population dense areas? Mentally ill men. It’s not to say there aren’t dangerous and desperate criminals killed across the line. But African-Americans and the mentally ill people make up a huge percentage of people killed by police.”

How might the discourse surrounding police brutality differ if we had access to comprehensive historical data and statistics, instead of just perceptions?

Several weeks ago, D. Brian Burghart, editor of the Reno News & Review in Reno, NV, wrote about his two-year old attempt to compile a database of fatal encounters with police. He set out in response to the total non-existence of any comprehensive public data about killings by police in the United States, a fact he believes to be intentional.

It’s sadly easy to see why governments might hide this information — there’s little incentive to collect and report this data beyond a 1994 law that has been almost completely ignored (there was one report issued in 2001).

Burghart argues, however, that journalists have played a role in this problem too, by failing to report on police use of force and by letting the use of force, especially against minorities, seem normal. One of the key functions of journalism is agenda setting, what is essentially public conversation shaping. Just because the problem has existed for a while doesn’t mean it’s any less of a problem, and it’s journalists who are in place to remind us that these things are still happening, not somewhere else but down the street, in our own neighborhoods. Sadly, this doesn’t happen nearly often enough.

There have been a number of high-profile areas around the country where this problem has actually come to light — Albuquerque, NM and Ferguson, MO are obvious examples — but it doesn’t seem that there’s been any change in the way these areas are policed even once the use of force is in the spotlight. Albuquerque has been in the news for use of force more than once; first in 2011 when the Police Executive Research Forum issued a study regarding use of force and again this year when the Albuquerque Police Department shot and killed a homeless man for illegal camping. The Department of Justice issued a report after the incident stating that the APD’s use of excessive force is “a pattern…not isolated or sporadic.”

What’s shocking about Albuquerque is that after each time the police continued to use excessive force, sometimes the same week. The status quo of little oversight and non-existent data is simply not working. But this raises another question—why did we have to wait for the issuance of a government report to learn this? I’m confident that many of these findings would have all but jumped out of the data.

The root of the problem is that these incidents are not well documented by police departments themselves, but matters are compounded by a lack of good reporting on the issue. Nearly every daily paper has a crime beat reporter, but few seem to be tracking statistics and fewer still track them in any centralized repository.

The APD’s use of excessive force is “a pattern…not isolated or sporadic.”

Here’s where I think part of the problem lies — the current system of police reporting relies too much on human effort, the effort of crime reporters getting police reports and deciding what’s newsworthy. I think that the real story, the trends and aggregate statistics, are hidden in the individual reports and just aren’t visible when all we get is singular stories about incidents. There’s a darker side to this as well—crime reporters often can’t say everything they want to for fear of damaging their professional relationship with police departments.

What’s more, I think that a lot of bias and racism can hide in decisions about what is newsworthy. But what if we had software doing the grunt work of this job instead of reporters and editors? The hard truths of the American law enforcement system might be more obvious.

One change in the legal process could enable this on a large scale — a requirement that police departments release daily briefs in a machine-readable format.

The same reasoning can be applied to all kinds of local data, from traffic data to road maintenance spending to community events. The data is already there, it just needs to be machine readable.

I think it’s important that this change happen from the bottom up—at the level of local government and police—for a number of reasons. One is that raw, local data is simply more transparent. But more importantly, local municipalities can simply act faster and get more done than state and national governments can.

The specifics of the format don’t matter at all, only that software can ingest the data and do much of the legwork of reporting automatically, sans human intervention. At minimum the software should be simply logging and tracking data, however there’s no reason at least a preliminary story couldn’t be generated for any offense more serious than, say, a traffic violation.

This kind of robot journalism is already becoming commonplace in other, more mundane beats. In L.A., a story is published automatically every time there’s an earthquake above a given magnitude. The same thing has been done successfully with business reporting and local sports, even fantasy sports. Kevin Roose wrote an article for New York Magazine about this phenomenon arguing emphatically that robot journalism is a good thing for journalists and readers alike , and I place myself squarely in the pro-robot (probot?) camp as well.

However, Roose barely touches on a deeper reason that robot reporting will benefit us all. Journalism has only ever incidentally been about the hard work of fact gathering — the point of it all has always been the distinctly human labor of sense making and story telling, and about facilitation — of the flow of information and of the public conversation. The more we can offload aspects of the process of journalism onto machines the more journalists can focus on the interesting, truly creative part and the more we can eliminate the human bias that keeps some of the stories from surfacing.

In short, we ought be using software to amplify journalists’ ability to affect social change.

All the attempts at automation in journalism so far have centered on the conversion of structured data into small, human-readable reports. The next step is to think about ways to automate the gathering and synthesis of information for deeper story telling.

I suspect that there’s a whole class of issues that could, at least in part, be covered by machines, and I suspect that we’d all be better off if they were. Issues of crime, government spending and the environment all come to mind. It’s not unimaginable to have software scraping public spending records looking for patterns indicative of corruption or pulling publicly available oil spill reports and keeping track of which companies are the worst offenders. None of this is infringing on any meaningful human activity, only the grunt work that enables it.

A lot of what I’m suggesting is technologically feasible right now, today, given enough resources. However, I don’t think it will truly enter the mainstream until a lot more data is released in machine-readable formats. PDF reports ought to be considered insufficient, and thankfully, increasingly they are.

This brings us back to the original question — how might the discussion be different if we had comprehensive data about police use of force? I don’t think it’s hyperbole to say that, at least indirectly, data could and would savelives. If the facts of police brutality had been staring us in the face all along, then I’d like to think that we would have been forced to have this conversation long ago, and a continuous stream of new data would force us to keep having the conversation until police stop killing so many black men and mentally ill people.

The beauty of an approach that focuses on machine-readable data at the lowest levels of government, I think, is that it will most enable change to happen from the ground up, within communities.

It’s time to let the machines take over; we’ll all be better off for it.



Scott Ogle

Software Engineer at Pulselocker. Former journalism student, news hacker, internet foolosopher.