Open your government: the importance of data understanding

In a day where the public interpretation of a cell phone video can make the difference between a cop shooting a criminal and a cop murdering a man and planting evidence, people care more than ever about government transparency. From the widely spread stories of individuals like Tamir Rice and Eric Garner to content sections on major news sites like Huffington Post [11], the public is refusing to keep misconduct quiet. One specific group that has made policy transparency its mission is Black Lives Matter (BLM) — in late January of this year, BLM activist Deray McKesson and his colleagues at Campaign Zero expanded their already ambitious project to improve policing policy in urban areas across the nation [1]. The Police Use of Force Project is “the first open-source database of police use of force policies for the 100 largest U.S. city police departments” [2]. However, news of this new project didn’t reach ears too far outside of the BLM movement — its roots and goals are complicated enough, and introducing this data to a world that often isn’t familiar with the idea of “open source” makes them even more so. But regardless of how you feel about the tactics and work of BLM, paying attention to government data is important — it can be used to track government performance, understand population demographics, find out where federal and state money is going, and more. To understand and request data is to keep the government accountable.

I especially want to stress data understanding. The simple fact of having data available is important, but unless we have tools we can use to analyze it fully and correctly, it’s not going to do too much for us. Consider the water crisis in Flint, Michigan — the city’s government changed the source of its water from the Detroit system to the Flint River in order to save money in mid-2015, and now we’re seeing the results. Many Flint children are showing increased amounts of lead in their blood, which, according to the Mayo Clinic, can lead to developmental delays and learning difficulties [12], and the increased amounts of lead are thought to be a direct result of the change in water source. A reporter at the Detroit Free Press spoke to Dr. Eden Wells, who said, “‘If I knew then what I know now … We needed a much more robust analysis…We should have torn it up and apart and gone up and down with that (data)’” [10]. Dr. Wells’s words make it clear that better data understanding could have lead to an earlier understanding of the effects of the Flint River change, and it starts with the citizens.

According to an April 2015 study done by the Pew Research Center, “Minorities [less than a majority, not necessarily racial minorities] of Americans say they pay a lot of attention to how governments share data with the public and relatively few say they are aware of examples where government has done a good (or bad) job sharing data. Less than one quarter use government data to monitor how government performs in several different domains” [3]. Because only a minority of the population is using the data that we’re given, we haven’t yet reached the critical mass that will demand greater accountability by the government, and those paying attention to data right now are largely people who are experts in statistics. Data analysis is seen as a practice that is too hard for the average person to do, and, as I’ll discuss in a later post, this isn’t the case at all. For now, just understand that, as citizens, we’re not paying attention to the information that we’re being given (or not being given).

It’s time we start.

But before we get into any more specific examples, let’s understand some key ideas in the open data movement. Below, we’ve provided a list of key terms to more fully understand what open data means:

Key terms:

  • open data — anyone can freely access, use, modify, and share for any purpose [4]
  • data analytics — using data to find patterns and trends in order to make better decisions and evaluate performance [5]
  • open government — a government that is highly transparent and focuses on accountability to the people [6]

If I could sum up all of these terms in three words, they would be transparency, communication, and access. That’s what the BLM Police Use of Force Project is about, and that’s what open government data can be about, too.

Fig. 1: The districts where libraries are present and absent in Indiana. (

As it stands right now, most of the government data that’s been released to the public falls into categories like agriculture, energy, population, legislation, business, poverty/welfare, etc. These topics are general, but they encompass much of the information that a person wondering about the government’s actions would want to know. For instance, if I wanted to find out where all of the libraries in my home state of Indiana were, I could go to and search for “libraries.” Fig. 1 is what I would see.

Before looking at the open data, I had no idea that there were areas of the state that didn’t have libraries. This type of information could be useful if I were part of a nonprofit that focused on literacy — I would now know areas where there may be accessibility problems.

As another example, let’s say that I’m a Hispanic mother of two that’s thinking about moving to Indiana. I want my kids to have a strong community of Hispanic peers — I could jump to the website of the National Center for Education Statistics website and see which areas would fit that criteria. Fig 2. is what I would see. Though there isn’t a school district in Indiana where Hispanic/Latino is the largest population demographic, the Indianapolis Public Schools district has the largest population that fits the criteria [7]. Though racial makeup isn’t the only factor I would want to consider when making a decision on schools, it’s an important piece of the puzzle.

Fig. 2: Slice of ethnic makeups of several Indiana school districts filtered from greatest to least population of Hispanic or Latino students. (National Center for Education Statistics)

Even though we do have all of these pieces of data available, as you can see in Fig. 3, Indiana doesn’t have police data available as a searchable topic. This doesn’t seem likely to change any time soon — a bill is currently going through the Indiana House of Representatives that would keep the majority of police body camera footage private unless the public were to argue for its release [13]. Until the public has enough data understanding to do so, this is where Campaign Zero’s Police Use of Force Project comes in.

Fig. 3: Topics of statistics that are searchable for the state of Indiana. (

The project’s website describes itself as a project for finding out “how police use of force policies help to enable police violence” [8]. They’re using data that they have available to monitor the ways in which specific policy language in police departments can sometimes lead to physical altercations, specifically, in the case of BLM and Campaign Zero, against people of color.

Fig. 4: A graphic showing the cities that comply with certain aspects of Campaign Zero. (

The campaign itself is a great example of everyday citizens gathering data to lead to more accessibility in police departments. Even better, if you scroll down past the chart featured in Fig. 4, you’ll see that you can download the original source data used to create the graphics on the site. The data includes links to the original messages from Chiefs of Police and other individuals within the departments. As Campaign Zero is exemplifying, it’s not enough to just release the data, but in order to be truly accountable, you have to source the data as well.

So this is all fine and good — we have organizations that are working on making the government more accountable and cities such as Washington, D.C. are updating their open data policies as a response to pushback [9]. But how can you look at the data yourself? I’ve told you about how you can find the data on government websites, but in terms of actual analysis, what can be done?

Well, a lot. My favorite program to use for data analysis is Microsoft Excel, although if you aren’t working on a computer with Microsoft Office, Google Drive (a free, cloud-based service) will work just as well. If there’s another spreadsheet-based software or website that you think is great, let us know in the comments.

I’ll be talking more specifically about different data analysis techniques in a later post, but for now, here’s your task:

Go to or another open data site and find a report that’s interesting to you. That could be the list of abandoned wells in Colorado, a report on the number of kids in kindergarten in California, or anything else. All that matters is that it’s data and that you want to learn more about it. Once you’ve found it, download the original data as an Excel file and open it in Google Drive, Excel, or another program. I encourage you to go and look up some techniques for data analytics, but if you want to wait until my next post on this, just look around at the data. See what kinds of information was collected, look for big trends that you can see. You might find yourself frustrated that you can’t see all the data, or that you can’t learn everything about everything by looking at it. That’s why paying attention to data is so important — hopefully, the more we demand, the more we’ll be given.

Above all, learn something new. I’m looking forward to hearing about what you find!