How will citizens obtain election news / educate themselves about politics in the future?


I started really thinking about setting my mind toward working on real problems like the above to see if I could impact how the US does democracy. The following post was written about 1 1/2 years ago where I started getting inspired about the idea.

I wasn’t 100% sure what shape this project would take, but I knew I wanted to start looking into / researching this space. Not as an academic, but as a concerned citizen recognizing that something was broken or starting to look broken in our country. I didn’t know exactly what I was seeing, but I was at least able to set a direction forward for myself.

Voter education, I decided, seemed more than ever like a problem worth exploring. Not only in the US, but around the world. Democracy assumes that the majority of people make good decisions about the people they place in government, and the issues they vote on. And it follows that if we’re going to entrust the citizens of our country to make good decisions, we need to make sure the systems are in place to make it as easy as possible for them to do this with as accurate information / news as possible. Most efforts toward “voter education” it seems work toward educating people on particular issues, but also trying to persuade them what the right decision is. I knew from the beginning that my effort was going to work on figuring out how to help “educate” voters while not standing in the way by trying to persuade them which decision is the right decision.

I put it on the backburner for a little while but kept thinking about how to put my project together.

The Research

The idea for my project is to create a direct link between a citizen and the news they consume so that they can easily inform themselves about the issues / candidates they will be voting on in their next election. I’m not talking about just knowing who you can vote for President, or Senator or House of Representatives at the federal level, which already feel like they saturate the news to no end. I’m also talking about all the judges, local representatives, levies, and other local issues that will be on the voter’s ballot. Most of the larger websites I browsed for my research appeared to only ever concentrate on issues and candidates at the federal level.

I wasn’t 100% certain what I would need in order to make this happen. It was in fact the first time I had ever really considered digging in any real depth into how my government actually operates. Besides, of course, voting and attending a handful of driver education courses to keep points off of my license for traffic violations. All the signals were pointing to a need to open a discussion about this with my local Board of Elections office. After a few brief calls to the Cuyahoga County Board of Elections (NE Ohio) I figured out I needed to contact the individual within the Board of Elections who handles all “public records requests”. Before I contacted this person I still wasn’t sure what to ask for. I figured I’d start off with the idea that I was just going to request all the ballots distributed within the county. After all, this is going to contain exactly the information I was going to need to fulfill the goals of this project. Just match each user up with his/her ballot and we’re done, right? Once I contacted this individual I first confirmed it was going to be ok to do this (to request “sample” ballots, which are essentially the final ballot with a big watermark in the middle), and then I simply requested all of the ballots to be sent to me.

Not for the November 2016 election. I was still too early for that. So I decided I was going to do a “dry run” of what it was going to be like once they were ready with November 2016 ballots, so I requested all ballots from the recent Nov. 2015 election. They told me by email that the information would be put in the mail and arrive within a few days. A few days later I received a CD in the mail containing a digital copy of all of the sample ballots in PDF format.

Once I had this information, I was finally able to start putting together my project. However, early on in working with the ballots it became obvious I was missing something. Each ballot lists the voting precinct on every page. So technically I probably could, through a lot of (perhaps manual) work, parse through all ballots in order to obtain a mapping between the issues / candidates on the ballots and the voting precinct. But parsing the data would’ve likely been too much work.

As a side note I should mention that earlier in my research I learned that the final ballots for each county throughout the state of Ohio generally will not be “finalized” until approximately 1 month before the election, so I figured that trying to do all this work (manually parsing and mapping all the issues from all the ballots with all the voting precincts from almost 1000 PDF ballots) 1 month away from an election would probably not make it possible for me to (a) get all the data together, and (b) have it up on the website in any reasonable amount of time before the election. After all this is a side project and so time I can spend on this is limited. So I went back to the Board of Elections to ask if there was any simple way to map out a basic graph of relationships showing the relationship between the voting precinct of a ballot and the issues on that ballot. The folks I had already made contact with were unable to help me, so I decided to contact someone within their technology department. I was lucky to hear back from the CIO who explained that Cuyahoga County produces a report that they publish on their website which shows results by precinct / by issue. He was clear to me that it was likely that not all counties would be handling this information the same way and so I would have to handle this on a case-by-case basis with all of the other Board of Elections departments in other counties. By then time was already becoming very valuable, so it was at that point that I had to make an executive decision and restrict the “November 2016” deployment to Cuyahoga County only. I reviewed the report he pointed me to online, and saw that it was a fixed-width, plain text report which would probably be very easy to write a simple script to parse and extract the relevant data from. The one hurdle we had to get over was that the report is only ever published to their website AFTER the results come in, so this became the second part to my “Public Records Request” template. So for each Board of Elections I interface with I need all the sample ballots and a report comparable to the one I received from Cuyahoga County BOE. I was instructed that I would need to request a version of this report with “zeroed out” results which would likely not be available until approximately 1 month prior to election day. For the time being (during my dry-run) I was happy to just pull the old report from the website to build my POC.

Here’s an example of what that report looks like:

I’ll get into the details of the architecture later in a future post, but for now it should suffice to say that I had all the pieces in place now, and was able to build my POC. It took me a few hours to write all the scripts I needed in order to parse through the above report. Ultimately all I needed was a mapping between the issues on the ballot and the voting precicnt. I could’ve probably done without the ballots themselves in order to get a bare-bones version of this concept up and running, but I figured they would be helpful for voters who wanted to see the actual ballots. This way, if nothing else, it’s a way that a user can confirm that the information about their ballot published on the website does in fact correlate 1 to 1 with the issues on their actual ballot.

I should take a step back here to explain why the mapping was necessary. For those who are not very familiar with how ballots are organized, each issue that shows up on a ballot may be distributed to many or only a few of the voting precincts within a county. The way you could think of it is a bunch of overlapping circles. When 5 different circles overlap a particular voting precinct, all 5 of those circles (which represent issues on the ballot) will show up on the voting precinct’s ballot which resides under those circles. Some of the other voting precincts may only have 3, or maybe 10’s of circles overlapping their voting precinct. This is due to all the different levels of granularity that any given issue may turn up on a given ballot. There are of course different groups of voting precincts that will all vote for a given candidate for House of Representatives. Or for a candidate running for an area’s school board. These different geographic regions all overlap in many different ways. The “voting precinct” appears to be that “lowest level of granularity” by which all other levels of granularity are defined. In other words, it appears that there are no cases where an issue on a ballot will be voted on by one half of a voting precinct, but not the other half.

After all is said and done there will be an overlap of many issues across different voting precincts. It’s important that I had categorized this information because once I had a story I wanted to distribute to users’ news feeds, I needed to know precisely which users to distribute that story to. In other words, if a story is regarding a Tax Levy that only affects people in the city of Cleveland, I don’t think anyone in the city of Euclid will need that news in their news feed.

We’re talking about a lot of data, though. Cuyahoga County alone is comprised of almost 1000 voting precincts. And each election will likely have hundreds of unique issues being voted on across the county divvied up differently amongst all of the different voting precincts. And I haven’t even started thinking about how I’m going to handle the other 87 counties when I get around to doing that. Because of the sheer volume of data we’re talking about coupled with the need to “categorize” each news story submission to the applicable ballot issue / candidate race, one of the concepts I wanted to incorporate into this project is the idea of crowdsourcing. I’m certainly not, and no small group of people I could recruit, are going to be able to accumulate and post all of the stories for all of the issues being voted on in all voting precincts across all of Cuyahoga County let alone the entire State of Ohio (all 88 counties). I already felt like I had my hands full with the ~1000 ballots of Cuyahoga County, but I’m not backing down from the idea that the ultimate objective of this project is going to be to eventually scale this project up so that we’ll be able to go through this same process for all counties in Ohio at least for all 3 of the major elections we have in any given year. Once I have a “cookie cutter” process in place with each of the counties across the state, my hope is every election cycle the project will go like clockwork.

So I foresaw pretty early on that crowdsourcing is the way to go here. I’ll get into some of the issues that could obviously come up with a crowdsourced app in a future post, but for now it should be sufficient to explain that there is a page on the app where anyone using the app can a) choose their county, b) choose their voting precinct, c) pick the issue on their ballot that a story pertains to, and d) submit a title and url for any web resource / web-page containing a story about the applicable issue on their ballot.

Once I had the mapping in place on the backend I could ensure submitted stories would be added to the news feeds of all individuals whose ballots also contain that/those issues. So, in other words, the architecture was set up in such a way that (for example) if I submit a story related to the Presidential race, everyone throughout the state of Ohio should see that story in their feed. If the story I submit is related to a City-wide tax levy, then only the people whose voting precincts lie within that city will see that story in their feeds.

Stay tuned for future posts on this project! I have several already partially written, and I will certainly be linking to the existing project in those future posts, but was anxious to start getting feedback, if possible, on what I’ve written so far. I certainly haven’t been able to tell the whole story here. There are a number of nuances and ideas that I glossed over or simply omitted for the time being. Future posts will try to handle them more thoroughly.

One clap, two clap, three clap, forty?

By clapping more or less, you can signal to us which stories really stand out.