Building CampaignHawk: Thinking About Data (Part 10)

So now that we have a map and we have some data, it’s time to look at the data to see what we can do with it. Since we still have autopublish active in our app, we have access to our collection without setting up publish or subscribe, which is pretty handy at this point in the process. Meteor also allows you to run Mongo queries in the browser console, which is great feature.

In the browser console, we can run, which returns an Array of 4263 items:

VoterData.find().fetch() // returns Array[4263]

If we check out what one of those array items looks like, we can get a sense for the structure of our data. I’ve included a screenshot of one of those items below.

Most of this data looks pretty straightforward. The addresses have also been geocoded for us, so that’s going to save us a lot of headache in the future.

There are a few things that are are less-than obvious. The history field has a percentage that relates to something and the party field is a zero, which also means something that isn’t immediately obvious.

I’ll start with history, which is a metric that is somewhat subjectively calculated by Tech Roanoke as the likelihood of that voter voting in the future. Voter turnout in the U.S. is just over 50% and it’s much lower for small elections, so this is a very important number for campaigns with limited resources — you want to target the people most likely to vote. A person with a higher percentage is more likely to vote.

The party number is how that person publicly identifies her political beliefs. A zero is someone about whom nothing is known. A six is someone who has actively declined to state their political beliefs. As a practical matter, they are the same as a zero.

The numbers one through five represent a spectrum, where one is staunchly Democratic and five is steadfastly Republican. That of course means that a three is somewhere in the middle and does not lean in any direction.

So now we have to give some thought to how we’re going to represent this data. We’re also going to have to change some of our wireframes because this data does not match what I had anticipated.

New Data Layers

The first one to look at is the Republican-Democrat divide. Previously, I was planning to split the map into districts and color them red or blue. This is no longer practical for two reasons: 1) the data is now more segmented, and 2) the campaign managers are interested in more granular data than the general political tendencies of a large geographic region. Something like a choropleth might be more useful (image below).

A cloropleth of the United States, generated with D3.

We can probably make a data layer that aggregates all voters within a designated area and averages their beliefs (excluding zeros and sixes), then on zoom in, that designated area gets more granular.

We might also want to filter for only certain voters. Maybe you’re a campaign manager for a Democrat and you really want to send your volunteers to fire up your base of core voters (the ones). It would be good to know exactly where those “ones” are.

Another type of map that might be helpful is a density map (below).

This density map comes from CartoDB’s website.

If resources are limited and you have to send people to only the areas that have the highest density of target voters, a visualization like this could be very helpful.

Age (birthday) is an important metric for elections, so I’ve been told. Since the average geographic area will have a bell curve distribution, we could represent this with a gradient.

That said, although a gradient would look really nice, it wouldn’t very helpful for someone trying to run a campaign. If we think about how the user will actually want to use this, she will probably want to do one of three things — or all of them: 1) filter out people of a certain age group, 2) see the density of certain age groups based on location, 3) divide people by age based on decile or quartile.

History should be a slider that allows you to filter people out based on their voting behavior. If you want to target only those people who have a 90% chance of voting, you can move the slider to the right to 90% and all those people who do not meet the criteria will be filtered out of the results.

That’s about enough contemplation for one day. Time to start implementing these data layers.

Next Steps

Start playing around with Mapbox data layers and see what we can do with it. Their docs are about as robust as I’ve ever seen, so I imagine these data layers will be relatively easy to implement. Once we have a few of them implemented, we’ll have to figure out a way to toggle them on and off.