Machine Learning in Action
If you’ve following me here on Medium or on my blog you’ll note that I’m contstantly digging into emerging technologies: physics analogies applied to marketing, pivot tables applied to health statistics, DNA innovations, and my latest interest, A.I. and Machine learning applied to lead qualification. Before I delve further into that last subject let me take a step back and walk you through an application of Machine Learning to demonstrate how this all comes together.
First, if you’ve seen some of my work you know that I love data and I love politics. Whether it’s putting out an infographic on terrorist attacks or demonstrating correlations between lack of time and bad health habits — data drives my waking moments! Let’s go back to politics and try something out.
Using data from Data for Democracy I was able to put together county-level demographic info matched to Presidential election results from last November. In previous posts I showed you how to do a pivot table in Excel. Pivot tables are great for finding correlations and insights between a few data points.
New tools are emerging as a data-miner’s best friend. Among them is BigML.com. They’ve been around for a while now but their free access for limited processing and their open Python sources API makes them a brilliant leader in the space. Using the election/demo data to the county level I can use BigML.com to show me correlations like this negative correlation between Median Housing Costs and % of voters for Trump in each county.
It’s not a strong correlation but there’s something there. Of course, that sort of correlation is pretty easy to do in Excel: (=CORREL(range1,range2)). Here’s where we get to the exciting part.
Imagine if you could look at a dozen data points and find paths of connections through them all. Think of it like a Rube Goldberg machine with random pieces mapping together to form a unique set of characteristics.
BigML.com helps us do this incredibly quickly (thank you Moore’s law!). Using their modeling tool I was able to identify some key demographic points that would predict a county with a Trump win.
The tree-like structure shows you probabilities of demographic stats. One of the most important stats determining whether or not a county voted for Donald Trump was the % of people in the county who had never married. This makes sense given the high rate of non-marrieds in urban and college towns — who, it seems, didn’t cast their vote for Trump.
You can see the correlation is moderately strong in the chart here. But note that the tree map from BigML.com goes into more depth and shows that the lower the Median Housing Costs to higher likelyhood of Trump win in those counties. Next is a county categorization called the Woodward American Nation from Colin Woodword’s 2011 book. The model excludes Yankeedom here. The next most relevant was a lower % of Hispanics in that county.
All of this is intuitive but it helps to clarify where to place bets on specific targeting for future elections. There are many other maths on the tree model but that path was the “thickest” — indicating the highest probability.
Of course we can do the same thing for Hillary.
Here you can see that, as you might expect, the % of Never Marrieds is again a predictor but this time it has a lower-end hard-stop. Education plays a role here and the Woodward American Nation category “Greater Appalachia” should be avoided were you to predict a Hillary win.
Note that the adage “garbage-in, garbage-out” should be adhered to strictly. I fed this model A LOT of data points so the error rate here is above 10% of each of my examples. If I narrow it down to 4% of each candidate I get the results seen in the adjacent comparison.
From all of these points I think it’s clear that we have some good learnings but throwing in all data points can ruin your momentum.
I’ve found that the best bet is to build models on like-minded stats and then build in other categorical elements.
That’s just a sample of what BigML.com can do. In the coming weeks I’ll be doing some exciting things with deep data learning, sales and marketing. Stay tuned.
— about the author:
Justin Hart is a senior executive consultant.
His primary objective: plumb the deep depths of cutting edge technologies and translate those into c-suite strategies to improve marketing and sales teams.
shorter version: mktg + bizdev + ai
Justin is a recognized industry speaker on modern marketing trends. He is currently working with several companies applying advanced tech tools like machine learning and artificial intelligence to business funnel basics.
You can find his work online at justinhart.biz.
Email Justin at justinhart.biz at gmail.
On twitter @justin_hart.
On Medium Justin Hart
Justin has over 20 years experience as a senior executive of established and start-up companies and even political campaigns (as senior digital director to the Mitt Romney campaign). He currently resides in Southern California.