The BigDataGame - Predicting the Super Bowl with Machine Learning

How an enterprise middleware company jumped headfirst into Football analytics

“Darn, we have been beaten up by a bunch of (cute) puppies!” read the email from Isabelle Mauny, WSO2's VP of Product Management.

When I joined WSO2 seven months ago, I joined as a simple writer (or so I thought). If someone had told me that I’d be shrugging on Football armour and arguing ESPN stats over lunch, I’d have been pretty surprised. After all, it’s enterprise software — isn’t it?

Nevertheless, that’s exactly what happened. We’ve just spent the last three weeks using high-tech machine learning to build a prediction engine for the Super Bowl. We christened the thing BigDataGame and hooked it up to a web page for the entire Internet to see. And now that the stats are in, it turns out we did a better job than Microsoft Cortana.

Not only has it been a technical victory, but it’s been a great marketing campaign — one where everyone involved had fun in something totally outside their usual sphere of operations. How did that happen?

Everything began when Saliya Withana, who heads the Digital Marketing team, fired off an email about advertising at the Super Bowl. Saliya is a very cool guy, stocky, impeccably polite, with a thing for those Peruvian cajon drums. A large portion of WSO2's clients are based in the US (think Ebay, Boeing), and keeping an eye out for potential advertising opportunities is always a thing here.

In this case, Saliya was looking at the massive spikes of search traffic pouring in around Super Bowl keywords. Could we do something with Super Bowl data and tap into that? And (typical Saliya) — can we do it in a down-to-earth way?

Part of Saliya’s original email

In the background, a team at our office at Trace Expert City was crunching on a new product called Machine Learner. Machine Learner was designed to look at data and build statistical models for making predictions, and machine learning as a whole is a very hot topic right now — what with Google and Facebook using it to figure out Go and all that.

Nirmal Fernando, who leads the Machine Learner team, fired off a reply to Saliya: we couldn’t advertise at Super Bowl, but we could predict it. +1s all around. From a content marketing perspective (which is where I come in), there was a lot of value there. While ML is pretty good, it‘s new, meaning zero published case studies or documents backing it up. A project like this means content.

Marketing stuff for the data products is largely my responsibility, so while I wasn’t doing much to help at this stage, I started hanging around the Trace office trying to understand what Nirmal and Thamali Wijewardhana as they started hammering away at this SuperBowl predictions engine.

And it was an interesting challenge, because none of us — save for Kern Rikhi and Grant Thompson — really knew anything about Football. Granted, we’d all played Rugby at some point, so the rules were easy to understand. Now the question was, how hard was it to predict this game?

The industry standard, so to speak

Sports analytics systems are huge in the US. Baseball’s taken most of the limelight — Moneyball and Nate Silver’s PECOTA system are arguably the most famous - but Football analytics has also been around since the early 2000’s. FootballOutsiders.com is said to have published the first truly advanced metric in this field in 2003. It was called DVOA (defense-adjusted value over average), and judged a player’s success on each play, compared to the league average, based on a number of variables (including the strength of the opponent). Then Pro Football Focus came along with a more sophisticated player grading system. Advanced NFL Stats (Advanced Football Analytics) did the same. Bill Barnwell of Grantland created the Speed Score.

And so on and so forth. Today, everyone’s in the game — from Microsoft to Ebay to Electronic Arts. EA is pretty cool, because they use their Madden NFL engine to simulate the game. Microsoft is cooler, because they have Cortana, and Cortana is…well, if you’ve ever played Halo, you’ll understand. No life is complete until you’ve sprayed bullet hell over the Covenant and the Flood with her voice in your ear.

The real giants, however, are the likes of 538 (screenshot above), ESPN, Oddshark and IronRank. Most of these guys use variants of an algorithm called ELO, which was initially designed to evaluate chess players. The algorithm was so successful that it’s the heart of competitive gaming now —Major League baseball, basketball, League of Legends, Dota2, you name it.

Our system was going to be different. Machine Learner trains a model based on data. We throw inputs at the model, we get predictions out. Easier said than done.

Now, there are things you needed for this kind of operation. The first is a) the model, which lets us throw data at it, get an output, and basically make the prediction. The second is b) the dataset. The dataset usually shows a bunch of inputs and their outputs. The third, and the real hero of this story, is c) the algorithm, which takes the dataset, reads it and builds and rebuilds the model, with the model getting a bit more accurate with each run.

A bit of advice from Kern and Grant meant we now knew where to look, especially with regard to the plays. We wrote about this process here, but long story short, Thamali ended up scraping pro-football-reference.com for the data. If anyone from that site is reading this, thanks, guys — job well done.

The next step was the algorithm. I am not an engineer, so for real pedal-to-the-metal detail, I’d recommend you read Nirmal’s post on Data-Informed.com.

This is pretty heavy stuff if you’re not in the field, so if you aren’t, let me give you a less technical explanation: after messing about with a few algorithms, we settled on using Random Forest regression. This works by building a whole lot of deep decision trees off the data set and outputting the mean prediction of the trees. It’s a way of compensating for the high-variance nature of decision trees — you generally end up with a fairly low bias model that performs a lot better than any single tree.

I had a bit of trouble understanding this until I came across this Matlab script. It generates visuals of the random forest algorithm at work (like the one above). Each leaf is a value of the target variable given the values of the input variables represented by the path from the root to the leaf.

Random Forests are a favourite among data scientists. ‘The Unreasonable Effectiveness of Random Forests’, by Ahmed El Deeb lists why: they’re incredibly accurate and very easy to apply.

The ML team’s magic worked, because not only did we end up with a great Random Forest regression algorithm; we also ended up shipping it with the next release of Machine Learner.

We now had model that could predict the Super Bowl; now we had to present it. Iwantha Lekamge, Thimuth Amarakoon and Madhura Mendis stepped up to build the site. A few hasty meetings and group chats later, the API was hooked up, everything was scrolling nicely, the grass was green on that side of the web page, and we were crunching away.

Things settled into a routine. Every week, Nirmal would drop by with the latest predictions. We’d then dissect it. I’d go away and watch the matches and blog about them. Kern would occasionally drop by and make corrections. Arguments were made, presented and lost when Sanjiva happened.

In the meantime, BigDataGame went places. People on Reddit struck up a discussion on the thing. Nirmal managed to get an article out. Saliya wrote about it. Harindu De Alwis, events points man and videographer extraordinaire, had us run up and down in full gear and came out with this amazing video.

Mind you, all of this was in the middle of some pretty heavy changes to our product strategy, redesigns and company meetings, and background prep for the upcoming WSO2Con Asia.

7 of the 11 games of the Super Bowl went to the teams we predicted as having the highest probability of winning that particular matchup.

But honestly, nobody saw the Broncos winning. Our model gave the Panthers a 57.40% of winning. If you’d been on that mail thread, you’d have seen heavy support for the Panthers — Kern in particular pointed out that Peyton Manning was old and slow.

The only one who disagreed was Grant. In the style of a true sports analyst, he sent us all this email:

It looked like our Predictor failed the final test. We were beaten by puppies.

If there’s one thing I’ve learned about prediction over the past few months, it’s that it’s a game of probability. A coin has a 50% chance of coming up heads when flipped. This means that if you flip it ten times, 5 times of of 10 it will come up heads. But the other 5 times it will come up tails.

Now think back to that previous statement — 7 out of 11 games went to the one we predicted as having the highest probability of winning. In some cases those probability differences were minute percentages. The game could easily have gone the other way .

And we still would have been right. Because, as it turns out, what we were predicting wasn’t who would win. What we were predicting was who had the better chance of winning. That 57.40% chance of the Panthers winning was — the other side of the coin — a 42.60% chance for the Broncos to nail the game.

Looking back, we could have presented this better. But it’s a pretty boring story when you’re just presenting numbers. “Give me a one-handed economist,” said the frustrated American President, not knowing that “on the other hand” is the nature of this game. The story is better told when you can come out and say “the machine says this man will win”.

My satisfaction lies in knowing that we got so close to 538, the Golden Boy of the prediction crowd — closer than Microsoft. And that’s saying something; this, after all, is very much a 20% project that ballooned into something bigger. Cortana really should have known better.

The fact that I get to work at a place where crazy moonshots like this are possible, and oddball ideas are nurtured instead of being crushed? That’s the icing on the cake.

--

--

Data scientist, public policy and tech, @LIRNEasia. Nebula Award nominated author. Numbercaste (2017) / the Inhuman Race (2018). @yudhanjaya on Twitter.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Yudhanjaya Wijeratne

Data scientist, public policy and tech, @LIRNEasia. Nebula Award nominated author. Numbercaste (2017) / the Inhuman Race (2018). @yudhanjaya on Twitter.