Using Data Science to Win in Fantasy Football

5 min readJun 22, 2016

During the football season, managers of fantasy football teams struggle with the same question: who do I start? If you Google “who do I start in fantasy football,” you’ll find an incredible amount of brain space gets dedicated to figuring this out, but most of the information sources are just people’s opinions.

Like their brethren in the real world, fantasy football GMs are finding that analytics can help them make better decisions. A couple years ago, I encountered a New York Times article by Boris Chen that showed how data science can help you pick your best Fantasy Football team.

Each week, the world’s Fantasy Football experts come up with their rankings for each position. People rely on these rankings, but as Chen explains, rankings don’t really tell you how much better a player is than another player:

[A]ll ranked lists share a flaw. They imply a strict monotonic ordering and do not illustrate the true distance between players. A list implies QB1 > QB2 > QB3, whereas the reality might be QB1 >> QB2 = QB3.

To address this, Chen’s applies a clustering algorithm called the Gaussian mixture model to the rankings and then produces a visualization of the results:

Through clustering, we now see groups or tiers of similarly ranked players. Decision-making is now simple: favor players in higher-ranked tiers, but consider players in the same tier to be roughly equivalent.

I became a fan of these tiers and found that using them saved me a lot of time. I would spend less time twiddling with my roster every time I had a new thought. During the draft, instead of worrying about silly things like bye weeks, picking a certain position by a certain round, I focused only onpicking players in higher tiers. This seemed to give me the best overall roster and also gave me enough depth to trade with other fantasy players.

It could be luck, but I think tiers helped me climb to the top of both my fantasy leagues the past couple of years. However, I figured there would be a better way to generate these clusters. It turned out the most promising solution came from a set of statistics I was trained to ignore: projected scores.

Projected Scores vs. Rankings

I’ve always been skeptical about projected scores, the sentiment of which this quote from Datascope summarizes perfectly:

“ESPN’s fantasy football projections are way off. They’re projecting Kelvin Benjamin will score 18.4 points in my league. He’s only scored over 18 points twice, and one of those times was 18.2 points. He averages 13 points a game. How do they come up with this stuff?!”

However, after doing more research, it turns out that aggregating different projection sources can actually yield better results than rankings. FantasyFootballAnalytics recently did a comparison of projections vs. rankings that showed projections were more accurate for quarterbacks, running backs, wide receivers, and tight ends.

Creating Tiers Based on Projections

It seemed like the natural thing to do would be to cluster based on projected scores instead of rankings, so I applied a Gaussian Mixture Model to a set of projections provided by Fantasy Football Analytics.

Below is a visualization of the QB projection tiers for Week 17:

Comparison of Methods

To keep things simple, I’ll use the term Rank Tiers when I’m talking about tiers generated from rankings and Projection Tiers when I’m talking about tiers generated from projections.

Since Chen’s blog no longer has published tiers for Weeks 1–16, I used Chen’s code to generate Rank Tiers myself. I stuck to the default number of tiers used in his code, but it’s possible that the tiers I generated differ slightly from the ones that he published during the season.

To compare the two techniques for the 2015 season, we’ll use Chen’s concept of tier accuracy, the percentage of tiers that actually came in scoring higher than a lower tier.

Accuracy of Rank Tiers

Here are the accuracies for each position on a football team using Rank Tiers:

QB    RB    WR    TE    K     DST
63.4% 70.2% 72.5% 65.5% 46.1% 62.0%

Accuracy of Projection Tiers

Similarly, here are the accuracies for each position using Projection Tiers:

QB    RB    WR    TE    K     DST 
65.2% 75.5% 75.1% 71.1% 50.2% 62.2%

For the 2015 season, it appears that Projection Tiers outperforms Rank Tiers for all the positions listed above.

Bootstrapping

I wanted to test this on more data, but I wasn’t able to find ranking data for previous seasons. With bootstrapping, we can create many simulations of our observed data by randomly sampling it (with replacement). I created many simulations of the 2015 data, generated Rank Tiers and Projection Tiers, and then counted when Projection Tiers had a higher tier accuracy than Rank Tiers.

The numbers below represent as a percentage how often Projection Tiers were more accurate than Tier Ranks.

QB    RB    WR    TE    K     DST 
54.0% 53.6% 57.5% 47.5% 46.9% 37.9%

Interestingly enough, Tier Projections seems to do better for QBs, RBs, and WRs, but worse for the other positions. In particular, they perform significantly worse for Defense, but that seems to be corroborated by the fact that projections perform worse than rankings for Defense (and Kickers).

Combining Models

I tried combining the two models by summing the two tiers together to create a different ordering, but this performed really poorly against the bootstrapped data.

Despite the fact that a more complicated method of combining tiers ultimately didn’t work, we nevertheless have a fallback method of stacking the two models together: use Projection Tiers for quarterbacks, running backs, and wide receivers, and then Rank Tiers for the remaining positions. This fallback model is a strict improvement over Chen’s model.

Going Forward

It seems that applying clustering techniques to projections may be helpful! I’m planning on posting these tiers throughout the upcoming season so we’ll be able to see how they perform.

With that said, there probably are better ways to cluster the players than using Gaussian Mixture Models, which assumes that the underlying data is a mixture of different Gaussian distributions. Because it’s not clear that these assumptions hold, there’s probably room for improvement here. I did try an approach of picking the best tiers from a set of many randomly generated partitions, but found the runtime was too long to approach similar levels of tier accuracy as the GMM-based approaches.

I have some additional ideas on how to take this forward, such as draft analysis, customizing towards particular league scoring settings, and better ways to partition the data. I also would like to compare these techniques across other seasons, but I haven’t been able to find historical data from FantasyPros yet.

Finally, I’ve published the code on GitHub, so you should be able to replicate my results. Let me know if you have any feedback!

Thanks to Andrew Ho and Jonah Sinick at Signal Data Science for reading drafts of this post and for helping me learn the techniques needed to conduct this analysis.