The WFTDA Rankings System is Kinda Broken
With the release of the June 2016 rankings and playoff seeding we now have a full picture of the 2016 season and the impact of outlier scores on Playoffs…and it’s kind of a mess.
As we have covered on this site before, as leagues are added to WFTDA, weights increase. When WFTDA first published weights within their current ranking system, in March 2014, 1st place was Gotham with a weight of 3.86. Heading into 2016 Playoffs Gotham still sits in 1st, but that weight has shot up to 7.34 — almost double.
When the ranking system was first devised it would have been hard to foresee the amount of growth within WFTDA. 149 leagues in February 2013 to 290 in the latest ranking. Again, almost double.
The speed of this growth has meant that imperfections (no rankings system is perfect) in the ranking and scoring system have been exposed, as outlier scores are having an increasingly larger impact and will continue to unless a correction is made or the amount of leagues levels off or contracts.
What has this looked like in real time this year? Teams that played Demolition, Glasgow or Rocky Mountain early on all typically received outlier scores (scores much higher than their averages). This has happened in the past (Windy City or New Hampshire 2015 for example), but as weights increase year to year those outlier scores are pushing higher and the butterfly effect of those scores is reaching out further into the rankings, i.e. anyone who played Newcastle after they got in an early strength factor challenge against Glasgow most likely also received and outlier score.
Within the WFTDA ranking system the measure of choice is means — an average of your scores over the previous 12 months. When weights were under 4, this was a good system for finding “the centre” of a team as it gave a solid understanding of a team’s strength. As weights increase though, means can easily be pushed off of their centres by outliers. There are a handful of teams this year who typically pull in scores around 400 riding single game boosts of 600–700+.
A downside to using means can also be seen in unavoidable rankings match ups that are unequal — think of the 8–9 v 1 seed match up at D1 Playoffs. Teams who carry those negative outliers will be pushed off of their centres and within the current system there is actual an incentive to lose that 8–9 bout. No rankings system should ever incentivize losing.
I am not a WFTDA member and although I have heard rumblings of a half weight off season being brought online to correct some of these issues, I hope that WFTDA will seriously look at moving to a ranking based on median scores as a way to control for outliers and keep teams at their centres.
For people who want to know the difference between these two systems an example is below:
Team A WFTDA Game Scores: 390 / 398 / 702 / 404 / 356 / 430 / 387 = Mean (Avg) 438.14
Team A WFTDA Median WFTDA Game Scores: 356 / 387 / 390 / 398 / 404 / 430 / 702 = 398 *(With an even number of games you average your two middle scores)
The Median is the middle score in the collection of scores. WFTDA wouldn’t have to change their scoring system (weights), or the sample size (12 months), just how they interpret this set. In the mean system Team A’s centre is inflated by one outlier score — in the median system their centre is more evident. Within the current ranking system this team would rank 27th in the mean system, but 37th in a median system. That’s a major difference and it would see a team enter Playoffs as a 7 seed or a 10 seed.
On the flip side, if Team A received a score of 100 against a 1 seed in Playoffs the effect of a negative outlier would be severely lessened and they would be closer to their centre.
Team A WFTDA Game Scores: 390 / 398 / 98 / 404 / 356/ 430 / 387 = Mean Avg 351.85
Team A Median WFTDA Game Scores: 98 / 356 / 387 / 390 / 398 / 404 / 430 = 390 Median
This would be the difference between 47th and 40th — a score set like this could decide a Division 1 Playoff seeding just on the basis of how it is interpreted.
Although the top 20 is largely immune to outlier scores (for now) because a massive drop off is less likely and their scores are already so high, this has seriously impacted the 20–40 half of the Division 1 Playoff this year, as well as, D2 Playoffs. Maybe there is an advantage to introducing a bit of chaos and it is far more likely now that we will see upsets this year — but ranking systems are designed to find the “centre” of a team and place them accordingly. Within the current system that is getting harder to do and gaming the system is getting easier (hunt inflated weights + play less games and ride outliers).
A move to medians won’t fix everything, but it is a relatively painless solution that could be implemented very easily after playoffs. It may also be a great solution to combine with efforts to establish an off season like the rumoured half weights. Other scoring issues still exist — strength factor challenges have been a mess this year — but regardless, any move to control for outliers will have a positive benefit. Other options are also a lot more work; reducing the amount of eligible leagues by increasing amount of eligible games needed will bring weights and outliers down but hurt growth and introducing a whole new algorithm would be brutal.
There are a lot of brilliant minds at WFTDA, I hope they hear this out and make this switch so we can get back to teams being ranked and weighted closer to their actual strengths.