The Power of Crowd Predictions — Using World Cup Pools Data from Torneo

Cathy Ha
Sports Analytics
Published in
6 min readJul 17, 2018

--

On Sunday, the 2018 World Cup concluded with fans and teams celebrating (and sulking) under a blanket of rain and golden confetti in Moscow, Russia.

It was a entertaining tournament with many upsets and surprises. Many of the sure favorites — Brazil (beaten by Belgium in quarter-finals), Germany (did not manage to escape the curse of the champions), Spain (narrow loss in penalty stage to Russia), Argentina (a struggle with France in Round of 16) were knocked out to the shock of their fans, while underdogs like Croatia and Belgium (ok, debatable, but not many people saw it coming) crept their way to the top 3.

How good is the average person at predicting outcomes for the World Cup? Where does the power of collective opinion fail, and succeed?

We dig into the 11,911 predictions from 222 users in 28 different pools on our platform, Torneo, to answer these questions.

(Note: as opposed to traditional bracket-style pools, users on Torneo made predictions for the World Cup on a match-by-match basis, as each match became available)

How Good Were Torneo User Predictions?

On a collective basis, 53% of the predictions had correct outcomes and 9.3% had correct scores.

Of the users who had more than 40 picks (total of 64 matches available):

  • twenty-one (21) users predicted more than 60% of outcomes correctly;
  • two (2) users predicted more than 20% of scores correctly
Distribution of Outcome Prediction Accuracy by User

Looking at the distribution of correct outcome % and correct score % by user, we see a small group of users that are really bad at predicting outcomes (predicting worse than 45% of outcomes correctly), and a small group of users that are leading the pack at predicting scores (it seems like it’s very difficult to predict more than 15% of scores correctly).

Distribution of Outcome Prediction Accuracy by User
Average Outcome Prediction Accuracy by Stage

But is an outcome prediction accuracy of 53% actually good? It’s hard to tell when you’re looking at the results overall because there are three potential outcomes during the Group Stage (win for either team, or tie), and two potential outcomes during the Knockout Stage (win for either team).

Breaking the prediction accuracy into different stages, we can see that Torneo users predicted 51% of match outcomes correctly during the Group Stage, and 64% of match outcomes correctly during the Knockout Stage — compared to 33.3%, and 50% at random, respectively.

And what about a score prediction accuracy of 9.3%? Assuming that team scores can range from 0 to 4 (leaving out the occasional 5s and 6s), a random guess would have a 4% chance of being correct.

So, Torneo users were much better at predicting World Cup match outcomes than random — which shouldn’t be a surprise considering the sleuth of information available on the internet, and the likelihood that a good portion of people who participated are either somewhat knowledgeable about soccer, or have friends/ family who were able to help.

Which Matches Were the Most Difficult to Predict?

The majority of most unexpected match outcomes were in the Group Stage, where some of the favorites flopped completely:

  • Brazil tied with Switzerland in the Group Stage
  • Germany failed to make it out of the Group Stage as a result of losing to Korea Republic and Mexico
  • Spain tied with Morocco in the Group Stage, then were knocked out by Russia in the Round of 16
  • Argentina lost to Croatia, and tied with Iceland in the Group Stage
  • Portugal tied with Iran in the Group Stage
Average Outcome Prediction Accuracy by Match — 10 Most Unexpected Outcomes

As for scores, we can get a sense of which matches had scores that were difficult to predict by averaging the absolute difference between user score predictions and actual match scores.

Here in the 10 most difficult scores to predict, we can see some of the same matches as the “unexpected outcomes” analysis above, as well as the matches that have scores on the extreme high end:

  • England’s 6 goals against Panama
  • Belgium’s 5 goals against Tunisia
  • Russia’s 5 goals World Cup opener against Saudi Arabia
Total Score Prediction Difference — 10 Most Unexpected Scores

Although Torneo Users’ predictions were, on average, better than random, they were pretty bad at forecasting extreme events — such as favorite teams losing matches that they really should have won, and extremely high scores.

Collective Consensus, and Dissent

What if we let the majority vote represent the opinion of all users, for each match? For example, if 98% of predictions were for Belgium winning in the Belgium vs. Japan match, then the group prediction would be for Belgium.

Looking at collective opinion this way, consensus from Torneo users would have predicted 42, or 66% of the 64 matches correctly.

But how much disagreement, or dissent, was there for each match? We can get an idea of group consensus by dividing the % of predictions that were for the majority opinion by 33.3% for the Group Stage (three possible outcomes), and 50% for the Knockout Stage (two possible). We can call this the “Consensus Index” — higher indexes indicate higher consensus.

Matches with the Lowest, and Highest Consensus Index — top and bottom 5

Matches that generated the highest amount of debate seem to be between teams that are both lesser-known, while matches that generated the highest amount of consensus seem to be where one team is the obvious favorite — which all makes sense.

The better question is — what is the relationship between group consensus and prediction accuracy? Looking at the median Consensus Index for correct and incorrect group predictions, it seems like there is slightly more consensus for the correct group predictions, but not much more.

Applying the same idea to score predictions, what if we averaged the scores of all predictions for each match, and compared that “crowd opinion” to actual scores to gauge group score prediction accuracy? The result is guessing five (5), or 7.8% of the 64 match scores correctly, which is actually worse than the 9.3% mentioned above.

How good is the average person at predicting outcomes for the World Cup? Better than random. Collective opinion seemed to do even better, at least for predicting match outcomes. Yet most people failed at predicting some of the “unlikely” outcomes and extreme high scores in this tournament, with three of the favorite teams not even making it to the Quarter-final.

When France beat Croatia in the Finals, everything seemed to be right again in the world of probabilities, yet it was almost gut-wrenching for those cheering for the underdog (myself included) to watch a small country’s fairy-tale get snuffed out in 90-ish minutes. But in this probabilistic world and in a game that is heavily rooted in randomness, we can think that maybe if that own-goal didn’t happen, if the VAR ruling went the other way, and maybe if Croatia’s goalkeeper was in better shape — perhaps it could have been a different team cheering for their moment in making history.

--

--