Hey There, Neighbor!

The semi-finals are upon us as we continue our analysis of World Cup using Google Cloud with a preview of France and Belgium. The two nations share part of a border with each other (620 kilometers to be precise) and one can only imagine what kind of fever pitch this match has created in that particular region for each country.

On a historical note, although this will mark the 75th time the two neighbors have played one another, it is only the third time they have met in World Cup action, with France winning the two previous matches in 1938 and 1986. However, Belgium defeated France most recently, 4–3 in a 2015 friendly.

Regardless of what happens between England and Croatia in the other semi-final, there are plenty of pundits who feel the winner of this match between Les Bleus and the Red Devils will at the very least be the odds-on-favorites to become 2018 World Cup Champions.

France advanced after eliminating Uruguay 2–0 in the quarters — but in addition to our model — we were also told it would happen beforehand by Newton the Parrott. Meanwhile, if Belgium didn’t already have the world’s attention prior to the quarter final match with Brazil, they certainly do now after a marvelous 2–1 win to eliminate the 5-time champions. This is only the second time Belgium has reached the semi-finals, with the only other appearance coming in 1986. France has reached the semi-finals for the first time since 2006 and for the sixth time overall.

From a player availability standpoint, Belgium will not have the services of right wing Thomas Meunier, who will be serving a yellow card suspension for this match. This is a key loss for Belgium — and a gain for France — who will have nearly a full team (with the exception of Djibril Sidibe, who is questionable to play with an ankle injury).

No matter which way you slice it, this is a tough one to call, even for the models. France is the deeper team — and arguably the more talented of the two. Belgium is stubborn — and arguably the tougher of the two teams.

Predictions

Thus far through the tournament, we’ve shared the results of a particular type of model; one with a lot of features based on historical team data, supplemented with player stats. For this game, we decided to explore a few different predictive models; the aforementioned mix of team and player stats, a model using strictly team stats, a model using strictly player stats and a projection based on ratings from simple box scores, very much like the classic ELO rating.

All of these models favor France, but no single model gives them even a 10% edge on their north-eastern neighbors.

  • Our original model gives the French a 54.8% chance (1.73 xG) of advancing, meaning Belgium come in at 45.2% (1.67 xG).
  • Our model using strictly team related features gives France a slightly bigger edge, at 55.3% (1.66 xG) and Belgium at 44.7% (1.19 xG).
  • Our ELO projection goes a few points closer, giving France a 52.6% chance and the Belgians a 47.4% chance.
  • And finally, our player based model — which assumes the starting XIs listed below and will be updated when we have final team sheets has this game at almost a coin-flip; France 51.0% (2.43 xG), Belgium 49.0% (2.28 xG).

Assumed French XI: Hugo Lloris, Benjamin Pavard, Raphael Varane, Samuel Umtiti, Lucas Hernández, Paul Pogba, N’Golo Kanté, Blaise Matuidi, Antoine Griezmann, Kylian Mbappé, Olivier Giroud

Assumed Belgian XI: Thibaut Courtois, Toby Alderweireld, Vincent Kompany, Jan Vertonghen, Yannick Carrasco, Axel Witsel, Marouane Fellaini, Nacer Chadli, Kevin De Bruyne, Eden Hazard, Romelu Lukaku

UPDATED NUMBERS

Mousa Dembele is in for Yannick Carrasco.

  • Our player based model now has France at 50.6% win probability(2.38 xG) and Belgium at 49.4% (2.33 xG).
  • Our combined model shows France at 54.3% (1.71 xG) and Belgium at 45.7% (1.68 xG).

Understanding Performance

Now a bit of background on historical performance of these models. All of the non-ELO based projections are built with scikit-learn on a dataset of 20,491 games, a set of games with no ties. This set is randomly split into training and testing sets with 4,098 left aside for testing.

  • Our original model correctly picked 69% of our 4,098 test games
  • A model using strictly team based features correctly picked 67%
  • Our player based model correctly picked 63%

The takeaway here is that on historical data our original model, which blends team and player features, has proven to be the most accurate. However, before Sunday’s final we’ll take a deeper dive into how these models have differed this World Cup and explore which has been the most accurate predicting current games.

538 also expects an extremely tight match — with a very slight lean toward France

Bing gives France a tiny bit more cheese

Google Search expects a close encounter — with extra time certainly possible — and a slight edge for France

Enjoy the match!

--

--