This is an interactive dashboard. You can click to play with it.

World Cup visualized: The most valuable players

Check out the players who touched the ball most often with this interactive dashboard. Data from Opta Sports, analyzed with BigQuery, visualized with Data Studio.

Felipe Hoffa
Jun 29, 2018 · 3 min read

With the Opta Sports fútbol dataset we can find out each time a player touched the ball — and where in the field they where at that time.

This dataset is not available publicly, but you can license it too by contacting Opta. Check out Eric Schmidt’s Analyzing the World Cup using Google Cloud series of posts for even more info on this.

With a quick query we can find out who are the players that touched most the ball during these last few days:

The top players by # of touches in 2018 (at the screenshot time)

We can also jump back to 2014:

Or just looking at the top attackers:

The top players by # of touches in the top third of the field. 2014 and 2018.

Defenders:

The top players by # of touches in the bottom third of the field. 2014 and 2018.

And the ones that build the game on the middle field:

The top players by # of touches in the middle third of the field. 2014 and 2018.

Or you can just check out your favorite team:

The top players by # of touches in the top third of the field. Chile. Only 2014.

How I Built this Visualization

My query:

  • Count the number of times each player touched the ball.
  • Check their position (0–100) and assign it one of the 3 thirds of the field.
#standardSQL
WITH team_names AS (
SELECT team_id, REGEXP_REPLACE(MIN(name), r'C..te', 'Cote') name
FROM (
SELECT away_team_id team_id, away_team_name name
FROM `cloude-sandbox.galacticos.games`
WHERE competition_id = 4
UNION ALL
SELECT home_team_id team_id, home_team_name name
FROM `cloude-sandbox.galacticos.games`
WHERE competition_id = 4
)
GROUP BY 1
)
, player_touches AS (
SELECT COUNT(*) touches
, (SELECT name FROM team_names WHERE team_id=a.team_id) team
, LEAST(FLOOR(x/33.3333),2.0) x_group
, (SELECT CONCAT(MAX(name)) FROM `cloude-sandbox.galacticos.sqauds` WHERE player_id = CONCAT('p', CAST(a.player_id AS STRING))) player
, EXTRACT(YEAR FROM event_timestamp) year
FROM `cloude-sandbox.galacticos.events` a
WHERE competition_id = 4
AND x>0 AND y>0
AND EXTRACT( YEAR FROM event_timestamp) IN (2014,2018)
GROUP BY team, x_group, player, year
)
SELECT SUM(touches) touches, team, player
, SUM(IF(x_group=1, touches,0)) middlefield
, SUM(IF(x_group=0, touches,0)) defense
, SUM(IF(x_group=2, touches,0)) attack
, year
FROM player_touches
WHERE NOT player IS null
GROUP BY team, player, year
ORDER BY 1 DESC

Next steps

Check out more Data Studio news at /r/GoogleDataStudio/, maintained by Minhaz Kazi.

Want more stories? Check my Medium, follow me on twitter, and subscribe to reddit.com/r/bigquery. And try BigQuery — every month you get a full terabyte of analysis for free.

Felipe Hoffa

Written by

Developer Advocate @Google. Originally from Chile, now in San Francisco and around the world. Let’s talk data.