Nitty-gritty insights on FIFA World Cup

Yudhisteer Chintaram
3 min readSep 11, 2022

--

With the 2022 FIFA World Cup in less than 3 months, I decided to look at the past World Cup data to extract insights on who might win this year’s World Cup.

After some data cleaning and data preprocessing, I plot the ridge plot on the number of goals scored over the years for a few selected countries. We see that Brazil, Germany, France, and Uruguay have some heavy tail distribution on the right which would mean these countries score more goals compared to others.

With boxplots, we observe that the average number of goals scored per country(white dot) seems to be higher for Brazil, Argentina, and Uruguay where their means are pulled to the right by some outliers.

USA and England seem to be the countries who score less of them all.

Next, we plot the stacked area chart for the sum of goals scored by all countries from 1930 to 2014. First, we observe a gap between 1942 and 1946 where the area is zero. This is because the World Cup was canceled due to the Second World War.

Secondly, we can clearly see an increasing trend in the total number of goals scored by all the countries. Note that the number of teams qualified for the 2022 World Cup is 32 but that number is not always constant.

After WWII we see a sharp increase in the total number of goals scored. The trend decreases by 1970 but overall we have a gradual increase in the trend.

We also want to find out the strength of a country through the maximum number of goals scored by players. I plot a stripplot with a few selected countries.

Germany is in 1st position with Miroslav Klose having a total number of 16 goals. Brazil is in 2nd position with Ronaldo having scored 15 goals and in 3rd position, we have Germany again with Gerd Müller having 14 goals.

Note how the distribution of the points for France, Peru, England, and Spain are different from the rest. The maximum goals scored for these countries seem to be outliers. It is as if once in a lifetime a gifted player is born who breaks all records by a significant number.

Germany and Brazil seem to have the best scorers over the years.

This is #day1 of my #100dataviz projects on data science and storytelling with data. This was inspired by Hannah Yan Han. Her articles have been a great source of inspiration. I welcome feedback of any kind or ideas on any topics which you would want me to explore. Thank you for reading!

--

--