Exploratory Data Analysis in Sports Analytics (Part-2)

Making more informed, “better” decisions

Yavuz Selim Sefunc
Analytics Vidhya
4 min readDec 30, 2020

--

Photo by Travel Nomades on Unsplash

In part 1, we did a preprocess of the football dataset. In this part, we perform exploratory data analysis. The dataset contains 79 explanatory variables that include a vast array of bet attributes. The dataset gif is below.

Existing Dataset (1)

We will analyze Liverpool which was the 2019–2020 Premier League champions. I will do some aggregation on the dataset to present it in a summarized format. Here some of them ;

  • Time aggregation / Time Slot aggregation
  • Day aggregation
  • Month aggregation
  • Day of Week / Weekend & Weekend aggregation

1- Plotting Part

Plot Code (2)

I am using Plotly with Python. The code above illustrates how line plots can be made. I will compare the 2.5 under/over goals strategy for all aggregations.

Sample aggregation function (3)

2- Time / Time Slot aggregation

The first aggregation is the “Time/Time slot” features. It is noticeable that performance during morning or night affects the football match. The first plot shows which kick-off time 15:00 or 20:15 are more likely to 2.5 above the final result. I do categorize times which are morning, afternoon, and night. The second plot is given more intuition about how kickoff time affects the match goal.

Time aggregation result (4)
Time Plot (5)
Time Slot Plot (6)

3- Day aggregation

The second aggregation is the “Day” attribute. It is clear from the graph that the end of the month is likely to be 2.5 above goals. Overall, other days are kind of mirror symmetric.

Day Plot (7)

5- Month aggregation

The third aggregation is a “Month” column. The figure shows a line graph that the winter term is more likely to 2.5 under the final. In contrast to the figures for the other bet types, the summer term matches are more goals.

Month Plot (8)

6- Day of Week / Weekend & Weekend aggregation

The final aggregation is “Day of Week / Weekend & Weekend” features. The first day of the week figure indicates that Wednesday and Saturday are more probable to be 2.5 goals over. Other days, the pattern is almost similar.

Day of Week Plot (9)
Weekday / Weekend Plot (10)

Summary

Eventually, we did some exploratory data analysis on the football dataset to discover patterns and trends in the specific bet features. All code in my Github account. Thank you for reading. Please let me know if you have any feedback. Part 1: Follow the link below.

--

--