Scraping xG Data for Almost Any League in the World

It’s all about knowing where to search and not about writing super complex crawlers.

Sergi Lehkyi
Geek Culture

--

Photo by Steven Wright on Unsplash

As I found a really nice dataset with football data, it is time to switch off from work a little and have some fun with the numbers.

xG metric isn’t something new in the world of football anymore, you can find it on almost every football web portal, but normally you will only have the data for top championships and it will be hard to get all the data in order to create a dataset and perform your own analysis on it.

After some research I have found a nice database with xG data for 40 different competitions starting from the season 2016 (though not all the leagues have the data that much back in time). This dataset is free and available to anyone. All you have to do is download it.

The dataset is prepared and maintained by FiveThirtyEight website that collects a lot of different data from a lot of different sources and in a lot of different fields. Luckily for us, they also collect football data (or as they call it — soccer… oh, those Americans…) and use it for predictions of matches outcomes. My friend in betting business said those predictions are quite useless. But anyway, they developed their own SPI rating and use it to rank different…

--

--

Sergi Lehkyi
Geek Culture

Data and Cloud Developer, love technology in general, maybe too much humor and never too serious, based in amazing Barcelona