Scraping xG Data for Almost Any League in the World
It’s all about knowing where to search and not about writing super complex crawlers.
As I found a really nice dataset with football data, it is time to switch off from work a little and have some fun with the numbers.
xG metric isn’t something new in the world of football anymore, you can find it on almost every football web portal, but normally you will only have the data for top championships and it will be hard to get all the data in order to create a dataset and perform your own analysis on it.
After some research I have found a nice database with xG data for 40 different competitions starting from the season 2016 (though not all the leagues have the data that much back in time). This dataset is free and available to anyone. All you have to do is download it.
The dataset is prepared and maintained by FiveThirtyEight website that collects a lot of different data from a lot of different sources and in a lot of different fields. Luckily for us, they also collect football data (or as they call it — soccer… oh, those Americans…) and use it for predictions of matches outcomes. My friend in betting business said those predictions are quite useless. But anyway, they developed their own SPI rating and use it to rank different…