Accurately Predicting Football with Python & SQL
And betting on the outcomes
“I’ve got a hunch that Chelsea is going to win” isn’t a very good argument for why you should place a bet.
“I’m using an algorithm that is proven to accurately predict outcomes better than the bookmakers” is much better.
To predict these outcomes I’ve created a data warehouse with my co-developer (Estèphe Corlin) throughout the 2021/22 football season to store: bookies odds, our own odds, player stats, fixture outcomes and more. Below is an example of our “HomeOdds” table being created in the warehouse:
CREATE TABLE Football.HomeOdds(
FixtureId INT
, TeamId INT
, Market VARCHAR(50)
, Odd DECIMAL(4,2)
, DataTimeStamp DATETIME NOT NULL
, CONSTRAINT PK_BookieOdds PRIMARY KEY (FixtureId, TeamId, Market)
);
The data warehouse is a fully managed relational database that contains five seasons of data. It is populated through a mixture of backfills, web scraping and algorithmic calculations to calculate outcomes.
These calculations are based on the given xG for a team, their current form, their opponents ability to defend and whether they’re playing home or away. More details of this calculation can be found in the article below.