Augmented ELO Rating Models in Football

Sam Iyer-Sequeira

Published in

Football Applied

10 min readJul 8, 2024

Note:I strongly recommend you read the following article before proceeding to read this one:

Predictive Modelling Football Games using Python and Excel ⚽️

Predictive modelling in football games leverages the power of data analysis and statistical techniques to forecast…

medium.com

In the realm of predictive sports analytics, ELO ratings have long served as a fundamental tool for assessing team performance and predicting outcomes. Traditionally, ELO models use a constant k-factor across all matches, which doesn’t fully capture the varying significance of games, especially in high-stakes tournaments like the Champions League and World Cup. To address this limitation, we’ve developed an augmented ELO ratings model that adjusts the k-factor based on the importance of each match stage.

· Augmented EUROS k-factor model
·
· Augmented World Cup k-factor model
· Comparing New and Old ELO Ratings
· Europe’s Dominance

Augmented EUROS k-factor model

Our approach assigns different k-factors to matches at different stages of the tournament: group stages are assigned a k-factor of 10, round of 16 matches have a k-factor of 20, quarter-finals are set at 25, semi-finals at 30, and the final at 40. This nuanced adjustment allows our model to more accurately reflect the dynamics and pressures of critical matches, providing a deeper analysis of team performances and tournament progressions.

import os
import json
import pandas as pd
import pydash
import plotly.express as px
import plotly.graph_objects as go
from plotly.subplots import make_subplots
import numpy as np
import csv
from mpl_toolkits.mplot3d import Axes3D
from sklearn.preprocessing import StandardScaler
import matplotlib.pyplot as plt # plotting
import seaborn as sns

euros = pd.read_csv("/Users/user/Downloads/filename.csv", index_col=0)

unique_teams = pd.concat([euros['Team1'].str.strip(), euros['Team2'].str.strip()]).unique()

# Print unique teams
print(unique_teams)

unique_countries_team1 = euros['Team1'].str.strip().unique()

euros_count = {}
for team in unique_teams:
    euros_count[team] = ((euros['Team1'].str.strip() == team) | (euros['Team2'].str.strip() == team)).sum()

euros_count_df = pd.DataFrame(list(euros_count.items()), columns=['Team', 'MatchesPlayed'])

euros_count_df = euros_count_df.sort_values(by='MatchesPlayed', ascending=False)

print(euros_count_df)

unique_teams = pd.concat([euros['Team1'], euros['Team2']]).str.strip().unique()
elo_ratings = {team: 1000 for team in unique_teams}

import pandas as pd

matches = pd.read_csv("/Users/user/Downloads/filename.csv", index_col=0)

# Function to calculate Elo rating
def calculate_elo_rating(team1_rating, team2_rating, outcome, k_factor):
    expected1 = 1 / (1 + 10 ** ((team2_rating - team1_rating) / 400))
    elo_change = k_factor * (outcome - expected1)
    return expected1, elo_change

# Function to determine K-factor based on playoff value
def determine_k_factor(playoff):
    if playoff == 1:
        return 10
    elif playoff == 2:
        return 20
    elif playoff == 3:
        return 25
    elif playoff == 4:
        return 30
    elif playoff == 5:
        return 40
    else:
        return 5  # Default K-factor if playoff value is not in the specified range

# Function to update Elo ratings
def update_elo_ratings(matches, elo_ratings):
    for index, match in matches.iterrows():
        team1 = str(match['Team1']).strip()
        team2 = str(match['Team2']).strip()
        
        winner = match['Winner'] if pd.notna(match['Winner']) else ''
        
        if winner.strip() == team1:
            outcome_team1 = 2  # Win for team1
            outcome_team2 = 0  # Loss for team2
        elif winner.strip() == team2:
            outcome_team1 = 0  # Loss for team1
            outcome_team2 = 2  # Win for team2
        else:
            outcome_team1 = 1  # Draw
            outcome_team2 = 1  # Draw

# Determine K-factor based on playoff value
        k_factor = determine_k_factor(match['playoff'])

# Get current Elo ratings
        team1_rating = elo_ratings.get(team1, 1000)
        team2_rating = elo_ratings.get(team2, 1000)

# Calculate Elo changes and expected outcomes
        expected1, elo_change1 = calculate_elo_rating(team1_rating, team2_rating, outcome_team1, k_factor)
        expected2, elo_change2 = calculate_elo_rating(team2_rating, team1_rating, outcome_team2, k_factor)

# Update Elo ratings in the dictionary
        elo_ratings[team1] += elo_change1
        elo_ratings[team2] += elo_change2

# Also update the Elo ratings and expected outcomes in the DataFrame
        matches.at[index, 'team1_rating'] = team1_rating
        matches.at[index, 'team2_rating'] = team2_rating
        matches.at[index, 'team1_newrating'] = elo_ratings[team1]
        matches.at[index, 'team2_newrating'] = elo_ratings[team2]
        matches.at[index, 'team1_expected'] = expected1
        matches.at[index, 'team2_expected'] = expected2
        matches.at[index, 'outcometeam1'] = outcome_team1
        matches.at[index, 'outcometeam2'] = outcome_team2
    
    return elo_ratings

# Extract unique teams
unique_teams = pd.concat([matches['Team1'], matches['Team2']]).astype(str).str.strip().unique()

# Initialize Elo ratings dictionary
elo_ratings = {team: 1000 for team in unique_teams}

# Initialize Elo ratings columns in the matches DataFrame
matches['team1_rating'] = matches['Team1'].astype(str).map(lambda x: elo_ratings.get(x.strip(), 1000)).astype('float64')
matches['team2_rating'] = matches['Team2'].astype(str).map(lambda x: elo_ratings.get(x.strip(), 1000)).astype('float64')
matches['team1_newrating'] = None
matches['team2_newrating'] = None
matches['team1_expected'] = None
matches['team2_expected'] = None
matches['outcometeam1'] = None
matches['outcometeam2'] = None

# Update Elo ratings based on matches data
elo_ratings = update_elo_ratings(matches, elo_ratings)

# Display updated Elo ratings
for team, rating in elo_ratings.items():
    print(f"{team}: {rating}")

# Print the updated DataFrame
print(matches[['id', 'Team1', 'Team2', 'team1_rating', 'team2_rating', 'team1_newrating', 'team2_newrating', 'team1_expected', 'team2_expected', 'outcometeam1', 'outcometeam2']])

# Save the updated DataFrame to a CSV file
output_file = 'euro20.csv'
matches.to_csv(output_file, index=False)

import matplotlib.pyplot as plt
import seaborn as sns

eloeuro_ratings = {
    'Poland' : 1055.6592497237868,
    'Russia' : 1015.082540029809,
    'Germany' : 1179.1503524980567,
    'Netherlands' : 1067.0523829713766,
    'Republic of Ireland' : 993.2949696678498,
    'Spain' : 1285.4389172340748,
    'France' : 1229.06843027853,
    'Ukraine' : 1034.6319061441015,
    'Greece' : 1003.6946280443649,
    'Denmark' : 1076.703968907782,
    'Italy' : 1264.2920667557817,
    'Sweden' : 1023.3393897093288,
    'Portugal' : 1268.5293250997686,
    'Croatia' : 1059.4434058365405,
    'Czech Republic' : 1038.7044912904296,
    'England' : 1263.7997108558698,
    'Albania' : 1007.3048822491611,
    'Wales' : 1081.476200471279,
    'Turkey' : 1055.1799307282065,
    'Belgium' : 1125.8874189209232,
    'Austria' : 1032.9587926998472,
    'Romania' : 1003.022074120873,
    'Iceland' : 1042.7358909479742,
    'Slovakia' : 1027.0930755516622,
    'Northern Ireland' : 996.7390620750681,
    'Hungary' : 1034.578215981436,
    'Switzerland' : 1140.4690022954724,
    'Scotland' : 995.2637226665787,
    'Finland' : 1006.0024814578551,
    'Slovenia' : 1014.4447087764835,
    'Serbia' : 1009.0422498580773,
    'Georgia' : 1014.72265084901,
    'North Macedonia' : 985.1939053026409
}

lo_df = pd.DataFrame(list(eloeuro_ratings.items()), columns=['Team', 'Elo Rating'])
output_file = 'euroelo4.csv'
euros.to_csv(output_file, index=False)

lo_df = lo_df.sort_values(by='Elo Rating', ascending=False)

# Print DataFrame to ensure correctness
print(lo_df)

The provided code implements this augmented ELO ratings model to evaluate and update the performance ratings of football teams based on their match outcomes. It begins by importing necessary libraries and loading match data from CSV files. The code identifies unique teams, calculates the number of matches played by each team, and initializes their ELO ratings at 1000.

The core functionality revolves around updating ELO ratings based on match results. The calculate_elo_rating function computes expected outcomes and ELO changes using the current ratings and a dynamically assigned k-factor. This k-factor, determined by the stage of the tournament, ensures that matches in later stages contribute more significantly to team ratings.

The update_elo_ratings function iterates through each match, retrieves current ratings, determines outcomes (win, loss, draw), computes ELO changes, and updates ratings accordingly. This iterative process is crucial for accurately reflecting teams' performances across different tournament stages.

After processing all matches, the updated ELO ratings for each team are displayed and saved to a new CSV file. These ratings not only quantify team performances but also highlight trends and strengths relative to tournament stages. For example, higher-rated teams like Spain demonstrate consistent success in critical matches, reflecting their ability to perform under pressure and strategically manage game dynamics.

Semi-Finals:

Spain v Portugal

England v Netherlands

Final:

Spain v England

Winner: Spain

Spain emerges as the favorite in the updated Elo rating model due to several key factors. Their consistent success in knockout stages, where they excel under pressure, significantly boosts their Elo rating. Spain’s tactical acumen, cohesive teamwork, and adeptness in executing their possession-based tiki-taka style contribute to their strong performance in high-stakes matches. This strategic approach not only controls the tempo of the game but also minimizes opponents’ opportunities, enhancing Spain’s chances of victory.

Augmented World Cup k-factor model

In comparing the old and new Elo ratings for international football teams, the adjustments made in the new model, particularly the incorporation of variable K-factors for different stages of the tournament, reveal nuanced shifts in team evaluations. Argentina stands out with a substantial increase in its Elo rating, rising from 1217.62 to 1378.95. This surge reflects Argentina’s strong performance in knockout stages, where crucial matches are weighted more heavily with higher K-factors. This adjustment highlights Argentina’s ability to deliver under pressure, influencing their overall rating positively.

The provided code leverages an Elo ratings system to assess and update the performance of football teams based on match outcomes from two datasets: one for World Cup matches (`wc`) and another for unspecified tournament matches (`cl`). Initially, the code imports necessary libraries and loads the match data from CSV files. It identifies unique teams involved in these tournaments, ensuring data integrity by stripping whitespace and handling non-numeric values appropriately.

The core functionality revolves around the Elo ratings calculations. It defines a `calculate_elo_rating` function to determine the expected outcomes and Elo changes based on the ratings of the competing teams and a specified k-factor. This k-factor varies depending on the stage of the tournament, with higher values assigned to more critical matches like finals and lower values to group stages or less decisive encounters.

The `update_elo_ratings` function iterates through each match in the datasets (`cl` and `wc`). It retrieves current Elo ratings for the home and away teams, determines the match outcome (win, loss, or draw), and updates the Elo ratings accordingly using the calculated Elo changes. The updated ratings and expected outcomes are logged back into the respective DataFrame columns (`team1_newrating`, `team2_newrating`, etc.) for further analysis.

After processing all matches, the updated Elo ratings for each team are displayed, demonstrating how well each team performed relative to expectations throughout the tournament. Additionally, the final Elo ratings are saved into new CSV files (`euro22.csv` and `wcelo4.csv`) for future reference or additional analysis. This approach not only captures the competitive dynamics of football tournaments but also provides a quantitative basis for assessing team performance and predicting future outcomes based on historical data.

Similarly, France has seen a notable increase from 1190.49 to 1344.23, underscoring their consistent success across both group and knockout stages. The new model’s recognition of France’s prowess in critical matches amplifies their rating, affirming their status as a formidable competitor. Germany, with its rise from 1166.41 to 1291.82, also benefits significantly from the new rating methodology, which emphasizes performance in high-stakes knockout games. These adjustments reflect Germany’s ability to maintain strong performances throughout tournaments, thereby enhancing their Elo rating under the updated system. Conversely, teams like Italy show minimal change (1010.70 to 1011.00), indicating consistent performance levels across all stages without significant fluctuations. Overall, the introduction of variable K-factors in the new Elo rating model enriches its capacity to capture the intricacies of international football dynamics, providing a more refined assessment of teams’ tournament performances and their corresponding Elo ratings.

Comparing New and Old ELO Ratings

In comparing the old and augmented ELO ratings for international football teams, significant adjustments are evident, particularly in teams’ evaluations during critical tournament stages. Teams like Argentina and France show substantial increases in their ratings, reflecting strong performances in knockout matches where higher k-factors are applied. This adjustment underscores their ability to excel in decisive moments, thereby enhancing their overall ratings significantly.

Europe’s Dominance

European teams’ dominance in World Cup tournaments can be attributed to several interconnected factors, including robust footballing infrastructure, historical legacy, and strategic development initiatives.

Firstly, European countries often boast well-established footballing infrastructures that support grassroots development, professional leagues, and extensive youth academies. These infrastructures nurture talent from a young age, providing systematic training and development pathways that cultivate skilled players capable of competing at the highest international levels. This structured approach ensures a consistent supply of talented players who are well-prepared for the rigours of elite football competition.

Moreover, European nations benefit from a rich footballing heritage and a deep-rooted cultural affinity for the sport. Football holds a central place in European culture, with passionate fan bases, strong club rivalries, and a long-standing tradition of excellence. This cultural backdrop fosters a competitive environment that motivates players to excel and achieve success on the global stage.

Strategically, European football associations and clubs invest heavily in youth development programs, coaching education, and sports science research. These investments not only enhance player skills but also optimize training methodologies and tactical innovations. The integration of advanced technologies and analytics further refines coaching strategies and player performance assessments, giving European teams a competitive edge in international competitions.

Additionally, the structure of European football leagues, such as the English Premier League, La Liga, Bundesliga, Serie A, and Ligue 1, among others, provides a high level of competition and exposure for players. The intensity and quality of these leagues prepare players mentally and physically for the demands of major tournaments like the World Cup.

In conclusion, the augmented ELO ratings models developed for both the UEFA Euros and FIFA World Cup tournaments represent a significant advancement in assessing team performance across varying stages of competition. By integrating variable k-factors that reflect the importance of matches, from group stages to finals, these models offer a nuanced understanding of team dynamics under pressure. Spain’s consistent rise in the rankings exemplifies how adept teams can leverage strategic prowess and tactical execution to dominate critical encounters. Moreover, the robustness of these models lies in their ability to capture not only historical performance but also predictive insights, paving the way for more accurate tournament prognostications and deeper analyses of football’s evolving landscape.