The heartfelt story of me building a League of Legends win interpreter for hard-stuck Silver II players (100% not me)

Liam Isaacs
Mar 22 · 9 min read

INTRODUCTION

League of Legends is a 5v5 team-based game where each player can select a champion (out of 154 options) and role (top, jungle, mid, adc/bot, support/bot). The game’s competitive scene is vibrant and global, with a *definitely not toxic* ranked match-making option where players can compete amongst themselves and climb the ELO ladder to rank up. I wanted to see if a machine learning model could answer the common question players might have about their matches: “what did I do this game that helped or hurt my chances of winning, and how much did each thing that happened matter?”

Before I go any further, please try it out yourself here — note that the algorithms sweat and grind to compute, and the rate limits of the API, so in the beta stages the algo does kind of crash if more than 2 people are using it 😰😰

This blog post will be more 💫journey-oriented💫 and less in the nitty-gritty with 🐍robot snek🐍 (python3, I prefer to refer to it this way), but please feel free to look at all the code here.

How to grapple with big ambitions?

On the night of March 8th, I found myself in a similar state of mind to how I start most of my projects: the crash after a brainstorm, the slow, saxophone-playing-on-a-lonely-street pain of trying to actually do the thing.

I had thought it’d make more sense if I tried to “sketch” an image of what I wanted before ironing it out with algorithmic finesse.

Liam’s first sketch (possibly ever) of a web app, circa March 2021

Even after this, I struggled to see which part of the display to start with. I decided to just try to make the first one-line part of the match display using Flask/Django, html with some javascript. No machine learning, no algorithms, just EDA.

Reading player data, passing to app.py

player data-->app.py

We start by defining player data

#in a .py file that's not app.py#playerdata.py
from riotwatcher import LolWatcher, ApiError
class game_info(): def __init__(self, api_key, name, region, game_id):
self.api_key = api_key
self.name = name
self.region = region
self.game_id = game_id
watcher = LolWatcher(self.api_key)
self.user = watcher.summoner.by_name(region, name)

def match_data(self):
watcher = LolWatcher(self.api_key)
self.matches = watcher.match.matchlist_by_account
(self.region, self.user['accountId'])
self.this_match = watcher.match.by_id(self.region,
self.game_id)

def get_data():
n = []
for row in m['participants']: #for player in match
m_row = {}
m_row['kills'] = row['stats']['kills']
#so on and so forth for every piece of data you want
n.append(m_row)
return n

n = get_data()

We can then do something like

#player data-->app.py#in app.py
from models.playerdata import game_info
from riotwatcher import LolWatcher, ApiError
app = Flask(__name__)@app.route('/initial_search_page', methods=["GET", "POST"])
def riot_api_call():
#get search input, like 'Doublelift'
if request.method = 'POST':
form = request.form
for key in form:
name = form[key]
region = 'na1'
user = watcher.summoner.by_name(region, name)
matches = watcher.match.matchlist_by_account(region,
user['accountId'])

game_ids = []
game_amount = 3
for i in range(game_amount):
game_ids.append(matches['matches'][i]['gameId'])

dfs = {}
for game_id in game_ids:
dfs[game_id] = game_info(api_key, name, region, game_id).
match_data()

Since game_info() is a class with the match_data() method, we can use that on any number of games. Object-oriented programming has graced us once again.

Creating the form page

  • initial_serach_page.html

We create a form to get the name we will search, like “Doublelift”

<form id="riot-api-form" action = "{{ url_for('riot_api_call') }}">
<input placeholder="Doublelift..." type="text">
</form>

And let’s say that the riot_api_call() thing from above loads another html page,

@app.route(‘/initial_search_page’, methods=[“GET”, “POST”])
def riot_api_call():
...
...
return render_template('public/separate.html', dfs = dfs)

In our separate.html we can just run a for loop to display each match.

Here’s a quick snippet of the html:

{% for a in dfs: %}
{% set D_df = dfs[a] %}
<div class="gamelist">
<div class="gameitem">
<div class="gameitem {{ 'win' if
D_df.loc[D_df['summonerName'] == name]['win'].values[0] ==
'胜利' else 'lose' }}">
<div class="content">
<div class="gamestats">
<div class="gametype">
{{ D_df.loc[D_df['summonerName'] == name]
['gameMode'].values[0] }}
</div>
<div class="timestamp">
{% if D_df.loc[D_df['summonerName'] ==name]
['lastGamePlayedWhen'].values[0] != 0: %}
{{ D_df.loc[D_df['summonerName'] == name]
['lastGamePlayedWhen'].values[0] }} days ago
{% else %}
today
{% endif %}

After enough back-and-forth, way more than Jerry Seinfeld (I do not watch Seinfeld — is this the sort of reference appropriate?), I got to

Going from match_data() to my take on OP.gg’s one-line statistic

Display advanced stats on click

I started by writing the javascript needed to fade in and out a html div on click

var show_details = document.querySelectorAll('.game_deets')
var btn = document.querySelectorAll('.content')
for (let i=0; i<btn.length; i++){
btn[i].addEventListener("click", function() {
if(show_deets[i].dataset.state !== 'faded_in'){
unfade(show_deets[i])
} else {
fade(show_deets[i])
}
});
}
//where fade, unfade are functions to change opacity

To define game_deets, the html is basically a table

My take on op.gg’s advanced statistics table

How do Web Apps work as story-tellers?

Here’s at the point where we meaningfully diverge from mimicking the function of OP.gg, blitz.gg, apps like this. It’s a big moment✨✨

What I want to say before my meaningful departure from the status quo:

Web design focuses on reconstruction. When we design a web app for an experience taken from somewhere else, we re-tell the story of that experience; for instance, ordering Papa Johns through their app is just a brief version of the old days where you had to GO and SEE Papa John 🥵

In our context, a League of Legends Web App for displaying a game statistic is re-telling what happened that game, without having to play it. OP.gg is really interesting because it splits the re-told story of a game as 2 layers: (1) a brief summary; and (2) advanced stats below.

In my re-telling of the narrative, I chose not to focus on Tier or a score of 1st-10th, both of which OP.gg has. On one hand, it makes sense that Tier is a good intuition for win/loss, since if you are matched up against a player 5x your rank, you will most likely lose. On the other hand, knowing that is not always a helpful piece of information: more often than not, it’s used as “I lost because this player was Diamond and I’m Silver”, not as “I lost because the Diamond player did ___ to beat me.” An algorithmic linear regression (lr) score of 1–10 based off features of the game has the same effect, you reflect on yourself thinking: “Oh I got 1st, I do not know why but I will just do more of that” or “I got last what a useless game”. Both of these factors are judgements that do not help you critique yourself.

There are two glaring problems with this sort of analysis: (1) you reduce your “story of a game” to KDA, dmg to champions, wards, CS, items; and (2) you have no idea how much these features actually matter — that’s what the linear regression score on OP.gg tries to do, but since it comes across as a black-box model with no coefficients, it’s not entirely doing that.

The Machine Learning model, and the struggles of not having data “judge” the player

The idea behind the algorithm I’m about to show you isn’t efficiency, elegance, it’s just the humble desire to make a machine learning algorithm that isn’t about judging people for no reason.

Having just made another app that judges movies for no reason, I started off on a bit of the wrong foot. There’s some fun about accidentally writing an algorithm that for weeks keeps scoring 50 shades of grey abnormally high, but this is because you immediately think of “how stupid this algorithm is being, how come it doesn’t rate Twilight movies as high?”

Here, though, I’m trying to be serious — there’s absolutely nothing funny about League of Legends. What that means is people who use this algorithm will not immediately question it. It makes sense then to make the algorithm transparent.

I thought to use a logistic regression model, one that can deduce what and by how much a set of features mattered to a given result (like win/loss). Coupled with SHAP force plots, we can see what’s pushing the model’s result back & forth, as if to say to the player “I’m not saying do this or do that but this is what might have done that.”

An example SHAP plot for a Kaisa game I played yesterday

Now we can see the algorithm’s opinion about this game. 5 deaths? Kind of was good not to die so much this game. Maybe should’ve taken rift herald, or bought less wards. Looks like vision control might have been valuable.

The last part was to add-in the What is this? functionality. I wanted to actively tell the person where the data is coming from. In order to do that, I had to collect it.

ENDING WITH THE BEGINNING: Data Collection

Having started at the end, let’s end at the beginning. I ended this project doing data collection. Yes, that’s right. Sorry, I have to self-affirm and flex 🦦🦦

The input to our model is crucially important. Although I built it initially just with some random dataset with 100k Korean challenger games, I quickly realized: there’s no way games at the highest level of play will be helpful to a person in NA in Silver II; and sure, 100k games is great, but how much do I gotta over-sample user games to really personalize these results?

After talking with one of my wonderful professors (since Liam’s stats … not so good), we thought that some kind of collaborative filtering would be helpful. I walked away thinking “what the hell is collaborative filtering”, but it means grouping the player with players like them. So, this goes all the way back to the start of this essay when we’re talking about League of Legends is 5v5. You can pick one of 154 champions and a role. You are ranked in a division by tier (IRONIV, SILVERII, for example). These are all ripe categories for filtering, but we need a healthy dataset that has that information.

I will spare you the details, but I spent around a week constantly running some code to get data (~1000 games) for each division by tier. I uploaded the dataset to Kaggle here. The data collection code is here.

From there, we can just “filter” the input data to the model for each game we want a SHAP plot for.

In summary

What this model does well

  1. works well for mid to low elo junglers playing meta champions (where’s there’s a healthy amount of data that can be easily filtered collaboratively);
  2. can take into account and quantify how much a feature matters in a given win/loss;
  3. can take into account a given player’s play style;
  4. can provide pretty good feedback — ward less, take more dragons, that sort of thing.

Drawbacks of the model:

  1. data volume is quite low — the more data, the better — and for smaller samples the model is just going to overfit;
  2. data collection pipeline requires a lot of maintenance since the game updates constantly (part of this is a drawback to API rate limits more than the model);
  3. data model takes a long time to run and has the capacity for only a few people to use at a time (due to rate limits but also general model design — the project might be better use to cache user data in a database or something)
  4. 🌑🌑🌑WE NEED A DARK MODE🌑🌑🌑

— — — — THANK YOU

Thank you for reading. Please find the app here

Interested in working together?🦥 Please contact me @ liamnisaacs@gmail.com !

Want to see more of my work?🦫 See @ liamisaacs.com

Note: this project was completed as part of the Metis data science intensive 3-month bootcamp program, defined by a focus in project-oriented skill application of machine learning, statistical design to my own inquiries about the world and its data.

Analytics Vidhya

Analytics Vidhya is a community of Analytics and Data…

Analytics Vidhya

Analytics Vidhya is a community of Analytics and Data Science professionals. We are building the next-gen data science ecosystem https://www.analyticsvidhya.com

Liam Isaacs

Written by

叶秋 pianist & data scientist, liamisaacs.com

Analytics Vidhya

Analytics Vidhya is a community of Analytics and Data Science professionals. We are building the next-gen data science ecosystem https://www.analyticsvidhya.com

Medium is an open platform where 170 million readers come to find insightful and dynamic thinking. Here, expert and undiscovered voices alike dive into the heart of any topic and bring new ideas to the surface. Learn more

Follow the writers, publications, and topics that matter to you, and you’ll see them on your homepage and in your inbox. Explore

If you have a story to tell, knowledge to share, or a perspective to offer — welcome home. It’s easy and free to post your thinking on any topic. Write on Medium

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store