The Treasure Chest of League of Legends: Riot’s APIs

10 min readAug 19, 2017

In the previous post, we’ve explained how limited the meta-game knowledge was online. We will now see how we can try to answer some of the questions regarding the meta-game by getting a massive load of matches, using Riot’s APIs, and what the current limitations are.

a visual representation of Riot’s APIs’ goldmine

The Scale of League of Legends

Riot has created a truly massive success with League of Legends, with literally more than a hundred million monthly players. Handling that many users is already a serious challenge out of itself, but Riot steps up even further. Riot saves a ton of data for its players. Player data, like profile information of course, but also mastery pages, runes, key bindings, ranks, match history, chat history, teams, and even replays, and probably a lot more. All of this to make the player experience the best.

This is an insane amount of data. We’re talking hundreds of millions of matches saved each months; billions, if not tens of billions of matches per season, and maybe more than a hundred billion matches since the game’s creation. Imagine the size of the book that would contain the chat logs of a hundred billion matches. Even saving 1% of that data is already crazy. But the craziest part is that Riot gives access to some of this data — this treasure trove — to third parties that wish to develop applications related to League of Legends. Oh and it’s free to use.

Of course, there are limitations. Riot does not give you unfettered access to their data, but only to a subset that they consider an acceptable trade-off between the player’s privacy, game fairness and the possibilities offered to third parties. For instance, it is considered acceptable to have access to the ranked results of a player, even if you are not him/her. Normal games are considered private, and therefore only players that have granted you the right to use that data should be able to use it. (this is not possible at the moment, but is apparently in the work). However, where a player puts his ward is not considered an acceptable data point, because you could abuse it to expose where enemy players ward and where they are blind, which could result in an unfair advantage.

Riot’s APIs

The way to get that filtered data is through a list of APIs with clearly defined models for inputs and outputs. There are 49 Operations regrouped in 12 sets based on their purpose. For instance there are 3 APIs revolving around summoners, 1 for finding summoners using their names, 1 for finding summoners using their summonerIds, and 1 for finding summoners using their accountId. In this article, we’ll present 3 core sets of operations: the Match-v3, Summoner-v3, and League-v3 sets.

League v3

It is possible to get the list of challenger and master players directly from the Developer APIs. The main argument required for these endpoints is the queueId corresponding to queue we want the challengers of. For instance, the ranked soloQ queueId is 420. If we use that queueId with these endpoints, we’ll obtain the list of challengers or masters for soloQ. However, there is one minor issue with the data returned by these operations: the data for each player only includes the summonerId, and not the accountId (more on the difference between the two in the next section), which would be fine, if not for the fact that the match operations need the accountId to work. This means that any data we get from that endpoint needs to be completed using the Summoner v3 set of operations.

Summoner v3

There are three ways to get information about a summoner, as there are 3 core descriptors of a summoner.

The first descriptor is the name of the summoner. Using the summoner’s name, it is possible to retrieve the associated account (remember that summoner’s name are unique across regions). This is the purpose of the /by-name endpoint. This endpoint is mostly used by 3rd party services where the user enters its summoners name to get detailed information about themselves, like op.gg or lolking.net.

The second descriptor is the summonerId. It is the id associated with the account inside League of Legends (it looks something like this: 48828232). This endpoint is mostly used to refresh data about already know summoners.

Until relatively recently, the summonerId was the sole identifier necessary to work with the APIs. However, Riot introduced a new identifier called the accountId, which should be unique across all of Riot Games game(s). Quite a few endpoint have migrated to using this id instead, as we’ll see further below.

You can find more information about the summoner endpoint here.

Match v3 — matchlists

There are two ways to get a list of match from the Riot Developer APIs. The first is to get the recent matches using the /recent endpoint, which will return the last 20 match played by a given accountId. This endpoint is rather simple to use, but it has severe limitations that can make using it painful. First of all, this will return the past 20 matches for all queues that exist, so you’ll have a mix of solo and flex queues, you’ll have games not only for summoners rift, but also for Twisted Treeline. The /recent endpoint will also return the ids of normal games, which are anonymized, and bot games. Basically, it’s a clusterf*ck of possible type of matches that’s relatively painful to make any sense of.

The second endpoint is much more powerful, as it gives you the ability to filter by queue, champion, starting date, end date, and more. However, since you can get a lot more information about a player, you can only get his ranked data. However, the tradeoff is worth it, as it allows you to have so much more control on the data you get. This endpoint is the endpoint to use if you want to do data science with League of Legends, whereas the other endpoint seems to be designed for services that aim at improving the match history that Riot provides.

You can find more information about the match list endpoint here.

Match v3 — match

Given a matchId, it is possible to obtain detailed information about its associated match. It really is quite detailed. The match object returned by this endpoint contains among other things: The bans, and their order, which team got first drake, herald, turret, baron, elder, inhibitor, the runes and masteries of every player, which lane and side they were playing, what their total gold was, what the gold and xp difference was head-to-head for each player, in slices of 5 minutes, counter-jungling stats, number of wards dropped and purchased, how many quadra and pentakills, etc. It really is a goldmine.

There are so many interesting features that this endpoint is enough for most analysis you could think of. For instance, you can chart objective control by champion, determining which champions have the most impact (if any) when it comes to dragon control, baron, or turrets. You can extract core traits about champions from the participant timeline object. For instance, does this champion generally have more cs than its counterpart during the first 5 mins, 10 mins, 15 mins? Or does this champion have more minions in the enemy jungle than average? You can do analysis on the masteries too! For each champion, which top mastery yield the highest win rate, if any is better? What’s the overall distribution of the highest winrate top mastery for champions? (aka, what’s the most overpowered mastery?) How does using the highest winrate mastery correlate with champion representation? (aka, does this mastery break the game?). And these are questions that I came up with in less than a minute; there’s plenty left.

You can find more information about the match endpoint here, and you should check it out, seriously.

Match v3 — timeline

The match data from the previous endpoint is only helpful to get a general idea of what a typical match looks like in terms of compositions, and relative power of compositions. Mind you, it is extremely helpful in that regard, but it can’t give you clear indicators about the “shape” of a match. To get information about the general progression of a match, you’ll need to have access to the timeline of the match. The timeline is a pretty complete list of events that happened during the game (e.g. a ward was dropped by player X at moment T1, or player Y killed player Z with the help of player X at moment T2), as well as snapshots of the state of each players for every minute of the match.

If the previous endpoint was insane in terms of how much you could learn about the meta-game, the /timelines endpoint is probably even crazier for learning about how a game evolves. There are events relating to champion kills, elite monster kills, ward drops, ward kills, skill level up, item purchases, sells, undos, as well as item destructions (e.g. you know when consumables are used!), and finally you also have access to the time each building is destroyed. And that’s only events, you also have the participant snapshots, where you have their position, their gold, xp, cs, jungle cs, and level for every minute of the game. Here are some examples of applications for this data: When does each role do its first back? its second? What’s the average GPM for each role over time? What about the XP? How much gold do people have when they back? When do players drop their wards? and when are they blind? How long does a control ward last on average? When is the first drake taken depending on the type of drake? Is there a gap between the respawn of the drake and its next death? How long is it depending on the dragon type? When does each role shine in terms of gold and XP? How much does a death really cost? You can also try to know when the support roams, or what the ganking dynamics are (e.g. when does mid focus top or bot, etc.). What’s the winrate associated with a first blood in the 3rd, 4th, 5th, 6th minute? How many games have a gold advantage of X at time T? How does a gold advantage of X at time T evolve over time? How does a gold advantage of X at time T translate into winrate?

This timeline data is awesome, but also strongly restricted. In the past, this endpoint also used to include sub elite monster kills like blue buff and red buff, which meant that you could have a general idea of how large of a gap you had to counter-jungle these camps. However, the team in charge of this endpoint removed these events, probably over concerns that one could easily detect if a player used weird routes that delayed the taking of the buffs, and exposing them to counter jungling strategies that would not be used otherwise (e.g. this could provide an unfair advantage). You’d also get information about the wolf spirit ward spawn on smite. These privacy concerns are also the reasons why the position of the dropped wards are not provided (to avoid exposing the “sweet spots” that players favor for their wards).

You can find more information about the timeline endpoint here.

Interacting with the APIs

The use of these APIs is actually not completely open to every one. In order to use them, 3rd parties must submit their applications for review to Riot, which will determine whether the intent of the application is good or not. Once an application is accepted, it is granted an API key which needs to be included with every request to determine which application is asking for what. These API keys are also used to limit the rate at which applications can request information from the API, so that 3rd party applications can’t overload the system and impact each other, or worse impact the internal services of Riot. We’ll go a bit more in depth about these rate limits in the next article.

Conclusion

In this article, we’ve introduced a few of the endpoints which are accessible using the APIs Riot exposes to third parties, as well as the general rules that one need to respect to use them (respect of privacy, no unfair advantage, and rate limits). In the next article, we’ll see how to build a data scrapper using these APIs.