Playing around League of Legends with Neo4j — Prologue

Jimmy Crequer
Neo4j Developer Blog
5 min readAug 22, 2019

It has been a while (few years actually) since I wanted to use Neo4j, but until now I couldn’t find an interesting dataset to get my hands on.

A few days ago, the League of Legends European Championship (LEC) 2019 Summer Regular Season just ended and this is the time, looking into playoffs coming up next week, to look back on what happened during the last ten weeks. I came across a site which gathers a lot of interesting information about the League of Legends competitive scene and decided to scrape some data from here.

The League of Legends European Championship

This post aims to import simple data and to build a small graph in Neo4j, then execute some Cypher queries and see it we can get some insights from the data.

1. Build the graph

1.1. Import the data

You can find the data in the Github repo.

Copy the 3 files into the import folder of your Neo4j installation, then run the following queries.

Note : I didn’t include the scraping part in this post because I am planning to write a dedicated post about it in the future (after I clean up my code).

The teams

LOAD CSV WITH HEADERS FROM "file:///teams.csv" AS row
MERGE (t:Team { nameShort: row.nameShort, nameFull: row.nameFull })

10 teams are participating in the 2019 LEC Summer Season, namely G2 Esports, Fnatic, Splyce, FC Schalke 04, Rogue, Team Vitality, SK Gaming, Origen, Misfits Gaming and Excel Esports.

The matches

LOAD CSV WITH HEADERS FROM "file:///matches.csv" AS row
MATCH (t1:Team { nameShort: row.team1 }),
(t2:Team { nameShort: row.team2 }),
(winner:Team { nameShort: row.winner })
MERGE (m:Match { date: row.date, team1: row.team1, team2: row.team2 })
MERGE (t1)-[:PLAYED]->(m)<-[:PLAYED]-(t2)
MERGE (winner)-[:WON]->(m)

91 matches were played this season so far over 9 weeks, with an additional tiebreaker game at the end. We link the match to the teams with a PLAYED relation and add an additional relationship representing which team WON .

The casters

LOAD CSV WITH HEADERS FROM "file:///casters.csv" AS row
MATCH (m:Match { team1: row.team1, team2: row.team2 })
MERGE (c:Caster { name: row.casterName, type: row.casterType })
MERGE (c)-[:CASTED]->(m)

Lastly, we create the caster nodes and add a relationship to the matches they CASTED .

1.2. Check the graph

The schema is very simple, so let’s look a bit closer to one match.

MATCH (m:Match)
WITH m
LIMIT 1
RETURN m, (m)--(:Team), (m)--(:Caster)
A single match and its relationships

A match (purple node) was played on June 7 by Splyce and G2 Esports (orange nodes), and eventually won by G2 Esports. Ender and Quickshot (blue nodes) were casting this match.

2. Sample Queries

2.1. Number of matches casted per Caster

MATCH (m:Match)
WITH COUNT(m) AS numberOfMatchesTotal
MATCH (c:Caster)-[:CASTED]-(m:Match)
RETURN c.name AS casterName,
COUNT(m) AS numberOfMatchesCasted,
COUNT(m) * 100.0 / numberOfMatchesTotal AS percentageOfMatches
ORDER BY numberOfMatchesCasted DESC

Looking from the bottom of the table, PapaSmithy and Foxdrop actually casted only one week worth of matches, so we expected to see them at the bottom. Though, I was surprised that Quickshot casted so few matches compared to the others casters (20 matches compared to 30+ matches of the top 5 casters).

2.2. Teams casted by Medic & Vedius

MATCH (:Caster {name: "Medic"})-[:CASTED]->(m:Match)<-[:CASTED]-(:Caster {name: "Vedius"})
WITH m
MATCH (t:Team)-[:PLAYED]->(m)
RETURN t.nameFull AS teamName,
COUNT(*) AS numberOfTimesCasted
ORDER BY numberOfTimesCasted DESC

In League of Legends, usually two casters comment together and the better their synergy, the more they make the game entertaining. I have a personal preference for the “MediVedi” duo, composed of Medic and Vedius.

I was interested to know if we could find something interesting among the teams they casted together. We can see that they casted more SK Gaming and Origen compared to Excel Esports and Team Vitality.

2.3. Matches which had a tri-cast

MATCH (t:Team)-[:PLAYED]->(m:Match)<-[:CASTED]-(c:Caster)
WITH m,
COLLECT(DISTINCT t.nameFull) AS matchup,
COLLECT(DISTINCT c.name) AS cast
WHERE size(cast) = 3
RETURN m.date AS date,
matchup[0] + " VS " + matchup[1] AS Match,
cast AS Cast
ORDER by date ASC

Since tri-cast usually happen when a match is important, it is understandable that the more you progress into the season, the more matches matter. By looking at the results from the Cypher query, we notice that the number of matches that was tri-casted was higher in August, which also corresponds to the last month of the season.

Noticeably, Vedius was part of 14 among the 21 casting trios.

2.4. Casters who never casted together

MATCH (c1:Caster),
(c2:Caster)
WHERE id(c1) < id(c2)
AND NOT( (c1)-[:CASTED]->(:Match)<-[:CASTED]-(c2) )
RETURN c1.name,
c2.name

Because Neo4j is built around relationships between entities, we can also look for relationships that don’t exist. This is useful when building a recommendation engine for example.

In this case, “casters who never casted together” means that there is no relation (c1)-[:CASTED]->(:Match)<-[:CASTED]-(c2).

3. Conclusion

Using Neo4j to build a graph from simple data, I was able to play around simple Cypher queries and get some interesting insights from the data. Some of the queries are, I think, actually very hard to do in SQL, but that would deserve its own post.

In the next steps, I will build a more complex graph using richer data such as players, champions, picks & bans, items builds, detailed stats for each match, and add other regional leagues and international competitions to do even more interesting analysis.

--

--