Player Graphs in Destiny 2

Published in

Social Media: Theories, Ethics, and Analytics

6 min readOct 4, 2020

Determining player types in games is very much an open research question. Such methods can inform a game designer’s ability to balance their game and-or help designers retain players (note: there are different papers on hyperlinked of the previous words several of them from 2020, and this paper from last month: Player-Centered AI for Automatic Game Personalization: Open Problems is particularly relevant.).

Given that games are extremely expensive to make and that live games (games that are supported for years) are becoming more the norm in the industry, it makes sense that developers would want to create systems that can help retain players via e.g. automated personalization of game content.

One method of doing so would be to determine types of players and then provide players with content similar to the content they “enjoy” playing that way people continue to return to your game.

Graph Structure, Visualization, and Statistics

Data:

Data was captured from the Bungie API using snowball sampling. I started with a single destiny player (myself), and pulled down my last 200 competitive PvP matches. I scraped out data about how I performed in those matches and then pulled down 200 matches for each other player I played with. The data was stored in a CSV file with nested indices of matchID, teamID, and playerID where each row of the dataset is one person’s statistics for a given match.

Structure:

To make the flat data above into a graph, we have two types of nodes. Weapon nodes and Player nodes. Player nodes are naturally the player’s unique ID, the weapon node is what type of weapons the person used in each match of collected data.

verboseEdgeList = [(PlayerId, WeaponType), ... for each weapon for each player for each match]

Then the edges are combined into a single weighted edge that combined repetitive edges (i.e. if two edges are the same, combine them and increase the edge’s weight).

edgeCount = defaultdict(int)
[edgeCount[(PlayerId, WeaponType)] += 1 for (PlayerId, WeaponType) in verboseEdgeList]weightedEdges = []
for (PlayerId, WeaponType), w in edgeCount.items():
    weightedEdges.append((PlayerId, WeaponType, w))# weightedEdges
# [('aadharna', 'PR', 180),    # pulse rifle, 180 games used
#  ('aadharna', 'Shot', 188),  # shotgun,     188 games used
#  ('aadharna', 'Melee', 132), # melee
#  ('aadharna', 'Sup', 138),   # super
#  ('aadharna', 'sword', 78),  # sword
#  ... ]

We can then use these pieces to actually define our graph using the python-igraph library. (Note, I am planning on switching to NetworkX shortly since graphs in NX will naturally be accepted into neural network graph libraries.)

g = igraph.Graph.TupleList(weightedEdges, weights=True)

Visualization:

Edge weight is determined by weapon usage for the player, run through the natural log (since the distribution is highly skewed), normalized, and shifted by 0.25 (to ensure that lightly weighted edges are still visible in the graph)

Weighted bipartite player-weapon graph. Red nodes are players; Blue nodes are weapons. This is a subset of the whole extracted player population since Medium wouldn’t allow the entire graph here.

Since players only connect to weapons the graph is clearly bipartite. The types of nodes are differentiated by colors in the visualization, but the layout algorithm (reingold_tilford_circular) decided to separate out some of the player nodes since, by inspection, they do not use hand cannons (which are the most prevalent weapon type used in the competitive playlist).

Statistics:

Since the above graph is bipartite, the diameter (longest path in the graph) of the graph is naturally small, but surprisingly, the diameter is 6.

We can further observe the degree distribution of the big graph:

Degree distribution of the graph with the entire player population. The mode of the distribution (which looks to be a Rayleigh distribution) is at 4 saying that the mode player uses 4 unique weapons types/abilities.

In addition to getting the degree distribution of the (big) graph, we can also get the (eigenvector) centrality of each weapon node.

Node centrality for each weapon type. Centrality seems to be a reasonable stand-in for the type of weapon the community determines to be the best.

Weapons that are central to the graph are likely “meta” weapon types. Meta weapons are the types that the population has determined to be the best and therefore use. Centrality as a stand-in for what the community has determined to be the “best” is a reasonable assumption since this data is drawn from the competitive playlist where people match against people at their skill level. Furthermore, since the competitive playlist has unique rewards that you can only get for reaching a specific rank people make specific choices about the weapons they use to get as much of an advantage as possible. As such, there will also (probably) be clusters inherent in this graph describing playstyle. This idea is further supported by the degree distribution graph saying that many people use only a few weapons/abilities.

From the outset, you can infer two types of players those being specialists (who use very few weapons, e.g. 4) and generalists (who use many, e.g. above 10). Furthermore, the fact that there are 4–5 central nodes in the graph, supports that the majority of people use a few weapon types. The generalist, however, is a far more unique and rare player.

Rather than rely upon “expert” knowledge (from myself), it will be interesting to apply a graph proper clustering algorithm to this graph and see what it comes up with.

Limitation of this method

One natural limitation of this method is that in this graph, I am only considering one mode of the game. Since weapons in Destiny are designed to each fill a different niche (with some overlap), it is reasonable to imply a playstyle from the weapons a person uses. However, this graph is only taking into account the weapon type and not other important data e.g. average kill distance, time in air, deaths, or TrueSkill rating (a Bayesian version of ELO from chess invented at MS research and used in Bungie games since the Halo days) etc. This missing data is important in qualifying a playstyle. For example, people in the top skill bracket of players often “play the game in the air” unlike normal players. Adding in the above data into the graph would require the graph to have multiple types of edges and nodes.

Ethical Concerns

So, the sampling technique I used has some inherent potential ethical issues. For example, is the extracted data biased due to the first individual that I sampled data from? Is the data also biased towards a single platform (in this case, yes, the PC)? Is the data drawn equally from different parts of the world? Inherently, we can say no here since not all of the world has disposable funds to use to buy and play video games.

Furthermore, ethical issues that might be enabled by systems that seek to automatically classify players given the data- and engagement-focused metrics that get tracked for every player. If the play patterns of “influential” players were tracked (which they are), such information could be used to determine if someone is likely to leave. And if it was detected that they were going to leave, it is easy to imagine a system that looks for a person’s playstyle and then provides them a “deal” on a purchase that helps bolster their character’s playstyle thereby infusing a shot of new vigor for that person in the game.