Inferring latent structure in player graphs

Published in

Social Media: Theories, Ethics, and Analytics

8 min readOct 19, 2020

Last time, we created a graph of player/weapon interactions in Destiny 2 and discussed measures of centrality. Let us now attempt to gain insight from the graph about players by attempting to determine play styles by weapon usage.

Determining player types in games is very much an open research question. Such methods can inform a game designer’s ability to balance their game and-or help designers retain players.

As a reminder, determining latent structure in data (especially without labels) is difficult for a variety of reasons ranging from the fact that people are not static (i.e. I can play a game in a multitude of ways) to the fact that the hopefully present latent structure (in this case, play-styles) can be an emergent property rather than something explicitly designed for (e.g. the core experience of playing Destiny is similar for each of the three classes resulting in nuanced differences whereas Hearthstone decks are constructed to feel and play dramatically differently deck archetype to deck archetype).

However, the fact that different classes and weapon types exist in Destiny implies that there exists different ways of playing the game. Playing the game with shotguns results in vastly different play patterns than using a sniper rifle. Therefore, we’re going to try to tease that out of the graph.

Since we are looking for a feature that is completely latent, there does not exist an oracle to provide an answer. This removes more traditional methods like supervised (and semi-supervised) learning. Instead while looking for latent structure, validating the findings will rely on “expert” intuition and/or replicating these results with different clustering techniques. If we were simply using the clustering as an embedding for a down-steam task (e.g. skill prediction), then we could treat our clusters as black-box embedding and simply validate on the final task.

Learning the latent structure of a player’s play-style could enable researchers/designers to learn about what types of players form teams that play well together (“well” is subjective, but it could be win-rate, self-reported enjoyment, etc.). If we knew how people play well together, then we could form teams of players who we hypothesize would be synergistic thereby providing greater engagement with the game.

As a reminder, here’s the bipartite player-weapon graph from previously:

Shrunk player-weapon graph Blue (named) nodes are weapons; red nodes are players.

And since we’re going to dive into the graph, here’s a list of the weapon node name abbreviations and what they mean. Unnamed nodes are unique players.

The way we’re going to measure clustering is through Modularity.

This idea that true community structure in a network corresponds to a statistically surprising arrangement of edges, can be quantified by using the measure known as modularity. The modularity is, up to a multiplicative constant, the number of edges falling within groups minus the expected number in an equivalent network with edges placed at random. [1]

And one can search for latent structure in a graph by partitioning the network such that modularity is positive (and ideally high). Mathematically, the modularity matrix, or Q, is defined [1] as:

A is the adjacency matrix; k_i is the degree of a node i; s is a binary vector that denotes a node’s inclusion/exclusion in a two-way partition (+1 if s_j lives in group A, -1 otherwise); m is the total number of edges in the network.

Note that Q has exactly the form a convex optimization problem type known as Quadratic Programming [2] (which is a very well studied class of problems that also envelop algorithms like Support Vector Machines [3]). Anyway, we’re going to use Q to detect clusters in our player-weapon graph. Specifically, since weapons define the primary method through which players interact with the game, we’ll be looking at how the weapons cluster together. Once you have weapon clusters, classifying players into those clusters could be as simple as a majority vote of what cluster is the player connected the most to.

We can decompose Q into its eigenvectors and eigenvalues, and then split the graph according to the leading eigenvector of Q. We then have two graphs, and we can recursively repeat the process until the modularity score maximizes [4]. Alternatively, we can stop the recursion when we reach a pre-defined k community, and then project these clusterings back onto the original graph.

If we apply community detection via leading_eigenvector decomposition, without indicating a stopping point, we can arrive at the following projection:

Eigenvector clustering (of the full graph) has 10 clusters and stops when modularity is maximized

Due to overlap in the layout algorithm, it’s hard to view the colors of the weapon nodes (which determine a play style), here’s a table:

It’s difficult to validate the above clustering since we’re looking at latent stucture. But from an “experts” point of view, at least three of these clusters make sense. Hand Cannons (HC) and Sniper Rifles (SR) share a cluster (mint green), and this is reflective of a true weapon composition in the game used by “top” players. Both weapon types are high-damage single-shot guns where HCs cover the close range encounters and SRs cover far range encounters. Furthermore, Linear Fusion Rifles are a weapon type unlike any other, therefore it makes sense they’d be their own cluster. Finally, Grenades being their own cluster also makes sense since the percentage of players who got grenade kills is less than 1%. However, it might also be expected that the singular clusters around AR, Side, and SMG would merge since the weapon types are similar (differentiating in their respective effective ranges).

Therefore, if we place a limit on the number of clusters found we might see some consolidation of individual clusters. With a cluster limit of 4, we achieve the following graph:

Eigenvector clustering (of the full graph) that stops when 4 clusters have been found.

In this case, AR and Side merge into one cluster the HC/SR cluster envelops grenades and SMGs as well (surprisingly).

For another example, in Gephi we can run a modularity classification on the graph, but what we get out might not be understandable and it’s not clear what method of partitioning Q Gephi is using.

Modularity classification with the Fruchterman Reingold layout algorithm

This has been a qualitative validation of the clusterings that relies on my understanding of the game. If we forgo some level of understandability and instead just use the clustering as input for a downstream problem, we can look for other ways of clustering the data based on more traditional machine learning techniques.

In that vein, there is one last method that I wanted to try but didn’t have time to. Normally an adjacency matrix, A, is a square matrix with shape |nodes| x |nodes| (in our case, 22637 x 22637). However, here we’re going to take advantage of the bipartite nature of the graph and cut a view into A that is of size |playerNodes| x |weaponNodes| (22617 x 20). We’re could then feed this smaller matrix into an algorithm like DBSCAN and see what comes out.

There are certainly reasonable objections to my methodology and also the clusters I’ve extracted. These are:

Because of the snowball sampling (see the previous post; linked at the start) I could have skewed my population to be reflective of players around the same skill as the initial few individuals I sampled (and given that they are players who are similar (at least according to Bungie’s matchmaking) these players could play similarly, and therefore I don’t have enough diversity).
These graphs are missing important semantic information (e.g. averageKillDistance, kill-death-ratio, timeInAir, etc.) that could differentiate between players who have the same weapon preference but use them differently.
I do not account for friends playing together in a pre-made team (and therefore presumably being able to coordinate) vs individuals playing on a team by themselves in this analysis.

Now, a reasonable question here is, so what? Why did we go through all this work? Well, in addition to now being able to cluster weapons together by how people are using them, we could also classify each player into one of these clusters by, say, a majority vote of how many ties the person has to each cluster. This then gives us a method to describe how each player uses the weapons in the game. This in turn can be fed to the design teams and balance teams to tune the game based on player engagement. Furthermore, we could feed this information to downstream tasks such as “If person A belongs to cluster B (which has some distinctive play style (say, they use LFRs), then who should else should we pair person A with to increase their odds of winning the match?” Although not as apparent as predicting protected characteristics (e.g. sex, gender, SSN, etc.), there are potential ethical issues here. Given that we can identify people based on their online-habits [5], could people be identified based on their play habits?

Furthermore, just because we can identify behaviours based on engagement statistics, should we? There’s certainly no end to interesting problems in this data. We have skill as a tracked metric. How do the graph clusters look different if we first partition the graph by a minimum skill level? Do higher skilled players use more or less weapons? Can you use a person’s playstyle to predict their skill level? Should skill-based information even be used in a matchmaking setting? The push-and-pull at least in the Destiny community waffles between skill-based and connection-based. But it certainly should be possible to build a system that does playstyle-based matchmaking and building such a system could enable entirely new games and game-modes to be created.

Code: https://github.com/aadharna/SocMed/tree/9041ffe72fc209ee07312fa60bd3c95b7712dd45

[1] Newman M. E. (2006). Modularity and community structure in networks. Proceedings of the National Academy of Sciences of the United States of America, 103(23), 8577–8582. https://doi.org/10.1073/pnas.0601602103

[2] https://web.stanford.edu/~boyd/cvxbook/bv_cvxbook.pdf Quadratic Programming: chapter 4.4

[3] https://github.com/aadharna/aadharna.github.io/blob/master/files/svm_derivation.pdf

[4] https://arxiv.org/pdf/physics/0605087.pdf

[5] https://www.theatlantic.com/technology/archive/2017/02/browsing-history-identity/515763/

Inferring latent structure in player graphs

Written by Aaron Dharna