Analyzing Steam tags show the game features and themes that are most successful

Nick Yee
Nick Yee
Sep 4, 2019 · 6 min read

We recently linked up the game titles from our data set with available metadata from places like Steam and IGDB. As part of this, we’ve been playing around a lot with Steam tags. In this blog post, I’ll show you what happened when we tried to visualize how Steam tags are related to each other.

A quick word on Steam tags

For every game on Steam, gamers can attach tags (i.e., keywords) on its webpage. The interface provides an autocomplete suggestion as you start typing, but users are allowed to enter any character string. So for example, on Europa Universalis IV’s page, the top tags are “Grand Strategy”, “Strategy”, “Historical”, etc. Steam shows the top 20 tags for every game, and the exact count of each tag can be found on SteamSpy.

To generate the data set, we looked up the Steam tags for all the game titles that have been mentioned at least 5 times in the Gamer Motivation Profile (with data from over 350,000 gamers) and exist on Steam — which came out to be 2,129 game titles. The Steam tags data we analyzed was gathered in mid-December 2017.

Defining tag relationships

There isn’t one “right” way of defining how close or similar two things are. For example, if we were to draw out a social network for someone, that graph would look different depending on whether we defined closeness as the length of each relationship, how much you cared about each person, how often you interact, or how geographically close you are.

The same is true here for the Steam tags data, and we present one reasonable approach of analyzing the data. In our analysis, we defined tags as being “close” if they tend to appear together across games at similar proportions. Or put another way, as we look at how Tag A is used across all the games, which other tags are used in the most similar proportions in those games?

Data processing notes

There’s a lot of data processing that goes on in any big data and network analysis, and we present the details here for data science folks or those who are curious. Others should feel free to skip this section.

Final Tally: We started with 321 tags across 2,129 games and the cleaned data set consisted of 279 tags across 2,070 games.

Visualizing Steam tags: The basics

The network graph shows the strongest relationships for each tag. Here are the basics.

Dots represent tags. The bigger the dot/label, the more often that tag appears on Steam.

Lines represent how closely-related two tags are. The thicker the line, the higher the likelihood of appearing together in Steam games at similar proportions. For each tag, its most salient relationships are shown.

Want a hi-res version? You can download the hi-res version here. It’s 7500 x 7500 (1mb).

The layout algorithm tries to make every edge visible and about the same length. This means there aren’t hidden edges between overlapping dots (e.g., there isn’t a line hiding between “Space” and “Turn-Based”).

Colors represent local communities of highly-related tags. A community is a set of tags that form a cohesive subgroup via shared linkages, similar to identifiable cliques in a high school cafeteria. We identified 17 communities with more than 3 members, and each of these was given a different color.

Proximity between dots (if they are unlinked) does NOT indicate a relationship. Similar to how metro maps prioritize stop sequences rather than the actual distances traveled, the network graph optimizes linkage layout. For example, on the right edge of the map, “Hunting” is close to “Top-Down Shooter”, but because they are unlinked, their relative proximity is not an indication that these two tags are related.

Some highlights to jumpstart your own exploration

There’s a lot going on in the chart, but here are some observations to help you explore.

Broad, mainstream tags are more central; niche tags are more peripheral

Because the most common tags tend to co-occur with other common tags, these tags are drawn together to form a dense, inner core. As the graph generation algorithm untangles all the knots, the graph quickly establishes a hierarchy from broad, mainstream tags to niche, granular tags. While the most generic tags are in the middle of the network (e.g., “Action”, Shooter”), the more niche and granular tags lie further away in the peripheries (e.g., “Romance” at the top).

Island nations

Isolated tags form islands in the edges of the chart. These tend to be niche tags that are not well-connected to the main network. There are 9 islands in the chart, and 2 specific islands are worth pointing out. The “Superhero” island is notable for having multiple relatively-frequent tags that are nevertheless disconnected from the main network. And the “Board/Card Game” island has the distinction of being the only island with more than 3 nodes. The more nodes a community has, the more likely it will be connected to the main network. So it is rare to find large islands. This implies that these two groups of Steam tags (and their associated games) are very conceptually distinct from most video games.

Thick connections are support beams for local communities

The thickest edges within each community reveal the key features that anchor the community, like the support beams in a building. For example, the “Visual Novel” community is anchored by the beams related to “Anime-Romance”, the triangular beam of “Nudity-Mature”, and the beam of “Choices Matter-Multiple Endings”. In this sense, the chart visually distills genres into their key ingredients.

Next-door neighbors reveal pivot points

Even though they are both in the Strategy genre, the Turn-Based Tactical community (green cluster) is distinct from the Economic Base-Building community (red cluster). Moreover, there are surprisingly few cross-over points between the two communities — they are held in close proximity by other nodes in neighboring communities. If you look closely, there are only 3 bridges between these neighbors: Medieval-Historical, RTS-Base-Building, and RTS-Economy. This provides a guideline on tried-and-tested ways to expand into a different community of players.

It’s a roadmap of the most successful recipes

As an aggregation of Steam tags across the roughly 2,000 most popular Steam games, the chart creates a roadmap of game features and themes that have been successful combinations. Starting with every dot, the nearest 1-hop tags represent the best bets to make in terms of both gamer expectations and combinations that have proven to be successful. The nearest 2-hop and 3-hop tags (particularly when crossing communities) are then more risky bets that may nevertheless create new and appealing gaming spaces (especially when the intermediate nodes are included to create a cohesive experience).

This article originally appeared on Quantic Foundry. Read it here.

ironSource LevelUp

Analysis and opinion from game industry leaders

Medium is an open platform where 170 million readers come to find insightful and dynamic thinking. Here, expert and undiscovered voices alike dive into the heart of any topic and bring new ideas to the surface. Learn more

Follow the writers, publications, and topics that matter to you, and you’ll see them on your homepage and in your inbox. Explore

If you have a story to tell, knowledge to share, or a perspective to offer — welcome home. It’s easy and free to post your thinking on any topic. Write on Medium

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store