Which genre dominated video games in the 2000s?

Teejay Abam
INST414: Data Science Techniques
4 min readMay 11, 2022

The purpose of this assignment and to write this story is to know which video game genre dominated from the start of the 21st century and for that whole decade. It is important to know that along with modern technology, video games received the same rise in fame and popularity around this time. Therefore, looking at the genres that helped propel this era would help any video game company to make decisions on what direction(s) they can go to for the next decades.

The source of this network data came from a crowdsource data on HowLongToBeat.com, which contains information of a vast amount of video games produced like platform capability, estimated playtime, genres, platforms etc. I am focusing on video games created from 2000s-2010s. The nodes in this network represent each video game from that era, respective to their genres, while the edges represent their connections towards each other based on their genres. This network was built by subsetting the dataset that it would only contain the game_id and the genre, and ultimately using Gephi to make the graph by using the nx.write_graphml function.

The structure of this graph is a very broad one, since it has around more than 1,000 nodes and more than 8,000 edges. I made sure that the size of each node was based on the centrality of them each. The purpose of this graph was to show the commonality of each video games based on the genres they are stated to have. From there, we can determine which genres, or set of genres, that have been the most popular during the 2000-2010s.

In here, the three most important nodes are the dark-blue, lighter-blue and the green node. I see these as the three most central genres. Using the centrality degree function in Python, I was able to get the top 10 most central nodes.

With this image, the three most important nodes are the most central, with are Action, Sports and Simulation/Strategy.

The software used to conduct this network analysis was the NetworkX module in Python and the Gephi application. With the NetworkX module, I was able to create a graph with all the information needed, while the Gephi was used to visualize the graph more clearly.

A bug that I encountered was during the iteration process. This came down to the data itself since there were a lot of video games who had nothing in the genre section. Therefore, getting accurate results were difficult because “NA” had a significantly high centrality. So, I had to take my time clearing all the NA values for all the genre columns, and this made the data clearer and easier to use, and therefore improves the accuracy in the results. I also had to rename the columns to make them much more readable because a majority of the columns name were a little bit too complicated. For my Gephi graph, I made sure to put some color in it that would distinguish the least central nodes to the most central ones, ranging from green (least central), yellow (middle), and blue (most central).

A limitation to this network analysis is that a lot of the observation in the dataset had to be removed as part of the cleaning process. Even though this improves the accuracy of the results from the dataset as a whole, it may fall short when painting a full picture towards the question we want to answer.

Conclusion:

Nonetheless, what we can learn from the data given, it seems that the video game genre that dominated the start of the 21st century was action-based video games as they have the highest degree centrality with a value of 0.575, and as shown in the graph with the dark-blue node. Sports games (0.326) and Simulation, Strategy games (0.117) are the second and third most dominant. From here, we are able to know which genres were the most popular during this time and know which would most likely be played the most in the years to come.

--

--