Analyzing the Structure of a Video Game Network
I have always been an avid video game player. I play games across all different genres of games, like sports games (Madden, NBA2k, NHL, MLB the Show), mario games (mario kart, mario odyssey, mario party, etc…) and many more. However, when playing games, I had never thought of the connections games have with one another. Therefore I decided it would be interesting to research and discover the connections games have with one another. In today’s digital age, understanding the structure of networks is essential for creating a strong network that can be used by various stakeholders, from industry analysts to game developers. In this following network analysis, I will dive into the structure of a network derived from a dataset containing information about video games. Through network analysis, I aim to uncover insights that could inform decision-making in the gaming industry.
Question Exploration:
With the given data set, the question I decided to investigate is: What are the predominant genres in the video game industry, and how do they intersect across different titles? This question will highlight the genres that provide the most content and help video game companies make money. For this question, the specific Stakeholder that would be asking this question is video game development companies. If they can analyze and detect which genre of video games are most common, then they can come up with stronger game development strategies, marketing campaigns, and platform investments.
Dataset Description:
The dataset I found comprises video game titles along with a bunch of other data including their associated genres. The specific fields in the dataset are: Title, Release Date, Developer, Publisher, Genres, Product Rating, User Score, User Rating, Genres Splitted, and Platforms Info. For this analysis we are focusing on the Title and Genres Splitted fields. This dataset is relevant to the question as it provides insights into the genre landscape of video games and their distribution compared to other games.
Data Collection:
The dataset was obtained from Kaggle, which is a platform that provides free datasets for many different purposes including analysis and research. More specifically, I went on to Kaggle and searched through video game databases until I found one that provided a strong csv file that would work properly to create a strong network representation. Here is the link to the kaggle dataset: https://www.kaggle.com/datasets/beridzeg45/video-games?resource=download
Node and Edge Representation:
In my graph a node represents an individual video game title. An edge represents a connection between nodes due to a shared genre between video games. The edges are weighted representing the total number of genres shared between the two games. For example, the games “Ziggurat (2012)” and “Frantix: A Puzzle Adventure” have an edge with a weight of 1 because in the genres splitted column, they both have action as one of their genres. Another example is that the games “4X4 EVO 2” and “MotoGP 2 (2001)” have an edge with a weight of 3 because they have 3 genres in common.
Defining “Importance”:
In the context of the analysis for this graph, importance can be measured by the total weight of edges connected to a node, signifying its influence within the network. The higher the total weight a node has, the more important they are. The lower the weight total a node has, the less important the node is.
Data Cleanup and Limitations:
The data cleanup involved removing the columns that were unnecessary for our analysis.The Limitations include potential bias due to dataset size and coverage, as well as the subjective categorization of genres. As I did not choose the games provided in this dataset, the creator may have intentionally left out a lot of games from a specific genre or purposely filled the data set with a lot of games from their favorite genre. Below is the code for the data cleanup:
Here is the output of the dataframe at this point of the script:
Visualization:
The next step of my script was to create code that creates a node graph using the python library networkx. The following image is the code I wrote to create a network, with comments to explain what each part of the code is doing:
Important Nodes:
My next step was to list out the 50 most important nodes and the genre with the most connections. Here is the code that I wrote to do so:
Here is the output of this code:
Returning to the Visualization:
Finally, the last part of my code produces a graph representation using the matplotlib package. Here is the code:
As you can see in the code, I decided to shorten the labels as they were too congested. Finally, here is the network representation visual:
Analysis and Findings:
The most common genre identified in the dataset is action, indicating its prevalence within the video game industry. Most games seem to have some form of action in them, even if it isn’t the main focus of the game. This indicates that gamers are interested in some form of fighting, battling, racing, or whatever else is considered action. If a game does not include action, there is a good chance that it will be harder for the game to be successful in the eyes of the producer. When analyzing the visualization, it is clear that even in a set of 500 games, most games are closely related to each other in terms of genre, as there are a few sets of intensely large clusters of nodes on the visualization.
Through node analysis, we identified the top 50 video game titles based on their total edge weights, revealing titles with significant genre intersections and influence within the network. When looking back at the dataset and the games in the top 50, I noticed to no surprise that every game in the top 50 had action as one of its genres. Clearly, our earlier analysis of action being the most important genre was accurate.
Conclusion and GitHub Repository:
In conclusion, it is evidently clear that action is the dominant genre in video games and the games with the most genre connections to other games are action games. The outlier games are mostly non action games as it is clear that most games have some form of action. Video game developing companies need to ensure that every game they create has some form of action when they create their game development strategies, marketing campaigns, and platform investments.
- For the code used in this analysis, refer to the following GitHub repository: https://github.com/DrossTheBoss/INST414-ModuleAssignment2