How to Find Video Games Similar to Your Existing Favorites.

Shea Versey
INST414: Data Science Techniques
6 min readMar 9, 2024

I personally enjoy playing video games as one of my favorite hobbies. For example let’s take 3 video games, Wii Sports, FIFA 14, and Left 4 Dead. These are all unique in some way and share some similarities but how would you find other games similar to those ones that you would probably also enjoy playing? People invested in the answer to this question are others who enjoy playing video games but want to find new games to try. We can use similarity metrics to determine different games that are similar to your existing favorites. This will give the stakeholders the information necessary to decide on what new games to purchase and play.

Ideal data to answer this question would be data on all current video games across all platforms including different stats and features for each game. Genre is very important because it is one of the most impactful aspects of a video game that makes it similar to others in the same category. Ratings on how users liked the game is another important field considering this whole analysis is about finding games you would like. Any other features or fields that explain an aspect of the game and how it can relate to other games would make for sufficient data to analyze similarity.

Using Kaggle I located an existing dataset called Video Games Data, which has 16 columns and over 16000 rows of data, one row for each unique video game in the dataset. The dataset is updated annually and was updated most recently 5 months ago.

I will be using a select few columns to act as features in this similarity analysis. Each column will be waited equally after the data frame was normalized using min/max scaling. This prevents columns with inherently larger gaps in value from inaccurately skewing the results. To be specific, the following columns, year of release, genre, global sales, critic score, user score. The genre column is not a numeric value, therefore I created a new column for each existing genre in the dataset and used a binary value in each cell to represent if the game fell under that genre category or not. For this assessment I have used the Euclidean Distance metric because the data has already been normalized to scale.

Snapshot of the Normalized Data Frame with the selected columns tested for Euclidean Distance

Now It’s time to see the results of the assessment for each of our sample video games.

Wii Sports:

Code and output for top 10 most similar games to ‘Wii Sports’

The top 10 video games most similar to Wii Sports and there corresponding Euclidean distance from Wii Sports are ranked above. The list from most to least similar is Wii Sports Resort, Wii Fit, Wii Fit Plus, Kinect Sports, Sports Champions, EA Sports Active, Mario Strikers Charged, Hot Shots Golf 3, Skate 3, and Mario Super Sluggers.

It is no surprise to see several other Wii games that are centered around sports and activity at the very top. In fact every single video game in that list is under the sports genre. As a Wii Sports player it is very likely you have played some of these video games such as Wii Sports Resort and Wii Fit. If you haven’t then this is very indicative of two new games that you should play if you plan on seeking the same enjoyment as playing Wii Sports. Some other games on the list are slightly more unique despite their Euclidean similarity. If you Seek a game not exactly like Wii Sports than options such as Sports Champions and Hot Shots Golf could be great options as well considering these games are on PlayStation and would offer a different playing experience while maintaining your interests.

FIFA 14:

Code and output for top 10 most similar games to ‘FIFA 14’

Here are the 10 most similar video games to FIFA 14. The most similar are FIFA 17, FIFA 15, FIFA 16, Tiger Woods PGA Tour 2005, Tiger Woods PGA Tour 06, NHL 17, College Hoops 2k8, Pro Evolution Soccer 2016, 2014 FIFA World Cup Brazil, and Madden NFL 13 in order from most to least similar.

Even less shockingly than the Wii Sports results there are 3 other videogames from the same franchise in the top 3, FIFA 17, FIFA 15, and FIFA 16 respectively. There is no doubt that anyone searching for new games is not looking for different iterations of the same game, they want something that is at least a little different. If the player is really into soccer games and is looking for a new spin on a soccer video game, then Pro Evolution Soccer and 2014 FIFA World World Cup Brazil are two good options for new games to try out. If the specific sport of soccer is not what is motivating there search than games like Tiger Woods PGA Tour and NHL 17 are the best options to check out for similar style gameplay of a different professional sport.

Left 4 Dead:

Code and output for top 10 most similar games to ‘Left 4 Dead’

Finally, the top 10 most similar video games to Left 4 Dead can be seen in the image above, and the list is as follows, Red Faction, Killzone 2, Resistance: Fall of Man, Gears of War 3, Crysis, Battlefield 3, Far Cry 3, Far Cry, Uncharted 4: A Thief’s End, and Gears of War.

Unlike the first two video games that were looked at this is not a sports game, this is a shooter game. What is unique about Left 4 Dead is that it is a zombie shooter game. This game has a big sense of adventure that some shooter games don’t share. For a player looking for a game similar to Left 4 Dead I would recommend games like Far Cry and Uncharted first, despite the fact that they rank 8th and 9th respectively. This is because some of the games at the top are more army centered shooter games where as the two games I mentioned also have an adventure element that can relate more to Left 4 Dead.

As can be seen from all of 3 of these queries, there are more than enough options to choose from and all of them will bare similarity to the target games that were used. With a little more background research on the results any player should have success finding more video games to enjoy.

Cleaning this data was a quick easy process. As I previously mentioned, this dataset came from Kaggle, which generally has very clean and organized data to begin with. When altering the data for this similarity assessment the cleaning process consisted of gathering the relevant columns and replacing troublesome values. Lots of cells had null values, some were critic scores, others were user scores, and some were just missing values. Another value I needed to replace was ‘tbd’ which showed up particularly in the user score column. This also had to be replaced to a 0 in order to normalize and test the data. I did not experience any bugs and I would recommend this dataset for future use.

The limitations of this analysis stem from the dataset. Unfortunately not all columns could be used that may have had a slight impact on the data. For example, publisher was a column in the dataset which had hundreds of unique values which was too many to create a separate column for each. Another limitation is that context is import. We saw a few games that were practically duplicates that popped up in the rankings and other games that were in the same genre yet weren't truly as similar as some lower options. The fact is not every feature is a perfect consistent representation in relation to other games, which leads potential for less accuracy.

GitHub Link: https://github.com/skversey/INST414-Module-Assignment-3

--

--