Steam Games Logo [https://cdn.freebiesupply.com/images/large/2x/steam-logo-transparent.png]
Steam Logo PNG

‘Steam’s Top 10’ Most Positive Rated Games to Play 2019

Daniel Aguilar

--

data collected in May 2019 by Nik Davis from games available from steam during that time….

Looking For The Top Ten

What games on the video game digital distribution client, Steam, was the most positively rated in the first half of the year 2019 and why?

Nik Davis and The Data

Using the open source dataset platform, Kaggle, I started looking for any possible open source dataset that contained any information on steam games along with popularity ratings to those games. Luckily for me I found a dataset created by a data scientist by the name of Nik Davis.

Nik constructed a clean dataset using Steam and SteamSpy API’s. You can find the same dataset I used here for your own observations!

Breaking Down the Data

The dataset contained 27,075 games….now that is a lot of games to have to sort through! Luckily with the power of python libraries like pandas and numpy, I was able to create a data frame to start sorting out my dataset. The following is my strategy on how I was going to break down the data.

  • Identifying which games out of the 27,075 games in the dataset had the highest positive ratings(most popular).
  • Identifying which of the games in the dataset had the highest negative ratings(least popular)
  • Look for any relationships between the different columns in the dataset to that of the positive and negative ratings.
  • Finally, with that information being identified I can answer the question of which games were the most popular and why?

I will be using seaborn to represent my data for number one and two following that I’ll use px.express to represent my observation in number 3

Top 10 Most Positive Rated Steam Games and Top 10 Most Negative Rated Steam Games

Top 10 most Positive Rated Games

First thing first was to apply some conditions to my data frame so that I would be able to sort the data to acquire the top ten most positive rated games you can play on steam. The following games were the top ten:

1) Counter-Strike Global Offensive

2) Dota 2

3) Team Fortress 2

4) Player Unknown’s Battle Grounds (PUBG)

5) Garry’s Mod

6) Grand Theft Auto V

7) Payday 2

8) Unturned

9) Terraria

10) Left 4 Dead 2

Visualizations Using Python’s Visualization Library — Seaborn

Luckily for me the dataset was already cleaned by the original author, Nik. This saved me a lot of time as my dataset was ready for me to go straight into visualizations using the python visualization library, seaborn. I used seaborn to make quick and simple bar plots to represent the number of rating for both the top ten most positive rated games and the top ten most negative rated games. (I’ll go more in depth with this later on in this post…)

BAR PLOTS

Looking at the data I identified that Counter Strike Global Offensive was the number one most positively rated game on steam (having 2,644,404 positive ratings) behind that is Dota 2 (with 863,507 positive ratings). Since I was only interested in seeing the top ten most positive rated steam games I won’t be posting a list of the top ten most negative steam games however it is still easily displayed on the bar plot above labeled NEGATIVE RATING FROM TOP 10 MOST NEGATIVE RATED GAMES.

I will however go a little more in depth in my observation of possible relationships between the games and the different columns of information on the kaggle dataset that included average playtime, owners, price, genres, and etc.

SCATTER PLOTS

GIF showing a demo on how to look at the data visualization

In the bottom I have a snapshot of the two different plotly scatter plots I created.

After setting up my parameters using the python library, px.scatter, I observed some interesting information on both visualizations.

The first thing I had to do to be able to distinguish how big the bubbles would be and what they would appear is by applying a some sort of attribute to them. Since I was doing a scatter plot to represent the relationship between the average playtime and ratings I thought it would be interesting to show what type of genre and how many approximate owners each one of those games had. It was for that reason that I decided to use these two attributes and applied them onto the size and color of each bubble.

MY OBSERVATION

Each title for the scatter plot has an embedded link! Click the title to interact with the plotly visualization!

Positive Ratings x Average Playtime

Looking at the the first scatter plot you can see immediately that the game Counter Strike Global Offensive is at the very top with the highest average playtime and highest positive ratings. Following that are clusters of other games that is predominantly Action (genre color blue) with high positive ratings and high average play time. These action games also carry high volume of owners represented on the size of each bubble. It was then when I had made my decision that the most popular games on steam were action games, specifically FPS (first person shooters) based on the majority of FPS games that are on my list of top ten most positive rated games in which I had filtered through the means of pandas.

I decided to keep going and observe what type of games on steam were the top ten most negative rated games and identify possible catalysts as to why.

Negative Ratings x Average Playtime

To my surprise about half of the games on my top 10 most positive rated games appeared on my new list of top 10 most negative rated games. At first I was confused but then it hit me. Some of those games had a high volume of players and high average playtime meaning those games would not only have high positive ratings but also high negative ratings due to the high volume of active players that the game has! This is seen on the scatter plot below as the bubbles that are the most negative rated still have a high average play time and a high volume of players. The genre is still action would means that the majority of steam users tend to be downloading and playing action genre games overall.

It was very interesting to see which games from steam is the most popular. Although the dataset is as late as May 2019, I would imagine the top ten positive rated games are still in that list this late in the year. It would be interesting to analyze similar datasets with other platforms such as other PC clients like Blizzard, Origin or console games for Xbox One and PlayStation 4.

What do you guys think? What are your favorite games on steam or in general? Are you a League of Legends multiplayer online battle arena (MOBA) fanatic or are you a person who likes solo campaign games like the Dark Souls series? Maybe battle royal games like Fortnite! Lets me know!

Here is a link to my github repository to check out the code for the data visualizations yourself! Also check out my porfolio!

--

--

Daniel Aguilar

Hello! My name is Daniel and I am a United States Marine Reservist and a software engineer!