Steam Reviews empower users to voice their opinions, knowing that their voice can make a change. As empowering as it is, as most developers know, ideas never execute as they do on paper. This is why games have terms of service — this is why there are balance changes in games — using the system the way it’s not designed is the way of life. Most games simply patch to fix these type of errors.
But — what would happen if there was no patch to balance and close exploits, leaving player-driven content to be exploitable and not used for it’s original purpose?
Specifically, there are two exploits that manipulate reviews in an unintended (meta) way that only online gamedevs experience; and doubled+ if they moderate.
Initial Clarity TL;DR Readers (Disclosure):
To further emphasize, this article is not about the review content, but the weight (impact) of two specific kinds of meta reviews in the context of affecting review % scores. In this article, we will explain the 2 types of meta reviews. This, in no way, expresses that we believe *all* negative reviews are bad.
Now that that’s out of the way, let’s first discuss top-level analytics.
To compare, our game (Throne of Lies) has only 12 recent reviews for the month and a matching ~1% for overall review engagement — we seem to closely match to the average.
Well, what does this mean?
- In a game with over 100,000 players, only 1,000 (1%) have actually reviewed, leaving the accuracy to be <1% (why less-than? See below).
- If there are 2000 players playing per month (not necessarily new) that may or may not have already reviewed (unique to online games that have unlimited gameplay), that means “recent reviews” for only 15 reviews would only have 0.8% accuracy without even considering troll reviews, 0.1-hour login screen reviews, “burned out” reviews (discussed below), or “revenge reviews” (discussed later).
1 — “Burned Out” Reviews
When I was a teenager, I worked customer service in a grocery store. You’d be surprised how many people attempted to return fully eaten steaks, leaving just the t-bone remainders. “We didn’t like the steak” after already eating 20 steaks over an extended period of time. Our manager said to refund him and I thought it was wild. Well, next month, sure enough, he bought 20 more steaks and returned them for “not being good”. Our manager realized the exploit and “patched” it by denying the refund after the 2nd attempt.
Unfortunately, no patch has yet to happen to fix this form of review manipulation, leading into “2000 hour” negative reviewers that even continue playing after reviewing.
Online games are known for updating content, providing unlimited replayability, have a longer shelf life than a linear single-player game, and can provide thousands of hours of replayable content for the same price of a single player game.
What happens when you play a single player platformer with an 8-hour shelf life if you enjoyed the game? Well, 1% may positive review!
Now, what if you played that same game 250 times (2,000 hours)?
Well, you’d get bored! Not because it’s a bad game, but because you OVERPLAYED it because you enjoyed it immensely.
This sounds funny, doesn’t it? If you played the same game for 2,000 hours, it sounds like a “go-figure thumbs-up”. However, with online games, this is not the case:
Online Steam games are doomed to a majority-chance of a negative review being a “burnout review” for those that liked your game too much. These games’ biggest fans are often the ones that negative review — yet, keep playing.
This may boggle your mind, however, think of it this way:
When do you STOP playing an online game with unlimited content?
This is the true cause of these reviews. Sure, the dev could release more content patches, could do x or y, could buff the user’s favorite class — but everyone still has a “quitting point” no matter how targeted the updates may be for the user.
So, when is this quitting point? Hint: It’s not about logic, but emotion.
Well, anything that causes a tip of emotion! It’s often the smallest trigger point — often something very obscure. The top 3 quitting points are:
- Most often, the triggering point is simply random toxicity for when you’re already at the “played enough” quitting point. You ever have a girlfriend you wanted to break up with, but waiting for “a sign” to quit? It could be anything! If someone gives you a hard time, there it is. The 2000-hour negative review will likely say “bad community!” while tolerating the community for the first 1,999 hours.
- Response to a moderation disciplinary action: Regardless of the form it comes in. Logic further does not matter, here: You could have blatant evidence presented to the player with their own words that may be the most hateful you’ve ever seen, but being caught triggers a defensive response (which leads to vengeful emotions -> which leads to a negative revenge review).
- Most revenge reviews are noticed to come from those with 100+ hours played — which may be caused by an emotional response feeling similar to “I feel like I don’t deserve this warning+ because I’ve played the game for x amount of time”. The resulting negative review generally mentions keywords such as “SJW devs” or related deflection upon others. More about revenge reviews discussed soon.
- Meta reasons — Online games always have a community. For example, our Discord has 14,000 players, and has over 36,000 suggestions to improve the game. When a user has played 2,000 hours and their feedback wasn’t implemented, there will often be a “Devs don’t listen to feedback” review, even if bountiful amounts of community-delivered feedback was implemented.
However, while I say “quitting point”, this average reviewer category …. keeps playing for months — even years — to come!
“Burned Out” reviews are a depressing reality when it comes to online games that deliver massive amounts of replayability for the buck, leaving gamedevs to inevitably have a higher chance of negative review the more they enjoy the game.
Let’s go in-depth on the 2nd burned-out review combo:
2a — “Revenge” Reviews
Ever feel that even a single person full of toxicity or someone cheating can ruin a game for you? You’re not alone. However, there’s a cost to removing these players. The backlash of moderation — even a warning — often results in a “revenge” review.
If you’ve never heard of this, it may be because this only happens to online games with moderation. In our game, for example, we boast the most efficient moderation system in the indie industry. However, is this a good thing for gamedevs?
As it turns out, it’s great for the community, but it causes a gamedev to silently bleed without thanks.
It only takes 6 negative reviews per month to take an online indie game to its knees.
In the example above, this is our game. To prevent biased considerations, let’s just talk numbers:
- Over the past 2 years, including some mini review bombs and even after the 2 review types that burns online gamedevs, we’ve still maintained a “Very Positive” overall score.
- In the previous month, we were as high as 93% and as low as 45% for recent reviews. There were no major changes this month — no loot boxes added, nada.
- This month, we dropped to 33% (mostly negative). However, the #1 thing to notice is this:
Out of 12 (!?) user reviews. Yep. 12.
Off-Topic Rant: Review Aggregation Considerations
Let’s take a quick breather to get distracted. Hey, I’m a dev, not writer; cut me some slack ;D Remember the analytical report above saying that the average # of reviews per month was 15? Well, ours is 10~15. Now, the question is:
If a doctor told you to take medicine that was given to 100k people, but only 12 people reported back. “Still alive, for now!” — would you take that medicine?
How about a controversial perspective. Politics :oh no!!:
If I went to a non-swing state (comparing to revenge/burnout reviewers that have likely have a biased opinion) and gathered political surveying for only 15 people, would you say that would be accurate enough information to tell who would become the next president?
Recent review numbers (and even overall) represents almost nothing but a figurative number generated by an insufficient quality control pool of numbers to represent any form of accuracy.
2b — Revenge Reviews (Continued)
Consider that even Steam promotes moderation, if you’ve read about the recent DotA wave and even publicly within their Steamworks docs. While casual moderation has no pains towards AAA games due to the sheer number of reviews that don’t budget (eg, ARK with 181,000 overall and 968 recent), indie games don’t have that “feature”:
Quite the opposite, indie games that moderate online games can only sit silently and bleed. If you don’t moderate an online game with social engagement, then — well, I honestly judge you from a gamedev’s perspective — but I also envy you from an experienced online game dev’s perspective. Here’s the thing:
Statistically, we issue ~110 disciplinary actions per month (as low as a warning or even a “warning for a warning”), not even including meta actions (like a Discord/forum/discussions mute+).
If we only get 10~15 reviews per month, yet issue 110 disciplinary actions per month …
(See where this is going?)
When someone receives a warning+, there is a significant chance of triggering a revenge review. The average revenge reviewer will continue playing the game for months — years — after an action is issued/resolved. We have had players that revenge reviewed at 400 hours for a warning in 2017, then seen still playing at 2000+ hours actively in 2019. This form of review manipulation is destructive to indie games that moderate.
After all, to moderate, it takes:
- a back-end server that costs $$
- Massive amounts of resources to have a report system
- a report review system
- a report resolution system
- a massive amount of emotional stress (guess what the topic of most reviews are — hint: It’s rarely about the game, but about the person that banned them; but I digress).
You get the point — it takes a ton of time, money, and people to make this happen. Devs don’t want to ban any players, unless necessary, as our score will likely go down when we do.
However, it is often necessary as if you don’t issue bans to the bad guys, the good guys will quit instead and the baddies will continue.
There comes a point where you must choose between the bad guy, or the people surrounding the bad guy.
We have a #justice NSFW channel in our Discord that streams, in realtime, semi-anon bans with the reason for ban to show proof of moderation.
We get folks that review this channel all the time and we rarely get any appeals or “Hey, that looks unfair!”. We also include the # of warnings / suspensions on their record to further show what may have caused a line to cross into being suspended (eg, what may normally be a warning may be a suspension for someone with a huge suspension history).
However, generally within 24 hours of a disciplinary action issued, a revenge review will appear on Steam and when the action subsides, the user will almost always be seen playing actively after without a review change.
From a different perspective, let’s say you go to your favorite cafe every day for 2 years. Well, one day, you decide to stand up on your chair and full-finger point at people shouting as loud as possible, “F ALL <EXTREMELY HATEFUL RACIST WORDS>” repeatedly.
Well, you’re likely going to get kicked out, banned, or given a “Please don’t do that” warning. Now, after, 3 things happen the guy is kicked out:
- The entire cafe claps and is very thankful for what you’ve done. If you didn’t do something, they would have all left.
- The person kicked out writes a Yelp review, “WORST CAFE EVER!”
- 0 of the 100 people in the cafe wrote a positive Yelp review. With only 2 reviews, that cafe is now going to sit at 50%.
[Optional] What often happens in social environments (online games / cafes) is the negative reviewer will also gather friends to further negative review with them. Let’s say 2 more.
The cafe now at 25% rating for saving 100’s. Alas, a thankless effort in terms of reviews. With only 4 total reviews, 3 being manipulated, that cafe is now suffering for doing the right thing.
Dev Emotional Tolls in Exchange for Moderation
While this article does not cover the emotional tolls of moderation and the types of targeted abuse that generally accompanies them in the negative reviews (that spend paragraphs insulting the dev instead of reviewing the game), I can assure you that this is also a very relevant topic for an article written by someone else.
And this … is the life of an indie online gamedev. $9.99 for 2000+ hours worth of content for an impending negative review for an obscurity. Thousands of hours invested into moderation only to hurt ourselves in the dual-edged sword of moderation actions. So many people we save from toxicity and cheaters that we’ll never see reflected inside negative reviews, where disciplinary action is a blatant reminder to revenge review while the “No news is good news” fans are so immersed in happiness that reviews are only but forgotten.
What can save us?
- Me? We update. We listen to the community. We engage directly. We do as much as we can possibly do with only 1 full-time developer (me): This is how we maintain a “very positive” score after all this in terms of our “overall” score.
> When it comes to “recent reviews” and the “true” score of a game, it’s up to Valve to somehow come up with a review revamp to consider the “No news is good news” crowd. I, myself, am bad at reviewing the games I love most because I’m immersed in them, forgetting to review at all when I’m happy. There’s no reminder, no curve, no automated “assume positive if played x hours” — simply a way to consider the “no news is good news” crowd would DEMOLISH all the pains caused to online indie devs, especially, but really any indie dev that doesn’t have the luxury of having 1000 new reviews per month to curve out extremes that come from manipulated reviews.
> An alternate way to save us would be to disconnect the “weight” of reviews from the actual review, inspiring reviews that actually talk about the game instead of for the purpose of manipulating the weight of a game.
- Epic? With Epic Games Store having an opt-out review system, perhaps they will be able to swoop in and save indie games, one day. Or, perhaps, maybe Valve will one day realize how manipulated the current-day reviews are, which are arguably more manipulated than Greenlight was. If Greenlight was removed for manipulation in very similar lights, it’s only a matter of time before it’s realized that. It’s not so much about the user voice. Imo, let the players speak their voice: But the weight? Let a reasonable algorithm be in charge of that to more-represent a realistic and accurate number, if a number at all.
- Drop moderation altogether? Well, it’s been considered numerous times. If I showed you an NSFW screenshot of just a few common cases for warnings/suspensions, how would you feel if moderation was dropped? Even without context, it’s pretty difficult to fit any context that would not make this at least warning-worthy. We’ve analyzed that “toxic” crowds will most often get triggered simply playing the game and either losing or having teammates not listen to them (even with bad advice). We’ve also analyzed “racist” crowds, who just … seem to spout racist spam without being triggered at all.
- Moderation vs None — The fact is, revenge reviews are triggered; a reminder to negative review. Toxicity is passive; no reminder to negative review. Without moderation, you will have higher reviews (and more players since visibility is likely increased). However, your retention will likely be significantly lower and your community will be very hateful (as seen in other games that don’t moderate as effectively as we do).
For now, online game devs can only do the best they can do, then sit back and bleed, hoping to survive until something is changed.
Online gamedevs are pretty rare — I don’t know many. Let’s be friends! If you found this story interesting and wanna say hi, you can find me on my game Discord as Xblade#4242 @ https://discord.gg/tol — Toss me a DM 🍻