The state of Referrerspamblocker.com

Welcome to the probably last Referrerspamblocker blog. Given the latest wave of new spam and issues with the out-of-dateness of the segments section, we decided to write a blog to give you more insight on what’s happening in the world of Google Analytics spam.

New Language Spam

As you have probably noticed a new wave of spam domains has entered your Analytics account. A few of the most submitted spam domains are: ‘abc.xyz, thenextweb.com, addons.mozilla.org, reddit.com’. However, you don’t see these domains back in the blacklist.. “So why won’t you add them?”, you might ask. If you visit these domains you will find them to be legitimate websites (abc.xyz is actually the website of Alphabet, the parent company of Google). The spammer now uses valid referrer domains and populates the language field with spam. This will results in messages like the following being shown in the language section like “Secret.ɢoogle.com You are invited! Enter only with this ticket URL. Copy it. Vote for Trump!“

Recap: Ghosts and crawlers

In general, Google Analytics spam can be divided in two types:

  1. Crawlers, spam that actually visits your website, resulting in spam in the source/medium parameter.
  2. Ghosts, that use the Google API to inject spam data, without actually visiting your website.

In the early stages of GA spam, we decided to focus on blacklisting spammers based on the source parameter. Most spammers were crawlers and almost all ghosts used the source-parameter to spam. Besides that, it was also a safe way to filter only genuine spam (read more about this aspect in our blog on Moz).

The evolution of spam

However, times have changed and the spam has evolved and became more advanced and harder to identify.

Since Ghosts use the API, they have full ‘freedom’ in populating every parameter. Meaning, they can use any parameter to spread spam. It has been used to spam our keyword reports, our pages reports and now the language parameter is the latest trend.

But this isn’t even the bad news.

Where the crawlers can be identified by their Source, Ghosts also had one parameter to identify them: the hostname. Since a Ghost never really visits your website, it can’t guess the hostname of the server of your website and link it to the UA-code it’s spamming. That’s why the ‘hostname = (not set)’-filter was a valid way to filter spam.

However, this last safe harbor has been ‘lost’ too:

Language spam with a valid Hostname

The spammers have found a way to identify the valid hostnames for a website and link it to the UA-code. Probably by combining a Crawler and Ghost, to populate all parameters with valid values.

The bigger they are, the harder they fall?

For a moment, we hoped that we were the only ones experiencing this. Our tool has become known around the world to fight spam, so could the spammers have targeted us directly and therefore link our hostname to our UA-code manually?

The short answer is: no. We’ve seen the same happening to multiple accounts and organisations. We can only conclude that the battle has been lost.

The only one who can win this battle for us is Google. We hope they’re fast, because this latest evolution is making Google Analytics completely useless.

#unhappyblocking

The community has been great. Both in submitting spam domains, which allows us to keep our blacklist up to date and make more users happy, and also in donating money to keep the project running. In total we have received € 204,95 in donations and we are really grateful for that.

But to be honest we feel like we have lost the battle. Spammers are becoming more creative in getting spam into your Google Analytics account and we could implement more features to keep up with them. However, we would have to completely rebuild the tool and we simply can’t invest the time to do this. Our company Stijlbreuk is still young ( 3 years old ) and needs all our attention to make it grow. Unfortunately Referrerspamblocker.com has taken (and still needs) a lot of our time for maintenance and improvement. We also feel that as a product it’s currently not living up to the high quality standards Stijlbreuk aims to maintain. We hoped that, next to giving back to the community, Referrerspamblocker.com would generate new business opportunities for Stijlbreuk, but unfortunately this hasn’t been the case.

We are sorry to announce we are throwing in the towel. We want to thank you guys for the all of the support and kind words! We hope that Google will find a solution to block the spam and that their great free tool ( Google Analytics ) will not lose value for their user-base.

So what does this mean? We will update the blacklist until the end of this year and you will still be able to use the tool. We will shut down referrerspamblocker.com in the beginning of 2017.

We hope you understand.

Kind regards,

Aron, Jeroen & Werner