Would you like a free iPhone with that?

Andy Vermeulen
12 min readApr 22, 2022

--

Let’s pick up where we left off in my previous story: Oops, we had a subdomain takeover. Should we be worried?

To recap, we discovered 87 subdomain takeovers to GitHub pages, each hosting ~10k landing pages, with each page having an obfuscated JavaScript file included from one of 10 .ru domains, e.g. https://js.ekb-tv.ru/trd. We analysed this script and it merely conditionally redirects visitors to https://js.ekb-tv.ru/trds?q=.

Redirect #1: https://js.ekb-tv.ru/trds?q=

This redirect to https://js.ekb-tv.ru/trds?q= serves up a virtually empty HTML page with a JavaScript onload hook to redirect the user a second time.

HTML based redirect to blessedwin.life

Repeatedly fetching this url keep returning the same .life domain but the domain appears to have changed at some point:

  • luckygain.life was the domain we first identified
  • blessedwin.life is the domain used at the time of writing this post

The u=, o=, and cid= arguments always come back with the same values. Even if we try to remove cookies and other identifying headers, if we change our Internet connectivity to a VPN or use tethering, etc. The values appear to stay the same.

Cookie being set on the redirect

The bsi cookie returned does change on every reload but even with CyberChef’s Magic recipe we couldn’t work out what this value could represent. It is likely either a random value or encrypted.

Redirect #2: https://blessedwin.life/

The JavaScript redirect takes us to a blue-ish loading page which loads a few other resources and after a two or three seconds redirects us again to a .top domain.

A fake loading page we were redirected to, which redirects us to a .top domain after a few seconds.

This is where things get interesting. There is a fair amount of JS inside this blue-ish page and not your standard jQuery form handling kind. We will de-obfuscate it later to figure out what is going on and see if we can figure out how it decides where to redirect us to.

Before we do, it is also worth mentioning that the page and server behave differently under varying probing scenarios, most notably:

  • A basic curl command to query the same url returns a NSFW “It’s better than Tinder!” page. It has an “Unsubscribe” link to a form with vague questionable intro text which bounces you to Google after completion.
  • Trying to access the website homepage simply shows an “Under construction” message.
A questionable ‘unsubscribe’ page that may just be harvesting email addresses instead?

Redirect #3 & #4: https://owljmf.provewarmmind.top/

The page we are redirected to on the .top domain contains some metadata about our location, IP and ISP but doesn’t appear to do anything useful with it. It merely redirects us again to /web on the same domain and carries forward our session id.

Quick redirect back to /web

That /web endpoint doesn’t like us much either and we are redirected yet again using a HTTP 302 Response to rockstorageplace.com.

A HTTP redirect to rockstorageplace.com

The .top domain we were redirected to earlier first appears static at first glance however regular probing reveals that this .top domain is rotated every 5–25 minutes. Rotating domains this frequently is likely intended to avert detection and blocking features like Chrome’s Safe Browsing feature which only update domain blacklists every 30 minutes. The names appear to be algorithmically generated by concatenating three random words:

  • *.provewarmmind.top
  • *.causedivisionflower.top
  • *.areroomself.top
  • *.doublewasmaterial.top
  • * pastthinnight.top
  • *.firstringmiss.top
  • *.problemcomparefather.top
  • *.busyboatraise.top

By probing every 5 minutes over the course of several weeks and doing lookups on ICANN databases for identical naming patterns or contact details, we were able to find thousands of similar .com, .net, .top, and .xyz domains.

VirusTotal graph documenting the domains and IPs being rotated every few minutes.

All domains use NS1.FASTSERVICEDNS.COM and NS2.FASTSERVICEDNS.COM as their name servers. Older domains have their DNS records removed. Only the currently active domain has a single DNS A record to an IP that is also seemingly randomly selected:

  • Usually it picks an IP in the 5.189.217.0/24 range in the Netherlands (AS209813, Fast Content Delivery LTD, EU block)
  • On extremely rare instances have we seen a 31.44.185.251 IP in Russia (AS35029)

Redirect #5 & #6: https://rockstorageplace.com

Continuing the redirection circus, rockstorageplace.com does another HTTP 302 Redirect to the /away.php page.

The url= querystring provided is clearly a Base64-encoded value but decoding it doesn’t give it’s contents away. It is either a random binary value or more likely encrypted content.

Being redirected yet again to /away.php

The next page away.php has a self-explanatory name and does a HTML-based redirect, this time to the Google Play store.

The final redirect to the TikTok app on the Google Play store

The Google Play store url it sends us to turns out to be for the TikTok app. Yes, this is the legitimate TikTok app, not a knock-off. Presumably this is just a random target to move people away from the attacker controlled domains.

Chrome, Safari, Firefox and friends

Everything described earlier was the behaviour while using the embedded Chromium browser in Burp Suite from an Australian location.

Trying the entire redirect chain from different browsers (including standalone Chrome) and geographic locations (with a VPN) reveals several less benign redirection targets are in play.

Demystifying where we are being redirect to

Clearly the browser, geolocation and potentially other attributes dictate where a visitor is redirected to. Let’s get back to the obfuscated JavaScript we discovered earlier on the blue-ish .top page and see if we can figure out what is going on.

The beautified JS file is in my accompanying GitHub if you want to follow along.

The first script tag in the HTML appears to contain some session state and configuration:

Configuration data being set

The second HTML script tag contains the bulk of the code. It’s starts off with a few dependencies that appear to be handpicked. We can spot CryptoJS and a few related functions like MD5 (hashing), EvpKDF (key derivation), Base64 (encoding), AES (encryption), etc:

Loading the CryptoJS library
Adding the MD5 function to CryptoJS
Adding the EvpKDF function to CryptoJS
Adding a Base64 encoding function to CryptoJS
Adding a Cipher capability to CryptoJS
Adding an AES cipher to CryptoJS

De-obfuscation #2.1: Simplify

The actual logic doesn’t start until the CryptoJS dependency set up is complete and once again this logic code is obfuscated. As per my prior blog post we run a few simple Regular Expressions to de-obfuscate the code and make it easier to follow:

  • !![] becomes true
  • ![] becomes false
  • Hexadecimal numbers like 0xc335d become their decimal equivalent 799581
  • Bracket notation function calls like _0xc30cb0[‘push’] become dot notation _0xc30cb0.push instead
  • We will also normalise hexadecimal identifier names to make parsing with Regular Expressions easier. We again drop the 0x from the prefix and add an underscore postfix.

De-obfuscation #2.2: Undo shifting

Straight away we can recognise a “shifting” function that is self-invoked as the script loads. We can see a seemingly infinite loop while(true) and push and shift calls to cycle the elements in an array of string values.

What appears to be a shifting function

The array of string values being shifted around is a bit harder to find. It’s not the first line of the script but rather inside a function _0x568a that is declared later in the file. This pattern works because a JavaScript function declaration causes its identifier to be statically bound before any JavaScript in the module is executed. (A function expression and assignment is different however and occurs in normal top-down execution order). In the example above, that means both _4573_ and _568a_ are already assigned even though their function declarations appear later in the file.

https://262.ecma-international.org/11.0/ #15.2.1.11 Static Semantics: LexicallyDeclaredNames

In order to make the code harder to follow, the obfuscation is re-declaring _568a_ inside the _568a_ function with the same name. Thanks to JavaScript closures this shadows and changes the definition of _568a_ inside the _568a_ function itself while the global definition of _568a_ remains unchanged. Unfortunately this makes de-obfuscating the code with Regular Expressions challenging as we would have to ensure we evaluate these JavaScript closures correctly.

The strings table hidden in a function declaration further down in the JavaScript module

That is a pain so instead we introduce a breakpoint with the debugger statement after the shifting function has completed and use the Chrome DevTools to capture the unshifted strings array with JSON.stringify(_c30cb0_). We then substitute the only invocation to the _568a_ function with this array and remove the shuffling function altogether.

De-obfuscation #2.3: Removing decode

We can now clearly see the fn_decode function and that it also shadows the identifier fn_decode with a nested function with the same name.

A nested fn_decode function shadowing a global fn_decode function

As we did in my previous blog post, we want to figure out what the two arguments to fn_decode represent. We hope to get rid of all fn_decode calls and substitute them with the resulting string values instead.

In this particular script fn_decode is rarely called directly though and is usually assigned to a new identifier first and then called using that identifier instead. We will replace all those instances with direct calls to fn_decode instead which makes them easier to identify.

Assigning new identifiers to hide calls to the fn_decode function

Again, similar to my previous blog post, we can see the first line inside the nested fn_decode is subtracting 277 from the first argument passed. The first argument again appears to be the index in the strings array of the string we are looking up. We can again “lift” this out of this function and into the callers.

Subtracting 277 from the first argument to fn_decode

The second argument unfortunately isn’t as straightforward to make sense of this time. It would appear that the character codes from the second argument are used to alter the characters inside the string array itself. That is, each and every string value is offset or encoded differently. Improving our understanding of this complex logic is unlikely to sufficiently help us de-obfuscate this code further.

De-obfuscation #2.4: Removing decode (using brute force)

We will take a brute force approach instead. We create a map with a string key containing the fn_decode arguments being invoked. To collect the corresponding values we invoke the real fn_decode function. We execute this in the Chrome DevTools console which can evaluate the fn_decode calls for us and return to us the resulting string values.

For each result in our map we can then substitute all occurrences of the key inside the JavaScript file with the relevant calculated value.

A map to be evaluated to match every fn_decode invocation with it’s output.

Following this substitution we can drop the fn_decode function altogether, as several helper functions it depends on as they are no longer in use.

De-obfuscation #2.5: Simplify again

The JavaScript is now largely de-obfuscated and easy to read, however we will repeat the substitution from #2.1 to replace bracket notation with dot notation for method invocations.

We also rename some variables and functions to aid our understanding of the code and the rest of the discussion below.

JavaScript fingerprinting

Now that the script is largely de-obfuscated we can clearly see that the de-obfuscated script contains 91 small functions that each perform some measurement or fingerprinting of the browser. These probes are fairly comprehensive and include:

  • simple navigator object probes for the userAgent, operating system, browser plugins, screen size and pixel depth, supported language, timezone, etc
  • multiple detection mechanisms for browser automation hooks for PhantomJS, Selenium, WebDriver, etc and NodeJS
  • checks to detect web browser security and sandboxing products like Kaspersky, Netdefender, Norton Toolbar, Trend Micro, Avira, Symantec, etc
  • fingerprinting of error handling behaviour, setTimeout and setInterval behaviour, permissions queries, etc
  • evaluating browser support for Canvas, WebGL, speechSynthesis, performance and other HTML5 APIs
One of 91 browser fingerprinting functions

These probes are already fairly well documented in past efforts by Christopher Tarry, Antoine Vestel, vah13, or have equivalent probes implemented in the open source FingerprintJS library so I won’t review them in detail here.

The script runs all these probes in order and builds a fingerprint object containing the outcome of each probe. That is, the majority of the 91 probe functions return only a 0 or 1 outcome rather than the actual values gathered from the browser.

Compiling the output of all 91 checks into a single fingerprint

The first probe chk being persisted as “a0” is a bit different and doesn’t run any unique code but rather seems to return 1 if one of a handful of other checks returns 1. We are not sure what its purpose is however speculate it marks traffic that is deemed uninteresting.

Different behaviour of the first fingerprinting function

The resulting fingerprint of all probes is stringified and encrypted with AES-CBC-PKCS7 using a 16-byte key provided in the configuration data provided earlier.

Stringified browser fingerprint
Configuration data including the AES encryption key used

All url query parameters are then concatenated with the link provided in the configuration data and used to redirect the browser window by setting window.location. The query argument fp= is set to the encrypted fingerprint while f= is always set to 1. For example: https://mrfgzh.provewarmmind.top/bcndknyf/?f=1&sid=t4~oey5aihqlolxuamyxm2sm1rq&fp=48lWzoCPJ(abbreviated)

So what are the potential redirection targets?

Unfortunately, we can not confidently say. The redirection happens server-side and the comprehensive fingerprinting of the browser makes it non-trivial to brute force this logic to discover other potential redirect targets and potentially more malicious payloads.

On each visit, 91 browser attributes are being considered in order to detect spoofed browser user agents, automation scripts, and the presence of anti-virus solutions. As soon as one of these 91 attributes doesn’t match up to a “real” browser controlled by a “real” human running a “desirable” target operating environment, we are sent back to the Google Play store or to a 404 page.

We identified a few potential targets from manual testing (as per the earlier screenshots provided) however do not have access to a sufficient large data set of matching “real” browser attributes to brute force this with a reasonable chance of success.

What’s next?

As always, time is the enemy and we may not be able to continue investigating this much further. We definitely do have several more threads that we can pull:

  • We can continue to monitor the redirect chains to discover potentially new IPs, domains, or redirect targets
  • We may build a library of “real” browser fingerprints to be able to brute force the redirections and discover other malicious landing pages
  • We haven’t investigated the “congrats you won an iPhone” landing pages to see what they may reveal
  • We haven’t gone back to look at the shell companies and hosting providers to see what else they may be involved in
  • Reports about the affected IPs and domains have started becoming more frequent on VirusTotal since we started looking into this, so we may be able to correlate our findings with other investigations and malware reports

If time permits and anything noteworthy pops up you can expect a third blog post in the series. Until then!

--

--