Browser as Botnet, or the Coming War on Your Web Browser

One spring afternoon I was having lunch with Nick Briz at a small neighborhood diner near our studio in Chicago. We were throwing around ideas for an upcoming conference in Brooklyn that we’ve been participating in for the last few years called Radical Networks. The event brings together artist, educators, journalists and activists from all over the world to foster discussion and engagement with topics of communication networks and Internet infrastructure through workshops, performances, invited speakers, and an art show.

What if websites borrowed compute resources from their visitor’s devices while they browsed as a means of distributed computing?

We’d both participated in the art show since the festival’s inception, but this year I felt compelled to break into the speaker track. In particular, I was entertaining the idea of presenting about an idea I’d had a few days prior, “what if websites borrowed compute resources from their visitor’s devices while they browsed as a means of distributed computing?”

Because of the way the web was designed, visiting a website requires your web browser to download and run code served from that website on your device. When you browse Facebook, their JavaScript code runs in your web browser on your machine. The code that gets executed in your browser is, of course, assumed to be code related to the functionality of the site you are browsing. Netflix serves code that allows your browser to access their movie database and stream video content, Twitter serves codes that allows you to post, view, and comment on tweets, etc…

Technically, however, there is nothing stopping a website from serving arbitrary code that has nothing to do with your browsing experience. Your web browser will blindly execute whatever JavaScript code it receives from the website you are browsing. What’s to stop high-traffic sites like Facebook and Google from abusing this feature of the web, harvesting massive compute resources from their hundreds of thousands of concurrently connected users for free? Was this idea really feasible in practice? If so, was it being used in the wild?

This post is a report of my trip down this rabbit hole of an idea, and a summary of the talk that I ended up giving at Radical Networks as a result of that research.

Browser as Botnet talk @ Radical Networks 2017

Stepping Back, A Bit About Distributed Computing

Before we go too deep into the implications of borrowing user’s compute resources while they unsuspectingly browse the web, I want to touch on why it would be advantageous to do so in the first place. The example scenario that I’ve posed falls into a field of computer science called Distributed computing. Distributed computing is the practice of dividing a problem into small chunks and running it on many different computers in parallel, significantly reducing the time needed to compute the problem. In general, distributed computing offers abundant compute resources like many CPUs, high network bandwidth, and a diverse set of IP addresses. For some tasks, distributed computing provides the opportunity for 1,000 computers to work together to solve a task 1,000x faster than it would take one computer to solve that same task working alone.

Serial computing (top) vs distributed computing (bottom)

Distributed computing has a rich history that dates back to ARPANET in the 1960s, with a slew of community and volunteer citizen science projects popping up in the late-1990s and early-2000s (partially thanks to the Berkeley Open Infrastructure for Network Computing, or BOINC software). Projects like SETI@Home, Folding@Home, GIMPS, and many others which allow computer users to donate idle time on their computers to cure diseases, study global warming, find large prime numbers, search for alien life, and do many other types of scientific research.

A botnet is a distributed compute network where the owners of the participating computers don’t know that their computers are participating in the network.

Opposite the idea of volunteer distributed computing is the concept of a Botnet. A botnet, the portmanteau of “Robot” and “Network”, is a distributed compute network where the owners of the participating computers don’t know that their computers are participating in the network. They are associated with hacking and criminal activity and are best known for their use in nefarious activities like distributed denial of service (DDoS), e-mail spamming, spyware, click fraud, and more recently, cryptocurrency mining. Botnet software is usually installed on a user’s machine as a trojan or worm and can persist for months or years without the owner knowing, all the while providing compute cycles and bandwidth to an anonymous third party. Occasionally these botnets grow in size until they control tens of millions of unsuspected user’s computers and become informally recognized and named by members of the cybersecurity community.

Named botnets

Browser Based Botnets

Imagine a situation where your computer is participating as a node in a botnet, only this time malware isn’t installed as a program on your computer. Rather, it occurs in the background of the very browser tab you have open reading this blog post. This method would give malicious JavaScript code full access to the sandboxed web browser API, an increasingly powerful set of web technologies. It would also be transient and difficult to detect once the user has navigated off the website, providing compute resources to the botnet equal to the number of concurrent website visitors at any given time. What’s to stop high-traffic websites from leeching resources from their visitors for free for the duration of the time they are visiting a website?

A bit of digging revealed that this wasn’t a particularly new idea, and that folks had been talking openly about this technique since at least 2012. MWR Labs conducted research on the subject applied to distributed hash cracking on the web (an idea that I elaborated on in a demo during my talk, code here) and Jeremiah Grossman and Matt Johansen had a great talk at Black Hat USA in 2013 on the subject. Both research groups distributed their experiments to unsuspecting users in a notably devious and ingenious way: ad networks.

Traditional methods of distributed computing involve volunteers or viruses, but the landscape is quite different for browser-based botnets. With our approach, we need to distribute our code to as many web browsers as possible at once. We have a few options:

  • Run a popular website
  • Write a Wordpress/Tumblr theme and embed our malicious code in the source
  • Run a free proxy server (or TOR exit node), and inject our code into non-HTTPS traffic
  • Be an ISP and do the same ^
  • Embed our malicious code into popular websites with persistent cross-site scripting (XSS) (illegal)
  • Buy a banner ad

Like those before me, I ventured down the dark path of Internet advertising. Did you know that those pesky banner ads that follow you around the web are often iframes, a special HTML element that allows you to embed web pages into other web pages? That sleazy click-bait photo at the top of your favorite torrent site might not be the innocent .JPG you think it is, but rather a web page in its own right, with the ability to deliver custom JavaScript code that gets executed in your browser.

Here’s the idea: advertising networks connect web content publishers (i.e. blogs, news sites, porn sites, forums) to advertisers. Advertisers pay the ad network per click (CPC) or per impression/view (CPM). The network scrapes money off the top before sending it along to the publishers who host the ads on their platforms. When an advertiser creates a dynamic third-party creative (a fancy name for an embeddable <iframe> advertisement) they have the opportunity to include whatever HTML/CSS/JavaScript they want. Usually advertisers abuse this privileged by including nasty tracking code whose purpose is to identify and record information about the user the advertisement is being served. But technically, there is nothing stopping the code included in the advertisement from instead delivering a malicious botnet payload aimed at harvesting compute and network resources from the user it is served to. Worse yet, certain ad networks allow you to pay them in Bitcoin, potentially allowing the advertisers distributing a botnet payload to remain anonymous (when done right)!

Doing it Anonymously

Given that researchers had luck exploiting these techniques five years ago, I was curious if it was still possible to do so today, or if browsers and ad networks had wised up to these kinds of shenanigans. In preparation for my talk I found an ad network that supported iframes and wrote some pseudo-malicious bots. My goal was to survey the landscape and see what was possible in this domain, specifically utilizing some of the more modern web browser technologies that have evolved since 2012.

As an extra challenge, I wanted to carry out my experiments in a way that was as anonymous as possible, simulating how a nefarious hacker might do the same. Like most anon activity on the net, I needed to start with an anon email address. For that I chose protonmail.ch, a Swiss email and VPN provider founded by a few privacy/security minded CERN employees. Equipped with an untraceable email address I was able to begin searching for a particularly shady ad network. I had three requirements in a network — it had to support dynamic third-party creatives (iframe advertisements), it had to have a minimal ad review process to avoid getting my ads flagged as malicious, and it had to accept payment in some way that would be difficult to trace back to my true identity. After signing up for about a half-dozen networks I hit the mark with popunder.net, a Russian ad network that I would soon come to learn primarily represents publishers of the pornographic type. Popunder allows you to upload your ads as .zip files containing entire static web pages built with HTML, CSS, and JavaScript. They also had top-notch customer support if you were willing to do a bit of Google Translating.

Google Translate

The bots that I was writing worked by communicating with a central command-and-control server that would coordinate the compute nodes and distribute tasks, log experiment results, etc. For this I needed a cloud server to run my back end Node.js code. Here is where I cheated a bit. There are tons of bulletproof and offshore VPSes available for purchase on the web, most all of which accept Bitcoin as payment. But for convenience, and because as far as I could tell I wasn’t actually doing anything illegal, I chose to use Amazon Web Services (AWS). A nefarious hacker would have no problem finding an anonymous VPS or using someone else’s server that they already compromised.

For added security I wanted to encrypt the communications between my malicious ad bots and the Node command-and-control server, so I also required an SSL/TLS certificate. Let’s Encrypt provides them for free, but like all SSL certificates, you need to own a domain name to get one. Fortunately, Namecheap.com recently announced a new Bitcoin payment method, so equipped with my anon email address, I created an account and registered a $0.88 “.website” domain paid for in Bitcoin.

Before I deployed the first ads, I wanted to configure some sort of analytics tracking to gather information about the types of users the ads were served to. I was primarily interested in geographic location as well as simple time-on-page and recurring visitor statistics. Google Analytics is the standard analytics tracker, but that doesn’t fit very nicely into my anonymous pipeline — plus, I’d rather not feed the Google beast. Matomo (formerly Piwik) is an open source analytics alternative that can be self-hosted on your own server.

Matomo visitor map. Most traffic from popunder.net came from Russia and the United States.

Once I’d determined my anonymous distribution pipeline I began to author a suite of JavaScript bots to deliver via the ad network. My goal was to write a small collection of CPU and bandwidth benchmarking bots in an attempt to measure concurrent compute and network resources made available by users machines. Essentially, I wanted to find out how powerful a browser-based botnet distributed by an ad network could really be? Turns out… pretty powerful.


Experiments

The popunder.net advertising network offers minimum CPM (“cost per milli”, or price for 1,000 impressions) ad buys for $0.04, so I was able to conduct all of my experiments on a budget. All together, I spent less than $100 running ads intermittently over the course of one month.

What would you do with 100,000 web browsers and an afternoon?

Info bot

The first ad simply logged IP addresses, user agents, and visit duration. The ad started running at 9AM CDT on a Thursday right before heading to work. I ran the ad for ~3 hours, turning it off around lunch time to analyze some of the results.

I was shocked to see that the ad had been served to 117,852 web browsers from 30,234 unique IP addresses. Surprisingly, a significant portion of the visitors stayed on the page serving the ad for quite a while, which could provide sizable CPU clock time. Some clients even reported back to the command-and-control server over 24 hours after the ad network had stopped serving the ad, meaning that some poor users still had the tab open. Including these outliers, the average time time on ad was 15 minutes!

Time on ad. The long tail is chopped off at 600, but it carries into the tens of thousands.

I summed the number of seconds that all browser clients ran the code served by the ad and the total added up to 327 days. That’s the equivalent of one computer running my ad on one web browser for nearly a year, all in just three hours real-time for just around $15 USD of Bitcoin. Hot. Damn.

Hash Bot

So this whole thing worked; an ad network turned out to be a brilliant method of distribution. But how powerful was this network? Compared to say, the beefy 4.2GHz CPU of the machine that I was using to develop it? To test this I wrote a hashing bot that calculated the SHA1 hash of random numbers in an infinite loop as quickly as possible.

The speed of the network offered a 100x increase from my home workstation for a nominal cost.

The web browser’s navigator API provides the ability to check the number of CPU cores available on a machine. I used this number to launch one SHA1 hashing web worker per core and reported the current hash rate of the bot back to my server once a second. Web workers can be thought of as a means of multi-threaded JavaScript (its not exactly the same, but it serves the same purpose). They allow consumptive JavaScript code to be run in parallel on multiple CPUs without blocking the main UI thread or interrupting the user’s experience of the website.

The browser clients that received this ad had 3.67 CPU cores on average, boding well for the possibility of multi-threaded exploitation in-browser. Collectively, the SHA1 botnet averaged 324 concurrently connected clients hashing 8.5 million SHA1 hashes per-second as an entire network.

On average, 324 bots were connected to the command-and-control server at any given time

While 8.5 MH/s isn’t actually a notably high hash rate for the task of SHA1 hashing, the relatively slow JavaScript implementation I was using ran on my Intel quad-core CPU at a frequency of between 8–10 KH/s. The speed of the network offered a 100x increase from my home workstation for a nominal cost.

The network hash rate was consistent and normally distributed
Bots reported their current hash rate to the server once per second

Monero Miner Bot

While conducting this research, I also found myself conducting, *ahem… cough, cough*, other research on The Pirate Bay 🏴‍☠️. I happened to have my system CPU monitor open because I was testing some botnet code a few minutes before and I noticed something peculiar. When I opened certain links on The Pirate Bay my CPU usage would spike to ~80% on all cores. When I navigated away from those links the usage would fall. Had I found an instance of the very abuse that I was studying live in the wild?

Coinhive XMR miner running on The Pirate Bay

I profiled the suspicious pages using the Firefox developer tools and noticed there were six dedicated web worker threads running a script called CryptoniteWASMWrapper. This was an immediate red alert. WASM stands for Web Assembly, a new hyper-optimized assembly bytecode spec that runs in the web browser. It provides near-native speed code execution in the web browser, far faster than JavaScript, and is a compiler target for C and C++ code. I also happened to know that Cryptonite was the name of the hashing algorithm used by the Monero cryptocurrency (XMR) and that it had an interesting quality — Cryptonite hashrates are only marginally faster to run on a GPU vs a CPU, offering a speedup of only 2x+ rather than the three order of magnitude speed increase of other popular proof-of-work algorithms. This means that XMR can be mined rather efficiently on a CPU, and in this case, on my computer served by The Pirate Bay.

Digging deeper, I found a file called coinhive.min.js. Some Duck Duck Go’ing lead me to coinhive.com. Coinhive appeared to be a company that was offering an alternative method of monetization on the web. Sites could use Coinhive to embed XMR miners into their web pages that would borrow their user’s CPU instead of serving them advertisements. This is fairly unprecedented as far as I know and Coinhive appeared to have just been launched the week before. In fact, first reports of it being used by The Pirate Bay didn’t even start to make waves on the net until the day after I stumbled across it.

The timing of Coinhive coinciding with my research was impeccable and the interest that it sparked on the web was encouraging. I created an ad that ran a Coinhive.js miner and ran it for an hour and fifteen minutes. I was able to mine the equivalence of $4.20 🌲 in XMR at the time (~$3 after Coinhive’s cut), although the ad itself cost nearly $10 to run. The price of Monero has jumped ~300% since then so this method may now be approaching profitability.

DDoS Bot

Botnets are most associated with distributed denial of service (DDoS) attacks. Botmasters use thousands of machines under their control to flood target servers with enough Internet traffic to render their services unusable or rent access to their botnet for others to do the same. Would the popunder.net ad network give me enough concurrent users to perform a DDoS against one of my own servers?

I rented another t2.micro AWS server and installed stock Nginx to serve a boilerplate website accessible on the net. I then launched a DDoS bot on the ad network that made concurrent HTTP requests to my Nginx server as quickly as possible in an attempt to knock it offline. Nginx was able to handle the ~22K requests per second generated by the bots. The service seemed to operate regularly during the attack which directed 9,850,049 1KB GET requests sent from 12,326 unique IP addresses.

I had similar results with an Apache 2 server I set up. The default Apache server was able to fend off the bots and handle an average of ~26K requests per second. Both Nginx and Apache did use ~60–100% of their single CPU during the attack.

Request volume for the entire network

While the attacks didn’t work in rendering the services unusable (which is actually pretty relieving) I was able to generate a 5.3GB Nginx logfile in just over an hour. The standard AWS micro instance has 8GB of storage, so it would likely be trivial to fill the entire disk of small websites that have the default logging behavior enabled for only a few dollars.

HTTP request volume was consistent throughout the experiment, which is surprising considering the number of concurrently connected browsers should increase somewhat monotonically as users leave browser tabs open.

This is only speculative, but the t2.micro instance provides low network bandwidth and speed in comparison to their more expensive servers, which may have actually throttled the rate that traffic could reach the server. I haven’t run the experiments on a larger instance, but it is possible that attacks would actually be more effective against servers with more network resources. AWS servers are also known for being stable against DDoS attacks, so perhaps attacking a VPS hosted on another platform would be more successful.

Torrent Bot

Finally, the bot I’m most excited to share — the Web Torrent bot. A few years ago a new protocol for peer-to-peer networking communications was introduced in the browser called WebRTC. WebRTC allows web browsers to exchange video, audio, or arbitrary data with each other directly without the need for a third party server to shuffle the information back and forth. A few engineers quickly implemented the popular BitTorrent protocol over WebRTC and WebTorrent was born. WebTorrent allows users to seed and leech files with hundreds of peers entirely through their web browsers. While this technology brings a wealth of opportunities for distributed networking to the web it also comes with some significant security concerns. Torrents can be downloaded and uploaded in the background of web pages unbeknownst to users, which can become particularly problematic if the content is illegal or otherwise unwelcome.

The entire network uploaded a whopping 3.15 TB of data in a single day.

To measure the potentials of such activity I created a torrent of 1GB of random noise data to seed entirely through the ad network. Users that were served the ad automatically download the 1GB file from other users that also had the ad open in a browser tab. The health of the torrent was determined by the number of connected clients at any given time.

The ad ran for 24 hours reaching 180,175 browser clients from 127,755 unique IP addresses. 328.5 KB were uploaded every second by each browser on average, leading to a 702 Mbps upload speed for the entire network.

Clients had an average seed ratio of 2.24 (106.18 max) and uploaded 25 MB of data each (69.28 GB max). The entire network seeded (uploaded) a whopping 3.15 TB of data in a single day.

WebRTC doesn’t discriminate against metered or cellular network connections. I configured the ad network to only target desktop devices when serving this ad but there is nothing stopping a malicious actor from using hundreds of Gigabytes of network data from your cell phone over an LTE connection and racking up a $10,000 phone bill in the process.

Statistics

An ad network turns out to be a wildly successful method of distribution for browser-based botnet code. Together, the ads that I ran executed custom JavaScript code in web browsers with 11,021 unique user agents from 271,464 IP addresses. They were served from 99,690 unique web pages hosted on 17,112 websites.


Looking Forward

I had my fun. Launching a series of research botnets in unsuspecting user’s browsers was pretty close to an all-time high (all in the name of science of course). I never had any intention of abusing strangers on the web or profiting from these endeavors in any way. My ads were limited in scope and duration and I did not expose the IP addresses or other identifiable information of any of the *victims* of the experiments. I sought out to answer a few unsettling questions about the state of the web, the browser, and Internet advertising in an attempt to publish my findings in the open and encourage public discourse about browser based botnets. What I found was honestly horrifying, and I didn’t even tread into some of the deeper waters of modern web technologies.

2017 brought support for WebAssembly in all major browsers and the opportunity for near native speeds of compiled bytecode running in a multi-threaded(-ish) environment with Web Workers. WebGL and the capability of general purpose GPU computing (GPUGPU) with OpenGL shaders, GPU.js and Deeplearn.js offer hardware-accelerated parallel programming in the browser, ripe for the exploitation of unsuspected user’s tabs.

Recent hubbub about the Meltdown and Spectre CPU vulnerabilities and their ability to be exploited via JavaScript is haunting given the success of iframe Internet advertisements as a means of distribution for malicious JavaScript code. Other reports of advertisements using browser form auto-fill features to steal username, password, and credit card information from unsuspecting users scare the pants off of me given what I now know about the scale and reach of these ad networks.

Block ads with uBlock origin or Adblock Plus

There is no doubt more research to be done to better understand the threat we may already be facing in our web browsers and will continue to face in the future. The techniques that I’ve demonstrated in this post are less of an exploit and more a feature of how the web inherently works. As a result, the steps that can be taken to defend yourself against the type of abuse I’m proposing are somewhat limited. My first suggestion is please, please, please BLOCK ADS. If you’ve somehow made it all the way to 2018 without using an ad blocker, 1) wtf… and 2) start today. In all seriousness, I don’t mean to be patronizing. An ad blocker is a necessary tool to preserve your privacy and security on the web and there is no shame in using one. Advertising networks have overstepped their bounds and its time to show them that we won’t stand for it.

Blocking ads defends you from the distribution mechanism that we discussed in this post, but you are still vulnerable to code that is hosted by CPU greedy websites themselves, like The Pirate Bay. The best suggestion that I have for defending against these threats at the moment is to diligently monitor your computer’s CPU usage as you browse, responding to CPU spikes and irregularities as you deem fit. Its a good habit to get into to have your system monitor open during regular computer operation so that you can observe CPU and network usage of your machine at an application level.

Industry Abuse

In closing, I’ll leave you with a hypothetical situation — An attempt to loosely answer a question posed at the beginning of the post. What would happen if major websites borrowed CPU cycles from their users while they browsed their sites much like I did with advertising bots? How much free compute might they be able to extract?

Alexa top three website statistics

Google, YouTube, and Facebook are the top three most visited websites on the Internet according to 2016 Alexa rankings. Google.com (the search page itself, not all of the products offered by the company), receives 1.5 billion visitors a day with an average 8 minutes per-visit, or 22,831 years of “browser time” daily. Given the statistics I collected from ~30,000 samples in one of my advertisements, lets assume each device has ~3.5 CPU cores. That makes Google’s estimated free-daily compute resources equivalent to one CPU running 24/7 for 79,908 years. People would pitch a fit if Google.com greedily used 100% of their CPU resources, but would they notice if they used a mere 10%? Doing so would still yield nearly 8,000 years of compute each day. And remember, that’s not the power of Google’s server infrastructure, but rather, a loose estimation of the amount of free compute they could exploit from their user’s devices entirely for free by virtue of their site’s popularity. Minus, of course, the astronomical legal fees that could come with actually doing it when the public found out about it.

Estimation of free compute that Google could hypothetically harvest from its user’s devices

If you are interested in learning more about this research, a recording of the Radical Networks talk is available to watch on YouTube. A copy of the slides are also available as a PDF on my website. You are welcome to use any resources from this post, the recording of the talk, or the slides in your own work (CC BY-SA).

The information contained in this post is to be used for research and education purposes only. I do not condone its use for illegal purposes. Don’t be a dick.

❤ Brannon

https://brannon.online | https://github.com/brannondorsey | https://twitter.com/brannondorsey