Don’t Scan My Website I: Exploiting an Old Version of Wappalyzer

Published in

The Startup

6 min readDec 1, 2020

Disclaimer: I discovered this vulnerability in February and it was fixed in May 2020 (version 5.10.2 and new branch 6.x) due to the change of the web driver from Zombie.js to puppeteer.
Initial research was done as part of my work at Dreamlab Technologies.

At work I had to vet different software detection solutions and one of them was Wappalyzer. Following the line of my previous research about scraping software being pwned by malicious websites [1] [2] and Wappalyzer being a tool analyzing third-party websites, the natural question was: would it be possible to be pwned by a malicious website if I run Wappalyzer against it?

I spent some hours of trial and error and tried the following hypothesis:

What happens if the src attribute of an iframe points to a local file? Would I be able to read the content of that iframe using Javascript?

For that purpose, I created a web page that dynamically points the iframe source to a local file. In case of success, the file contents are inserted into the document :

I made it available at http://localhost:8080. Visiting that page using a real web browser, the iframe doesn’t load and the console displays the following error:

More information about this security measure can be found here. However, what happens when Wappalyzer visits that page?

In this article I’m using version 5.9.34 because it’s the last version of the branch 5.9 available on npm (I installed it using npm install wappalyzer@v5.9.34). For this test, I did some hack in my Wappalyzer installation to display the page content over which Wappalyzer applies its heuristics. Let’s try running Wappalyzer against my malicious website:

The exploit works! In this post we’re going to go first with the full exploitation of this vulnerability and next we will delve into the technical details why it’s happening.

Straight to the exploitation

The proof of concept is working and it inserts the local file contents into the document body. We can execute Javascript code and that gives us a lot of freedom i.e. we can create AJAX requests and fetch external resources. Can we fetch any kind of resource? No, only script and (i?)frame resources but that’s enough (it’s explained further in the Technical Details section).

After a bit of testing, it seems an unrestricted scenario:

We can add as many iframes as we want, meaning that we can read a lot of files.
Iframes are loaded recursively: iframes inside an iframe will be loaded too.

The second case is interesting and reminds me of Exploiting the scraper post. Most valuable files in a victim’s machine are usually in its $HOME directory. Using the file:// protocol handler we can’t reference relative files, so we need to know the local user to be able to build the full path to fetch files from $HOME. Can we do that?

Yes! With the help of Bottle I can build my malicious server. The flow is the following:

Wappalyzer requests my malicious server at http://malicious-server/
My malicious server returns the following response:

As seen there, at line 9 it encodes the file contents of /etc/passwd in base64 to be exfiltrated to my malicious server at line 10.

3. Wappalyzer renders this page, executes the Javascript code, sends the request to http://malicious-server/exfil1 and waits for its response to render it.

4. In my malicious server I receive the exfiltrated data, decode it and read the list of users. I discard common system users and get the name of the local user (in this example it’s existent_user). Wappalyzer is waiting for a response that in this case it will be:

It’s the same logic, this time exfiltrating the user’s private SSH key file to other endpoint.

5. Wappalyzer makes the same as in point 3, this time requesting http://malicious-server/exfil2 endpoint.

6. In my malicious server, I get the exfiltrated file and return an empty HTML page, which means that there’s nothing more to show.

7. Wappalyzer gets it and finishes the rendering process, proceeding to start the analysis logic.

The full code of the exploit is available here. I’ve created a video where I target file ~/secret_file instead of the private SSH key. I’m referencing the server at localhost but I’ve tested and it works for remote servers as well.

Takeaways

In terms of exploitation, I’ve only shown 2 steps but it could be extended to as many as you want, being able to fetch more files from victim’s $HOME or file system. Using the same premises ( iframe src) it’s also possible to turn it into a Client-Side Request Forgery to query hosts/services reachable by the victim and be able to read the responses.

In terms of recommendations, always run your security tools either in a virtual machine or container. Related to Wappalyzer, use version >=6.x .

Below there’s the explanation of the vulnerability root cause and its notification timeline.

Technical details: the root cause

Previous to version 5.10.2 , Wappalyzer used Zombie.js as its headless browser to render target websites. However, Zombie.js is not a real web browser and under the hood uses JSDom to provide Javascript capabilities.

Reading the documentation of JSDom, there’s a mention to a setting called runScripts that when it’s set to the value dangerously it enables executing scripts from the target website. It’s warned to developers to use this setting and value only with trusted content.

Coming back to Zombie.js, let’s see how it uses JSDom. In src/document.js , it sets the behavior to deal with scripts and remote resources:

From src/index.js , we can notice that the default enabled features are:

So, by default, Zombie.js has enabled JSDom’s dangerous setting and will load external scripts and iframes. Wappalyzer, making use of Zombie.js, inherits this behavior and that’s why the exploitation worked.

Technical details: the issue

For me, there are two points that make it possible:

Dangerous setting in JSDom invocation
No validation of resource loading from different both protocol and origin (in our test, we were loading a local file using a file:// protocol from a external HTTP server).

We contacted JSDom team about these two points and they replied:

This is not a security vulnerability, as they have explicitly disabled security by setting runScripts: "dangerously".

And with security, they mean any kind of security measure. I don’t agree with that: JSDom makes i.e. CORS pre-flight checks and some other browser stuff that’s not affected by runScripts value. The same should happen with resource loading from HTML tags.

Going a little deeper in point 2, I created the following proof of concept without runScripts="dangerously":

The file /tmp/loadit doesn’t exist. Running the proof of concept using node displays:

Even without runScripts , it tries to load the file from the file system. However, without Javascript being interpreted there’s no way to exfiltrate the content (at the moment).

Vulnerability timeline

February 2020: I discovered it
Mid-May 2020: Shared with Dreamlab Research Team
Late-May 2020: Vulnerability was fixed by changing web driver
June 2020: Notification to JSDom developers
December 2020: Public disclosure

Thanks to Sheila for both reviewing the initial advisory and managing the communication with JSDom developers and Conrad for proofreading this post.