Let’s write a (theoretical) Java Library Worm

4 min readJun 10, 2019

This Article is an addendum to Want to take over the Java ecosystem? All you need is a MITM!

A natural question might be, “Why should I care if one of my dependencies builds could have been compromised by a MITM? Their build isn’t running in my production environment! I’ve got nothing to worry about, right?” What this fails to recognize is that any malicious code being executed in the same environment as a library is being produced can be used to compromise that library.

Using the assumption that a malicious actor can compromise a JAR file in flight to a build server being used to build and publish a library; how could we use this to compromise the end user? To answer this question, let’s write a hopefully theoretical worm.

How do Java Libraries Get Built?

In order to understand how our worm will operate, a basic understanding of how a Java project’s artifacts normally get produced is required. Let’s take a simplified look at the series of steps that Maven/Gradle go through to produce their artifacts (ie. the released files that are published).

The build tool (Maven or Gradle) is launched by Continuous Integration (CI) (eg. Travis CI, Build Ship, Jenkins, TeamCity, ect..).
The build tool downloads the user-defined dependencies for plugins that are used to modify & expand the basic functionality of the tool. Plugins may be anything from source code formatters to additional compilers (for example for compiling Groovy or Kotlin). These dependencies are downloaded using the user-defined repositories (which may be over HTTP or HTTPS).
These plugins are all JAR files that execute on the JVM as a part of the build process. The build tool classloads the class files inside of the plugins and then executes the main entry point method for the plugin.
If your plugin JARs were downloaded over HTTP you might now have malicious code executing in your build.
The build tool now looks at the dependencies for the source code that is being compiled. These dependencies are downloaded using a different set of user-defined repositories (which may be over HTTP or HTTPS).
Now that the plugins are loaded and the compile-time dependencies are downloaded, the build begins compiling the source code for the library creating Java .class files. The same is done for any source code that is used for unit or integration testing the source code.
Although this step doesn’t happen in some development pipelines, in most it is at this point where the unit & integration tests are executed for the library. This is done prior to a release to make one final check before the code is published that no last-minute bugs were introduced into the source code. The build tool does this by loading all the compiled source code and any of the dependent libraries into the JVM and instructing the test framework to begin executing the tests. If at this point, any of the dependencies have malicious code inside them, your build is executing malicious code.
Assuming the tests in the previous step have all passed, the build begins assembling the compiled source code (the .class files) into a JAR file (generically called an “artifact”). You can think of a JAR file is basically a ZIP file with a different extension.
Assuming all the previous steps have succeeded, build tool now uploads the artifacts to an artifact server where other users can then download and use the library in their own code.

Writing Our Worm

Let’s write a theoretical generic MITM JAR based worm together, shall we?

Features

Have it inspect the environment it’s in and only infect artifacts when it’s executing in a CI environment (indicating a release is about to be created). This is easy enough to do as most CI environments have a bunch of environment variables that can be used to fingerprint the context in which the code is executing.
Detect when running in a production environment and begin its nefarious activities. This can easily be done by detecting the presence of directories that are only present on developer machines (eg. ~/.gradle or ~/.m2).
Only ever launch a single instance of itself per system. This is easy enough to achieve.

Our worm will be loaded onto the build server via a MITM of a JAR file being downloaded over HTTP. Once our malicious code is executed, either by being class loaded by the build tool’s plugin infrastructure or by the test runner, our worm gets to work. It begins injecting malicious bytecode into all the .class files that it can find on the system, thus reproducing. When the build tool is done executing the tests, it happily bundles up the now maliciously compromised .class files into the JAR that is published as the official release.

Once the worm infected library arrives in the production environment, the system that the code is executing on is pwned.

Has this actually happened in the wild?

There is no evidence that this has actually occurred, that being said, several of the projects that did audit their previous releases came back with an ‘inconclusive’ result. Others simply chose to file a precautionary CVE and didn’t even attempt an audit. As such, it’s impossible to definitively say that an attack of this nature hasn’t occurred.

Is this vulnerability exclusive to the JVM Ecosystem?

Absolutely not. This sort of attack could be pulled off with the same sort of mechanism against any language that has a standard practice of executing tests before a release. I merely chose to explore the JVM ecosystem because I primarily work in this space.