Discovering A 16 Million Download/Week Node.js Package Zero Day for a Capture-The-Flag Challenge

Eugene Lim
Dec 23, 2020 · 7 min read

Background

GovTech’s Cyber Security Group recently organised the STACK the Flags Cybersecurity Capture-the-Flag (CTF) competition from 4th to 6th December 2020. For the web domain, my team wanted to build challenges that addressed real-world issues we have encountered during penetration testing of government web applications and commercial off-the-shelf products.

From my experience, a significant number of vulnerabilities arise from developers’ lack of familiarity with third-party libraries that they use in their code. If these libraries are compromised by malicious actors or applied in an insecure manner, developers can unknowingly introduce devastating weaknesses in their applications. The SolarWinds supply chain attack is a prime example of this.

As one of the most popular programming languages for web developers, the Node.js ecosystem has had its fair share of issues with third-party libraries. The Node package manager, better known as npm, serves more than one hundred billion packages per month and hosts close to one-and-a-half million packages. Part of what makes package managers so huge is the tree-like dependency structure. Every time you install a package in your project, you also install that package’s dependencies, and their dependencies, and so on — sometimes ending up with dozens of packages!

npm’s recent statistics

If a single dependency in this chain is compromised or vulnerable, it can lead to cascading effects on the entire ecosystem. In 2018, a widely-used npm package, , was taken over by a malicious author who added bitcoin-stealing code targeting the Copay bitcoin wallet. Even though the attacker had a single target in mind, the popular package was downloaded nearly 8 million times in 2.5 months before the malicious code was discovered. In 2019, I presented a tool called at Black Hat Asia that sought to identify malicious packages, but it was clear that npm needed to resolve this systematically. Thankfully, the npm ecosystem has improved significantly since then, including the release of the feature and more active monitoring.

Hunting NPM Package Vulnerabilities

With this context in mind, I set out to design a challenge that used a vulnerable npm package. Additionally, I wanted to exploit a prototype pollution vulnerability. To put it simply, prototype pollution involves overwriting the properties of Javascript objects in an application by polluting the objects’ prototypes. For example, if I overwrote the property of an object and printed that object with , it would output my overwritten value instead of the actual string representation of that object. This can lead to critical issues depending on the application — imagine what would happen if I overwrote the property of a object to always be ! Nevertheless, as the impact of prototype pollution remains dependent on the application context, few know how to properly exploit it.

Next, I applied two tactics to find npm packages that were vulnerable to prototype pollution: pattern matching and functionality grouping.

Pattern Matching

When vulnerable code is written, it often falls into recognisable patterns that can be captured by static scanners. This forms the basis of many tools such as GitHub’s CodeQL, which scans open source codebases for unsafe code patterns. While scanners are used defensively to discover vulnerabilities ahead of time, attackers can also perform their own pattern matching to discover unreported vulnerabilities in open source code.

My tool of choice was grep.app, a speedy regex search engine that trawls over half a million public repositories on GitHub. Since most npm packages host their code on GitHub, I felt confident that it would uncover at least a few vulnerable packages. The next step was to identify a useful regex pattern. I looked up previously-disclosed prototype pollution vulnerabilities in npm packages and found a January 2020 Snyk advisory for the package. Next, I checked the GitHub commit that patched the vulnerability.

code diff

patched the prototype pollution vulnerability by blacklisting the following keys:

const disallowedKeys = [
‘__proto__’,
‘prototype’,
‘constructor’
];

Here, there was no obvious code pattern that was inherently vulnerable; it was the lack of a blacklist that made it vulnerable. I decided to zoom out a little and focus on what did that required a blacklist in the first place. According to the package description, is a package to get, set, or delete a property from a nested object using a dot path.

For example, I could set a property like so:

// Setter
const object = {foo: {bar: 'a'}};
dotProp.set(object, 'foo.bar', 'b');
console.log(object); // => {foo: {bar: 'b'}}

However, the following proof-of-concept would trigger a prototype pollution using ’s function:

const object = {};
console.log("Before "+ object.b); // => Undefined
dotProp.set(object, '__proto__.b', true);
console.log("After "+ {}.b); // => true

This worked because the function of was to parse a dotted path string as keys in an object and set the values of those keys. Based on what we know about prototype pollution, this is inherently dangerous unless certain keys are blacklisted.

After considering this, I decided to search for patterns that matched other dotted path parsers. used to split up dotted paths, although I later discovered that was commonly used by other dotted path parsers as well. With this approach, I discovered several vulnerable packages, but this required me to manually inspect each package’s code to verify if a blacklist was used. Additionally, not all dotted path parsers used or to denote the dotted path string, so I probably missed out on many more.

grep.app search with JavaScript filter

Functionality Grouping

I realised that a better approach would be to group npm packages based on their functionality — in the previous case, dotted path parsers. This is because such functionality is unsafe by default unless appropriate blacklists or safeguards are put in place. After looking through the dotted path parsers, I stumbled on a far more prolific group of packages — configuration file parsers.

Configuration files come in various formats such as YAML, JSON, and more. Out of these, TOML and INI are very similar and match this format:

[foo]
bar = "baz"

A typical INI parser would parse this file into the following object:

iniParser.parse(fs.readFileSync('./config.ini', 'utf-8')) // => { foo: { bar: 'baz' } }

However, unless the parser uses a blacklist, the following configuration file would lead to prototype pollution:

[__proto__]
polluted = "polluted"
iniParser.parse(fs.readFileSync('./payload.ini', 'utf-8')) // => { }
console.log(parsed.__proto__) // => { polluted: 'polluted' }
console.log({}.polluted) // => polluted
console.log(polluted) // => polluted

Indeed, prototype pollution vulnerabilities have been reported in such parsers previously, but only on an ad-hoc basis. I built my proof-of-concept code to quickly test packages at scale, then used npm’s search function to discover other parsers. The search function supports searching by tags such as or , allowing me to quickly discover multiple vulnerable packages.

One of these was , a simple INI parser with a staggering sixteen million downloads per week:

ini’s download statistics

This is because almost 2000 dependent packages use , including the npm CLI itself! Since comes packaged with each default Node.js installation, this means that every user of Node.js was downloading the vulnerable package as well. Other notable dependents include the Angular CLI and , a wrapper around the cryptography library. While these packages included as a dependency, their risk depended on how was used; if they did not call the vulnerable function, the vulnerability would not be triggered.

Packages that depend on ini

Although I did not use for the challenge, I made sure to responsibly disclose the list of vulnerable packages to npm.

Responsible Disclosure

npm supports a robust responsible disclosure process, including a currently-on-hold vulnerability disclosure program. The open source security company Snyk also provides a simple vulnerability disclosure form, which I used to coordinate the disclosures. Fortunately, the disclosure process for went smoothly, with the developer patching the vulnerability in two days.

  • December 6, 2020: Initial disclosure to Snyk
  • December 7, 2020: First response from Snyk
  • December 8, 2020: Disclosure to Developer
  • December 10, 2020: Patch issued
  • December 10, 2020: Disclosure published
  • December 11, 2020: CVE-2020–7788 assigned

Other packages are undergoing responsible disclosure or have been disclosed, such as .

The vulnerability-hunting process highlighted both the strengths and weaknesses of open source packages. Although open source packages written by third parties can be analysed for vulnerabilities or compromised by malicious actors, developers can also quickly find, report, and patch the vulnerabilities. It remains the responsibility of the organisations and developers to vet packages before using them. While not everyone can afford the resources needed to inspect the code directly, there are free tools such as Snyk Advisor that use metrics such as update frequency and contribution history to estimate a package’s health. Developers should also vet new versions of packages, especially if they were written by a different author or published at an irregular timing.

In the long run, there are no easy answers to open source package security. Nevertheless, organisations can apply sensible measures to effectively secure their projects.

P.S. One of our participants, Yeo Quan Yang, posted an excellent write-up on the challenge that illustrated the intended solution to chain a prototype pollution in a package with a remote code execution gadget in a templating engine. Check it out here!

CSG @ GovTech

GovTech CSG — keeping the Singapore Government’s ICT and Smart Systems safe and secure

CSG @ GovTech

CSG — cyber lead for the Singapore Government sector — keeping the Singapore Government’s ICT and Smart Systems safe and secure. Our blog is all about the techniques and technologies in cybersecurity. We post fortnightly. Till then, stay cyber safe, and cyber ready!

Eugene Lim

Written by

White Hat && DevSecOps | Awarded Most Valuable Hacker at h1–213 by Hackerone, Verizon Media, and US Air Force

CSG @ GovTech

CSG — cyber lead for the Singapore Government sector — keeping the Singapore Government’s ICT and Smart Systems safe and secure. Our blog is all about the techniques and technologies in cybersecurity. We post fortnightly. Till then, stay cyber safe, and cyber ready!

Medium is an open platform where 170 million readers come to find insightful and dynamic thinking. Here, expert and undiscovered voices alike dive into the heart of any topic and bring new ideas to the surface. Learn more

Follow the writers, publications, and topics that matter to you, and you’ll see them on your homepage and in your inbox. Explore

If you have a story to tell, knowledge to share, or a perspective to offer — welcome home. It’s easy and free to post your thinking on any topic. Write on Medium

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store