These software packages contain “presents” you don’t want to open
In the forever cat-and-mouse game between ethical builders and users of software and malicious hackers looking to exploit any weaknesses they can find in that software, the good guys have made some progress. Modern software, thanks to more rigorous security testing, is more secure. Still a long way from perfect, but better.
That’s the good news. The bad news is that criminal hackers, as always, are endlessly adaptive. Which is why there is now an increasing number of headlines about attacks that don’t involve exploitable vulnerabilities in software made by the good guys.
The bad guys, instead of waiting around for the good guys to make mistakes, are creating their own software components loaded with malware and trying to trick software developers into using them. It’s a version of so-called “social engineering” attacks that criminals use to try to trick all us regular people into clicking a malicious link in an email or text message.
The attack vector is called a malicious package — a software component designed to look legitimate. For example, as Ars Technica reported a few weeks ago, “An ongoing attack is uploading hundreds of malicious packages to the open source node package manager (NPM) repository in an attempt to infect the devices of developers who rely on code libraries there.”
Attackers focus on repositories (more commonly known as repos) like NPM because that’s where potential victims are — developers looking for components to help build their projects.
And that’s because modern software is much more assembled than written from scratch. While most projects have some proprietary code, more than 75% of it on average comes from elsewhere — commercial third-party components or open source code, which is usually free but comes with licensing requirements. Those components commonly rely on multiple other components like libraries to function. Those other components are called dependencies.
Developers flock to repos like NPM for good reason. Why spend time and money writing code from scratch if there’s an available component on a repo that does the same thing and is either free or much cheaper? Why reinvent multiple wheels every time you build a software product?
Repos as watering holes
But that also makes repos an attractive attack surface — a bit like a watering hole where predators look for prey. Those components, and the dependencies they rely on, become part of a software supply chain that can be corrupted. If developers don’t carefully vet before they download, they can end up importing one of those malicious packages that could put not just their own organization at risk, but also the customers who buy their infected products.
Mike McGuire, security solutions manager with the software security company Black Duck, said developers are waking up to that risk, but malicious packages still “represent what has historically been a rather significant blind spot until recently. It’s certainly an enticing avenue for attackers.”
“As usual, attackers exploit our inherent trust when it comes to implanting malware,” he said. “We trust that if a package comes from a reputable source, it’s safe. Or if a project is maintained by a reputable person, it’s safe.”
But not always. Just this past week, The Hacker News reported that the administrators of the Python Package Index repos had quarantined, and then later removed, the package “aiocpa” after they discovered that a new update to it “included malicious code to exfiltrate private keys via Telegram.” By the time they discovered the problem, the corrupted package had already been downloaded more than 12,000 times.
That’s just one example of why malicious package attacks work. It’s not so much that developers are being careless as it is that they are being tricked in much the same way an unwitting person ends up falling for a malicious link in an email — the packages look legit.
McGuire explained in a recent webinar the three primary ways attackers try to disguise malicious packages.
Typosquatting is misspelling or modifying a typical or popular package name by just one or two letters or characters. His example was cross env vs. cross-env. “One of these is a perfectly good, popular package,” he said. “The other is a malicious package. I’m willing to bet most of my audience can’t tell which is malicious and which is not. And that is exactly what an attacker wants.”
“If I’m a developer and I know what cross env does, I want to use it, and I look for it on Google or have a package manager look for it, I have a 50/50 chance of getting the wrong one,” he said. “And if I do get the wrong package, it might still have the functionality I want but also have some sort of backdoor in it.”
“This is essentially social engineering,” he said. “If I’m an attacker, I have two packages that look like one another and I hope somebody picks the wrong one.”
Brandjacking seeks to exploit both human and package manager behavior. “An attacker will use the name of a popular, legitimate package in one ecosystem and create a malicious version with the same name in a different ecosystem,” McGuire said.
“For example, the threat actor could upload a malicious version of Python SIPPY [Systems Identification Package for Python] to a Rust repo and hope that a Python programmer new to the Rust ecosystem will download it. Programmers can also ask PyPI, the Python package manager, to find SIPPY in the package repo or the ecosystem, and if there is one named that in the ecosystem, it will pull it down. That will be a malicious package, and it will be all downhill from there.”
Dependency confusion seeks to exploit the way development managers and package managers work. “Developers use third-party software libraries in their projects, and rather than including the library in the package, they include links to the original library downloaded from a private repo,” McGuire said.
“But by default, the package manager might first look for the library on a public repo. And if a package with the exact same name exists on a public repo, that package is going to be downloaded instead of the legitimate package in the private repo.”
“From there, malicious code can be introduced to the application, compromising the security of the entire organization, its customers, and beyond,” he said. “This is taking advantage of the default behavior of package managers. Pretty clever.”
McGuire cited ethical hacker Alex Birson, who wrote a post on Medium in 2021 about how he was able to use dependency confusion to hack multiple major tech companies including Apple, Microsoft, Yelp, and Tesla.
As Birson put it, “When downloading and using a package from any of these sources, you are essentially trusting its publisher to run code on your machine. So can this blind trust be exploited by malicious actors? Of course it can.”
Birson noted that he didn’t do this without notice. “Every single organization targeted during this research […] provided permission to have its security tested, either through public bug bounty programs or through private agreements,” he wrote.
But he reported that his success rate using dependency confusion “was simply astonishing. From one-off mistakes made by developers on their own machines, to misconfigured internal or cloud-based build servers, to systemically vulnerable development pipelines, one thing was clear: squatting valid internal package names was a nearly sure-fire method to get into the networks of some of the biggest tech companies out there,” he wrote, adding that he detected dependency confusion in 35 organizations, most with more than 1,000 employees.
No silver bullet, but …
So how can developers avoid building software products that include vulnerabilities they didn’t create?
“There’s no silver bullet for protecting against malicious packages,” McGuire said, but added that doesn’t mean there’s nothing developers can do. “It’s more of a strategy,” he said — one that can be more effective with the use of automated testing tools like software composition analysis (SCA), which can detect known vulnerabilities and licensing requirements in open source software components. McGuire’s recommendations, along with tools or practices to use, include
· Track open source packages that you use and ensure that the one you’re using is the one that you intended to include (SCA)
· Evaluate the source/provenance of each package (SCA)
· Protect the build pipelines against tampering (best practices and tools)
· Evaluate release artifacts before shipping them (binary analysis/SCA)
Or, to paraphrase the line President Ronald Reagan used in dealing with the Soviet Union, verify before you trust any software.