3 New Malicious Packages Found on PyPI

Highly Used Packages Identified Through Text Analysis

Andrew Scott
Ochrona Security
4 min readDec 12, 2021

--

Background

I recently started doing some large scale static analysis of a large percentage of the packages on PyPI. Using Bandersnatch from the Python Packaging Authority, it was relatively easy to download roughly ~200k of the nearly 330k packages available on PyPI today. For this research, I was careful to exclude exceedingly large distributions, only pull the latest versions of packages, and configured a lower number of workers to avoid putting excess pressure on PyPI’s servers.

Once I had a large number of the package distributions downloaded, I needed to extract them for easier analysis. I put together a pretty simple python script to recursively iterate through Bandersnatch’s somewhat complicated folder structure then decompressed and extracted each sdist, egg, or wheel out to a flat directory. Once extracted I ran a number of string and regex searches using grep, then manually reviewed the results.

The outcome of this simple approach was actually pretty impactful. I ended up discovering a minor vulnerability in an open source package offered by a commercial company. Along with three packages hosted on PyPI that were malicious in nature, which I’ll outline below.

Trojan Installation Package

The first package that I identified was aws-login0tool, which seems like a typo-squatting attempt (0 being right next to -on most keyboards). I found this package because it was flagged in multiple text searches I did looking at setup.py, since that’s one of the most common locations for malicious code in Python packages since arbitrary code can be executed there at install time. Specifically I found this by looking for import urllib.request since this is commonly used to exfiltrate data or download malicious files and it was also triggered by from subprocess import Popen which is somewhat suspicious because most packages don’t need to execute arbitrary command line code.

Looking at the file itself, it’s very clearly malicious. It does a standard package install, then fetches an .exe from a nondescript domain before attempting to execute the .exe.

setup.py and VirusTotal results for aws-login0tool

I also analyzed the .exe in question and VirusTotal was able to easily flag it as a known Trojan.

This package was first added to PyPI on December 1st, and I reported it to PyPI admins on December 10th. Unfortunately, PyPI download stats were affected by an incident during this time period, but after a backfill on December 12 it appears that this package was downloaded nearly 600 times.

Data Exfiltration Packages

I was also able to locate two packages uploaded by the same user which have been present on PyPI since February 2021, dpp-client and dpp-client1234. Both attempt to achieve the same outcome, which is to collect environment details during installation and send them to an unknown web service. While VirusTotal didn’t flag the target domain, this kind of behavior is still sketchy AF.

Like the earlier trojan package, I also identified these packages via the import urllib.request string. On closer analysis, this install will try to gather environment variables, file listings, and seems to be looking specifically for files related to Apache Mesos.

setup.py for dpp-client 1.0.9

The bad news here is that dpp-client, the more popular and thorough of the two packages has been downloaded over 10,000 times in the last year, including over 600 downloads in the last month. What’s worse, both of these packages included their source code url as an existing popular library, so anyone browsing to the package in PyPI or analyzing how popular the library was would see a large number of Github stars and forks — indicating a good reputation.

Takeaways

This was a very interesting project for me. Even with a less than scientific approach, and only analyzing 2/3 of available packages, I quickly discovered actual vulnerabilities in published code and more worryingly found actively used malicious code being hosted and downloaded.

I’ll be continuing to update and refine my analysis of these packages over time and will share out any additional discoveries.

Big thanks to the Python security team for prompting removing the identified packages once notified.

--

--

Andrew Scott
Ochrona Security

Maintainer @OchronaSec | PANW, ex Expanse, ex Tenable | Security & Automation | All views are my own... and awesome