Python Arbitrary File Write Prevention: The Tarbomb
What is a “Tarbomb” attack and how can you protect your python applications?
⚠️This code in this post is meant for education purposes ONLY! f you don’t own or have explicit permission to do penetration testing against an application, DO NOT USE THIS CODE⚠️
What is a Tarbomb?
A tarbomb can actually be a few different things. One common definition is similar to the XML bomb we looked at previously which expands from a small file into a very large object in memory, in this case the tar archive contains many, many files which flood the file system when extracted. However, we’ll actually be looking at an alternative type of tarbomb which can be a bit more malicious rather than just annoying.
Our tarbomb will will be constructed by adding files to the tarballs which are outside of the current directory, by utilizing relative paths. There are also variants of this attack which use absolute paths or symlinks to accomplish the same goal, which is file creation/overwrite in a directory they should not have access to.
As an example of how this could work, imagine you’re on your MacBook trying to open a file you just downloaded from your email, accounts_2020_06.tar.gz. From your downloads folder, you would expect the archive to be extracted into a new folder named
accounts_2020_06. However, what if the archive contained a file with the path
../.bash_profile and contained a modified version of a bash profile that opened a backdoor on your system? If taken literally, this malicious file would overwrite your valid bash profile and you wouldn’t even know it.
Luckily, the macOS archive utility and many other decompression tools check for these scenarios. However, not all do, case in point — tarfile, part of the python standard library, is vulnerable to this type of attack when used out of the box.
While most major zip and tar compression libraries have patched this vulnerability since Synk did additional research and publicized Zip Slip in 2018, I have to assume that there are still many modern and legacy libraries and products that are vulnerable to this class of vulnerability.
Creating a Tarbomb
Creating a tarbomb isn’t very difficult. See the code example below for a simple tool I put together to quickly create tarbombs for testing.
Triggering the Tarbomb
As mentioned earlier, python’s
tarfile module is vulnerable to this weakness. To trigger the vulnerability you just need to invoke the
extractall method on a malicious tarball.
I couldn’t find any reliable workarounds for this safe extraction after some light googling, so I made my own drop-in replacement library for
tarfile. My solution, tarsafe, actually just subclasses
TarFile and adds some safety checks.
Tarsafe can be used in exactly the same way as tarfile, but with added protections in
extractall against not just path traversal via relative path, but also symlinks/links.
I hadn’t planned to write this post — in fact I actually stumbled on this vulnerability a few weeks ago in the wild and thought I had discovered a new vulnerability in tarfile… while I did discover a vulnerability, it unfortunately wasn’t a new one. My hope is that this post will help raise awareness about tarfile and the dangers of handling suspect files without safety checks. Happy coding.