Zip me baby one more time
Introduction
As part of adversary simulation efforts, I have been keeping a lookout for some of the TTPs (Tactics, Techniques and Procedures) used to deliver a simulated malware attack. One of the sites that I refer to, shows that most of these attacks used macros.
Although macros serve as an effective delivery mechanism for malware, it is a well-known threat and might be detected by virus scanners in email gateways. As such, I wanted to explore novel mechanisms that can bypass email gateways.
This article describes my exploratory journey with a focus on Zip files and how they can be utilised in a simulated malware delivery mechanism.
Looking into Zip File Format
As a start, I began looking into Microsoft Outlook that is also part of the Office 365 office suite of products. Microsoft Outlook is one of the most common email clients widely used by personal and enterprises.
Numerous file formats are blocked by Office 365 in their office suite products (e.g. shortcut files, python files, powershell scripts, wscript scripts). The list of blocked file formats can be found here.
While the list demonstrates the excellent job Microsoft has done to block commonly abused file types, they have also inadvertently provided the solution to bypass their file format blacklist.
Taken from Microsoft’s support page on bypassing blocked attachments in outlook, “Using a compression utility, such as WinZip, creates a compressed archive file that has a different file name extension. Outlook doesn’t recognize these file name extensions as potential threats. Therefore, it doesn’t block the new attachment.”
Considering that zip files are one of the most commonly used file formats by employees, organisations would likely allow such file formats to pass through their email gateways.
Previously Known Bypass Techniques using Zip Files
In 2019, Trustwave published an article on a malware campaign, NanoCore RAT, that utilised specially crafted zip files to deliver their malware.
The zip archive contained 2 files: order.jpg and PL_INV_pdf.exe. What is amazing about this zip file is that it contained 2 End of Central Directory (EOCD) records. EOCD records are used by zip archives to signify the end of the zip archive; a normal zip file should only have 1 EOCD record.
As seen from the image below, this specially crafted zip file has the first EOCD record appended after the 1st file (order.jpg) and the 2nd EOCD record appended after the 2nd file (PL_INV_pdf.exe).
Due to an extra EOCD record appended, an interesting behaviour is demonstrated where different zip extractors might extract different files for a specially crafted zip archive.
Based on the table below, WinRar, PowerArchiver and portable 7Zip would have extracted the exe file. On the other hand, the 7Zip file extractor extracted order.jpg.
Lastly, the default Windows zip extractor will detect that the zip archive is invalid and not allow the extraction of any file (as shown below).
Messing around with Zip Files
Based on the case study above, I came up with a few hypotheses and questions:
1. It is not easy to code a proper zip extractor. Zip file formats that are not native to their tool are even more prone to errors. (e.g. Using WinRar/7Zip to unzip .zip file)
2. 7Zip seems more lenient with errors in the zip archive headers. This means that even when a zip archive is malformed, 7Zip could still extract some files from the zip archive
3. How do zip extractors handle special characters not permitted by Windows file names? (e.g. “?” character is not allowed in a Windows filename)
For this article, I will be focusing on .zip extraction through 7zip. Referencing the diagram from mql.com, the image below shows the structures for local file header and central directory in a zip file named “HelloWorld” containing HelloWorld.txt.
In the zip file structure, I noticed two identical file names in the zip archive, one in the local file header and one in the central directory. I wanted to explore the relationship between these two file names and how the local file header and the central directory would affect the zip archive.
Experiment 1
To test this, I created a zip archive with a different file name in the local file header and central directory. The local directory’s file name was set to max.txt and the central directory’s file name was set to sam.rtf.
When the zip file was extracted, 7Zip notified of errors in the header but still extracted max.txt. From this experiment, we can infer that the extraction would only reference the file name from the local file header. Even if a different file format or name was used in the central directory, it would not affect the extracted file.
Experiment 2
Next, I wanted to experiment how 7Zip would handle special characters that are not allowed in Windows file names. I modified the file name in the local file header by replacing character ‘a’ with character ‘/’. This results in 7Zip interpreting ‘/’ as a folder. After extraction, a new directory called m would be created with x.txt in it.
Peeking into the contents of x.txt, the original file content was still present.
To test if special characters were counted as part of the file name, I added in an extra character ‘/’ to the file name. Instead of the full name, x.tx was extracted instead. This behaviour demonstrated that special characters are counted as part of the file name.
Now this sparked a thought in my head — What if the file name in the local file header and the central directory had different extensions? Would the email gateway determine the file format through the file name in the local file header or the central directory? This helped to set the context for experiment 3.
Experiment 3
To test my idea, I created a zip file with a text file containing PowerShell commands. I then modified the file extension in the local file header from .txt to .ps1.
I sent this specially crafted zip archive to some email gateway and was successful in bypassing their file format blacklist. An interesting behaviour observed was that 7Zip used the file extension in the local file header when extracting files. This meant that the extracted file was in .ps1 file format and not .txt.
This experiment also confirmed my hypothesis that some email gateways only use the file extension in the central directory to determine the file format.
Experiment 4
Using the technique described in experiment 3, I was able to bypass some email gateway that was blacklisting .ps1 files. When I switched to hta files, however, email gateways such as Gmail would block and notify that the zip archive containing hta file was not allowed.
To bypass this validation, I had the idea of creating a schizophrenic zip archive. This causes the zip archive to be recognised differently by different zip extractors. This idea was inspired by Ange Albertini and his sharing on schizophrenic files.
To get started, I created a zip archive containing a .txt file with the following contents as shown in the picture below. This contains a hta script that runs calculator.exe when executed.
Next, I edited the file name length in the local file header located at 0x1A. By modifying this value from 0xC to 0x0, zip archivers would think that the file name length is 0.
An interesting behaviour of 7Zip is that when the file name length is 0, it references the zip archive name to set the name of the extracted file. To exploit this behaviour, I prepended an extra .hta file extension before the .zip file extension.
Opening this zip archive with 7Zip would result in a .hta file being extracted.
When previewing this zip archive with the default Windows zip extractor, it showed a .txt instead of a .hta file extension. With this, I have successfully created a schizophrenic zip archive which displays different file formats when using different zip extractors.
As this specially crafted zip archive requires the user to extract it using 7Zip, I modified the .zip file extension to .7z. This would increase the probability of the victim opening this zip archive with 7Zip.
As a proof of concept, I uploaded this specially crafted zip archive as an email attachment on Gmail. Note that Gmail no longer notifies that the zip archive contains a hta file.
Responsible Disclosure to Google Team and 7zip
Before publishing this article, I have also responsibly disclosed this file format bypass to Google. Google has acknowledged this issue but has decided not to track it as a security bug as it is intended behaviour.
Similarly, I have disclosed and sought 7zip’s permission before releasing this article.
Conclusion
- Using this specially crafted zip archive, I was successful in bypassing email gateways that implemented file format blacklists. This was because most email gateways utilised the default Windows zip extractor in order to determine the embedded zipped file formats.
- Ever since Microsoft blacklisted multiple file formats, zip files seemed to be an easier approach to delivering malware.
- Malicious actors are likely to continue developing new tricks to bypass defences such as file format blacklist. As such, defenders should consider employing multiple and layered defences.
- Organisations should drop malformed zip files at the email gateways. It is unlikely for anyone to send malformed zip files for work purposes.