Analysis Walkthrough: APT32’s {79828CC5–8979–43C0–9299–8E155B397281}.dll

asuna amawaka
insomniacs
Published in
19 min readJun 23, 2022

It has been some time since I did any reverse engineering! Time to refresh? And so I reached my hand into my stash of old malware…

This set of APT32’s binaries fell into my hands a long time ago and the anti-reversing stuff inside set me back for a long while. Don’t get excited yet, these are old binaries. Back then when I saw them, I didn’t think I have it in me to be able to analyze them. Now, let’s try!

This post will be organized into two parts: The first part will be on the unpacking of the payload; the second part will be on the analysis of the payload’s features. Lengthy post ahead, be warned!

Other researchers’ work on the same malware payload ({79828CC5–8979–43C0–9299–8E155B397281}.dll) are linked at the end of this post.

OK let’s go!

Files I started out with:

· xwizard.dll

· xwizard.dtd

Legitimate copies of these files are found in Windows (C:\Windows\System32) by default:

However, the set of malware looks different if you are sharp eyed:

Just by seeing the DLL and DTD files reminds me of file trinity (as seen in PlugX and ShadowPad, just to drop some familiar names).

Then I came across this old article: https://www.hexacorn.com/blog/2017/07/31/the-wizard-of-x-oppa-plugx-style/

This is very likely how the malicious files are being executed.

Alright, let’s start with simple static analysis as usual, then we’ll deep dive.

Scrolling through the DLL file, this caught my eye:

That looks like some magic chant.

There’s also another set of similar nonsense in Unicode:

And this is how these words are being used in the DLL:

Essentially, the contents of the “random nonsense” does nothing at all. So why are these there? To waste our time. Quite typical of what APT32 does in their malware.

The meat of the malware lies in the DTD file. The content of this file gets read into memory and executed in a new thread.

Welcome to junkyard.

Up ahead is a whole bunch of anti-reversing techniques purely to make us suffer :)

The DTD file starts with some instructions followed by stuff that resembles the PE header (actually just the DOS stub and RICH header are there, at a glance).

It just feels like the DOS stub and RICH header are deliberately left there just for tease — why else are they not removed or obfuscated along with everything else?

Unsurprisingly, the instructions at the beginning of the file are calls:

Which leads to a function that looks like this…

Uh.. ok..

One look at the first few instructions within this “function” already makes me go yikesssss…

What we are seeing is what other researchers [3] have noted about APT32’s usage of junk code and jumps.

Amidst the junk code, we see familiar code blocks. For example, these look like the typical GetProcAddress calls.

And also loops like this that resembles decryption of sorts.

I decided to leave the heavylifting to the malware itself by executing it. Then we can look for interesting parts to analyse in depth.

I wrote a quick binary loader with python to load the DTD directly into memory without going through the DLL (since we already know it does nothing interesting).

import sys
import ctypes.windll.kernel32 as kernel32
import ctypes.wintypes as wintype
MEM_COMMIT = 0x1000
PAGE_READWRITE_EXECUTE = 0x40
VirtualAlloc = kernel32.VirtualAlloc
VirtualAlloc.argtypes = [wintype.LPVOID, ctypes.c_size_t, wintype.DWORD, wintype.DWORD]
VirtualAlloc.restype = wintype.LPVOID
RtlMoveMemory = kernel32.RtlMoveMemory
RtlMoveMemory.argtypes = [wintype.LPVOID, wintype.LPVOID, ctypes.c_size_t]
RtlMoveMemory.restype = wintype.LPVOID
CreateThread = kernel32.CreateThread
CreateThread.argtypes = [wintype.LPVOID, ctypes.c_size_t, wintype.LPVOID, wintype.LPVOID, wintype.DWORD, wintype.LPVOID]
CreateThread.restype = wintype.HANDLE
WaitForSingleObject = kernel32.WaitForSingleObject
WaitForSingleObject.argtypes = [wintype.HANDLE, wintype.DWORD]
WaitForSingleObject.restype = wintype.DWORD
# Read binary filename
with open(sys.argv[1], "rb") as input_binfile:
bin = input_binfile.read()
# Inject into memory
mem_addr = VirtualAlloc(0, len(bin), MEM_COMMIT, PAGE_READWRITE_EXECUTE)
print("Loaded at address {:08X}".format(mem_addr))
RtlMoveMemory(mem_addr, bin, len(bin))
thread = CreateThread(0, 0, mem_addr, 0, 0, 0)
WaitForSingleObject(thread, 0xFFFFFFFF)

Andddd we run it.

There we go! DNS queries!

Let’s see if we can figure out where the config data is in the memory, using these domain names as clues. The idea is: if we can find the config data, then we are closer to the malware payload.

With hasherezade’s pe-sieve, this step is easy!

We find the domain names within one of the DLL PEs:

It looks like there is a registry key name nearby as well:

Also notice that the DNS queries are done in the order that the domain names appear within the config data.

Because we are reverse engineers, and here I am writing an analysis, we cant just completely rely on tools’ output, can we? :D Let’s have some more fun.

We can work backwards a little to see how this payload got loaded amidst the junk code.

To get started, we can get abit more help from an API monitor tool — we see allocations of 2 memory blocks, one for the headers and the other for the code, followed by a CreateThread that starts execution of the code. We can put a breakpoint onto VirtualAlloc to find out where the payload was hiding in.

In order to hook into the binary with a debugger, we can patch the first two bytes of the DTD file to “EB FE”, which is the instruction for an infinite loop (EB FE means JMP -2, which effectively causes this 2 byte instruction to be executed repeatedly).

Once our debugger is hooked in, we can revert these two bytes to their original in memory.

The first call to VirtualAlloc is here:

Followed by another VirtualAlloc and some memcpy:

Turns out that the content being copied is..

0x40 bytes of DOS header from the stack:

0xF8 bytes of the PE header, also from the stack:

The result of the memcpy looks like a proper PE file forming. And I think there is a good chance that this is the PE file we are looking for.

The next part that we can expect to be copied would be the section headers.

True enough. But there is something different this time. The data is being copied from the mapped DTD file and is encrypted.

Which brings us to this loop that prepares a value in AL that gets XOR-ed with the above data to reveal the section headers. Decryption!

The next strange thing that the malware does is populate the .text section with 0x90 (NOPs) and execute it with a lot of new threads. There are many ways of slowing down analysis and the malware author decided to use this.

After an eternity of waiting (just joking, it didn’t take that long. But we should short-circuit that in the future), the real contents get copied over. Again, the contents came from the mapped DTD file and is encrypted.

From this point onwards, the malware just performs VirtualAllocs, copy encrypted sections from DTD in memory, decrypting the sections… and the final outcome is indeed the file that we recovered with pe-sieve!

At this point, I would like to point out that what we have just experienced is an unpacking routine. This packer contains a few anti-analysis mechanisms, including junk code, data assembly using the stack and encryption.

Let’s find out what is this encryption algorithm used to protect the code of the payload.

Before we can talk about algorithms, we have to be aware that the malware assembles data on the stack with code that looks like this:

Following the GetProcAddress resolution of

· WaitForSingleObject

· VirtualAlloc

· CloseHandle

· CreateThread

The malware copies 0x100 bytes of data onto the stack like this:

There are even some sneaky edits to this chunk of data, following some junk code.

That is not all. 0x47 bytes of values found near the beginning of the DTD file is used to XOR with the first 0x47 bytes of this data chunk.

If we had dumped this chunk of data too early, we would have gotten the wrong values in some places.

Immediately after doing these, the malware proceeds to initialize 0x100 bytes on the stack:

Now this starts to look familiar. Here’s the mystery solved: what is this chunk of data that is so carefully protected? That’s the RC4 key! A very long one — exactly 0x100 bytes.

Pseudo-code of RC4 Key Scheduling:

for i from 0 to 255S[i] := iendforj := 0for i from 0 to 255j := (j + S[i] + key[i mod keylength]) mod 256swap values of S[i] and S[j]endfor

This is the exact loop implemented in the malware:

To ensure that malware did not introduce any sneaky arithmetic into the loop, I wrote a quick python script to perform the KSA and verified the output against the malware’s result.

S = bytearray(0x100)
with open("key","rb") as finput:
key = bytearray(finput.read())
for i in range(0,0x100):
S[i] = i
j=0
for i in range(0,0x100):
j = (j + S[i] + key[i]) % 0x100
t = S[i]
S[i] = S[j]
S[j] = t
output from my script
generated by malware

Perfect.

Next we are looking for the PRGA that generates the stream key for decrypting data. But before we found it, we see that the malware prepares another 0x100 bytes of data on the stack.

Looking onwards in the malware code, we can find the RC4 PRGA logic here:

Here’s the RC4 PRGA pseudo-code from Wikipedia:

i := 0j := 0while GeneratingOutput:i := (i + 1) mod 256j := (j + S[i]) mod 256swap values of S[i] and S[j]K := S[(S[i] + S[j]) mod 256]output Kendwhile

Then comes the surprise. Instead of using a byte from the S array as the key to XOR with the data, the malware chose to use another array as the key pool instead! This is the purpose of the second 0x100-byte chunk of data prepared.

Once again, we can reproduce and check this with a quick python script like this:

with open("xwizard.dtd", "rb") as dtd_input:
packed_file = dtd_input.read()
ciphertext = bytearray(packed_file[0x5F:0x5F+0x40])
plaintext = bytearray()
i = 0
j = 0
for c in ciphertext:
i = i + 1
j = (j + S[i]) % 0x100
t = S[i]
S[i] = S[j]
S[j] = t
k = S[(S[i] + S[j]) % 0x100]
final_K = S_1[k]
plaintext.append(final_K ^ c)

Tada~ Successful decryption, just as what the malware did.

From here onwards, everything clicks into place. Putting together all the observations from debugging the malware, we have to keep in mind the following order of decryption:

1. 0x40 bytes DOS header

2. 0xF8 bytes PE header

3. 0x28 bytes section header

4. 0x?? bytes section (read virtual/raw* size from section header)

5. Repeat 3,4 until all sections are decrypted

There is a little check for the section’s virtual size and raw size as defined in the decrypted section header, and the malware chooses the smaller number to use as the number of bytes to decrypt (look at the instruction cmovbe ecx, [esi] in the screenshot below). Because a stream cipher is used, the size and order of the data going into the decryption routine has to be correct, else the results will be gibberish. It sure does feel like the author is putting little traps to deter analysts trying to defeat this malware’s decryption.

I’ve included my unpacking script on GitHub. Unfortunately it cannot be a master solution for all samples of this malware, since there is a need to recover the RC4 key and the key pool (S_1) from memory because of the heavy usage of the stack to re-assemble these data.

Now we can look into the payload’s features.

I’m going to refer to this payload as “{79828CC5–8979–43C0–9299–8E155B397281}.dll” — this is the name of the DLL given by its author.

When we touch a new malware, everything is an unknown — how would we know where to start looking? What I usually do is to look at the imports and pick up interesting API functions to set breakpoints on. This doesn’t always work — for example if the malware performs dynamic resolution (with LoadLibrary and GetProcAddress). Fortunately for this case, the import table looks long enough for us to get started with this approach.

In order for IDA to load the sections correctly, I would usually fix the section tables by copying the virtual values into the raw values:

The import and export tables would then be parsed correctly:

a happy import table
a happy export table

When these are done, IDA is happy too.

We can start to identify interesting APIs.

Evergreen interesting API that I always put breakpoints on:

- VirtualAlloc

Possibly related to gathering victim machine information:

- GetUserNameW

- GetAdaptersInfo

- GetDriveTypeW

- GetComputerNameW

Possibly for decrypting stuff:

- CryptGetHashParam

- CryptCreateHash

- CryptAcquireContextW

- CryptHashData

- CryptImportKey

Possibly for managing persistency:

- RegQueryInfoKey

- RegOpenKeyExW

- RegCreateKeyExW

- RegSetValueExW

- CreateMutexW

File interactions are always interesting:

- WriteFile

- CreateFileW

- ReadFile

Resource section is often a place for hiding encrypted stuff:

- FindResourceExW

- LoadResource

- SizeofResource

These hint at interesting capabilities of the malware:

- ShellExecuteW

- CreateProcessW

- CreateThread

Also, notice that among the imports there is no network-related APIs (and we know there should be some, since the malware did DNS queries), so there should be another set of APIs being loaded at runtime.

Let’s use API Monitor to see how many of the above API predictions are correct.

We can be certain that there are network activities:

Looks like I did not get hits on the other API functions, yet. Now we can delve a little deeper, starting with VirtualAllocs.

The first VirtualAlloc that gets a hit:

Following that is a call to a function that we are going to find out to be a decryption and decompression routine.

Does this look familiar? Yes!

The encrypted data is found in the rdata section, with the size of the data being pushed onto the stack as one of the parameters for decryption/decompression. The 0x20 bytes RC4 key is also prepared on the stack (no surprise). And finally, the RC4 algorithm used here is the standard one.

The output from the RC4 decryption starts with “5D” which is the magic value for a LZMA-compressed file.

Fingers-crossed that this is a standard LZMA decompression… and bingo!

from Crypto.Cipher import ARC4
import lzma
with open("encrypted_rdata_1005D2F8", "rb") as enc_datafile:
ciphertext = enc_datafile.read()

key = bytearray(b"\x23\xb6\xb4\x67\xa3\x03\x26\x02\xb6\x87\x5c\x51\xce\x69\x39\x7d\x58\xf3\xc0\xba\xfb\xbc\x63\xbd\xef\x0a\x4c\x41\xa9\xb8\x98\x71")
rc4 = ARC4.new(key)
plaintext = rc4.decrypt(ciphertext)
props = lzma._decode_filter_properties(lzma.FILTER_LZMA1, plaintext[0:5])
decompressor = lzma.LZMADecompressor(lzma.FORMAT_RAW, filters=[props])
with open("decrypted_rdata_1005D2F8", "wb") as dec_datafile:
dec_datafile.write(decompressor.decompress(plaintext[5:]))

Notice that the PE header of the decrypted file looks like there are some missing values e.g. the section headers.

For the ease of following the rest of this post, note that the decrypted rdata PE starts at address 0x42E0000 in my instance. From this point on, whenever I mention any functions at address larger than 0x42E1000, I would be referring to a function within this decrypted rdata PE.

Let’s get back to it.

First thing that the malware did after decryption/decompression, is to dynamically load libraries that are defined within part of the decrypted file that is supposed to be the “imports table”.

This is followed by a long series of function calls that loads API calls and fix the offsets within memory (because of memory relocation?).

These calls ended abruptly (the entire code block is too big for IDA to display it in graph form) at this part of the code:

Which eventually led to the execution of a function beginning at 0x10056030 that reads 4 pieces of data from the resource section of malware, namely “100”, “2”, “1” and “200”.

Using Resource Hacker we can check what is in the resource section, to help us make sense of the function during analysis.

We don’t find a resource named “100” from our binary.

Resource “2” contains the domain names and ports as observed earlier in this post.

Resource “1” contains two visible chunks of data. The first looks like some encoded/encrypted data, and the second is a registry keyname.

Resource “200” reminds me of the LZMA header.

Once we dive into the function at 0x10056030, we notice that it calls into code that is in the RC4/LZMA decrypted-decompressed PE in the rdata that we handled earlier on.

As we proceed, we will also find that these code within the rdata are generously plumped with junk code and ROP-style jumps i.e. return addresses are pushed onto the stack and a jump is made by using the “ret” opcode. Such as this:

Until this point, perhaps you would ask — how does the malware know what exact value to push onto the stack such that it returns to the correct code? Afterall, in real ROP attacks, the return values are manipulated with ROP chains specially crafted to read EIP and other addresses to achieve outcomes. But this is not the case here — the return addresses appear to be hardcoded in the instructions.

Remember that we saw a very long sequence of offset-fixing just after decryption/decompression of the rdata PE? It turns out that the “offset” values that were fixed are the operands of the PUSH instructions that sets the return addresses!

On junk code: some of them form code blocks that begin and end with complementary jumps e.g. “JNZ and JZ”, “JNS and JS”. Upon further look, we would notice that the execution flow would end up at the same place after the jump — which means that if we are able to identify these code blocks, we can safely ignore them.

Someone at Checkpoint already wrote a really nice python plugin for Cutter and radare2 back in 2019 to clean up these junk blocks: https://research.checkpoint.com/2019/deobfuscating-apt32-flow-graphs-with-cutter-and-radare2/

Unfortunately not all of the junk code huddle together in neat blocks, so we would still need to battle against some of them.

Let’s get back to understanding what the malware does with the content read from the resource sections.

We do not have any resource named “100” within our binary, so other than knowing that the malware does not expect there to be any NULL bytes within the data, we know that the content read is being used in the function call to 0x432AA30. We can try to deduce what this function call does using subsequent occurrences of calls to this function, and then come back to predict what content a resource named “100” would hold.

Arguments to call subroutine within decrypted rdata at 0x432AA30:

It looks like the first argument is the name of the configuration, while the second and third arguments are the values and size respectively.

Notice that the configuration “{12C044FA-A4AB-88A2–32C3451476CE}” holds a pointer to a function at 0x10056650. This function looks like this:

That is a very nice shape of a switch-table. Let’s see how that looks like in the decompiler:

This function expects and checks some input value. If the first value is a 2, then the malware exits. If the first value is a 4, then a new copy of the malware process is started. Nothing very fancy here.

After everything is done with the resource section, the malware continues to execute code within rdata, at function 0x04329130. (In fact, the decrypted PE within rdata is the main component of this malware from this point on.)

There is a whole lot of merry-go-rounds in there, before finally ending up at a decent looking function at 0x43B4B0A. There is a createthread call there too.

It gets a little boring at this point because there are just so many anti-analysis stuff. So once again, I fell back to using API calls to identify interesting functions.

Remember that we are still looking for network communications. So here is a function that seems to load ws2_32’s APIs:

Since we know that the malware is going to do some DNS queries, we can put a breakpoint on ws2_32’s API calls that would perform domain name resolutions, for example getaddrinfo and gethostbyname. The usual API for sending and receiving data (send and recv respectively) should also be watched.

Turns out that the malware uses getaddrinfo, and a hit on this breakpoint leads us to the function at 0x43B0120:

Here is another anti-analysis technique used — the API is called through a small trampoline function:

trampoline function to call an API

Continuing onwards, we would see a send:

First the malware sends out 0x8 bytes of data, followed quickly by 0x4C bytes and then a variable-length chunk that changes on every execution.

This is how the buffers looked like in memory:

From Wireshark’s point of view:

On first look, these looked like encrypted data — we shouldn’t be surprised.

Tracing these turned out to be a very frustrating task, involving hardware breakpoints, a lot of VM reverts and a lot of netcatting with experimental values. So as not to bore you all, I’ll skip directly to the answer!

Standard RC4 is used to encrypt the traffic and we can see the encryption loop right here:

The RC4 key used to encrypt/decrypt the traffic is communicated between the malware and the C2 via the first 8 bytes sent in each communication.

The structure of the data communicated is as follows:

[QWORD RC4key]

[DWORD unknown] [WORD magicbytes] [DWORD unknown] [DWORD size1] [DWORD size2] [DWORD size3] [WORD datatype] [24bytes id_1] [24bytes id_2] [DWORD checksum]

The “magicbytes” is fixed to 0x69FD. The malware checks for the presence of this value before proceeding.

There are 3 fields reserved for sizes, and they indicate the length of the next chunk of data to be expected.

A 2-byte value indicates the type of data that follows. 0x0 implies data is in plaintext, 0x1 means zlib-deflate compressed.

There are two long fields, each of 24 bytes. The first appears to be filled with value randomly generated by the malware. And the second is to be received from the C2, presumably random too.

The last 4 bytes is a CRC checksum.

And we have finally come to the end of this post! It has been a very tiring but fulfilling week tearing apart this malware as an exercise for my brains.

Until next time!

===

[1] https://blogs.blackberry.com/en/2019/04/report-oceanlotus-apt-group-leveraging-steganography

[2] https://ti.qianxin.com/blog/articles/oceanlotus-targets-chinese-university/

[3] https://www.welivesecurity.com/wp-content/uploads/2018/03/ESET_OceanLotus.pdf

~~

Asuna | https://twitter.com/AsunaAmawaka

--

--