Can Assemblyline beat HackTheBox Business CTF 2023 Forensic Challenges? (1/3)

Assemblyline Blog Entry #4

gdesmar
7 min readOct 23, 2023

⚠️⚠️⚠️ CAUTION ⚠️⚠️⚠️

This document describes malware analysis in Assemblyline. Malware analysis must be performed in an isolated environment.

In the previous Assemblyline blog entry, Static Analysis Showcase, my colleague Kevin dove into a sample from a recent OneNote campaign to illustrate what analyzing a malicious sample in the Assemblyline UI looks like for users!

The following is the first part of a write-up detailing the solutions for forensic challenges that were part of the HackTheBox Business CTF 2023 competition. As with most CTF competitions, time is of the essence, and having the right tools can greatly help your chances of success. I wanted to see how Assemblyline could be leveraged to solve those challenges.

Forensics #1: Red Miners

This first forensic challenge was the easiest to start with Assemblyline. It was simply a ZIP file containing a shell script which, when sent to the Extract service, was extracted like so:

Extract’s results

Analysis by Assemblyline stopped there and we can see the file tree at the bottom of the Submission Details/Report page:

File tree

We can now open the shell script in the File Viewer and start viewing its contents:

Snippet of Bash script content in the File Viewer

Within the file contents are three separate places where we can find base64-encoded strings that will each reveal part of the flag:

1. We can find cGFydDI9Il90aDMxcl93NHkiCg== within the local url variable, which decodes to part2="_th31r_w4y".

Second part of the flag, base64-encoded

2. We can find X3QwX200cnN9Cg== in the value assigned to the dest variable, which decodes to _t0_m4rs}.

Third part of the flag, base64-encoded

3. And finally, cGFydDE9IkhUQnttMW4xbmciCg== seen in this cronjob command will give us part1="HTB{m1n1ng".

First part of the flag, base64-encoded

Reading those parts in the implied order of “part1 + part2 + remaining data” gives the final flag for this challenge: HTB{m1n1ng_th31r_w4y_t0_m4rs}.

Forensics #2: Scripts and Formulas

The second forensic challenge was more interesting than the first forensic challenge. Instead of being a classic “find the flag” challenge, we had to answer a list of questions.

The first question was:

What program is being copied, renamed, and what is the final name? (Eg: notepad.exe:picture.jpeg)

Let’s start by sending the ZIP file to Assemblyline. Like the earlier challenge “Red Miners”, the Extract service gives us the internal files from the archive. We see a Visual Basic script (VBScript or VBS) named invoice.vbs, a Windows shortcut file named invoice_01.lnk, and another ZIP file named logs.zip, which Extract recursively opened to reveal a very long list of Windows Event Logs. Below is the file tree view found at the bottom of the Submission Details/Report page for the submitted file.

Submission file tree

Above the file tree, we can see the Indicators of Compromise section:

Indicators of Compromise

It is interesting to see the above domain and URI safelisted, but we’ll revisit this later. For now, we can look at the usual entry point: the Windows shortcut file. From the Characterize service results, we can see that it assigned the Windows shortcut file a verdict of ‘Highly Suspicious’ as shown by the orange ‘H’ icon at the top of the result. You can visit our documentation for a more in-depth explanation of result verdicts.

Characterize’s results

We can see that the biggest reason the Characterize service found the file to be suspicious was because of the target of the shortcut: powershell.exe.

The Characterize service also found that this file used a suspicious icon (wordpad.exe in this case), which is usually a way to deceive the users to get them to click on the file and open it.

In the Metadata extracted by ExifTool section next to the Command Line Arguments key, we can see that the PowerShell arguments are -Nop -sta -noni -w hidden -c cp C:\Windows\System32\cscript.exe .\calc.exe;.\calc.exe Invoice.vbs. That gives us the answer to the first question:

What program is being copied, renamed, and what is the final name? (Eg: notepad.exe:picture.jpeg)”
> cscript.exe:calc.exe

The next question is:

What is the name of the function that is used for deobfuscating the strings, in the VBS script? (Eg: funcName)

It is now time to look at the VBScript named invoice.vbs. The content of the script is highly obfuscated and mainly composed of three functions that can be easily looked at through the Assemblyline File Viewer:

Snippet of highly obfuscated script
Main entrypoint of the script

We can follow the logic from the Main function calling ZbVxxAHCsiTnKpIJ, which then calls LLdunAaXwVgKfowf multiple times to deobfuscate the strings.

What is the name of the function that is used for deobfuscating the strings, in the VBS script? (Eg: funcName)
> LLdunAaXwVgKfowf

Next question!

What program is used for executing the next stage? (Eg. notepad.exe)

The tool of choice to analyze the VBScript is going to be the ViperMonkey service, based on the tool of the same name by decallage2. This GitHub project is dead as of 2021 so we use a fork of Kirk Sayre’s fork.

In ViperMonkey’s results, we can see a call to objShell.Run, which is particularly useful and shows the use of powershell.exe:

objShell.Run result section

What program is used for executing the next stage? (Eg. notepad.exe)
> powershell.exe

Sadly, that script should be extracted by the ViperMonkey service and fed to more specialized tools (like the Overpower service). For now, we’ll keep going and process the script by eyeballing it. What’s next?

What is the Spreadsheet ID the malicious actor downloads the next stage from? (Eg. U3ByZWFkU2hlZXQgSUQK)

Using the base64 utility on Linux, we can get the URL string:

$ echo "aHR0cHM6Ly9zaGVldHMuZ29vZ2xlYXBpcy5jb20vdjQvc3ByZWFkc2hlZXRzLzFIcEI0R3FxWXdJNlg3MXo0cDJFSzg4Rm9KanJzVzJES2JTa3gtcm81bFFRP2tleT1BSXphU3lEVXBqU2Y3UjFsMWRRb2hBNVF2OUVkeVdBM0tCT01jMFUmcmFuZ2VzPVNoZWV0MSFPMzcmaW5jbHVkZUdyaWREYXRhPXRydWU=" | base64 -d;
https://sheets.googleapis.com/v4/spreadsheets/1HpB4GqqYwI6X71z4p2EK88FoJjrsW2DKbSkx-ro5lQQ?key=AIzaSyDUpjSf7R1l1dQohA5Qv9EdyWA3KBOMc0U&ranges=Sheet1!O37&includeGridData=true

If the script had been extracted from ViperMonkey, maybe Overpower would have been able to give us the clear-text link…

What is the Spreadsheet ID the malicious actor downloads the next stage from? (Eg. U3ByZWFkU2hlZXQgSUQK)
> 1HpB4GqqYwI6X71z4p2EK88FoJjrsW2DKbSkx-ro5lQQ

The next question can be answered without any more digging:

What is the Sheet Name and Cell Number that houses the payload? (Eg: Shee1:A1)
> Sheet1:O37

An interesting thing to note, the URL that was extracted in the Indicators of Compromise section did not have the ‘Cell Number’. On top of that, the URL was flagged as safelisted. From the work done on this CTF already, we recognize that this URL should not be safelisted. In our system safelist, we saw that all URLs under the `sheets.googleapis.com` domain were being safelisted, which is incorrect as illustrated by this CTF challenge! We changed our system safelist to only strictly cover the domain sheets.googleapis.com. After this modification, the Indicators of Compromise section looked like this:

Indicators of Compromise without safelisted URI

The next question takes more of an incident response turn, as it relates to what happened in the logs found in the nested ZIP file logs.zip. We could not use Assemblyline to keep digging into that VBS file, as the next stage’s payload cannot be fetched since this Google Spreadsheet is no longer accessible.

What is the Event ID that relates to PowerShell execution? (Eg: 5991)

Looking at the list of files from logs.zip, there are two PowerShell Windows Event logs:

Extracted PowerShell log files

Looking at Windows Event log files in Assemblyline does not give that much information about everything that could be in the logs. Assemblyline won’t replace your Event Viewer and does not show all events, but only those of interest. Nonetheless, the Sigma service does find the PowerShell execution in the Windows/System32/Winevt/Logs/Windows PowerShell.evtx file:

PowerShell event

We can also find more information in the Windows/System32/Winevt/Logs/Microsoft-Windows-PowerShell%4Operational.evtx file, including the Event ID:

PowerShell event with EventID

That gives us our next answer:

What is the Event ID that relates to PowerShell execution? (Eg: 5991)
> 4104

And we finally got to the last question:

In the final payload, what is the XOR Key used to decrypt the shellcode? (Eg: 1337)

This can also be found again in the Windows/System32/Winevt/Logs/Microsoft-Windows-PowerShell%4Operational.evtx file, in the not-so-pretty Sigma result:

PowerShell execution with obfuscation and XOR key

In the final payload, what is the XOR Key used to decrypt the shellcode? (Eg: 1337)
> 35

After entering this last answer in the challenge’s prompt, the following flag was displayed: HTB{GSH33ts_4nd_str4ng3_f0rmula3_byp4ss1ng_f1r3w4lls!!}

We can confirm Assemblyline is able to beat some simple HackTheBox challenges. If you’re still reading, you’re obviously interested in a more complex example. You can head toward my next blog post : Can HackTheBox Business CTF 2023 Forensic Challenges beat Assemblyline? (2/3).

All images unless otherwise noted are by the author.

--

--