Faking your return address through Gadget and ROP
Skip the background if you want to keep your sanity, it is meme.
Background
A few months ago, I teamed up with a stranger to develop the best private cheat for a certain game. The journey was tough and full of struggles but with enough determination, I thought I could do it all. That is until I received a fat ol’ ban yesterday.
As someone who just spent 7 months on this project, I was not about to abandon my amazing code base without a fight, so let’s the reversing commence.
But, where to start? Of course it is gathering information. The game binary is big, and so is their anticheat “stub.dll” There was no way we could comb through the entire obfuscated library to find the missing link. So we do what us hackers do best, collaborate.


I know this is not the best answer, but being in the research field, there are amazing potential in collaboration and working together toward a common goal. Someone is an expert somewhere in something and lucky for us, we found the man with the info that cracked the case.

Now, as someone with very little Windows Internal knowledge — I can only think of a few ways to differentiate a function call. With the game’s anti-cheat operating in usermode, I cannot imagine them doing anything other than looking at the call stack. Let’s review some assembly.
Assembly 101
This serves as an introduction to some important concept used in this post. Skip if you already know your stuff.
CALL
A call instruction is actually a “macro” consist of multiple different instructions. A CALL SomeFunc just mean two things:
push ReturnAddress — The address of the next instruction after the call
jmp SomeFunc — Change the EIP/RIP to the address of SomeFuncRET / RET XX
Similarly, a RET or return is also a “macro” that does these two things.
If a size is specified, such as Ret 18h then:
add esp, 18h — Increase the stack pointer, decreasing the stack size, usually by the amount of arguments the function takes (that actually got pushed onto the stack and the callee is responsible for cleaning the stack). This is due to the stack “grows” downward.
pop eip — Practically pop the top of the stack into the instruction pointer, effectively “jmp” there.If no size is specified, the callee is not responsible for cleaning up the stack and only pop eip is performed.
PUSH “EAX”
Similarly, a “macro” that does:
sub esp, 4 — Subtracting 4 bytes in case of 32 bits from the stack pointer, effectively increasing the stack size.
mov [esp], eax — Moving the item being pushed to where the current stack pointer is located.POP “EAX”
Very similar to PUSH, but the opposite.
mov eax, [esp] — Move the value on top of the stack into whatever is being pop into.
add esp, 4 — To increase the esp, reducing the size of the stack.That is all you need to know about assembly to continue on with this blog post.
Gadget/ROP Chaining
LiveOverflow does a much better job at explaining the concept of ROP.
The Idea
The idea of “faking” the caller comes from how one would check who the caller is in the first place. As with many debugger, there is an ability to see something called the “Call stack”. Call stack is a record of which function called what functioned, which called what function, etc to get here. This is an example using x32dbg:

As you can see in the example, the chain started from ntdll into kernel32 into teamviewer, into kernelbase, etc. It goes from the bottom up and includes information such as the specific return address that will be execute upon a “RET” instruction. But how does x32dbg knows that?
The answer: Look at the stack.
Remember what CALL instruction does?

Yes, to know where to return to once “Ret” is executed, the call instruction must pushthe ReturnAddressonto the stack. Therefore, the stackshould include all the return addresses of the entire call stack. Let’s take a look.

We can see that the return addresses are indeed on the stack, exactly where x32dbg said they would be (from the “Address” list). This is how many debugger determine the call stack, and this is also how this particular game check if the previous caller is indeed an “authorized” caller, or so we think for now.
Now that we know that, the goal is very clear. We can just modify the address on the stack to point somewhere else that is “authorized”. Right? Well, not really. If we modify that address on the stack, when the inevitable “Ret” is executed, it will not return to our function. Worse case scenario (and most likely scenario), the instructions where we returned to is incompatible with our current registers and stack value which will inevitable crashes the program.
The Attempt
So then, I have this great idea of using ROP to make a valid “authorized” function that existed in the game’s client to make the call for me. Abusing instructions such as:

Let’s see if the game’s current binary have the instructions (Gadget) we are looking for.



So now that we found the gadget we are looking for, I want to explain the idea behind this gadget and how I am going to use it.
The Gadget does two things for us. First it will CALL the function at the address that eax is storing and second, it will return back to it’s caller.
However as we mentioned previously (and liveoverflow’s video), CALL and RET are just macros, so let’s break it down further. It will do:
0xb1e4d1: Push 0xb1e4d3; 0xb1e4d1: jmp eax0xb1e4d3: pop eip
The first instruction is not very important, but the second and the third are interacting with an input source that we can influence. We could modify the eax register, and we definitely could push stuff onto the stack for the function to return to later (through pop eip). So idea was to set eax to the function that we want to call so that the gadget can CALL it for us. Then we push the address of where we want this function to return to (our cheat’s function) onto the stack so that the target once finished executing can return back to us and continue code execution.
Sounds good on paper, let’s do it!
First, let’s see what a normal call look like. This is the current way we call it in C++ and assembly.


Let’s go into explaining what is going on.
push ebp //Push base pointer
mov ebp, esp //establishing new stack frame - nothing too importantmov eax, dword ptr ds:[2AFB3388] //move "Base" variable into eaxpush 1 //Push 7th arg onto the stack
push 0 //Push 6th arg onto the stack
push 0 //Push 5th arg onto the stack
push 0 //Push 4th arg onto the stackpush dword ptr ss:[ebp + C] // Push the 2nd arg of this function (Position pointer) onto the stack as the 3rd arg of the function we are about to calladd eax, 1BE660 //Base + fnIssueOrderpush dword ptr ss:[ebp + 8] //Push the 1st arg of this function (GameObjectOrder) onto the stack as the 2nd arg of the function we are about to callcall eax // call IssueOrder which also the one that pushes our function's return address onto the stackpop ebp //reseting stack frame, nothing too important
ret 8 //return to caller, clean 8 bytes off of the stack (4 bytes for each arguments)
One thing you might’ve noticed is that it did not do anything with the first argument, “this”. That is because the function we are calling is a “__thiscall”, therefore the first argument is actually passed using the ecx register.

The first argument of the function is actually a pointer to the player’s character. However, because this function is a member function of the player class that we created, the local player pointer is already inside the ecx register so we did not need to do anything there. If not, there would be an extra instruction like this.
mov ecx, obj_local_player //move local player's pointer into "this"Now that we know what we need to make our function looks like, it is time to reconstruct it. The goal is to not use the call eax from our function but instead make the gadget we found earlier to do it for us. In theory, we can just use the jmp instruction to jump to our gadget and then the gadget will execute the call. If the target function checks the caller, it will see our gadget’s return address on the stack, which is an authorized caller due to the fact that it is the game’s own code. However, the issue as mentioned earlier is when our gadget returns. The gadget will pop the arbitrary address off of the stack and return there —a really bad idea. Unless… we push our own return address onto the stack so that the gadget will returns back to us!
__asm {
push 1 //arg 7
push 0 //arg 6
push 0 //arg 5
push 0 //arg 4
push dword ptr ss : [ebp + 0xc] //arg3
push dword ptr ss : [ebp + 0x8] //arg2
mov ecx, LocalPlayer_Obj //this aka arg 1
mov eax, pIssueOrder //function for gadget to call
call $0 //call itself (push return address on the stack)
pop ebx //get EIP into ebx, abusing previous call add ebx, 11 //eip + 11 lands at mov al, 1
push ebx //return address
jmp pGadget //jmp to gadget
mov al, 1 //return TRUE
pop ebp //Stack frame destroy
ret 8 //Clean 8 bytes off of the stack
}
Looks very similar to the previously generated function, just without a call , and it worked.. if you consider the game instantly crashing “worked”.

Luckily, the game does a MiniDumpWriteDump for us upon the crash. Let’s open one up in WinDbg to see what caused the issue. You could use the command !analyze -v to analyze the crash dump.

1. Analyze command to start the crash dump analysis
2. The faulting IP or the instruction that causes the crash
3. The reason why it crashed, in our case - access violation which mean we trying to access an address that is invalid.
4. The invalid address in memory that we tried to read from that caused the crash, in our case "2" which is DEFINITELY not a valid memory address.So obviously whatever we did caused a big ol’ crash due to access violation due to the dereference of the eax register ( [eax] ). Let’s look at the binary to see why that happened.

From the first glance, as someone with experience, the movq xmm0should be a dead give away that this is an operation that works with floating point.
mov eax, [esp + 110h + LocationPtr] //move a value pointed to by the stack pointer + some offset into the eax registermovq xmm0, qword ptr [eax] //dereference that value and then move it to the xmm0 register, also responsible for the crash.
This tell us that the instruction is looking for a float pointer of some sort, but instead it got 2 so there must be something wrong. Ignoring the already defined LocationPtr you can tell that some misalignment occurred on the stack. Can you guess the problem?
Let’s me share one more piece of info regarding the other argument we passed in beside position.That other argument was Order which for this particular call was Move = 2.

So now, we can make a good assumption that whatever happened to the stack, it shifted everything by 4 bytes, which moves the arguments 1 offset to the right — making Order into PositionPtr which then crashes due to an invalid address read. We can compare the stack of the function when called normally and when called using our gadget.

Now we understand what caused this stack misalignment. There is an extra address pushed onto the stack which offset everything by 4 bytes. Exploring that extra pointer reveals the location of our mov al, 1 //return TRUE code. This mean that the extra return address that we pushed onto the stack to get the gadget to return back to us is the core problem of our stack misalignment.

There is where it really got me thinking. We need to push the return address onto the stack so that the gadget will return back to us, but by pushing the return address onto the stack — it is like adding an extra argument. This messes up everything.
The True Solution
bool IssueOrder(GameObjectOrder Order, Vector3D* Position)
{
__asm {
push retme
push 1
push 0
push 0
push 0
push Position
push Order
push Gadget
jmp pIssueOrder
retme :
mov al, 1
}
}Okay, let’s me explain. The mistake we made earlier was trying to emulate the CALL instruction. What we needed to do was to push the return address before the arguments and not after the arguments. This is due to the function reading the arguments using the offset from stack pointer (which pointers to the top of the stack). Therefore pushing the return address before the arguments does not change the order of the arguments relative to the top.
Another mistake was trying to use a CALL EAX; Retgadget. There was a simpler solution utilizing only a ret instruction gadget, which is abundant in any program. We will go into it a bit later.
The rest are optimization of code that Visual Studio assembler provided us. Instead of working with ebp + offset to grab the argument, we can type the argument name directly into the assembly code and the offset will be calculate automatically. Similar, we did not need to get the current EIP and then add an offset to calculate the return address. We can declare a label such as retme: and VS will take care of the rest for us. This allowed us to simplify the assembly code and get rid of possible noobie mistakes (thank you Billy Gates).
Lastly, let’s explain the use of a ret gadget. Initially, I attempted to look for a ret 18h gadget which will clean the stack for us due to us using 7 arguments where 6 is on the stack(0x4 * 0x4 = 0x18) and 1 this argument in the eax register. However, upon various crashes I realized that I forgot __thiscall calling convention is a callee clean up the stack calling convention. Therefore, a ret 18h should already exist.

Therefore, we only need one of the most abundant instruction, a 0xC3 or in another word retn to complete our code.

Noted, we do not care if the gadget just happened to have a 0xC3 in it as long as that 1 byte is in an executable page. I chose the first one, because I was lazy.
Now, let’s explain.
push retme //Return address, where mov al, 1 is at
push 1 //arg 7
push 0 //arg 6
push 0 //arg 5
push 0 //arg 4
push Position //arg 3
push Order //arg 2push Gadget //Gadget address, so when issueorder is finished, it will pop this off of the stack and return here, where it will then hit another return which will return back to retme.jmp pIssueOrder //jump to where issueorder is
retme : //label for assembler to calculate return address
mov al, 1 //return value are put inside eax, al is just the lower half of the eax register. We move 1 here indicating it is going to be a TRUE when return.
The trick is to push the return address of our function onto the stack before the arguments.
Stack View:
------------------------------
RetMe's Address <-------- ESPNow we push the arguments onto the stack.
Stack View:
------------------------------
RetMe's Address
Arg 7
Arg 6
Arg 5
Arg 4
Arg 3
Arg 2 <-------- ESPThen we will push the Gadget address onto the stack.
Stack View:
------------------------------
RetMe's Address
Arg 7
Arg 6
Arg 5
Arg 4
Arg 3
Arg 2
Gadget's Address <------- ESPThen we jump to the function we want to call. When the function we want to call checks the address for it to return to, it will see the Gadget’s Address, an authorized caller. We will then bypass the check. When the IssueOrder function is finish executing it will then pop the Gadget’s Address along with moving the stack pointer back by 0x18 bytes. Then jump to that popped Gadget’s Address.
Stack View:
------------------------------
RetMe's Address <------- ESP
Arg 7
Arg 6
Arg 5
Arg 4
Arg 3
Arg 2Here at our Gadget, another ret instruction is executed which then pop the RetMe’s Address off of the stack and jump there, moving us back to our function’s RetMe’s label, executing the return True statement.
mov al, 1Noted: There are missing stack setting and stack unsetting instructions that deals with the stack frame due to them being automatically generated by VS for us. If you want to know what they looks like, it is more or less like this.
push ebp
mov ebp, esp
push retme
push 1
push 0
push 0
push 0
push Position
push Order
push Gadget
jmp pIssueOrder
retme :
mov al, 1
pop ebp
ret 8The code run perfectly fine and no crash occurred.
The Result
Understand that all of these were done with minimal knowledge of how the detection works. However soon after we finished the code to fake our caller, we found a snippet that confirmed our belief.


From that snippet, we basically know if the return address is null, or less than the base address of the main module or higher than a certain address, the game will set PEB + 0x900 to 1. PEB is usually referred to through fs:0x30 for those who are confused about how I arrived at that conclusion (now you guys know why my handle is fs0x30 lol). Let’s write some code to check if we actually bypassed the check.


We ran our test twice. The first time, we can see that the flag was not toggle after we used our fake caller method. However, the flag was toggled permanently (RIP account) after the normal call was performed. This prove that our technique to fake the caller did bypassed the check and will keep us safe until next time when the developer decides to sneak in something like this again.
Conclusion
This was a good exercise on ROP and assembly before my OSCE exam coming up. I’m glad that the effort paid off with tangible proof. I am sorry for the non-security related post but those are coming up, I promise (Wow64 Gate hooking, VAD Unlinking and whatever other topic I am working on).
I hope it was a good read. I tried to take you guys on the journey from the beginning to the finish with me. Please let me know if I made any mistake anywhere, writing this in 1 sitting is definitely not without its drawback. Thank you for your time.
— fs0x30
