Hooking is easy! Right? Right?..

I’ve been watching some presentation about hooking and it occurred to me, that hooking at its base is a pretty simple concept. You just replace some code with trampoline, reallocate destroyed code, execute some payload, execute destroyed code, then go to original hooked place. Sounds simple enough that I can do it myself.


The Plan

I’m going to write from scratch a simple program, that hooks its own function. And I’m not going to read or use any existing hooking code. So, the plan is:

  1. Find trampoline code.
  2. Calculate replaced instructions length.
  3. Write payload + original function instructions somewhere in memory.
  4. Add jump to original function.

And I’ll do it in x64, because it’s cool. And I’m too lazy to change settings for dissassembler library to build it x86 version.


The Bait

What the fuck?! Only while writing this article I found out that trampoline is, in fact, restored first few instructions of hooked function, not code, that makes a detour. But that made sense to me: instead of executing function, control flow rises to the sky to your code. Fuck No! The detouring code is called just “code”, or “hooking code”, I guess. I’ll call it “bait”.

What bait code should I use? What a stupid question, I ‘ll just insert a jump! There’s got to be a jump to absolute offset in x64, right? No, there’s fucking not. “9 byte instruction is too long” — AMD though. Well, it wouldn’t be too long for me! No I’ve got to look for an actual bait!

Thanks for me x64 is about 10 years old and people have thought of many bait codes long before. In my mind the simplest bait to use is

PUSH DWORD
MOV [RSP+4], DWORD 
RET

It doesn’t corrupt any registers and doesn’t require relative jumps/calls. Takes 14 bytes, but it’s no problem for me.

Here is the code:


The Length

Since x64 has variable instruction length, I need to disassemble first 14 or more bytes of a place I’m hooking. This is to know how much bytes I need to relocate to trampoline.

I chose Zydis as a disassembler. Because it’s new. And it’s used in x64dbg (probably, because Zydis was written by x64dbg author).

Zydis comes with prebuilt binaries. In ELF format. While I want to write my hooking solution for Windows. As it turns out, Cmake can generate Visual Studio projects. But of course, not mine version of Cmake, which came with goddamn cygwin! With right version of Cmake, I generated Visual Studio project and built Zydis almost on the first try.

Zydis turned out quiet easy to use. All I did was copypasting example code and altering it a little (I think people call it “programming”).


The HGFEDCBA

As a proof of concept I call MessageBox function with some arguments, then hook it and replace arguments as my hook payload.

First, I should manually load user32.dll into my process, since I develop console app and it doesn’t link against user32.dll by default. Hint: this is not an optimal way to call MessageBox.

I’ve chosen to replace message box text argument. I can do that by replacing a text rdx points to. My payload will look like something like that:

movabs rax, 0x4142434445464748
mov qword ptr [rdx], rax

Let’s take a look at hooking PoC code:

What it does is pretty self-explanatory. Yet for some reason I feel a need to explain it!

  1. I allocate a memory page for payload and trampoline.
  2. I change .text section access rights in really ugly way.
  3. Copy everything we need in appropriate places.

Let’s do a quick test of out code.

Damn it! I have C0000005 exception right here. I forgot, that program constants, such as message box text, are stored in .rdata section. Guess what, “r” in “.rdata” is for “read only”.

OK, let’s cheat and change .rdata permissions to PAGE_EXECUTE_READWRITE. Not so read only now, huh? Now the code should be working.

Hey, I have some progress! Two instructions of progress, exactly. And I have a bunch of zeroes right below the payload. I think, I know, what’s wrong.

hook_body_offset = (uint64_t*)pHookBody + sizeof(payload);

Remember, kids, when you add something to pointer, you add not bytes, but units of this pointer type. Hmm… I guess that’s the real point of pointer types. Heh, “point of pointers”… Anyway…

Now the PoC is ready and I can show you the result.

What the hell?! Another C0000005! And now while trying to execute instructions from original function.

Turns out, MessageBoxA has relative instruction in its first 14 bytes. This means, that the offset of dword being compared is now wrong. And I have to recalculate the offset or unhook the function and redirect execution to its beginning. Or I can find a function which doesn’t have relative instruction at its beginning! And this function will be… MessageBoxExA! Genius, right?

Now, before hooking:

And after hooking:

Tremendous success!


As it turns out, hooking is hard and I’ve failed terribly at it. The concept is very simple, but in reality you have to solve some problems to develop a versatile hooking engine. Mainly, you have to deal with relative instructions. And there are a lot of those, especially, in x64.

Code of this PoC is available at: https://github.com/sl4v/hooking_poc