Introduction to the x86 architecture(STACK)

Gaurav yadav
RESETHACKER
Published in
8 min readJun 14, 2020

--

Starting out in Reverse Engineering

Previous Blog

I hope you all are doing good in this hard time and also hope that you read my previous blog if not please go and read that one first and then come back here(Previous Blog). In previous blog we have discussed about different general purpose and special purpose registers but we left 2 main registers which will be covered in this blog ,so buckle up because stack is the most important thing to understand if you are just starting out.

Let’s Begin

Photo by Joseph Gonzalez on Unsplash

Registers which we are going to focus in this blog are EBP(extended baseline pointer) and ESP(extended stack pointer) and if you are wondering what is this image doing in the left, just bare with me you will know. Let us first understand what is a stack, if you are familiar with programming you probably know what a stack is. Stack is a data structure which allow you to store different things(address, instructions, values etc) but you can add or remove elements from a stack in a particular manner known as LIFO (last in first out). Let us assume that you want to eat pancakes which are arranged like in above image, from where will you start eating ? Most probably from the top unless you want to make thing bit messy and if you want to add more pancakes you will also add them to the top . In this same manner if you want to store something in stack it will be added on the top and if you want to remove an element from the stack you can remove elements from the top like eating pancakes, this order is known as LIFO because you can remove that element first which has been added last to the stack (present on the top). Stack also resides in RAM like any other data, modern OS manage multiple stack and each stack represent a currently active program. Stack grows backward as we add elements to it ,means elements are allocated to the higher addresses first and then elements are allocated to the lower addresses. Instruction used to access stack are PUSH and POP.

With the help of above image you can easily understand how PUSH operation puts element in stack and how POP operation remove element from the stack. If you concentrate properly as the elements are added they are allocated to the lower address(Backward growing stack).

Why program uses stack?

Similar to registers stack is also used for storing values of the local variables , result of an arithmetic operation, argument passed to a function etc. But stack is used for storing slightly longer term data than registers, use of registers are immediate than a stack.

Here Comes the Pointer…

Now let us discuss about the Registers EBP and ESP . Both of these registers are very important if you want to access the stack elements. ESP always point to the top of the stack(always stores the address of top of the stack) so that we can remove(pop) or add(push) elements immediately. This means that whenever anything is added or removed from the stack ESP is decreased or increased accordingly but EBP does not change so frequently . Since ESP changes frequently so it become difficult to access any element from the stack when needed, it’s like you are trying to guess the speed of a car moving at a uniform speed(acting as a stored element at a particular address ) from your car(acting as ESP) which is moving at a nonuniform speed. This task is made easy by EBP because EBP is not moving and elements on the stack are also not moving so EBP act as a reference point for getting location of any element. Let us now understand these two registers with the help of examples we will be focusing on some instruction which are present in every every program and we will get to know why they are there.

Program and it’s compiled assembly

Prologue

Prologue is the instructions which are present at the beginning of every function, these instruction are used to save the base pointer of the previous function from which the current function is being called so that this base pointer can be used for current function without compromising the base pointer of caller function and to allocate some size on the stack to the current function. In this whole post we are going to use program in the above image. On the right hand side of the image you can see the assembly of main function now try to focus on the start.

Prologue

Let us understand what is happening here. So before calling main function some function ‘X’ was executing so function X must be having it’s EBP pointing somewhere below than it’s ESP after calling main() function what push ebp in the first line did is it saved the function ‘X’ EBP to the top of the stack so that after executing main function when execution returns to the function ‘X’ it can have it’s EBP back(pop ebp) for further execution of function ‘X’ .Now that the EBP is saved we have to move the EBP to the new address so that it can be used for main function that’s what the next instruction mov ebp,esp is doing .Now EBP and ESP are pointing to the same address. Now check the instruction sub esp,0x14 where arrow is pointing in the above image, this instruction is telling ESP to point to the address 20 bytes above the current position of ESP so that the free space between ESP and EBP can be used to store local variables and other values which are going to be used in current(main) function.

Local Variables and Arguments

In the C program image above you can see that there are 2 local variables in main function named localString and localInt can you find out where are these variables are assigned respective values in assembly ,let me give you a hint variable are going to be stored on the stack ;).

Local Variables

If you will observe correctly where the arrow is pointing on the right side (assembly) in above image, you can notice here that stack is being accessed with the help EBP, but why EBP is getting subtracted? so the answer to this question is in Prologue as i have already explained that at the starting ESP got subtracted by 20(0x14) bytes to allocate some stack space to the current function main so EBP is getting subtracted to access that same space which was allocated to the function during prologue. But how do I know local variables are getting stored in this address, so check carefully right side operand in assembly edx is getting stored in address [ebp-0xc] ,let’s check the value of edx at this point.

If you check into registers, edx is pointing to the address 0x5655701d if we check what is stored at that address you can see that some data is stored in hexadecimal number in little endian format, in this format least significant bytes are stored at the starting(reverse order) if we reverse this hexadecimal and convert it into ASCII can you guess what we will get, try doing that yourself ? you will get a string “main function” hmm… have you seen this string before ? Look in the C code this is the same string stored at local variable localString you can find other local variable easily that is getting stored in the next instruction and value is same as used in C code which is 0x11223344 . Now you know if EBP is getting subtracted there is a good chance local variables are getting accessed.

Now let us talk about arguments, arguments are the values which are being passed to a function so that function can operate on that values. Before running this program I provided ‘aaaaaaaaaaaa’ as command line to it so lets find out this argument. If you will notice in below image there is call instruction to a function ‘functionFunction’ and just above it eax is getting pushed on the stack lets check the value of eax.

Value of eax

Now you can see that at the address which is stored in eax bunch of 0x61 are stored and if you covert this hexadecimal into ASCII you will get that 0x61 is the hexadecimal representation of ‘a’. Now you got that if a value is being pushed just before the call to a function there is good chance that is is an argument to that function will be called just after that push operation.

Epilogue

We are in the end game now. Similarly like prologue is present at the beginning epilogue is present at the end of the function to restore the stack pointer of the function which called the current function.

Epilogue

Epilogue mainly contain two instruction leave and ret . Leave instruction does the opposite of prologue leave instruction executes mov esp,ebp followed by pop ebp which reset the stack pointer to the position to which it was pointing before entering in this function and ret instruction pop the instruction to the EIP which was pushed into the stack during calling current function so that whenever execution return to the calling function it can start execution where it left . We will talk about how ret instruction can be used to execute a instruction which we want rather than what it was going to execute in some other blog so keep following. Because this is where the fun begins .

Let’s Finish This

In both the blogs i have tried to explain as much as important things i can which are going to be helpful for us in upcoming topics which i am going to touch like buffer overflows, format string vulnerabilities etc. If you want me to make such blogs on the topics you are having difficulty with related to reverse engineering obviously :) please do let me know I will be more than happy to help. If this blog or previous blog helped you in any way please reach out to me on Linkedin or Instagram and share your experience of learning and productive criticism is always appreciated . Keep following and please show your love by clapping to this blog. Will come up with a new blog soon.

Say Hi! to me on Linkedin

--

--

Gaurav yadav
RESETHACKER

I like to learn things which challenges me . I am a Developer ,reverse engineer and very much addicted to games.