A Beginner’s Guide to Buffer Overflow Vulnerability
The art of memory exploitation...
Buffer
A Buffer is temporary storage usually present in the physical memory used to hold data.
Consider the most useless program ever made shown on the left image where a character buffer of length 5 is defined. In a big cluster of memory, a small memory of 5 bytes would be assigned to the buffer which looks like the image on the right.
Buffer Overflow
A Buffer Overflow occurs when more data is written to a specific length of memory such that adjacent memory addresses are overwritten.
DEMO (Controlling Local Variables):
Let’s take an example of a basic authentication app which asks for a password and returns Authenticated! if the password is correct.
Without really knowing how the app works, let’s enter a random password.
It says Authentical Declined since the password wasn’t correct. To test, we need to enter large random data.
You must be wondering why it got authenticated and why there is a Segmentation Fault!. Let’s see a more detailed version of the app.
As you can see, there are three variables: auth, sys_pass and usr_pass. The auth variable determines if the user is authenticated or not depending on the value(initially 0). The usr_pass stores the password that the user enters and the sys_pass variable is what the correct password is.
How the app works is if the usr_pass variable is equal to sys_pass then the auth variable becomes 1. If the auth variable is not 0, then the user is authenticated.
You may also see how the variables are stored in memory. Since the address is in hexadecimal and there is a difference of 1 therefore, usr_pass and sys_pass variables are buffers of length 16.
To test for Buffer Overflow, a long password is entered as shown.
As you can see the password entered in usr_pass variable overflows the sys_pass variable and then the auth variable.
Note: C functions like strcpy(), strcmp(), strcat() do not check the length of the variable and can overwrite later memory addresses which is what precisely buffer overflow is.
Refer to the code below for better understanding.
#include <stdio.h>int main(void) { int auth = 0;
char sys_pass[16] = "Secret";
char usr_pass[16]; printf("Enter password: ");
scanf("%s", usr_pass); if (strcmp(sys_pass, usr_pass) == 0) {
authorized = 1;
} printf("usr_pass: %s\n", usr_pass);
printf("sys_pass: %s\n", sys_pass);
printf("auth: %d\n", authorized);
printf("sys_pass addr: %p\n", (void *)sys_pass);
printf("auth addr: %p\n", (void *)&authorized); if (auth) {
printf("Authenticated!\n");
}
else{
printf("Authentication declined!\n");
}
}
Note: This might be the most unrealistic example and only meant for understanding purposes. You may not see such situations in real life.
Let’s dive a little deeper into the concepts now.
Important Concepts
Division of Memory for a Running Process
This is how the memory assigned to a process looks like. There are various sections like stack, heap, Uninitialized data etc. used for different purposes.
You may read more about the memory layout here: Memory Layout of a Process.
This blog focuses on Buffer Overflow in Stack so let’s look at that.
- Stack: A LIFO data structure extensively used by computers in Memory management etc.
- There are a bunch of registers present in the memory amongst which we shall only be concerned about EIP, EBP, ESP.
- EBP: It’s a stack pointer which points to the base of the stack.
- ESP: It’s a stack pointer which points to the top of the stack.
5. EIP: It contains the address of the next instruction to be executed.
Stack Layout
The above image shows how a stack looks like. It might look intimidating but trust me, it isn’t. Let’s see some important points related to the stack
- A stack is filled from higher memory to lower memory.
- In a stack, all the variables are accessed relative to the EBP.
- In a program, every function has its own stack.
- Everything is referenced from the EBP register.
Above the EBP, function parameters are stored.
For example:
void foo(int a, int b, int c){
//Function body
}Here a,b and c being the function parameters are stored above the EBP.
- All the
local variablesof a function are stored below the EBP. - The
Old %ebpis the value of the EBP of the previous function. Since after a function is executed, it has to return back to an older function; therefore, we need to store the values of both old EBP and EIP. - ESP register stores address of the bottom of the stack.
For example:
void foo(int a, int b, int c){
int x;
int y;
int z;
}Here x,y,z being local variables to the function are stored below the EBP.
Exploiting Buffer Overflow
It’s time to get into Buffer Overflow exploitation using stack.
Before that, let’s try to understand how a stack is built for any function.
Taking an example below:
The stack on the right is of the function foo as seen on the left image.
- Since
a,bandcareparameterspassed to the function, therefore, it is stored above the EBP. Also because the stack is filled from higher to lower memory and parameters are read from right to left, therefore,cis written first in the memory followed bybanda. x,yandzbeing the local variables are stored below the EBP.- It is also required to store the
Old EIPandOld EBPof thefunction mainin the stack to know where to return after the function executes.
Now, as shown in the previous demo, you could see how Buffer Overflow took place using the local variables.
Imagine a situation where you overflow the variables x,y and z in such a way that Old EIP is modified and stores the address of the memory where the malicious code is placed.
Refer to the below image for better understanding.
Assume a buffer of length 500 defined in a function. Now it is overflowed in such a way that it has some random data, followed by the shellcode(malicious code) and then the Return address which points to the shellcode.
So after the function gets executed, the instruction pointed by the Return address gets executed and this is how our shellcode gets executed.
This is pretty much how Buffer Overflow happens.
You must watch this video Buffer Overflow Attack — Computerphile to get a more realistic idea of Buffer Overflow. The codes used in the above video are present here.
Security Measures
- Use programming languages like Python, Java, Ruby in which
Dynamic Memory Allocationtakes place and, the language itself manages the memory for you. - In languages like C, C++ before writing data to a buffer perform all the relevant checks and
input validation. - Before using any
external libraries, check forsecurity vulnerabilitiesin it. - Use
source code analysis toolsfor static analysis against vulnerabilities. - Use
Non-executable Stack: It means that even if a machine code is injected in the stack, it cannot be executed since that particular region of memory is non-executable. It is done by setting upNXbit.
Note: Even after these measures are taken it might be possible to exploit Buffer Overflow. Therefore, these are just layers of security that can help to prevent exploitation of Buffer Overflow.
References
- Smashing The Stack For Fun And Profit
- Buffer Overflow Exploits and Countermeasures
- Buffer-Overflow Vulnerabilities and Attacks
Don’t forget to leave some claps in case you liked the blog :)
Let’s connect?
LinkedIn: https://linkedin.com/in/ashwigoel
Website: https://ashwingoel.com
Email: goelashwin36@gmail.com
