Learning Operating Systems for Reverse Engineering

System Calls, Buffer Overflow, Multiprogramming, and more

Published in

Reverse Engineering for Dummies

8 min readJul 2, 2021

Usually when the term “operating system” is used, it is in reference to computer software like macOS, Windows, or a Linux distribution. While this isn’t incorrect, the more accurate definition of an OS is the software of a computer which runs in kernel mode — when the CPU can use every feature of its available hardware and execute every instruction in its instruction set. The OS manages all hardware and software assets, giving computer programs a clean, abstract set of tools to utilize. In other words, its main function is managing the Input/Output devices and other system resources to provide its users with an extended (i.e., virtual) machine.

In the world of computer science and computer engineering education, Operating Systems is the name of a course which focuses on topics like the stack and the heap, buffer overflow, system calls, multiprogramming, parallel programming, scheduling, and more. It’s innately frustrating not just because of its abstract subject material, but also because of how time-consuming and confusing it can be to debug the kernel in a VM as opposed to developing and running a high-level C++ program in a more user-friendly IDE. However, in OS there are still important concepts to learn that are extremely relevant for reverse engineering and malware analysis.

The following is a collection of information I’ve gathered from when I took an OS class in undergrad as well as notes from Dennis Yurichev’s famous book on RE which you can find here.

Stack vs. Heap

https://techdifferences.com/wp-content/uploads/2017/10/Untitled-6.jpg

In my Crash Course for Assembly Language, I covered the Stack in general terms; now let’s go into more detail and compare it to the Heap. This was something I was asked to do in an interview, so it may be something you could be asked as well if you’re looking for a position in RE.

To reiterate:

Each active function call has a frame that stores the values of all local variables, and the frames of all active functions are maintained on the Stack. The Stack is a very important data structure in memory. In general, it is used to store temporary data needed during the execution of a program, like local variables and parameters, function return addresses, and more. It is static memory, meaning it cannot be altered during runtime. Dynamic memory like that allocated with the malloc() or new() functions is stored on the Heap.

So the Stack often has both local and automatic variables, which are generally pushed to it when you make procedure calls. These include the parameters you pass in loops and pretty much anything outside of the global scope except that allocated on the Heap with malloc(). The computer knows which instruction to execute when returning from a procedure because it makes a call to the Stack.

The Heap is dynamic memory (as mentioned above) and refers to a data structure which stores global variables. It is not managed automatically by the CPU and, unlike the Stack, can be fragmented as blocks of memory are allocated and then freed.

The primary differences between the two include:

Structure: the Stack is LIFO whereas the Heap is hierarchical (a priority queue)
Memory: the Stack is contiguous and will never become fragmented, whereas Heap memory is allocated in any random order and susceptible to fragmentation
Variables: the Stack only contains local variables while the Heap allows access to variables globally
Sizing: Stack variables cannot be resized whereas Heap variables can
De-allocation: the Stack doesn’t require de-allocation of variables whereas the Heap requires it via the programmer’s instructions

System Calls

https://www.linuxbnb.net/wp-content/uploads/2018/06/system-call-overview-1.png

A system call allows a user process to access and execute OS functions inside the kernel. User programs use syscalls to invoke certain OS services, and common UNIX syscalls include:

fork: creates child processes
open: initializes access to a file
kill: terminates a process
Exec: performs file name resolution, overwrites current processes’ memory space, moves the program counter, and starts a new program

The syscall handler knows which syscall is being made by referencing the system call table, which has syscall IDs — indexes that are stored in a particular register, accessed with function pointers.

Making a system call involves modifying a specific set of files like the syscall table, sys.c (for syscall function declarations), and the schedule header file. It is a common task for students in an OS course, but unfortunately not well documented and often takes many hours to figure out. I may one day in the future post clear instructions on how to do this to save future students all the trouble, but the above overview will give you a conceptual understanding for RE.

Buffer Overflow

The buffer is a contiguous section of RAM which temporarily holds data while it is transferred between an input or output device, compensating for the difference of execution speeds.

Buffer Overflow (or Buffer Overrun) occurs when an application tries to store too much data in the buffer, which leads to data overflowing into adjacent storage, potentially overwriting the existing data. This causes data loss and even a system crash. It is a common programming mistake that most developers unknowingly commit, but nevertheless, hackers can use it to gain access to sensitive data.

In a buffer overflow attack, the attacker can add extra data that can include malicious instructions to perform activities like executing shell code, corrupting files, changing data, etc. There are two types of buffer overflow attacks, those involving Stack-based memory allocation (which are simpler to exploit), and those that involve Heap-based memory allocation which are far less frequent. Languages that use Stack-based memory allocation techniques (like C, C++, Fortran, and Assembly) are the most vulnerable to buffer overflow exploitation.

There are numerous ways to prevent a buffer overflow attack, including:

Exception handling: to prevent code execution in the event of a buffer overflow
Size Allocation: to ensure the buffer has enough memory to handle large volumes of data
Routine testing: to detect vulnerabilities and fix any bugs in code
Avoidance of certain library functions: or third-party methods that are not bound-checked for buffer overflows, such as gets(), scanf(), or strcpy() found in the C/C++ programming languages

Threads vs. Processes

https://www.backblaze.com/blog/wp-content/uploads/2017/08/diagram-thread-process-1.png

A process is a execution of a specific computer program, while a thread is a unit of execution within a process. A process can have multiple threads, making it a multithreaded process (discussed more later on).

When a program is loaded into memory, it becomes one or more running processes, and processes are typically independent of each other. Threads, on the other hand, exist as a subset of a process and therefore use the memory of the process they belong to. This sharing of memory can lead to parallel programming, which refers to multiple processes or threads being executed simultaneously (which is only possible on a system with multiple processors). On a single processor, the CPU is shared among running processes using process scheduling. This is not parallel programming.

Multiprogramming vs. Multiprocessing vs. Multithreading vs. Multitasking

Multiprogramming is the rapid switching of the CPU between multiple processes in memory. It is commonly used to keep the CPU busy while one or more processes are doing I/O, since only one program at a time can use the CPU for executing its instructions. The main idea of multiprogramming is to maximize the use of CPU time and to allow the user to run multiple programs simultaneously.

If there is no DMA (Direct Memory Access), the CPU would have to run all the programs sequentially. This would lead to a backup since one would have to finish before the other could be initiated. Thus, multiprogramming would be less favorable, and a time-sharing system could be used. In a time-sharing system, multiple users can access and perform computations simultaneously using their own terminals. All time-sharing systems are multiprogramming systems, but not all multiprogramming systems are time-sharing systems since a multiprogramming system may run on a PC with only one user.

There is also spooling, which is a combination of buffering and queuing, and a form of multiprogramming for the purpose of copying data between devices. It allows programs to “hand off” work to be done by the peripheral and then proceed to other tasks.

To summarize, if there are 1 or more programs loaded in main memory, only 1 program can get the CPU for executing its instructions, so multiprogramming maximizes the use of CPU time by rapidly switching between programs.

Multiprocessing refers to executing multiples processes at the same time, which sounds quite similar to multiprogramming, but its difference lies in the fact multiprocessing refers to the hardware (i.e. the CPU units). A system can be both multiprogrammed and multiprocessing by having several programs running simultaneously with more than one physical processor.

Multitasking, much like its general definition unrelated to computing, refers to having multiple tasks (programs, processes, threads, etc.) running at the same time. It’s used in modern operating systems when multiple tasks share a common processing resource (like CPU and memory). At any time the CPU is executing one task only while other tasks waiting their turn. The illusion of parallelism is achieved when the CPU is reassigned to another task (i.e. process or thread context switching).

Both multiprogramming and multitasking operating systems are time-sharing systems. However, while in multiprogramming (older operating systems) one program as a whole keeps running until it blocks, in multitasking (modern operating systems) time sharing is best manifested because each running process takes only a fair quantum of the CPU time.

Multithreading is a model of execution that allows a single process to have multiple code segments (i.e. threads) run concurrently within that process. Threads are similar to child processes that share the parent process resources but execute independently. Multiple threads of a single process can share the CPU in a single CPU system or run in parallel in a multiprocessing system. Usually this synchronization of threads uses OS primitives like mutexes and sempaphores.

Multithreading is the best choice when server has a number of distinct tasks to be performed concurrently.

Miscellaneous

To finish up, here are a few miscellaneous definitions to be aware of:

Paging: makes allocation and free space management easier. A page fault will occur when you’re trying to access a piece of memory not in RAM
Trap instruction: kernel-mode set of instructions which causes a switch from user mode to kernel mode, starting execution at a fixed address in the kernel. It transfers control to the OS (which then carries out syscalls) before returning control to the following instruction
Pipe: can be used to connect two processes so the output from one becomes the input of the other. Pipes are FIFO and useful for inter-process communication