BOLO: Reverse Engineering — Part 1 (Basic Programming Concepts)

Reverse Engineering

Throughout the reverse engineering learning process I have found myself wanting a straightforward guide for what to look for when browsing through assembly code. While I’m a big believer in reading source code and manuals for information, I fully understand the desire to have concise, easy to comprehend, information all in one place. This “BOLO: Reverse Engineering” series is exactly that! Throughout this article series I will be showing you things to Be On the Look Out for when reverse engineering code. Ideally, this article series will make it easier for beginner reverse engineers to get a grasp on many different concepts!

Preface

Throughout this article you will see screenshots of C++ code and assembly code along with some explanation as to what you’re seeing and why things look the way they look. Furthermore, This article series will not cover the basics of assembly, it will only present patterns and decompiled code so that you can get a general understanding of what to look for / how to interpret assembly code.

Throughout this article we will cover:

  1. Variable Initiation
  2. Basic Output
  3. Mathematical Operations
  4. Functions
  5. Loops (For loop / While loop)
  6. Conditional Statements (IF Statement / Switch Statement)
  7. User Input

please note: This tutorial was made with visual C++ in Microsoft Visual Studio 2015 (I know, outdated version). Some of the assembly code (i.e. user input with cin) will reflect that. Furthermore, I am using IDA Pro as my disassembler.


Variable Initiation

Variables are extremely important when programming, here we can see a few important variables:

  1. a string
  2. an int
  3. a boolean
  4. a char
  5. a double
  6. a float
  7. a char array
Basic Variables

Please note: In C++, ‘string’ is not a primitive variable but I thought it important to show you anyway.

Now, lets take a look at the assembly:

Initiating Variables

Here we can see how IDA represents space allocation for variables. As you can see, we’re allocating space for each variable before we actually initialize them.

Initializing Variables

Once space is allocated, we move the values that we want to set each variable to into the space we allocated for said variable. Although the majority of the variables are initialized here, below you will see the C++ string initiation.

C++ String Initiation

As you can see, initiating a string requires a call to a built in function for initiation.

Basic Output

preface info: Throughout this section I will be talking about items pushed onto the stack and used as parameters for the printf function. The concept of function parameters will be explained in better detail later in this article.

Although this tutorial was built in visual C++, I opted to use printf rather than cout for output.

Basic Output

Now, let’s take a look at the assembly:

First, the string literal:

String Literal Output

As you can see, the string literal is pushed onto the stack to be called as a parameter for the printf function.

Now, let’s take a look at one of the variable outputs:

Variable Output

As you can see, first the intvar variable is moved into the EAX register, which is then pushed onto the stack along with the “%i” string literal used to indicate integer output. These variables are then taken from the stack and used as parameters when calling the printf function.

Mathematical Functions

In this section, we’ll be going over the following mathematical functions:

  1. Addition
  2. Subtraction
  3. Multiplication
  4. Division
  5. Bitwise AND
  6. Bitwise OR
  7. Bitwise XOR
  8. Bitwise NOT
  9. Bitwise Right-Shift
  10. Bitwise Left-Shift
Mathematical Functions Code

Let’s break each function down into assembly:

First, we set A to hex 0A, which represents decimal 10, and B to hex 0F, which represents decimal 15.

Variable Setting

We add by using the ‘add’ opcode:

Addition

We subtract using the ‘sub’ opcode:

Subtraction

We multiply using the ‘imul’ opcode:

Multiplication

We divide using the ‘idiv’ opcode. In this case, we also use the ‘cdq’ to double the size of EAX so that we can fit the output of the division operation.

Division

We perform the Bitwise AND using the ‘and’ opcode:

Bitwise AND

We perform the Bitwise OR using the ‘or’ opcode:

Bitwise OR

We perform the Bitwise XOR using the ‘xor’ opcode:

Bitwise XOR

We perform the Bitwise NOT using the ‘not’ opcode:

Bitwise NOT

We peform the Bitwise Right-Shift using the ‘sar’ opcode:

Bitwise Right-Shift

We perform the Bitwise Left-Shift using the ‘shl’ opcode:

Bitwise Left-Shift

Function Calls

In this section, we’ll be looking at 3 different types of functions:

  1. a basic void function
  2. a function that returns an integer
  3. a function that takes in parameters
Calling Functions

First, let’s take a look at calling newfunc() and newfuncret() because neither of those actually take in any parameters.

Calling Functions Without Parameters

If we follow the call to the newfunc() function, we can see that all it really does is print out “Hello! I’m a new function!”:

The newfunc() Function Code
The newfunc() Function

As you can see, this function does use the retn opcode but only to return back to the previous location (so that the program can continue after the function completes.) Now, let’s take a look at the newfuncret() function which generates a random integer using the C++ rand() function and then returns said integer.

The newfuncret() Function Code
The newfuncret() function

First, space is allocated for the A variable. Then, the rand() function is called, which returns a value into the EAX register. Next, the EAX variable is moved into the A variable space, effectively setting A to the result of rand(). Finally, the A variable is moved into EAX so that the function can use it as a return value.

Now that we have an understanding of how to call function and what it looks like when a function returns something, let’s talk about calling functions with parameters:

First, let’s take another look at the call statement:

Calling a Function with Parameters in C++
Calling a Function with Parameters

Although strings in C++ require a call to a basic_string function, the concept of calling a function with parameters is the same regardless of data type. First ,you move the variable into a register, then you push the registers on the stack, then you call the function.

Let’s take a look at the function’s code:

The funcparams() Function Code
The funcparams() Function

All this function does is take in a string, an integer, and a character and print them out using printf. As you can see, first the 3 variables are allocated at the top of the function, then these variables are pushed onto the stack as parameters for the printf function. Easy Peasy.

Loops

Now that we have function calling, output, variables, and math down, let’s move on to flow control. First, we’ll start with a for loop:

For Loop Code
A graphical Overview of the For Loop

Before we break down the assembly code into smaller sections, let’s take a look at the general layout. As you can see, when the for loop starts, it has 2 options; It can either go to the box on the right (green arrow) and return, or it can go to the box on the left (red arrow) and loop back to the start of the for loop.

Detailed For Loop

First, we check if we’ve hit the maximum value by comparing the i variable to the max variable. If the i variable is not greater than or equal to the max variable, we continue down to the left and print out the i variable then add 1 to i and continue back to the start of the loop. If the i variable is, in fact, greater than or equal to max, we simply exit the for loop and return.

Now, let’s take a look at a while loop:

While Loop Code
While Loop

In this loop, all we’re doing is generating a random number between 0 and 20. If the number is greater than 10, we exit the loop and print “I’m out!” otherwise, we continue to loop.

In the assembly, the A variable is generated and set to 0 originally, then we initialize the loop by comparing A to the hex number 0A which represents decimal 10. If A is not greater than or equal to 10, we generate a new random number which is then set to A and we continue back to the comparison. If A is greater than or equal to 10, we break out of the loop, print out “I’m out” and then return.

If Statements

Next, we’ll be talking about if statements. First, let’s take a look at the code:

IF Statement Code

This function generates a random number between 0 and 20 and stores said number in the variable A. If A is greater than 15, the program will print out “greater than 15”. If A is less than 15 but greater than 10, the program will print out “less than 15, greater than 10”. This pattern will continue until A is less than 5, in which case the program will print out “less than 5”.

Now, let’s take a look at the assembly graph:

IF Statement Assembly Graph

As you can see, the assembly is structured similarly to the actual code. This is because IF statements are simply “If X Then Y Else Z”. IF we look at the first set of arrows coming out of the top section, we can see a comparison between the A variable and hex 0F, which represents decimal 15. If A is greater than or equal to 15, the program will print out “greater than 15” and then return. Otherwise, the program will compare A to hex 0A which represents decimal 10. This pattern will continue until the program prints and returns.

Switch Statements

Switch statements are a lot like IF statements except in a Switch statement one variable or statement is compared to a number of ‘cases’ (or possible equivalences). Let’s take a look at our code:

Switch Statement Code

In this function, we set the variable A to equal a random number between 0 and 10. Then, we compare A to a number of cases using a Switch statement. If A is equal to any of the possible cases, the case number will be printed, and then the program will break out of the Switch statement and the function will return.

Now, let’s take a look at the assembly graph:

Switch Case Assembly Graph

Unlike IF statements, switch statements do not follow the “If X Then Y Else Z” rule, instead, the program simply compares the conditional statement to the cases and only executes a case if said case is the conditional statement’s equivalent. Le’ts first take a look at the initial 2 boxes:

The First 2 Graph Sections

First, the program generates a random number and sets it to A. Then, the program initializes the switch statement by first setting a temporary variable (var_D0) to equal A, then ensuring that var_D0 meets at least one of the possible cases. If var_D0 needs to default, the program follows the green arrow down to the final return section (see below). Otherwise, the program initiates a switch jump to the equivalent case’s section:

In the case that var_D0 (A) is equal to 5, the code will jump to the above case section, print out “5” and then jump to the return section.

User Input

In this section, we’ll cover user input using the C++ cin function. First, let’s look at the code:

User Input Code

In this function, we simply take in a string to the variable sentence using the C++ cin function and then we print out sentence through a printf statement.

Le’ts break this down into assembly. First, the C++ cin part:

C++ cin

This code simply initializes the string sentence then calls the cin function and sets the input to the sentence variable. Let’s take a look at the cin call a bit closer:

The C++ cin Function Upclose

First, the program sets the contents of the sentence variable to EAX, then pushes EAX onto the stack to be used as a parameter for the cin function which is then called and has it’s output moved into ECX, which is then put on the stack for the printf statement:

User Input printf Statement

Thanks!

Hopefully, this article gave you a decent understanding of how basic programming concepts are represented in assembly. Keep an eye out for the next part of this series, BOLO: Reverse Engineering — Part 2 (Advanced Programming Concepts)!

lea eax, [ebp+Reading] ; “Reading”
push eax
lea ecx, [ebp+For] ; “For”
push ecx
mov edx, [ebp+Thanks] ; “Thanks”
push edx
push offset _Format ; “%s %s %s”
call j_printf