Is Compilation easy?

Checha Giudice
4 min readSep 18, 2020

Let’s figure it out together!

Little joke to break the ice.

At school we had various assingments that were about compilation. Let’s get something straight: gcc (and other programs) already “compile” your files for you. So… why do we need to know about WHAT actually compilation does? Understanding the process, the step-by step, will help you be a better programmer.

Wikipedia says: In computing, a compiler is a computer program that translates computer code written in one programming language (the source language) into another language (the target language). So, it’s like a google translator for the terminal.

Compiler converts a C program (a “.c” file) into an executable file, meaning that this file can execute the orders you wrote into it. There are four phases for a C program to become an executable:

  1. Pre-processing
  2. Compilation
  3. Assembly
  4. Linking

Let’s write a little C program (saved in a file called main.c) in our favorite editor, and start “compiling” step-by-step:

#include <stdio.h>/**
* main - Entry point
*
* Return: Always 0 (Success)
*/
int main(void)
{
return (0);
}

1- Pre-processing:

$ gcc -E main.c

This is the first phase through which source code is passed. You run the compiler “gcc”, add the option “-E” to tell the compiler that you ONLY want to pre-process the file, and indicate which file you want to pre-process (in this case “main.c”). This phase include:

  • Removal of Comments (/* This is a comment */)
  • Expansion of Macros
  • Expansion of the included files (#include <stdio.h>)
  • Conditional compilation

The preprocessor takes the preprocessor directive and interprets it. For example, if <stdio.h>, the directive is available in the program, then the preprocessor interprets the directive and replace this directive with the content of the ‘stdio.h’ file.

The preprocessed output is stored in a new file, called main.i (remember our first file was main.c) which will be filled with lots and lots of info, but at the end our code is preserved.

This is how your main.o should look like in your editor.

2. Compilation.

The next step is to compile main.i and produce an intermediate compiled output file main.s.

The code which is expanded by the preprocessor is passed to the compiler. The compiler converts this code into assembly code. Or we can say that the C compiler converts the pre-processed code into assembly code.

3. Assembler.

In this phase the main.s is taken as input and turned into main.o by assembler. This file contain machine level instructions. At this phase, only existing code is converted into machine language, the function calls like printf() are not resolved. Let’s view main.o using your editor:

Your assembly code should looke like this.

This is the final phase in which all the linking of function calls with their definitions are done. Linker knows where all these functions are implemented. Linker does some extra work also, it adds some extra code to our program which is required when the program starts and ends. Without the option -o anotherfilename, the compiler will return a new file called a.out. This is the compiled file, that should look like this:

SO:

The following steps are taken to execute a program:

  • Firstly, the input file, i.e., hello.c, is passed to the preprocessor, and the preprocessor converts the source code into expanded source code. The extension of the expanded source code would be hello.i.
  • The expanded source code is passed to the compiler, and the compiler converts this expanded source code into assembly code. The extension of the assembly code would be hello.s.
  • This assembly code is then sent to the assembler, which converts the assembly code into object code.
  • After the creation of an object code, the linker creates the executable file. The loader will then load the executable file for the execution.

And that’s it! Easy right?

NOW, start compiling!!

--

--