Stages of compilation process

Minas Anton
4 min readMay 10, 2017

--

What happens when you type gcc main.c

Compilation process flow.

We are about to analyze what happens when we type the command
gcc main.c and learn the four main stages of the compilation process.

  1. Pre-processing
  2. Compilation
  3. Assembly
  4. Linking

Before we start lets create our .c file and run the command to see what happens. See example 1

example 1

So we created a file under main.c and wrote inside a C program that will print “Hello, world”. After we compiled it an executable file a.out was created. We test it and indeed it printed “Hello, world”. Now let’s think deeper and wonder what exactly happened and how we end up with a.out!

1. PRE-PROCESSING

This is the very first stage through which a source code passes. In this stage the following tasks are done:

  1. Macro substitution
  2. Comments are stripped off
  3. Expansion of the included files

To understand preprocessing better, we are going to compile the above ‘main.c’ program using flag -E, which will print the preprocessed output to standard output. See example 2!

example 2

Note: We could have saved the output to a file using the flag -o filename.

In order to move on and make things more clear , we will use the following command :
$ gcc -save-temps main.c

The flag -save-temps will tell the compiler to store the temporary intermediate files used by the gcc compiler in the current directory. So we will get the files main.i , main.s , main.o along with the executable a.out.
See example 3.

Note: We get the a.out as the executable name because we are not defining this in our command. To specify the output filename use flag -o following by the name.

example 3

2. COMPILING

After the compiler is done with the preprocessing stage. The next step is to take main.i as input, compile it and produce an intermediate compiled output. The output file for this stage is ‘main.s’. The output present in main.s is assembly level instructions. See example 4!

example 4

3. ASSEMBLY

At this stage the main.s file is taken as an input and an intermediate file main.o is produced. This file is also known as the object file and it’s produced by the assembler that understands and converts a‘.s’ file with assembly instructions into a ‘.o’ object file which contains machine level instructions. At this stage only the existing code is converted into machine language, the function calls like printf() are not resolved.

Since the output of this stage is a machine level file , it’s totally unreadable and it will look like in example 5.

example 5

By looking at this output we can explain only
ELF = Executable and Linkable Format
Hello , world = the content to be displayed witch we entered in the main.c
and the Ubuntu version that GCC compiler is running from.

4. LINKING

This is the final stage at which all the linking of function calls with their definitions are done. As discussed earlier, till this stage gcc doesn’t know about the definition of functions like printf(). Until the compiler knows exactly where all of these functions are implemented, it simply uses a placeholder for the function call. It is at this stage, the definition of printf() is resolved and the actual address of the function printf() is plugged in.The linker comes into action at this stage and does this task.Also the linker combines some extra code to our program that is required when the program starts and when the program ends. For example, there is code which is standard for setting up the running environment like passing command line arguments, passing environment variables to every program. Similarly some standard code that is required to return the return value of the program to the system.

After the standar code is combined the linker convert the .o file to executable file (a.out by default).

I hope this article helped you understand the general concept of compiling a C program.

--

--