The four stages of the gcc compiler: preprocessor, compiler, assembler, linker.

Fernando Gonzales Pradinett
5 min readFeb 8, 2022

1 Preprocessor:

The instructions in the preprocessing stage usually start with #, so the programs we write like #include, #define, etc. are completed at this stage, we often use it when we write programs for some stdio or iostream header files, it’s not just a simple sentence, it’s some libraries that have been written, we can quote them directly here, so the preprocessing stage will open all the header files you quote for inserting into our own program. The second is that the preprocessing stage will replace all the macros in our program. We often define a macro definition at the beginning of the program, and the replacement of the macro definition is done at the preprocessing stage. The third task is that we often write some comments when we write programs. These comments are for our programmers to see and have no effect on the program. Therefore, the program will delete the build that we wrote during the pre-processing stage. The machine cannot see the comments we write. The fourth is our conditional compilation. We often write #ifdef. At this time, our machine will not see the part that does not meet the conditions and will not enter the compilation stage.

To start the process, we need to give the compiler a file as input, so we create a source file with our favorite text editor and write a message, “Hello world”, I’ll use vi.

vi main.c

Now that we have the source file, we can make it an executable by simply calling the gcc command and naming the file:

gcc main.c

However, since we want to see what happens after each stage, we will have the process with the following flags (between the command and the file name):

  • -E : Stop after the preprocessing stage; do not run the compiler proper.
  • -S : Stop after the stage of compilation proper; do not assemble.
  • -c : Compile or assemble the source files, but do not link.
  • -o <file>: Place output in file <file>. This applies regardless to whatever sort of output is being produced, whether it be an executable file, an object file, an assembler file or preprocessed C code. If -o is not specified, the default is to put an executable file in a.out
  • Please refer to the manual page for further detail.

let’s continue with the preprocessor

gcc -E main.c

This step generates the extension code that looks like this:

2 Compiler:

Some levels lower than C language and higher levels than machine language are our assembly language. Assembly language only uses some machine language based mnemonics. In the second stage, the main task is to let the compiler check your program for any grammatical errors. When your program has no problems, compilation will bring the programming of your program closer to machine language assembly language.

gcc -S main.c

This step generates assembly code and if we check the main.s file, the content is an intermediate language that can be read by humans.

3 Assembly:

The third stage is the assembly stage, this stage is to convert the assembly code generated in the second stage into our executable file, which is to convert our assembly language into a machine language that can be executed by our machine. The stage the program must go through, because our C or assembly language or all kinds of language machines cannot understand it.

gcc -c main.c

-C means let our program execute the third stage to generate a machine language that the machine can understand. At this point, we can see the .o file generated by ls.

Right now, after opening it through vim, we will find that this is completely confusing to our eyes, but this is a binary language that the machine can understand.

4. Linker:

The functions that have been written, up to this point, if we don’t bind all of them independently when we execute them, the program will not be able to execute correctly, so we have to connect in the fourth step.

gcc main.c

If gcc is run without options, the default filename is a.out.

But, if we want to name our executable file, we need to add the -o flag followed by that name, and our source code file at the end:

gcc main.c mysystem

Now that the build is complete, we can run the executable and get the “Hello World” message with this “. /” (note that we can use either the a.out file or the ./mysystem):

These are the 4 build steps and the arguments to stop the build and parse each step.

--

--

Fernando Gonzales Pradinett

Si le das a alguien un programa, lo frustrarás un día. Si le enseñas a programar, lo frustrarás toda la vida.