How the Compilation Process Works in C

Kenny Reyes
How the Compilation Process Works in C
3 min readFeb 4, 2021

C is a mid-level language and it needs a compiler to convert it into an executable code so that the program can be run on our machine. In the case of the C language, the best known compiler is GCC.

GCC to compile from C to machine language

GCC stands for GNU Compiler Collection is a compiler system produced by the GNU Project supporting various programming languages.

C Compilation process:

Preprocessor:

Source code is the code written in a text editor, and an extension is provided to the source code file “.c” . This source code is first transferred to the preprocessor, and then that code is extended by the preprocessor. The extended code is transferred to the compiler after the extension of the code. And performs below tasks:

  • Remove comments from the source code.
  • Macro expansion.
  • Expansion of included header files.

If the source file name is “sample.c”, the preprocessor transforms the source code to extended source code. The expanded source code extension will be “sample.i”

Compiler:

The code expanded by the preprocessor is passed to the compiler. The code is translated to assembly code by the compiler. Or we can say that the C compiler transforms the preprocessed code into assembler. Compiler performs following tasks:

  • Check C program for syntax errors.
  • Translate the file into intermediate code i.e. in assembly language.
  • Optionally optimize the translated code for better performance.

If the expanded source file name is “sample.i”, The assembly code extension will be a “sample.s”.

Assembler:

With the help of an assembler, the assembly code is translated to object code. The name of the assembler generated object file is similar to that of the source file. The object file extension in Windows is ‘obj’, and the file extension in UNIX is ‘o’ .

If the assembly file name is “sample.s”, the object file name would be “sample.o”.

Linker:

Is the final stage of compilation. It takes one or more object files or libraries as input and combines them to produce a single (usually executable) file. In doing so, it resolves references to external symbols, assigns final addresses to procedures/functions and variables, and revises code and data to reflect new addresses (a process called relocation).

If the source file name is “sample.o”, the executable file name would be “sample.exe” in Windows, and in UNIX, the executable file can be renamed “a.out”.

How do we compile and run a C program with GCC?

First, we need create a file.c using an editor:

$ vi main.c

Now we can compile it using below command:

$ gcc main.c
main.c now is compiled!

Use the command “ls” to list your directory contents:

$ ls
and you will see an executable file named a.out

This is your program, to run type ./a.out :

We can see the output “Hello World”, this is how it works.

--

--