What Happens When You Type gcc main.c ?

Like so…

The ultimate objective of a programmer is to get a computer to perform a task. To communicate this task is no simple feat, because the language a human brain understands and the language a computer brain understands are completely different. To translate our written instructions in code to something the computer will understand, a compiler is needed.

One of the most commonly used compilers is gcc (the GNU Compiler Collection). gcc translates a number of languages into a series of formats that descend in order from legible to humans to legible to your cpu. How does gcc work? Suppose you had a file called ‘main.c,’ within that file was the following:

#include <stdio.h>

* main — Entry point
* Return: Always 0 (Success)

int main(void)
 return (0);

Typing ‘gcc main.c’ would put this code through four distinct steps that translate it into a functioning program. These steps are preprocessing, compilation, assembly and linking. Examining these four steps sheds light on this complex process.

1. Preprocessing

In this stage preprocessor commands are read, these are the lines starting with the ‘#’ character. The referenced code has one of these commands in the first line.

#include <stdio.h>

This is the include syntax command with the angle brackets option. This tells gcc to look in the standard list of system directories, for whatever is typed in between those angle brackets. For Linux users, this is the /usr/include directory.

In this example the file stdio.h is retrieved and prepended to the code (put on top). ‘stdio.h’ is a type of header file, a file containing C declarations and macro definitions. Declarations specify the interpretations and attributes of the various types of data within your code. Definitions are the names associated with the declarations within your code. Put simply, declarations declare that a type of data is there and has a purpose, definitions give it a name.

Now that stdio.h is combined with your code, there some changes being made within. ‘stdio.h’ runs through your code, removing comments, and taking any declarations and macros that may have been used and expanding them to their full form. Running gcc -E <filename> with your code will produce a file with the suffix .i that illustrates this more clearly. The once human-legible code is now less so. This readies the file for the second step of the compilation process.

2. Compiling

After being prepped through preprocessor stage, the code is then translated into assembly code, this is referred to as compiling. Assembly code may read as an esoteric list of instructions to the average human, but this is a language that is close to what the CPU of a computer can read and interpret. Assembly language contains instructions that directly manipulate the memory and processor, emphasis is no longer put on legibility.

It’s much like translating a human language into another that has a different syntax in literal manner. The wording may be less intelligible, but it will helps the learner get a glimpse of how another language works. Now that our code more closely resembles the ‘computer-speak’ its now ready for the next step.

3. Assembling

This assembles the assembly code into a full-on binary code (or object code), no longer legible by the human user, but quite useful for your computer.

4. Linking

This final step wraps up the compilation process. Any part of the code defined as external will have the code stored externally linked to the source code. Finally, the binary (or object file) is converted into an executable, allowing the user to run it and enjoy the fruits their labor.

One clap, two clap, three clap, forty?

By clapping more or less, you can signal to us which stories really stand out.