You Down with GCC?
What happens when you type gcc main.c
It is very easy when coding in C to just think of gcc as the command you run that feeds your source code into the compiler with your executable being shot out the other side. However, the following picture is much more accurate.
This shouldn’t be surprising, compilers like other large program are usually split up into parts as theses smaller parts can be optimized for their unique role as well as being easier to design and maintain. The command gcc can take your code through all of the parts or can stop at any point along the way and show you what it has done depending if you include the appropriate option along with the command. Lets take a walk through gcc.
gcc -E file.c
- Pre-processor: The above command will show you the result of running your source file through the first step, which is the pre-processor. The pre-processor is a macro processor, in that it will process any macros that you have used in your program. It also strips out any comments and replaces them with single spaces and takes any continued lines and makes them into a single line. After these textual changes it transforms the file into a series of pre-processing tokens that will be used by the compiler.
gcc -S file.c
2. Compiler: The above command will stop after passing the file through the pre-processor and the compiler. The compiling process takes your pre-processed file and turns it into assembly instructions. Theses instructions will differ depending on the target processor. You can use the following additional tag to get “gcc -S -masm=intel file.c” to get the intel syntax.
gcc -c file.c
3. Assembler: The above command will stop after taking the source file through the above two step and then through the assembler. The assembler will take the assembly instructions and turn it into machine code.
4. Linker: The above command has no tags and will take your file through all the stages, including the final one, the linker. It will produce a file a.out that will be your executable. Usually you want to name your executable instead of using the default a.out, so you would write the command as: gcc -o file file.c. Whatever name you put after the “-o” tag will be the name of the executable. The linker takes the machine code that may have parts that are out of order or missing and fills in these gaps. It also will identify the main function as the initial entry point where the execution should start. If one function calls another function it ensures that those functions are linked. It also adds pieces from any libraries that your code relies on.
These are the four main steps of getting from your source code to your final executable. There is usually one or more “optimizers” in the process, but more on these in another post.