Go: Overview of the Compiler

Vincent Blanchon
Sep 7, 2019 · 5 min read
Image for post
Image for post
Illustration created for “A Journey With Go”, made from the original Go Gopher, created by Renee French.

ℹ️ This article is based on Go 1.13.

The Go compiler is an important tool in the Go ecosystem since it is one of the essential steps for building our programs to executable binaries. The journey of the compiler is a long one, it has been written in C to move to Go and many optimizations and cleanups will keep happening in the future. Let’s discover the high level of its operations.

Phases

The Go compiler is composed of four phases that could be grouped into two categories:

  • frontend. This phase runs an analysis from the source code and produces an abstract syntactic structure of source code, called AST.
Image for post
Image for post
compiler documentation

In order to better understand each phase, let’s use a simple program:

Parsing

The first phase is pretty straightforward and well explained in the documentation:

In the first phase of compilation, source code is tokenized (lexical analysis), parsed (syntax analysis), and a syntax tree is constructed for each source file.

The lexer will be the first package to run in order to tokenize the source code. Here is the output of the previous example tokenized:

Image for post
Image for post
Go source code tokenized

Once tokenized, that will be parsed and used to build a syntax tree.

AST transformation

The transformation to an Abstract Syntax Tree can be displayed thanks to the command go tool compile with the flag -W:

Image for post
Image for post
sample of the generated AST

This phase will also include optimizations like inlining. In our example, the method add can be inlined already since we do not see any instruction CALLFUNC to the method add. Let’s run the again command with the flag -l that disables the inlining:

Image for post
Image for post

Once the AST generates, it allows the compiler to go to a lower-level intermediate representation with the SSA representation.

SSA generation

The Static Single Assignment form is the phase where the optimizations will happen: dead code elimination, removal of unused branches, replacing some expressions with constant values, etc.

The SSA code can be dumped thanks to the command GOSSAFUNC=main go tool compile main.go && open ssa.html that produces an HTML document will all the different passes that are done in the SSA package:

Image for post
Image for post
SSA passes

The generated SSA stands in the “start” tab:

Image for post
Image for post
SSA code

The variables a and b are highlighted here, along with the if condition and will allow us later to see how those lines are changed. The code also shows us how the compiler manages the println function that is decomposed in 4 steps: printlock, printint, printnl, printunlock. The compiler automatically adds a lock for us and, according to the type of the argument, will call the related method to print it correctly.

In our example, since a and b are known at the compilation, the compiler can calculate the final result and mark the variables as not necessary anymore. The pass opt will optimize this part:

Image for post
Image for post
SSA code — “opt” pass

v11 has been replaced here by the result of the addition of v4 and v5 that have been marked as dead code. The pass opt deadcode will then remove that code:

Image for post
Image for post
SSA code — “opt deadcode” pass

Regarding the if condition, the opt phase will mark the constant true as dead code and then will be removed:

Image for post
Image for post
constant boolean is removed

Then, another pass will simplify the control flow by marking the unnecessary block and condition as invalid. Those blocks will later be removed by another pass dedicated to the dead code:

Image for post
Image for post
unnecessary control flow is removed

Once all the passes are done, the Go compiler will now generate an intermediate assembly code:

Image for post
Image for post
Go asm code

The next phase will generate the machine code into the binary file.

Machine code generation

The last step of the compiler is the generation of the object file, main.o in our example. From this file, it is now possible to disassemble it with the objdumptool that does the reverse process. Here is a nice diagram created by Grant Seltzer Richman:

Image for post
Image for post
go tool compile
Image for post
Image for post
go tool objdump

You can find more information about the object file and binaries in “Dissecting Go Binaries.

Once the object file is generated, it can now be passed directly to the linker with the command go tool link and your binary will finally be ready.

A Journey With Go

A Journey With Go Language Programming

Medium is an open platform where 170 million readers come to find insightful and dynamic thinking. Here, expert and undiscovered voices alike dive into the heart of any topic and bring new ideas to the surface. Learn more

Follow the writers, publications, and topics that matter to you, and you’ll see them on your homepage and in your inbox. Explore

If you have a story to tell, knowledge to share, or a perspective to offer — welcome home. It’s easy and free to post your thinking on any topic. Write on Medium

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store