Every compiler, including Golang’s, optimizes our code to an extent. This makes our build faster and outputs a smaller and probably more efficient binary.
However, I believe that sometimes you should take a look under the hood to understand how things work. It will surely step up your programming and debugging skills.
Recently I faced a really strange behavior while benchmarking one of the functions I wrote:
Although it made me happy thinking my code runs so fast, I figured out that probably something went wrong and the test is misleading so I started digging for answers for this phenomenon.
Let’s start with a simple feature: Function Inlining.
What Does It Mean?
The compiler takes our function’s code and substitutes it with the function calls.
Why Inline At All?
Function call has its own burden of creating a new stack frame which generally includes:
- Return address
- Argument variables passed on the stack
- Local variables
- Modified registers
All of the above, affect our program execution with unwanted operations.
To make our application even faster, the compiler comes to the rescue and changes the code to include the function’s content on each call.
We need to remember there are always some strings attached and in our case, it results with a larger output binary which has its own cons as well.
Hence, most of the compilers define a threshold (which you can play with) that determines whether or not to inline the function.
From Go Wiki:
Only short and simple functions are inlined. To be inlined a function must contain less than ~40 expressions and does not contain complex things like function calls, loops, labels, closures, panic’s, recover’s, select’s, switch’es, etc.
Talk Is Cheap, Show Me The Code!
First, let’s implement a really cool and efficient sine approximation called Bhaskara I’s:
gcflags -m option for
go build may reveal some inner compiler decisions:
Cool, the compiler inlined our function. Now we can dive deeper and inspect the build disassembly using the
-S flag to see how our functions are translated and disable the compiler optimizations using the
-N flag to reduce uncertainty.
$ go build -gcflags '-S -N' main.go 2>&1
You can watch the full output in your terminal, but for the sake of readability I’ll filter the output a bit so we can take a look only at the parts that are important for us:
The command output is not a final machine code. For example, we can see the
PCDATAwhich use as hints for the linker's garbage collecting arrangements. For our purpose, we can simply ignore them.
To understand how the optimizations processes affect our code, I’ll re-do the previous step without disabling the compiler optimizations. Let’s see what happens:
If you already checked the output, you probably noticed a strange thing — Our main function is empty!
Why Does This Happen?
The compiler sees that the function has no calling side effects (e.g: neither calling any other third-party function nor changing a global variable)
If we change the code and add a global variable to hold the function’s result as seen in the following example:
The output of the main function will not be empty:
Where Can We Face It?
A common use case of unwanted compiler optimization might occur in benchmark tests.
To prevent the unwanted optimization (that could compromises our function’s performance measurement) one can disable optimizations when running the tests, which, in my opinion, does not reflect reality.
I would like to show you another possible solution:
We started by playing a little with Go build tools to understand a simple optimization of function inlining. Then we dived a little deeper to understand the compiled code and the optimization flags affections and finished by a real-world example we may face in our next benchmark tests.
I hope you now have a profound understanding of compiler optimization and some Go build tools, but most importantly curiosity to check other internal stuff.