[Linux] Profiling —visualize program bottleneck with Flamegraph

TechHara
4 min readOct 30, 2023

--

Let’s say you have a program which you want to improve the performance of. Probably the first thing one would try is perf or dtrace for sampling a running program. This is all good but its output is not in the most intuitive format. Today, let’s look at how we can create intuitive visualization of the stack trace using Flamegraph.

Prerequisite

First, I am going to assume we are running a Linux system. If you are running inside a Docker container, make sure your docker container has the required privilege. See this article for more details. We will need perf utility, which you can install by the following command in Ubuntu

sudo apt install linux-tools-common linux-tools-generic linux-tools-`uname -r`

Second, we will discuss profiling a native app. That is, a program that is compiled to the native machine code for your system and not running on a virtual machine (VM) of some sort. In technical terms, these are programs that are compiled ahead-of-time (AOT). For example, C/C++/Rust/Go programs use AOT compilers, while Java/C#/Python programs are running on a VM, so they need dedicated tools for profiling.

Finally, the programs must have debug info in order to obtain helpful graph. For C/C++ programs, we can add -g option. For Go programs, the default option will do. For Rust, we need to add debug = true to the Cargo.toml file.

Step by Step Guide

Alright. Let’s do an example with a Go program. We will use a simple gunzip program written in Go. Let’s download the source code and build the program.

# clone example source code
git clone https://github.com/TechHara/go_gunzip.git

# go into the source directory
cd go_gunzip

# compile to an executable ./gunzip
go build

This will create gunzip executable file in the directory. Now, let’s run the program with perf tool from Linux. We will use linux.tar.gz as an example file.

# download linux source code and compress as linux.tar.gz
wget https://cdn.kernel.org/pub/linux/kernel/v6.x/linux-6.5.5.tar.xz -O - | xz -d | gzip > linux.tar.gz

# this is how you would run the program
./gunzip < linux.tar.gz > linux.tar

# this time, run while profiling
perf record -g ./gunzip < linux.tar.gz > linux.tar

# convert to trace output
perf script > trace.perf

Note that when we run the program, we must include ./ to indicate that we are running ./gunzip executable in the current directory, rather than the system built-in gunzip.

We should see trace.perf file if all runs successful. If you want to look into the stack trace in text-format, you can do so with perf report, but the better way is to use Flamegraph.

To visualize the stack trace, we need to download Flamegraph from its repo and run a few more commands.

# clone Flamegraph repo
git clone https://github.com/brendangregg/FlameGraph.git

# collapse the stack trace
FlameGraph/stackcollapse-perf.pl trace.perf > trace.folded

# convert to svg format
FlameGraph/flamegraph.pl trace.folded > trace.svg

# open up in firefox
firefox trace.svg

This should open up a pretty interactive webpage as below

Voila! The graph shows intuitive visualization of the program stack trace. The horizontal-axis is in unit of time, whereas the vertical-axis shows stack frame. For this particular example program, we can see that main.Decode, runtime.memmove, and syscall.write functions take up the majority of the program runtime. If you want to improve the runtime, the probably you should start with main.Decode function.

For different programs, whether from C/C++/Rust/Go, the basic steps are the same. We just need to replace with the new program and provide its own arguments.

Even Better

OK, if you think this is too much work, there is an easier way. There is a Rust package that does the job for us. Assuming you have cargo installed on the system, you can run

# install flamegraph package
cargo install flamegraph

to install the package. Now, all you need is a single command to generate the flamegraph

# one liner with flamegraph crate
flamegraph --open --cmd "record -g" -- ./gunzip < linux.tar.gz > linux.tar

This single-line will profile, save trace, collapse, generate .svg file, and open.

--

--

TechHara

Passionate with software development. I write stories to help developers thrive.