Firing on All Engines

Flame graphs for Java Programs

Michael Hunger
Oct 17 · 3 min read

Traditional Java profilers use either byte code instrumentation or sampling (taking stack traces at short intervals) to determine where time was spent. Both approaches add their own skews and oddities. Understanding the output of those profilers is an art of its own and requires quite some experience.

Fortunately, Brendan Gregg, a performance engineer at Netflix, came up with flame graphs (see http://www.brendangregg.com/flamegraphs.html), an ingenious kind of diagram for stack traces that can be gathered from almost any system.

A flame graph sorts and aggregates the traces up to each stack-level, so that their count per level represents the percentage of the total time spent in that part of the code. Rendering those blocks as actual blocks (rectangles) with the width being proportional to the percentage, and stacking the blocks onto each other turned out to be a very useful visualization. (Note that the left-to-right order has no significance, often it’s just alphabetical sorting. The same is true for colors. Only the relative widths and stack depths are relevant.)

Flamegraph of the Benchmark of filling a non-preallocated ArrayList

The “flames” represent from bottom to top the progression from entry point of the program or thread (main or an event loop) to the leaves of the execution in the tips of the flames.

You can immediately see if certain parts of the program take an unexpectedly large amount of time. The higher up in the diagram that happens the worse. Especially if you have a flame that’s very wide on top, you know you found a bottleneck, which is not delegating work elsewhere. After fixing the issue, measure again and if the overall performance issue persists, revisit the diagram for new indications.

To address the shortcomings of traditional profilers, many modern tools make use of an internal JVM feature (AsyncGetCallTrace) which allows the gathering of stack traces outside of safepoints. Additionally, they combine measurement of JVM operations with native code and system calls to the operating system, so that time spent in network, I/O, or garbage collection can become part of the flame graph as well.

Tools like Honest Profiler, perf-map-agent, async-profiler, and even IntelliJ IDEA make capturing the information and generating flame graphs really easy.

In most cases you just download the tool, provide the PID of your Java process, and tell the tool to run for a certain amount of time and generate the interactive SVG.

# download and unzip async profiler for your OS from:
# https://github.com/jvm-profiling-tools/async-profiler
./profiler.sh -d <duration> -f flamegraph.svg -s -o svg <pid> && \
open flamegraph.svg -a "Google Chrome"

The SVG that the tools produce is not just colorful but also interactive. You can zoom into sections, search for symbols, and more.

Flamegraphs are an impressively powerful tool to quickly get an overview of the performance characteristics of your programs, you can see hotspots immediately and focus on those. Including non-JVM aspects also helps with the bigger picture.

Michael Hunger

Written by

A software developer passionate about teaching and learning. Currently working with Neo4j, GraphQL, Kotlin, ML/AI, Micronaut, Spring, Kafka, and more.

97 Things

97 Things

Tap into the wisdom of experts to learn what every great practitioner should know, no matter what technology or techniques you use. With the 97 short and extremely useful tips, you’ll expand your skills by adopting new approaches while learning best practices.

Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch
Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore
Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade