Benchmarking Is Hard — JMH Helps

Published in

97 Things

3 min readSep 9, 2019

Benchmarking on the JVM, especially microbenchmarking, is hard. It’s not enough to throw a nanosecond measurement around a call or loop and be done. You have to take into account warm-up, HotSpot compilation, code optimizations like inlining and dead-code elimination, multithreading, consistency of measurement, and more.

Fortunately, Aleksey Shipilëv the author of many great JVM tools, contributed JMH, the Java Microbenchmarking Harness to the OpenJDK. It consists of a small library and a build system plugin. The library provides annotations and utilities to declare your benchmarks as annotated Java classes and methods, including a BlackHole class to consume generated values to avoid code elimination. The library also offers correct state handling in the presence of multithreading.

The build system plugin generates a JAR with the relevant infrastructure code for running and measuring the tests correctly. That includes dedicated warm-up phases, proper multithreading, running multiple forks and averaging across them, and much more.

The tool also outputs important advice on how to use the gathered data and limitations thereof. Here is an example for measuring the impact of pre-sizing collections:

public class MyBenchmark {
    static final int COUNT = 10000;     @Benchmark
    public List<Boolean> testFillEmptyList() {
        List<Boolean> list = new ArrayList<>();
        for (int i=0;i<COUNT;i++) {
            list.add(Boolean.TRUE);
        }
        return list;
    }    @Benchmark
    public List<Boolean> testFillAllocatedList() {
        List<Boolean> list = new ArrayList<>(COUNT);
        for (int i=0;i<COUNT;i++) {
            list.add(Boolean.TRUE);
        }
        return list;
    }
}

To generate the project and run it, you can use the JMH Maven archetype:

mvn archetype:generate \
-DarchetypeGroupId=org.openjdk.jmh \
-DarchetypeArtifactId=jmh-java-benchmark-archetype \
-DinteractiveMode=false -DgroupId=com.example \
-DartifactId=coll-test -Dversion=1.0cd coll-test# add com/example/MyBenchmark.javamvn clean installjava -jar target/benchmarks.jar -w 1 -r 1...
# JMH version: 1.21
...
# Warmup: 5 iterations, 1 s each
# Measurement: 5 iterations, 1 s each
# Timeout: 10 min per iteration
# Threads: 1 thread, will synchronize iterations
# Benchmark mode: Throughput, ops/time
# Benchmark: com.example.MyBenchmark.testFillEmptyList...Result "com.example.MyBenchmark.testFillEmptyList":
  30966.686 ±(99.9%) 2636.125 ops/s [Average]
  (min, avg, max) = (18885.422, 30966.686, 35612.643), stdev = 3519.152
  CI (99.9%): [28330.561, 33602.811] (assumes normal distribution)
# Run complete. Total time: 00:01:45REMEMBER: The numbers below are just data. To gain reusable insights, you need to follow up on
why the numbers are the way they are. Use profilers (see -prof, -lprof), design factorial
experiments, perform baseline and negative tests that provide experimental control, make sure
the benchmarking environment is safe on JVM/OS/HW level, ask for reviews from the domain experts.
Do not assume the numbers tell you what you want them to tell.Benchmark                           Mode  Cnt      Score      Error  Units
MyBenchmark.testFillAllocatedList  thrpt   25  56786.708 ± 1609.633  ops/s
MyBenchmark.testFillEmptyList      thrpt   25  30966.686 ± 2636.125  ops/s

So we see that our pre-allocated collection is almost twice as fast as the default instance because it doesn’t have to be resized during addition of elements.

JMH is a powerful tool in your toolbox to write correct microbenchmarks. If you run them in the same environment they are even comparable, which should be the main way of interpreting their results. They can also be used for profiling purposes, as they provide stable, repeatable results. Aleksey has much more to say about the topic if you’re interested.

Benchmarking Is Hard — JMH Helps

Written by Michael Hunger