Java 8 vs Scala — Part III Trust no one, bench everything

Published in

Zappos Engineering

6 min readNov 21, 2015

This is part 3 of the article. Check out Part 1 and Part 2.

From Part 2, you’ve seen that the Scala approach is generally shorter than the Java approach when you want to manipulate either collections or streams. This part, I will show you which one performs better. The framework that we are going to use for benchmarking is JMH. JHM is a Java harness for building, running, and analyzing nano /micro /milli /macro benchmarks written in Java. It uses annotation based approach to generate code for a benchmark. According to JEP 230, JMH will come with Java 9. I don’t know how it works in Java 9 just yet. It might be different from the one that we are using here. It’s supposed to be easier to use in Java 9 in terms of configuration.

Why don’t we use System.nanoTime() instead of using a benchmark framework? We can, however the result will not be accurate because of the JVM optimization. There are some factors that might affect your benchmark result such as warm up phase, dead code elimination, profiles, and …hmmm… I’m sure there are a lot more :).

Even though the recommended way to run the JMH benchmark is to use Maven, it is kind of hacky to set up Maven project for Scala and JMH (I could be wrong). Luckily, there is the SBT plugin for that, it is sbt-jmh. So, we’re going to use SBT which is a built tool like Maven but it works well with both Scala and Java. In addition, SBT’s configuration is much more concise than Maven which is XML-based.

Let’s get started!

Let’s start by setting up SBT project for the benchmark. The sbt-jmh plugin has the activator template that you can use named sbt-jmh-seed. An activator template is similar to a Maven archetype. You can download the activator tool here.

Create a new project from the sbt-jmh-seed template by running the following command.

activator new PROJECT_NAME sbt-jmh-seed

After you run above command, the activator tool (a thin wrapper of SBT) will generate a bunch of files. One of the files that it generates is build.sbt which is the main configuration file like pom.xml in Maven.

The content of build.sbt file will look like the following.

name := “””java8-scala-benchmark”””version := “1.0-SNAPSHOT”libraryDependencies ++= Seq(
    // “group” % “artifact” % “version”
)enablePlugins(JmhPlugin)

There is one more file that is as important as build.sbt, it is project/plugins.sbt file containing the plugin configuration.

You know what? Forget what I said above. You don’t have to do that from scratch. If you want to run the benchmark yourself, in case you think that I may be making up the numbers :D, just go here and clone the project to your machine. You will get everything you need for running the benchmark. Before running the benchmark, make sure you disable any settings that may affect the CPU frequency in your operating system. The benchmark takes quite a long time to finish, your operating system may adjust CPU frequency when there is no user activity for a certain period. That’s why you should turn that feature off. In addition, you should close applications as many as possible (especially Antivirus) and shouldn’t use computer while the benchmark is running.

object ScalaStates {
    @State(Scope.Benchmark)    
    class BenchmarkState{
        var pets: List[Pet] = Pets.DATA.toList;
    }
}class ScalaCollectionBenchmark {
    import ScalaStates._
    @Benchmark
    def runFilterThenCount(state: BenchmarkState): Int = {
        state.pets.count{ pet => pet.getWeight > 50}
    }    @Benchmark
    def runSortThenCollect(state: BenchmarkState): List[Pet] = {
        state.pets.sortBy{pet => (pet.getType, pet.getName)}
    }
…
}

From the Scala code above, we have the class holding the data that we are going to use inside the benchmark method. You should declare the variable as an instance field and annotate the class with @State. Otherwise, it will be subjected to the constant optimization. Note that each benchmark method should either return a value or consume a value using Blackhole, to prevent the dead-code elimination. You can inject Blackhole object to a method parameter and call the consume method manually like the following snippet. However, I prefer using the return approach (consume a value implicitly) to using the Blackhole object explicitly because it’s much cleaner.

//Using Blackhole object explicitly.
@Benchmark
def runFilterThenCount(bh: Blackhole): Unit = {
    bh.consume(pets.count{ pet => pet.getWeight > 50})
}

If you don’t know Scala, you may wonder where is the return statement. In Scala, the last statement in a method body will be treated as a return statement implicitly. The return type is specified next to method parameters after : notation (It’s optional though. Scala can infer type from the statement.). Unit is similar to void in Java but it’s a type not a keyword.

Now let’s take a look at Java counterpart.

public class JavaStreamBenchmark {
    @State(Scope.Benchmark)
    public static class BenchmarkState {
        volatile List<Pet> pets = Pets.DATA;
    }
    @Benchmark
    public long runFilterThenCount(BenchmarkState state){
        return state.pets.stream()
            .filter(pet -> pet.getWeight() > 50)
            .count();
    }
    @Benchmark 
    public List<Pet> runSortThenCollect(BenchmarkState state){
        return state.pets.stream()
            .sorted(Comparator.comparing(Pet::getType)
            .thenComparing(Pet::getName))
            .collect(Collectors.toList());
    }
    ...
}

Java Streams API doesn’t mutate state. So, we don’t have to worry about re-populating the test data for each method invocation. But, what should we do if we want to benchmark the method that changes the test data? We can use fixture to prepare the test data based on the method’s behavior.

The following snippet shows how to use fixture using @Setup annotation. There are 3 available levels for @Setup that you can use and the default level is Trial.
Level.Trial — execute before the entire benchmark run
Level.Iteration — executes before the benchmark iteration
Level.Invocation — executes before each benchmark method invocation (Use with care see Javadoc for more information)

@State(Scope.Thread)
public static class ThreadState {
    List<Pet> pets;    @Setup(Level.Invocation)
    public void setup(){
        pets = new ArrayList<>(Pets.DATA);
    }
}public class JavaCollectionBenchmark {
    @Benchmark
    public long runFilterThenCountWithRemoveIf(ThreadState state){
        state.pets.removeIf(pet -> pet.getWeight() > 50);
        return state.pets.size();
    }
…
}

In Part 2, we discussed the difference between Collections API and Streams API in Java. Collections API mutates states while Streams API doesn’t. So, the benchmark method above requires Level.Invocation because removeIf API changes the test data when it executes.

Machine spec: CPU Intel Core i7–4700HQ 2.40 GHz, RAM 16.0 GB
Software spec: Windows 10 Pro 64-bit, Java 1.8.0.51, Scala 2.11.2, Sbt 0.13.8, sbt-jmh 0.2.4

We are going to run 10 iterations, 10 warm-up iterations, 2 forks, and 2 threads. I know that “the higher number of iterations, the higher the accuracy” but it takes a very long time to finish and I’m an impatient person :). Even though there are several measurement modes that you can use, we will only use throughput mode in this article.

Bench Bench Bench

We are going to compare Java Streams API, Java Collections API, Scala Collections API, Scala Streams API, and Scala Views API.

The left one is Buakaw, muay Thai fighter.

jmh:run -i 10 -wi 10 -f2 -t2

Filter then count: Java Streams API wins!

Sort then collect: Java Collections API wins!

Group: Java Streams API wins!

Map then collect: Scala Views API wins!

Map then reduce: Java Collections API wins

Find first: Java Collections API wins!

Filter then sort then map then collect: Scala Views API wins!

The result is quite interesting! Note that there is an overhead of converting collections to either streams or views in Scala. I didn’t convert beforehand because we often do it on the fly in real world scenarios.

If we benchmark individual operations, Java API is generally faster than Scala API. But, when we combine several operations together, Scala API becomes faster than Java API. Maybe because map function in Scala is much faster than Java. It’s also possible that there is something wrong in this benchmark. Is it possible that map function in Scala view is not evaluated?

DISCLAIMER: This benchmark is made for fun and the result doesn’t indicate which language is better.

If you find error(s), please let me know. If you think that my benchmark has flaws, also please let me know how to fix it. I can re-run the benchmark to make it fair and accurate as much as possible.

hussachai/java8-scala-benchmark

java8-scala-benchmark - Java Stream API vs Scala Collection API

github.com

Java 8 vs Scala — Part III Trust no one, bench everything

Let’s get started!

Bench Bench Bench

hussachai/java8-scala-benchmark

java8-scala-benchmark - Java Stream API vs Scala Collection API

Written by Hussachai Puripunpinyo