The cost of Scala Option

George Leung
Mar 13 · 3 min read

Interested in the cost of the Option type in Scala, I googled “scala option cost” and got this post as the first result.

You can see that using Option incurs on average extra 0.05 nanosecond in execution time. … an extra delay caused by a single ADD is twice as big as this one (0.1 nanosecond).

I was stupefied. WOW that is fast! Then I remember the quote “Trust no one, bench everything.” from sbt-jmh.

In that post the author guessed that it has something to do with escape analysis¹, and I guess he is correct.

@Benchmark
def testOption(): Any = {
val value: java.lang.Long = 4L
Option(value).map(_ * 2).getOrElse(1L)
}

The above is the benchmark code. The option objects Option(value) and Option(value).map(_ * 2) are used only inside the function. No other code will see them. So the compiler can choose not to place them on the heap and we have the amazing performance reported.

But that invites the question: what if the object does escape?


Options in the Heap

To help them escape, I simply return the optional/nullable values. Then to make the cases look more similar I wrap them both in a case class.

@Benchmark
def createWrappedOption(): OptionContainer[String] = {
OptionContainer(Some("string"))
}
Benchmark Mode Cnt Score Error Units
createWrappedNullable avgt 45 3.044 ± 0.002 ns/op
createWrappedOption avgt 45 5.446 ± 0.004 ns/op

Rather than the 0.05ns in the post, I saw a 2.4ns difference. In actual code, Option objects mostly live in a complicated graph of references. Escape analysis can hardly help there. That makes the 0.05ns figure very misleading.

The Indirection

What’s more, the cost of Option is not only the object allocation, but also the indirection cost.² To measure it I have this benchmark.

@Benchmark
def testOptionRandomAccess(state: OptionState): Int = {
val i = ThreadLocalRandom.current().nextInt(state.size)
val res = state.arr(i)
res.fold(0)(_.length)
}

Arrays of various sizes, from 64 to 16M, are filled with Option[String]. They are then accessed randomly. When the strings are created along with the Option wrapping, reading the length of a string from an Array[Option[String]] is slower than reading from an Array[String], but not by a lot. Since the String object live right next to the Option object, it will get pulled along into the cache.³

For shits and giggles, I wanted to make the numbers worse. If I can separate all the String objects from the Option objects, reading the String will require reading from a different location in the memory.

On the left-hand side, reading the first Option will bring the string “1” into the cache, its access is then cheap.⁴
On the right-hand-side, reading the first Option will bring the other Option objects into the cache. That’s not useful. Accessing “1” needs another fetch.

And yes it is slower.


I hope my rambling gives you a better idea of the different aspects of the cost of using Scala’s Option type. It may not be as cheap as you wish.

Am I arguing against the use of Option? No.

In the grand scheme of things, they do not matter. In 99% of the code⁵ we write, 0.05ns, 2.4ns, 100ns, or even 100μs, is not something to worry about. Option is fine with non-performance-sensitive code.


Is it possible to have both the safety of Option and the performance of null? Yes, just use Kotlin!

Kotlin supports nullable types: T?, which means T or null.⁷ In Scala 3 union types will be supported and we can have explicit null. Back in Scala 2, there is OptionVal, an “free” wrapper around a nullable reference.


  1. https://shipilev.net/jvm/anatomy-quarks/18-scalar-replacement/
  2. Consider this, you look for the definition of “poop”. The dictionary tells you it means the same as “shit”; then you go to the entry of “shit” and get 💩. Compare this to seeing 💩 immediately in the entry for “poop”.
    This indirection cost is also present in boxed primitives with generic data structure on the JVM.
  3. For an explanation of locality, see
    https://shipilev.net/jvm/anatomy-quarks/11-moving-gc-locality/
  4. The extra objects do mean that less useful data can fit into the cache.
  5. Two nines is probably an underestimate.
  6. For a comparison of T | Null and Option[T], see
    https://medium.com/@elizarov/dealing-with-absence-of-value-307b80534903

George Leung

Written by

Learned some mathematics, physics and linguistics, but did not study. Became a programmer.

Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch
Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore
Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade