Scala 3
Published in

Scala 3

Scala 3: A Look at “inline” (and “Programming Scala” is Now Published!)

From Teh Interwebs

Update April 3, 2022: I added the code examples below to the book’s code repo and did more extensive testing. See here for details.

Update May 22, 2022: Michel Charpentier correctly pointed out that the arguments don’t need to be by-name if they are inlined. This makes perfect sense if you think about it (which I didn’t 🤓), because we are no long calling the invariant and fail functions; they are now gone, replaced by their bodies! I’ve updated the gist and the text accordingly. Thanks, Michel!

I haven’t blogged yet about the new metaprogramming system in Scala 3, so let’s start that now. First, let’s look at the new inline keyword, which causes the compiler to “inline” the decorated code.

Programming Scala, Third Edition is now published! It provides a comprehensive introduction to all the new features in Scala 3, while also introducing the Scala 2 features you’ll still need to know for working with an existing code base. Programming Scala, Third Edition is aimed at experienced Scala developers who want to learn what’s new, as well as professional developers getting started with Scala.

“Inlining” means that instead of generating the usual byte code for a construct, like a conditional, val declaration, or method, the compiler inserts byte code that bypasses the overhead of the construct.

For a conditional, instead of doing the usual if (predicate) true_stuff else false_stuff, the compiler just inserts true_stuff if predicate is determined to be true at compile time or it inserts false_stuff if predicate is determined to be false. Hence, inline can’t inline conditionals when the value of predicate can’t be determined at compile time. I’ll show you an example in a moment.

For a val, the actual value, which must also be known at compile time, is inserted everywhere a reference to the val is made.

For a method, the body is inlined instead of calling the method. This could add code bloat for big methods used in many places, so only inline small methods. You won’t gain much runtime performance inlining large methods anyway.

The method arguments (which can also be inlined) don’t have to be compile-time constants, but if there are type parameters (e.g., def foo[T](t:T): Unit), the actual types have to be known at compile time, where the method is “called”.

As an example, suppose I need an invariant checker, a tool that allows me to specify some invariant that should be true before and after some code executes. Here is a possible implement using the Scala 3 metaprogramming tools:

Note the inline keywords. I start by importing scala.quoted.*, then define an object to implement the invariant checker.

First, I inline a flag ignore, which specifies whether or not to “ignore” invariant checking. This is analogous to how some of the assert related Scala library methods worked in Scala 2, where you could disable them at runtime by passing certain flags to scala. (At this time, this feature hasn’t been implemented in Scala 3.)

It would be convenient for the user to make this value a var, so it can be changed dynamically at runtime, without recompilation. However, this would prevent inlining, so if you want to disable the runtime checks, you have to recompile with the value set to true instead.

If you’re playing along at home, try adding the type annotation :Boolean to the declaration. You get this error:

[error] -- Error: .../InvariantEnabled.scala:5:21
[error] 6 | inline val ignore: Boolean = false
[error] | ^^^^^^^
[error] | inline value must have a literal constant type

The problem is that a Boolean can have two values, but the compiler only accepts the literal type constant true or false here, not Boolean. So, you could use false as the type here:

inline val ignore: false = false

Or, just leave off the type annotation. See my post, Scala 3: Dependent Types, Part I for more details on dependent types.

I won’t show it here, but I also defined a nearly-identical type, InvariantDisabled, where the only difference is to declare ignore to be true. I’ll use both of these types in an example below.

Moving on, apply and all of its parameters are defined inline. This method, along with fail, need to be inline for the new macro quoting and splicing to work. I’ll discuss those features in a subsequent post. For our purposes now, declaring these methods inline means that the byte code won’t contain calls to methods with these names, but instead it will contain their bodies inserted inline.

Similarly, the parameters predicate, message, and block will be inlined.

Now we come to the conditional, inline if ignore then ... else ... If ignore is true, then the byte code for the following four lines will be inserted:

if !predicate then fail(predicate, message, block, "before")      
val result = block
if !predicate then fail(predicate, message, block, "after")

fail is also inlined, so the actual byte code will contain the call to failImpl, which constructs and throws an InvariantFailure.

But what if ignore is false at compile time? Then only the byte code for block is inserted. Hence, there will be no runtime overhead for unused invariant checking! We’ll see this in action shortly.

I’ll discuss the rest of this example in a subsequent post that explores quoting and splicing.

Finally, here is another variant that removes most of the inlining, except where needed for quoting and splicing, but still performs invariant checking:

Note that now I need to pass the arguments predicate, message, and block as by-name parameters, so they are only evaluated inside the method bodies for apply and fail, not before calling those methods. This is not necessary when these arguments are inlined! However, even though this implementation still does invariant checking, not inlining the arguments means we won’t get the same expressions output as strings in the error messages. For example, instead of FAILURE! predicate “i.>=(0)” you get less useful output like FAILURE! predicate “predicate$proxy6”.

So, I claimed that a major advantage of inlining is the ability to remove whole blocks of unneeded code, if that situation can be determined at compile time. The inlined val ignore triggers this situation. Now let’s see what sort of performance impact this has. Consider the following program:

This program accepts zero or more numbers for the number of trials to run. If none is specified, it defaults to 1000. For each n, the program times the execution using invariantEnabled, invariantDisabled, and invariantNoInline, then prints out the times in nanoseconds and the percentages vs. what should be the fastest execution times, those for invariantDisabled.

Running this program with the arguments 10 100 1000 10000 100000, results in the following:

|      N |                    Elapsed Times (nanos)             |
| | Enabled | Disabled | E/D% | NoInline | N/D% |
| 10 | 329267 | 140683 | 234.05% | 397920 | 282.85% |
| 100 | 117403 | 33356 | 351.97% | 87826 | 263.30% |
| 1000 | 402782 | 152507 | 264.11% | 532264 | 349.01% |
| 10000 | 882765 | 220922 | 399.58% | 1941655 | 878.89% |
| 100000 | 1288324 | 1140562 | 112.96% | 1555927 | 136.42% |

The numbers can vary quite a lot from run to run, especially for larger N where JVM hotspot optimization kicks in. However, the general trend is clear at least for smaller N; compiling with checking disabled eliminates significant overhead and so does extensive inlining.

By the way, what happens if a check fails. If you change == to != in line 11 of InlinePerf, you’ll get this error:

[error] ...invariantEnabled$InvariantFailure: FAILURE! predicate "thing1.label.!=("label")" failed before evaluation of block: "thing1.count = i.*(2).%(3)". Message = "".

We see an important benefit of using a macro implementation; we can compose an error message that shows the actual code for both the predicate and the block that triggered the failure. Note, however, that the operator notation is converted to method invocations. Still, this is very handy when debugging.

Also, recall I said that the method parameters don’t have to be compile time constants, even though we inline them and the method. Note that the predicate we inlined is thing1.label == "label" and the block changes the value of thing1.count, neither of which is constant at compile time.

You can get carried away with inlining (like anything else). Outside the context of macros, it’s best to profile your code to determine if a) you really need to improve the performance of some section of code and b) using inline actually makes a significant difference in real-world execution scenarios (recall the behavior for large N above…).

So for example, I gave up the convenience of switching invariant checking on and off at runtime by inlining the ignore value. The performance gains were noticeable, but do they outweigh the convenience of the runtime flexibility?

See Programming Scala, Third Edition for more information about the new metaprogramming facilities and Scala 3, in general.



Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store