In this article we’ll talk about one of the recent GraalVM updates, libgraal. It’s a shared library, produced by
GraalVM Native Image, which contains a pre-compiled version of the GraalVM compiler. When you run Java applications on GraalVM it’s going to be used as the top tier Just-In-Time compiler.
This has several advantages, libgraal improves startup times and completely avoids interfering with the heap usage and profiling of the application code. That is, the compiler now “codes like Java, runs like C++”. More specifically in the context of HotSpot, libgraal executes like C2 while preserving most of the advantages of a managed runtime. Libgraal significantly contributes to improving the compilation speed and performance on shorter and medium length workloads in GraalVM 19.1 release. Keep reading to learn more.
The GraalVM compiler and native-image
First, some background for anyone unfamiliar with the relationship between the GraalVM compiler and native-image. The compiler is used to compile Java bytecode into machine code. When used as a JIT compiler by HotSpot, it only compiles the frequently executed (i.e. hot) bytecode of an application. The native-image tool also uses the GraalVM compiler to compile Java bytecode to machine code but it compiles all the bytecode of an application, ahead of time. With the GraalVM compiler itself being written in Java, this allows us to treat it as an application from the perspective of native-image. In this way, we get a version of the compiler that can be run as compiled machine code immediately at run time.
Getting started with libgraal
Since a few releases of GraalVM, libgraal is the default mode for the GraalVM compiler when running on the JVM. That is, when you use the
java launcher or any of the language launchers with the
--jvm option, all top-tier compilations are performed with libgraal. To disambiguate this mode of execution from the (now) legacy mode, we use the term jargraal for the latter. Furthermore, where the context is clear, we will refer to the GraalVM compiler simply as “the compiler” for brevity.
The primary benefit of libgraal is that compilations are fast from the start. This is because the compiler is running compiled from the get-go, by-passing the HotSpot interpreter altogether. Furthermore, it’s compiled by itself. By contrast, jargraal is compiled by C1. The result is that the compiled code of the compiler is more optimized with libgraal than with jargraal.
Let’s use GraalVM Enterprise Edition and the CountUppercase example to see how this all adds up to better startup performance:
java CountUppercase On your marks, Get set, Go...
1 (191 ms)
2 (107 ms)
3 (69 ms)
4 (120 ms)
5 (27 ms)
6 (26 ms)
7 (27 ms)
8 (28 ms)
9 (27 ms)
total: 29999997 (651 ms)
To compare with jargraal, we use the
java -XX:-UseJVMCINativeLibrary CountUppercase On your marks, Get set, Go...
1 (1065 ms)
2 (329 ms)
3 (149 ms)
4 (107 ms)
5 (106 ms)
6 (81 ms)
7 (125 ms)
8 (51 ms)
9 (34 ms)
total: 29999997 (2081 ms)
The advantage of having the GraalVM compiler compiled ahead-of-time (AOT) is clear. In this example, we reach peak performance about 3x faster with libgraal (after 0.5 seconds) compared to jargraal (after 1.5 seconds). Similar warmup times are also obtained with libgraal in GraalVM CE.
A more direct measure of libgraal’s compilation speed is provided by the
-XX:+CITime flag. This shows the bytes compiled per second (including bytes of inlined methods) as well as other metrics. Running the CountUppercase example with this flag shows that libgraal compiles about 74K bytes/second whereas jargraal only compiles about 9K bytes/seconds. As further points of comparison, C1 compiles about 380K bytes/second and C2 about 80K bytes/second. With the use of profile-guided optimizations (PGO), we expect libgraal to surpass C2’s compilation speed.
In addition to faster warmup, the other benefits to moving Graal out of the HotSpot heap are described below.
In jargraal mode, the compiler is loaded from class files (deployed in jar files) and is executed just like all other classes in the JVM. Allocations are made on the same garbage collected heap that the application code is using. In addition, the compiler classes also occupy HotSpot’s metaspace, the managed memory area used for metadata such as classes, methods and profiles. This can cause a number of issues:
- It makes computing the heap requirements for an application harder as the heap requirements of the GraalVM compiler need to be taken into account.
- It can perturb object locality by interleaving compiler heap objects with application heap objects. This can have a direct impact on performance.
- It increases the number of garbage collections performed since allocation by the compiler causes the heap to fill up faster.
Some of these effects on memory can be seen with Java Mission Control. Here is a screenshot showing the memory usage when running the CountUppercase example with jargraal:
The brown bars show that there were 11 collections performed during the measurement period and the purple lines show the heap usage. Hovering the mouse at the base of one of the collections shows how much memory is in use after the collection.
In contrast, here’s the memory usage profile for libgraal:
This shows that with libgraal, only 4 collections were performed. As there is only 2 MB of live memory after a collection in the libgraal profile, we can deduce that jargraal retains about 7.5 MB of live memory between compilations.
Another side-effect of jargraal is that execution of the compiler can perturb profiles of code also used by the application. Take for example an application that uses
java.util.HashMap and only ever uses keys of type
String. In a call to
HashMap.putAll(Map<? extends K, ? extends V> m) , the type profiles for calls to
Object.hashCode will indicate that the keys of
m are always of type
String. However, the compiler also uses
HashMap and not always with
String keys. The compiler calls to
HashMap.putAll will cause the type profiles to be “polluted” with these other types. This in turn can prevent inlining the
String.hashCode method in
While this kind of type pollution can be mitigated by aggressive inlining like that performed by GraalVM Enterprise Edition, the only way to completely eliminate it is to run the compiler in a mode where it does not update the profiles. If you haven’t guessed it by now, libgraal provides exactly this mode of execution. What’s more, since no compilation occurs in libgraal, there’s no need to profile the GraalVM compiler at all.
Advantages of Java
Being written in Java and compiled to machine code, libgraal preserves most of the benefits of jargraal. These include:
- Compressed references. As explained in an earlier GraalVM article, native-image supports compressed pointers. The TL;DR summary of this feature is that all object pointers in libgraal can be represented in 32-bits instead of 64-bits, saving a significant amount of memory.
- Garbage collection. Native compilers in HotSpot such as C1 and C2 allocate memory during compilation and typically only release most of it once the compilation is done. Since libgraal is running in a native image that supports garbage collection, it can be configured to run with a heap size that caps the memory used by the compiler, preventing a certain class of compiler bugs from blowing up the VM. The top-level entry point to a compilation installs an exception handler to catch an
OutOfMemoryErrorand take appropriate action (e.g., bail out of compilation). In contrast, a bug in C1 or C2 resulting in excessive allocation will result in killing the VM process with an uncatchable out of memory error.
- Robustness against compiler bugs. Generalizing the previous point, any compiler bug that results in an exception can have its damage mitigated. The exception can be caught and the VM can continue executing, albeit without the compiled code for the method that was being compiled. With the
-Dgraal.CompilationFailureAction=Diagnoseoption, such failures can even generate useful diagnostic info that can be submitted along with a bug report. You can use the
-Dgraal.CrashAtoption to simulate this with the CountUppercase example:
java -Dgraal.CrashAt=equals -Dgraal.CompilationFailureAction=Diagnose CountUppercase On your marks, Get set, Go...
-- iteration 1 --
1 (246 ms)
Thread[System-0,5,main]: Compilation of java.lang.String.equals(Object) failed:
java.lang.RuntimeException: Forced crash after compiling java.lang.String.equals(Object)
To disable compilation failure notifications, set CompilationFailureAction to Silent (e.g., -Dgraal.CompilationFailureAction=Silent).
To print a message for a compilation failure without retrying the compilation, set CompilationFailureAction to Print (e.g., -Dgraal.CompilationFailureAction=Print).
Retrying compilation of java.lang.String.equals(Object)
Dumping IGV graphs in /Users/dnsimon/graal/graal/compiler/graal_dumps/1555858467364/graal_diagnostics_41644/java.lang.String.equals(Object)
2 (103 ms)
3 (70 ms)
4 (135 ms)
5 (26 ms)
6 (27 ms)
7 (26 ms)
8 (26 ms)
9 (26 ms)
total: 29999997 (712 ms)
Graal diagnostic output saved in /Users/dnsimon/graal/graal/compiler/graal_dumps/1555858467364/graal_diagnostics_41644.zip
- While we’ve aimed to avoid excessive recursion in the compiler, it’s still possible for stack overflow to occur. Since native-image supports stack overflow checking, this again results in a compilation bailout instead of VM exit.
We have plans to continue improving and leveraging opportunities presented by libgraal. Some of these are detailed below.
- By default, native-image expands its heap to 80% of available physical memory. This can allow a misbehaving compilation (due to pathological input or compiler bugs) to use up a lot of memory and cause very slow compilation. We are now experimenting with adjusting the size of the young generation and capping the maximum native-image heap size for libgraal to mitigate against such cases. We aim to find values that achieve the best trade-off in terms of minimizing collections (which impact compilation speed) while preventing unbounded memory use when compiling.
- The support for isolates in Native Image allows us to further reduce the memory footprint of libgraal. We can completely discard the libgraal isolate to bring the memory footprint of the GraalVM compiler effectively to 0. A good time to do this would be when the compilation queue is empty. That is, once an application is in a steady state, the compiler can completely remove itself from the memory profile of the VM. If we can get the compiler initialization time down to low single-digit milliseconds, it’s even conceivable to offer a mode where a new isolate is created for each compilation. This would provide an absolute minimal memory footprint for the compiler. As long as each compilation allocates no more than the max heap for libgraal, it would also avoid all garbage collection in libgraal which translates to faster compilation.
- Soon after the changes required for libgraal are merged to the OpenJDK master, we will focus on making libgraal work in JDK 13.
- In conjunction with GraalVM Native Image team, we will work on reducing the static footprint of libgraal. This will mostly be a matter of reducing symbols and pruning out more unused code.
- It’s possible to configure the GraalVM compiler with the set of optimizations it will or won’t perform. We’re working on tuning an economy configuration that prefers compilation speed over generated code quality. This should allow us to generate an economy libgraal that can be used for first tier compilations instead of using C1.
Libgraal is a shared library, containing a pre-compiled version of the GraalVM compiler. Using it provides benefits of improving startup, competitive peak performance and removes all interference with application profiles and the application’s use of the heap.
Download GraalVM and try it yourself: www.graalvm.org/downloads.