Understanding Class Initialization in GraalVM Native Image Generation

Christian Wimmer
graalvm
Published in
9 min readSep 6, 2018

This article describes the approach before the GraalVM 19.0 release. A newer version covering GraalVM 19.0 and later is available here: https://medium.com/graalvm/updates-on-class-initialization-in-graalvm-native-image-generation-c61faca461f7

tl;dr: Classes reachable for a GraalVM native image are initialized at image build time. Objects allocated by class initializers are in the image heap that is part of the executable. The new option --delay-class-initialization-to-runtime= delays initialization of listed classes to image run time.

The GraalVM native-image tool enables ahead-of-time (AOT) compilation of Java applications into native executables or shared libraries. While traditionally Java code is just-in-time (JIT) compiled at run time, AOT compilation has two main advantages: First, it improves the start-up time since the code is already pre-compiled into efficient machine code. Second, it reduces the memory footprint of Java applications since it eliminates the need to include infrastructure to load and optimize code at run time.

We call the technology behind the GraalVM native-image tool Substrate Virtual Machine (VM) because, in addition to your application code and its dependencies, the target executable contains traditional VM components such as a garbage collector. To achieve the goals of a self-contained executable, the native-image tool performs a static analysis to find ahead-of-time the code that your application uses. This includes the parts of the JDK that your application uses, third-party library code, and the VM code itself (which is written in Java too). The following figure summarizes the native image build process:

The native executable contains not just code, but also an initial heap that serves as the starting point of the Java heap at run time. We call this initial heap the “image heap”. The image heap allows us to skip class initialization at run time, which is crucial for fast startup. While a traditional Java VM needs to run the class initializers of many core JDK classes before the main method of your application starts running, the native executable calls your main method quite directly.

A class initializer can be written explicitly as a static { ... } block in Java code. But every initialization of a static field (regardless whether the field is final or not) is implicitly converted to a class initializer. For example, when you declared static field as static final Date TIME = new Date(); then the assignment TIME = new Date() is in the class initializer. We will use this example later. But let’s first start with the minimal “Hello World” for class initialization:

To set up your development environment you first need to download the latest GraalVM. Either the Community Edition or the Enterprise Edition works for the purpose of this example. The GraalVM download contains a full JDK plus some other utilities like the native-image tool. Then you need to set your JAVA_HOME to point to the GraalVM directory. Now we can compile the example:

> $JAVA_HOME/bin/javac HelloClassInitializer.java

When we run it on a normal Java VM, it first prints the line from the class initializer and then the line from the main method:

> $JAVA_HOME/bin/java HelloClassInitializer
Hello Class Initializer
Hello Main

In order to get a standalone native executable for our application, we use the native-image tool that is part of the GraalVM download, and then run the helloclassinitializer executable that it creates:

> $JAVA_HOME/bin/native-image HelloClassInitializer
Build on Server(pid: 23652, port: 40485)
[helloclassinitializer:23652] classlist: 388.19 ms
[helloclassinitializer:23652] (cap): 650.59 ms
[helloclassinitializer:23652] setup: 1,008.77 ms
Hello Class Initializer
[helloclassinitializer:23652] (typeflow): 2,945.18 ms
[helloclassinitializer:23652] (objects): 1,666.47 ms
[helloclassinitializer:23652] (features): 37.61 ms
[helloclassinitializer:23652] analysis: 4,714.60 ms
[helloclassinitializer:23652] universe: 129.10 ms
[helloclassinitializer:23652] (parse): 512.04 ms
[helloclassinitializer:23652] (inline): 827.93 ms
[helloclassinitializer:23652] (compile): 4,344.61 ms
[helloclassinitializer:23652] compile: 5,974.13 ms
[helloclassinitializer:23652] image: 488.75 ms
[helloclassinitializer:23652] write: 107.55 ms
[helloclassinitializer:23652] [total]: 12,846.35 ms
> ./helloclassinitializer
Hello Main

Note that the line “Hello Class Initializer” is printed during image generation. Executing the binary only prints the line “Hello Main”. Running the helloclassinitializer executable always prints only the one line. This is the intended behavior: In order to have a fast startup of the executable, the (possibly expensive) class initializer is run only once during image generation.

Many classes in the JDK have class initializers, and they allocate a lot of Java objects, e.g., charset data and localization information. All these objects are part of the image heap and are therefore available at run time without any initialization and allocation overhead. We believe that ahead-of-time class initialization is such an important concept that we offer the Feature API. It allows you to run custom initialization code at defined points during image generation, e.g., before, during, or after the static analysis.

Executing class initializers during image generation is in most cases a good idea. But there are a few use cases where class initializers do things that must not run during image generation, for example

  • start application threads that continue to run in the background of the application,
  • load native libraries using java.lang.Runtime.load(String),
  • open files or sockets, or
  • allocate C memory, e.g., create direct byte buffers using java.nio.ByteBuffer.allocateDirect(int).

All these examples result in native resources that are outside of normal Java objects and the normal Java heap. The image heap can only capture Java objects and not native resources. Therefore, native resources created during image generation are no longer available at image run time, and accessing them would lead to a crash of the application.

Some class initializers also depend on state that is not yet available at image build time. For example, a class initializer can depend on system properties that are not yet set at image build time, or that change between image build time and image run time like the username or the current working directory.

To support such class initializers, we introduced a new option to GraalVM 1.0 RC6 that delays certain class initialization from image build time to image run time. Assume that in our example we want to run the class initializer of our example class HelloClassInitializer at run time:

> $JAVA_HOME/bin/native-image --delay-class-initialization-to-runtime=HelloClassInitializer HelloClassInitializer
Build on Server(pid: 23652, port: 40485)
[helloclassinitializer:23652] classlist: 260.53 ms
[helloclassinitializer:23652] (cap): 580.90 ms
[helloclassinitializer:23652] setup: 916.11 ms
[helloclassinitializer:23652] (typeflow): 2,506.24 ms
[helloclassinitializer:23652] (objects): 1,696.15 ms
[helloclassinitializer:23652] (features): 37.02 ms
[helloclassinitializer:23652] analysis: 4,307.63 ms
[helloclassinitializer:23652] universe: 142.57 ms
[helloclassinitializer:23652] (parse): 624.35 ms
[helloclassinitializer:23652] (inline): 963.93 ms
[helloclassinitializer:23652] (compile): 4,669.54 ms
[helloclassinitializer:23652] compile: 6,458.79 ms
[helloclassinitializer:23652] image: 535.08 ms
[helloclassinitializer:23652] write: 104.49 ms
[helloclassinitializer:23652] [total]: 12,753.86 ms
> ./helloclassinitializer
Hello Class Initializer
Hello Main

Note that there are no modifications of the Java source code, the only difference is the new option --delay-class-initialization-to-runtime=HelloClassInitializer on the command line for the native-image tool. The new option takes a comma-separate list of classes, and implicitly all of their subclasses, that are initialized at run time and not during image building. You can also invoke the new API in RuntimeClassInitialization from a Feature.

Let’s look at an example where class initialization at run time is necessary for correct behavior, i.e., the behavior that the programmer intended:

Compiling and running this example on a standard Java VM prints two time stamps that are the same or very close together:

> $JAVA_HOME/bin/javac HelloStartupTime.java> $JAVA_HOME/bin/java HelloStartupTime
Startup: Fri Aug 31 13:17:05 PDT 2018
Now: Fri Aug 31 13:17:05 PDT 2018

But when we build a native executable, the class Startup is initialized during image generation and Startup.TIME is initialized to the image build time:

> $JAVA_HOME/bin/native-image HelloStartupTime
[...]
> ./hellostartuptime
Startup: Fri Aug 31 13:22:12 PDT 2018
Now: Fri Aug 31 13:23:36 PDT 2018

Running the executable about an hour later prints:

> ./hellostartuptime 
Startup: Fri Aug 31 13:22:12 PDT 2018
Now: Fri Aug 31 14:35:42 PDT 2018

The “startup time” of the application is always the same constant. The Date object was created at image build time and is part of the image heap.

In order to get the intended behavior of the application, we need to initialize the class Startup at run time:

> $JAVA_HOME/bin/native-image --delay-class-initialization-to-runtime=Startup HelloStartupTime
[...]
> ./hellostartuptime
Startup: Fri Aug 31 13:27:01 PDT 2018
Now: Fri Aug 31 13:27:01 PDT 2018

When you register a class for initialization at run time, then all subclasses are registered automatically too. The Java specification states that initialization of a class also triggers initialization of all superclasses, i.e., the class initializers of all superclasses are executed before the class initializer of a class.

Class initialization is also triggered when you invoke a static method, access a static field, or allocate an instance of the class. If initialization of a class is delayed to image run time, then this class must not be initialized during image generation for any reason. This leads to some restriction on how such a class can be used. For example, no instances of such a class can be in the image heap.

Unfortunately, there are also some less obvious restrictions. Let’s look at a small variation of our previous example:

We already know that we need to initialize the class Startup at run time. But if we do that, the image build fails:

> $JAVA_HOME/bin/javac HelloCachedTime.java> $JAVA_HOME/bin/native-image --delay-class-initialization-to-runtime=Startup HelloCachedTime
Build on Server(pid: 23652, port: 40485)
[hellocachedtime:23652] classlist: 196.45 ms
[hellocachedtime:23652] (cap): 582.00 ms
[hellocachedtime:23652] setup: 916.66 ms
[hellocachedtime:23652] analysis: 3,296.61 ms
error: Class that is marked for delaying initialization to run time got initialized during image building: Startup

The error message tells us that the class Startup was initialized at image build time, even though we explicitly requested initialization at run time. But what triggered the class initialization? We can find that out by debugging the native image builder. Usually you would do that in your favorite IDE, but here we use the command line debugger jdb that comes with the JDK. First, we need to start the native-image tool under a debugger. Finding out the correct command line and class path to launch it directly from an IDE is tedious, so we provide the option --debug-attach. By default, it waits for a debugger to attach to port 8000:

> $JAVA_HOME/bin/native-image --debug-attach --delay-class-initialization-to-runtime=Startup HelloCachedTime
Listening for transport dt_socket at address: 8000

Now we can attach jdb and start debugging. We want to know why the class initializer of class Startup is executed, so we set a breakpoint in it. As mentioned beforehand, the static field initialization TIME = new Date() is an implicit class initializer, so the line number for the breakpoint is line 13:

> $JAVA_HOME/bin/jdb -attach 8000
Set uncaught java.lang.Throwable
Set deferred uncaught java.lang.Throwable
Initializing jdb ...
>
VM Started: No frames on the current call stack
main[1] stop at Startup:13
Deferring breakpoint Startup:13.
It will be set after the class is loaded.
main[1] run
> Set deferred breakpoint Startup:13
Breakpoint hit: "thread=ForkJoinPool-3-worker-2", Startup.<clinit>(), line=13 bci=0ForkJoinPool-3-worker-2[1] where
[1] Startup.<clinit> (HelloCachedTime.java:13)
[2] HelloCachedTime.<clinit> (HelloCachedTime.java:4)
[3] sun.misc.Unsafe.ensureClassInitialized (native method)
[...]

The three jdb commands above are highlighted in bold: stop at Startup:13 sets the breakpoint at line 13 in the class Startup; run starts the execution; and after the breakpoint is hit, where prints the stack trace. The stack trace contains about 50 more lines with internal stack frames that you can ignore. The top lines of the stack trace are enough to tell us the cause of the problem: The class initializer of class Startup (the internal method name for class initializers is <clinit>) is called by the class initializer of class HelloCachedTime.

So the problem is in line 4 of our example: CACHED_TIME = Startup.TIME. Accessing the field Startup.TIME triggers initialization of the class Startup. This means we also need to initialize the class HelloCachedTime at run time:

> $JAVA_HOME/bin/native-image --delay-class-initialization-to-runtime=Startup,HelloCachedTime HelloCachedTime
[...]
> ./hellocachedtime
Startup: Fri Aug 31 14:02:52 PDT 2018
Now: Fri Aug 31 14:02:52 PDT 2018

Finding all transitive dependencies of a class that trigger initialization is a manual and tedious process. We are currently investigating how we can make it more automatic, i.e., print the list of classes that trigger initialization automatically so that you do not need to use a Java debugger. Watch the release notes for future GraalVM releases for announcements.

Delaying class initialization to the run time is a powerful new feature for native image generation introduced in GraalVM 1.0 RC6. We believe it will allow more applications to be executed as native images, bringing the benefits of fast startup and low footprint to a wider range of Java applications. If you have encounter any problems or have questions, please let us know.

--

--

Christian Wimmer
graalvm

VM and compiler researcher at Oracle Labs. Project lead for GraalVM native image generation (Substrate VM). Opinions are my own.