CLI applications with GraalVM Native Image

Published in

graalvm

9 min readNov 13, 2020

GraalVM Native Image technology allows you to compile your applications into native executables. Some of the great benefits you get from it are:

small standalone distribution not requiring a JDK
instant startup
lower memory footprint

Native image is a very exciting way to deploy apps to cloud environments where these benefits determine the actual cost of running your applications.

However there is another class of applications that benefits from the above characteristics — command line apps!

Indeed, a good command line app is truly standalone, not like a docker run some-image standalone which involves downloading the infrastructure and the dependencies of the app just conveniently packaged. That was possible for ages: slap some @Grab annotations on the Groovy code, provide a wrapper to download the JDK, and call it a day. Native image gives you a binary that once you wget it is functional on its own.

A good CLI app starts fast! And if it doesn’t have a lot to do it dies fast too!

GraalVM Native Image precompiles your Java applications into native binaries, which don’t need to perform lots of boot-time initialization, like loading and verifying class files, and have startup times similar to apps written in other native compiled languages: Go, C/C++, etc.

Native executables also don’t need to initialize and allocate memory for the JIT compiler because there is no JIT compiler! Everything is compiled at build time. No JIT compiler means no code caches and no profile data caches which results in lower memory requirements than the same application running on the JVM. Your app will still consume memory for its own data of course, so if there’s a lot of data to process memory usage will grow as necessary but is still configurable with the normal -Xmx and friends command line options.

All in all, it checks a lot of the boxes for how a good CLI app should be. But in this article I want to explore one more cool feature of GraalVM native image — specifying the default memory requirements at the image build time.

Normally, for a Java application running on the JVM, you need to specify how much memory it can use at runtime. You run it with a command line like java -Xmx4G -jar myApp.jar and know that this application has its heap size limit at 4G. Without the -Xmx option the heap size will be determined by a heuristic, typically based on the total amount of memory available.

When you use GraalVM Native Image to compile your app, you can pre-set the heap size at build time. Let’s look at an example.

Let’s create a sample CLI application using Micronaut and Picocli and see how you can configure native executable memory.

You can go to launch.micronaut.io and create an app by specifying necessary components or use the command line utility mn. What’s interesting about mn is that it is itself a Micronaut command line app generated by GraalVM Native Image — a perfect example of a Java CLI application!

I’m using Micronaut 2.1.2:

mn create-cli-app primes; cd primes

We have the app ready, it’s a template for a CLI app that can respond to basic commands like --help. Let's create the business logic of a very important task of calculating prime numbers. CreatePrimesComputer.java in theprimes package using your favourite IDE:

It’s a very simple component, which can pick 2 random numbers up to the upperbound and return the list of prime numbers between them.

The calculation of the prime numbers is done using the Streams API, which is sort of inefficient but also not relevant for this exercise. This is actually remarkably convenient for the sample app since every calculation will generate a bit of data that is not used afterwards — i.e., garbage.

Let’s put it to use in the main PrimesCommand.java file:

It declares two integer options that our CLI utility will accept — how many iterations of the calculation to run and what’s the upper limit for the numbers to consider.

You can now try the app and see it work. For convenience — delete the test class left from the template (we could of course edit it to match our functionality, but removing is simpler):

rm src/test/java/primes/PrimesCommandTest.java.

Build the project now and run the resulting jar file:

./gradlew build 
...
java -jar build/libs/primes-0.1-all.jar
00:51:30.519 [main] INFO  i.m.context.env.DefaultEnvironment - Established active environments: [oraclecloud, cloud, cli]

You can see how quickly a Micronaut application starts! In less than a second!

We can also run it with some values for the arguments, so we see some output:

java -jar build/libs/primes-0.1-all.jar -n 1 -l 100 
[53, 59, 61, 67, 71, 73]

Remember, in the beginning of the article we said a CLI tool should start fast and use reasonably small amounts of memory. Let’s test the Java version just to have a baseline for those metrics:

/usr/bin/time -v java -jar build/libs/primes-0.1-all.jar -n 1 -l 100
16:34:02.776 [main] INFO  i.m.context.env.DefaultEnvironment - Established active environments: [oraclecloud, cloud, cli]
[53, 59, 61, 67, 71, 73]
    Command being timed: "java -jar build/libs/primes-0.1-all.jar -n 1 -l 100"
    User time (seconds): 2.73
    System time (seconds): 0.43
    Percent of CPU this job got: 309%
    Elapsed (wall clock) time (h:mm:ss or m:ss): 0:01.02
    Average shared text size (kbytes): 0
    Average unshared data size (kbytes): 0
    Average stack size (kbytes): 0
    Average total size (kbytes): 0
    Maximum resident set size (kbytes): 354996
    Average resident set size (kbytes): 0
    Major (requiring I/O) page faults: 0
    Minor (reclaiming a frame) page faults: 54451
    Voluntary context switches: 9774
    Involuntary context switches: 53
    Swaps: 0
    File system inputs: 0
    File system outputs: 64
    Socket messages sent: 0
    Socket messages received: 0
    Signals delivered: 0
    Page size (bytes): 4096
    Exit status: 0

I’m using the Linux time utility (or gtime on macOS) and in the verbose mode it shows a few interesting metrics:

Max memory used by the app (max RSS): 355m
Total wall clock time: 1.02s
CPU usage: 309%

Note that the machine I’m testing this on is a decent cloud VM, with a few CPU and some RAM, which I often use as my remote developer machine.

This machine has quite a bit of memory available, which is perfect for development, but a bit detrimental to the heuristic based limit configuration: a good heuristic will see it has access to a ton of resources and decide to consume as much as convenient. That’s why we should not decide whether 350M consumed by a simple invocation of out utility is too much or not.

Remember these heuristics make a lot of sense for the server environment where your application is a sole consumer of the resources, even it it also makes CLI apps look more memory hungry than necessary. I’ll leave 300% of CPU and 1s to compute 5 prime numbers without a comment, it’s as good of a baseline as any.

Now a much better way to run this application is as a native executable generated by GraalVM Native Image. Fortunately, building a native executable of the Micronaut application is very straightforward as there’s a Gradle plugin that helps with the configuration and execution of the native-image utility.

Run ./gradlew nativeImage and after compiling and building the executable you'll have a standalone binary file ready for deployment.

The executable is available in the build/native-image/ folder, which we for convenience will move to the current dir: mv build/native-image/application ./primes-defaults

We can run it now and see that it works the same way as the Java version: ./primes-defaults -n 1 -l 100

And the same time command to check if the behaviour of the native image is a bit more appropriate for the CLI app:

/usr/bin/time -v ./primes-defaults -n 1 -l 100
23:57:47.286 [main] INFO  i.m.context.env.DefaultEnvironment - Established active environments: [oraclecloud, cloud, cli]
[53, 59, 61, 67, 71, 73]
 Command being timed: "./primes-defaults -n 1 -l 100"
 User time (seconds): 0.01
 System time (seconds): 0.01
 Percent of CPU this job got: 113%
 Elapsed (wall clock) time (h:mm:ss or m:ss): 0:00.02
 Average shared text size (kbytes): 0
 Average unshared data size (kbytes): 0
 Average stack size (kbytes): 0
 Average total size (kbytes): 0
 Maximum resident set size (kbytes): 49980
 Average resident set size (kbytes): 0
 Major (requiring I/O) page faults: 0
 Minor (reclaiming a frame) page faults: 1226
 Voluntary context switches: 232
 Involuntary context switches: 2
 Swaps: 0
 File system inputs: 0
 File system outputs: 0
 Socket messages sent: 0
 Socket messages received: 0
 Signals delivered: 0
 Page size (bytes): 4096
 Exit status: 0

Here are the highlights:

50M rss
20 ms total time
113% CPU

This is much-much better of course, but since we’re not configuring the memory limits, it’ll still use the heuristics and grow memory much more aggressively than you might expect if we increase the workload.

Luckily, we have the number of iterations to run in the parameters, so we can experiment very easily. Here’s an output of running 100K iterations of the same task:

Command being timed: "./primes-defaults -n 100000 -l 100"
    User time (seconds): 0.88
    System time (seconds): 0.63
    Percent of CPU this job got: 72%
    Elapsed (wall clock) time (h:mm:ss or m:ss): 0:02.07
    Average shared text size (kbytes): 0
    Average unshared data size (kbytes): 0
    Average stack size (kbytes): 0
    Average total size (kbytes): 0
    Maximum resident set size (kbytes): 309956
    Average resident set size (kbytes): 0
    Major (requiring I/O) page faults: 0
    Minor (reclaiming a frame) page faults: 65110
    Voluntary context switches: 199
    Involuntary context switches: 5
    Swaps: 0
    File system inputs: 0
    File system outputs: 0
    Socket messages sent: 0
    Socket messages received: 0
    Signals delivered: 0
    Page size (bytes): 4096
    Exit status: 0

309M RSS
2s
72% CPU

(Just for completeness the Java JVM version shows me these results):

517M RSS
3.3s
171% CPU

Now if you still think 300M is more RAM than a good behaving CLI utility should take, you can control that with the -Xmx and -Xmn options at runtime, like this:

./primes-defaults -Xmx64m -Xmn16m -n 100000 -l 100

Or you can pre-build the default config values into the native image.

Add the following snippet to the build.gradle file:

nativeImage {   
  args("-R:MaxHeapSize=64m")   
  args("-R:MaxNewSize=16m") 
}

Here we specify the maximum heap size of 64m and the new generation size of 16m for the native image that will be built. These options will become the defaults for the native image executable. Build the native image again with ./gradlew nativeImage, to test it.

Move the binary too, so it won’t be overwritten by the future builds: mv build/native-image/application ./primes-limits

It still works the same:

./primes-limits -n 1 -l 100 
00:08:08.228 [main] INFO  i.m.context.env.DefaultEnvironment - Established active environments: [oraclecloud, cloud, cli] 
[53, 59, 61, 67, 71, 73]

But when we apply the load (the 100K iterations test), we can see the heap configuration in action the RSS doesn’t grow uncontrollably.

Timing native executable with heap size preconfigured during the native image build

On my test machine I get:

60M RSS
2.02 total time
77% CPU

Also the native executable built this way will still respect the -Xmx, -Xmn config at runtime, so there's no downside like that for using the pre-baked limits you can increase them if needed.

Conclusion

GraalVM native image technology is great for command line applications. It produces a standalone binary that doesn’t depend on the JVM, starts instantly, doesn’t use a lot of memory, and lets you preconfigure reasonable heap configuration values for the workloads you expect by default. In this article we showed a sample command line application built with Micronaut and PicoCLI which is a terrific combination that works really well and is a good fit for native images.

There are a number of project that use GraalVM native image for the CLI apps:

There are many many others. It’s probably a good fit for creating Kubernetes operators too, they are rather similar to the CLI apps, but perhaps that’s a topic for a followup article.

If you’re using GraalVM Native Image for command line applications, please tell us about it and how it works for you in the comments, or on Twitter, or Slack. We’re always looking to improve GraalVM further!

CLI applications with GraalVM Native Image

Conclusion

Written by Oleg Šelajev