Compiling Native Projects via the GraalVM LLVM Toolchain

Published in

graalvm

17 min readDec 5, 2019

GraalVM is a high-performance polyglot runtime supporting a variety of different languages, including JavaScript, Ruby, R, Python, and JVM languages such as Java, Scala or Kotlin.

There is one member of the GraalVM language family that is a bit special. While all the aforementioned languages are managed, so the language runtime manages all memory requirements, the LLVM runtime executes LLVM bitcode, which is unmanaged. That means there is no garbage collector that frees the memory automatically. Instead, users get their hands on raw pointers and need to maintain them manually. Also, arrays are not bounds-checked. The programmer is responsible for checking sizes.

Supporting LLVM bitcode brings languages such as C or C++ to the polyglot world of GraalVM. This allows us, for example, to pass a JavaScript object to C code and access it as if it was a C struct, or the other way round, without converting the underlying data to a different representation. (See the LLVM runtime reference for more information.)

Similar to the JVM languages, which are compiled to Java bytecode prior to execution, LLVM bitcode is a binary format that is not written by hand. Instead, frontends compile a source language to bitcode. For example, clang is a frontend that transforms C/C++ into bitcode, which is then compiled to native code using the LLVM code generator.

While for Java projects Java bytecode is the default output format, compiling a native project to LLVM bitcode can be more challenging. LLVM bitcode is mainly used as an intermediate representation during compiling, so most tools focus on that use case. To make this process easier, we started shipping an LLVM toolchain as an experimental feature initially in GraalVM 19.2.0. Since then, we incorporated feedback from our early adopters which further improved the user experience of our toolchain. Thus, with GraalVM 19.3.0, the toolchain is no longer considered experimental. In this blog post, we demonstrate what we can do with our toolchain and how it works in detail.

Prerequisites

The instructions of this post have been tested on Linux and on MacOS. To reproduce this post you will need the following:

A GraalVM (version 19.3.0 or higher). You can get it from https://www.graalvm.org/downloads/. For most parts of this blog post, the Community Edition of GraalVM is sufficient. If you want to try out the managed mode of the LLVM runtime, however, you will need the Enterprise Edition.
The C standard library headers. On Linux it depends on the distribution you are using. On Debian based systems the package is likely called libc6-dev, on rpm based systems the glibc-headers package is a good candidate. On MacOS you need to have Xcode installed.
Automake and make for configuring and building software projects. Your package manager can help you install those. (On MacOS, you might want to get it from Homebrew.)
A C compiler (optional). We will need a C compiler for building the native reference executable in the beginning. If you want to do that as well, install gcc or clang. Xcode includes clang, so you should already be covered if you are on MacOS.
Common utility tools (optional). For inspecting the results we use the tools file, objdump, dwarfdump, nm, ldd (Linux) or otool (MacOS). Most of them should be installed by default.

For the remainder of this post we assume that the GraalVM bin/ folder is on the $PATH. On Linux that folder is located in graalvm-*/bin. On MacOS, however, that would be graalvm-*/Contents/Home/bin. To test the setup, execute lli --help and look for www.graalvm.org at the bottom.

The Running Example

The C version of the pidigits benchmark from the The Computer Language Benchmarks Game is a perfect running example for this post. According to the description, the program calculates an arbitrary number of digits of Pi, using an Unbounded Spigot Algorithm. Although, the “project” might seem trivial at first sight, it allows us to showcase many interesting features of our toolchain.

Building a Native `pidigits`

First, let’s build a native executable to see how the procedure looks like without GraalVM. Let’s copy the source code ofpidigits.c into a fresh directory which we will reference by ${TMP_DIR}. The benchmark page provides instructions on how to compile the program. Replace gcc with clang if needed:

cd ${TMP_DIR}
gcc -pipe -Wall -O3 -fomit-frame-pointer -march=native pidigits.c -o pidigits.gcc_run -lgmp

We actually do not care about most of the compiler flags like -pipe, -fomit-frame-pointer, or -march=native, so we will leave them out for brevity. Nevertheless, they do not hurt, so feel free to keep them.

Most likely, the above command will fail with something like this:

pidigits.c:9:10: fatal error: gmp.h: No such file or directory

Our pidigits version uses the GNU Multiple Precision Arithmetic Library (libgmp) for working with arbitrary precision numbers. The error above indicates that the libgmp development files are not available. (If the compile command above succeeded, feel free to skip the following section.)

Building a Native `libgmp`

We could install the library via a package manager, but since compiling is fun, and for the sake of this blog post, we will build it ourselves. You can get the sources from the libgmp website.

cd ${TMP_DIR}
curl -L -O https://gmplib.org/download/gmp/gmp-6.1.2.tar.bz2
tar -xf gmp-6.1.2.tar.bz2

The libgmp project uses a configure script generated by Autoconf, a software configuration system that is popular, especially for GNU software. Building Autoconf based packages always follow a similar pattern. The configure script checks for dependencies and sets up the build system, make compiles the project, and make install copies the files to the final destination.

Let’s configure our libgmp build in a fresh directory.

mkdir build-gmp-native
cd build-gmp-native
../gmp-6.1.2/configure --prefix=${TMP_DIR}/native

The --prefix options tells configure where we want to install the results to. We choose a fresh directory ${TMP_DIR}/native. In case configure issues errors, you are probably missing a dependency. The message should tell you how to fix it. If configure succeeded we are ready to compile and install libgmp.

make
make install

If everything went fine, we find the library in ${TMP_DIR}/native/lib. On Linux, the file is called libgmp.so whereas on MacOS it is libgmp.dylib.

Now we can go back to compiling pidigits. We need to tell the compiler where to find the gmp.h header file as well as libgmp using the -I and -L compiler flags, respectively. On Linux, make install of libgmp suggested to "use the -Wl,-rpath -Wl,LIBDIR linker flag", which will help the dynamic linker to locate the library at run time. On MacOS, the library is located using the absolute file name by default, so we do not need these flags (but again, they do not hurt).

cd ${TMP_DIR}
clang -Wall -O3 pidigits.c -o pidigits.gcc_run -lgmp \
  -I${TMP_DIR}/native/include -L${TMP_DIR}/native/lib -Wl,-rpath -Wl,${TMP_DIR}/native/lib

Hurray! We can now calculate arbitrary many digits of Pi:

./pidigits.gcc_run 1000

Let’s inspect the result a bit closer. First, we check the file type of the result file.

file pidigits.gcc_run

This will output something like ELF 64-bit LSB shared object, x86-64 on Linux and Mach-O 64-bit executable x86_64 on MacOS, which are the default executable formats on these platforms. The ldd (on Linux) or otool -L (on MacOS) tools allow us to display the dependencies of the executable. Linux will show us something like this (we do not care about the *linux*.so dependencies):

ldd ./pidigits.gcc_run
        ...
        libgmp.so.10 => ${TMP_DIR}/native/lib/libgmp.so.10
        libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6
        ...

On MacOS it looks a bit different:

otool -L pidigits.gcc_run
pidigits.gcc_run:
        ${TMP_DIR}/native/lib/libgmp.10.dylib
        /usr/lib/libSystem.B.dylib

The result tells us that our pidigits program depends on libgmp as well as on the libc (on MacOS the libc is part of libSystem). If we check for the dependencies of libgmp, we will see that it also depends on libc. All three pieces, pidigits, libgmp, and libc, are native binaries containing machine code for the current architecture (most likely x86_64).

Hello Bitcode, Hello GraalVM LLVM Toolchain!

Instead of running machine code natively, we now want to run pidigits as a bitcode program using the GraalVM LLVM runtime lli. Before we can do that, we need to compile the program to bitcode. Here is where the LLVM toolchain enters the game. The LLVM toolchain is a set of build tools, such as a C compiler and a linker, that enables compiling a native project to bitcode. The LLVM toolchain is not shipped by default with GraalVM, but it can easily be installed using the GraalVM updater:

gu install llvm-toolchain

The GraalVM LLVM runtime launcher lli can show us the location of the toolchain via lli --print-toolchain-path, which will print something like this:

/path/to/graalvm/.../languages/llvm/native/bin

In this directory, we find executables and symbolic links like clang, gcc, or ld. Many of them point to the same executable. We do this since some build systems expect tools like the compiler to have a certain name. The general idea of the toolchain is to get the build system to use the tools from the toolchain directory and the result of the compilation can be executed with the GraalVM LLVM runtime.

pidigits does not use any build system so we invoke the compiler directly. To use the toolchain compiler, we simply put the toolchain directory onto the $PATH. To verify that, check whether which clang points to the toolchain.

cd ${TMP_DIR}
export TOOLCHAIN_PATH=`lli --print-toolchain-path`
export PATH=${TOOLCHAIN_PATH}:$PATH
clang -Wall -O3 pidigits.c -o pidigits.bitcode_run -lgmp \
  -I${TMP_DIR}/native/include -L${TMP_DIR}/native/lib -Wl,-rpath -Wl,${TMP_DIR}/native/lib

Did you notice the difference to the native compilation we did before? No? Good. :) The goal of the toolchain is to blend in as seamlessly as possible. Switching to the tools from the toolchain directory should be sufficient for most cases. We can now use the GraalVM LLVM runtime to execute the result:

lli pidigits.bitcode_run 1000

What is going on?

Let’s inspect the new executable. file pidigits.bitcode_run yields the same result as before and also ldd/otool does not show anything new. So where does the bitcode live? The answer is, it is embedded in a special section of the executable file. We can list the section headers of the executable with objdump -h pidigits.bitcode_run. Since the file formats differ on Linux and MacOS, the result is also sightly different. On Linux, the bitcode is stored in a section called .llvmbc.

...
  9 .llvmbc       00002144  0000000000200860  0000000000200860  00000860  2**4
                  CONTENTS, ALLOC, LOAD, READONLY, DATA
...

In Mach-O files on MacOS, the bitcode is in the __bundle section.

...
  8 __bundle      00002d70 0000000100002000 DATA
...

The otool -l pidigits.bitcode_run gives us a bit more information about the Mach-O file.

...
Section
  sectname __bundle
   segname __LLVM
      addr 0x0000000100002000
      size 0x0000000000002d70
...

So the __bundle is located in the __LLVM segment.

The section naming was not invented by GraalVM. This is what clang does when compiling with the -fembed-bitcode option. "But we did not supply this option," you say? Right, that is what our toolchain is doing for you. The executables in ${TOOLCHAIN_PATH} are merely wrappers for the actual tool, in this case clang. These wrappers do some option processing and insert additional compiler flags to produce an executable with embedded bitcode.

Why ELF and Mach-O files with Embedded Bitcode?

There are two reasons why the toolchain is producing an executable with an embedded bitcode section instead of a plain bitcode file. As we mentioned earlier, we want the toolchain to work out of the box with as many build systems as possible. But the fact is that most build systems are not happy if the compiler does not produce an executable. Piggybacking the bitcode onto the executable saves us a lot of trouble. But using executables also offers more features than plain bitcode files, namely a dependency registry. LLVM bitcode does not support registering library dependencies (that is what ldd and otool -L is telling us). The GraalVM LLVM runtime utilizes the information from ELF and Mach-O files to find and load dependencies. And the nice part is that the build system will insert them for free. :)

Under the Hood

If you want to know what is going on under the hood, call the compiler with the -v flag for verbose output.

clang -v -Wall -O3 pidigits.c -c -o pidigits.o -I${TMP_DIR}/native/include

If you do that, you see, along with lots of other arguments, the flag -flto=full. This flag turns on link time optimization (LTO). Normally, clang and most other compilers translate each source file to an object file containing native machine code. These object files are then handed to the linker, which combines them into an executable. So although clang used bitcode while compiling a single source file, at link time, all bitcode is already gone. This conflicts with our goal of building an executable including bitcode. To fix that, we utilize link time optimization. With LTO, clang will not produce native object files when compiling C sources, but bitcode files. We can verify that by running file pidigits.o, which should say something like LLVM bitcode. Since the linker now processes bitcode files it can perform inter-module optimizations on LLVM IR level. The linked bitcode is then transformed into native code. At this stage, we tell the linker to not only include the machine code, but also the LLVM bitcode in the resulting executable.

This last crucial step is currently not supported by LLVM by default. That is why we ship a custom version of LLVM with our toolchain. However, we are currently in the process of contributing the change back to LLVM.

`pidigits` in Bitcode

Let’s recap what we did up until now. By simply switching from the default compiler to our toolchain compiler, we got an executable with embedded bitcode that can be executed by the GraalVM LLVM runtime. Since the bitcode is embedded in an ELF or Mach-O file, the LLVM runtime also knows which libraries to load. The dependencies look similar to before.

The pidigits program is now executed as bitcode. All the other parts, libgmp and libc, are still in native code. That means, whenever pidigits calls a gmp library function, the LLVM runtime performs a native call. Since native code is a black box for the LLVM runtime, such calls are an optimization barrier.

`libgmp` in Bitcode

So what can we do to let the LLVM runtime optimize libgmp? We can also compile it to bitcode! With the toolchain at hand, this is again simple. Make sure that the toolchain is still on the path by checking that which clang points to the GraalVM toolchain directory. Now let's compile and install libgmp into a fresh directory.

cd ${TMP_DIR}
mkdir build-gmp-bitcode
cd build-gmp-bitcode
../gmp-6.1.2/configure --prefix=${TMP_DIR}/bitcode
make
make install

We also need to build a new version of pidigits which links against the bitcode libgmp.

cd ${TMP_DIR}
clang -Wall -O3 pidigits.c -o pidigits.bitcode_gmp_run -lgmp \
  -I${TMP_DIR}/bitcode/include -L${TMP_DIR}/bitcode/lib -Wl,-rpath -Wl,${TMP_DIR}/bitcode/lib

Let’s see whether it runs …

lli pidigits.bitcode_gmp_run 1000

… and it does not.

External function __gmpn_mul_1 cannot be found.

Bummer! What went wrong? Let’s investigate. The error tells us that GraalVM LLVM runtime cannot find the symbol __gmpn_mul_1 in the libgmp. We can use nm to list all symbols in the library (On MacOS the library is called libgmp.dylib).

nm ${TMP_DIR}/bitcode/lib/libgmp.so | grep __gmpn_mul_1

The command will print something like the following:

000000000003eb50 T ___gmpn_mul_1

The T in output tells us that symbol is in the text section of the library, which is where the native machine code is stored. Note that nm only prints native symbols, not bitcode symbols. So we verified that __gmpn_mul_1 exists in native code, but the LLVM runtime says it is missing from the embedded bitcode. We need to dig deeper. To get an idea of where the function is coming from, let’s have a look at the debug information. Both platforms the GraalVM LLVM runtime supports, Linux and MacOS, use the DWARF format, which we can inspect with the dwarfdump tool. On Linux, the debug information is contained in the library.

dwarfdump ${TMP_DIR}/bitcode/lib/libgmp.so

On MacOS, it is usually stored in a separate directory.

dwarfdump ${TMP_DIR}/build-gmp-bitcode/.libs/libgmp.10.dylib.dSYM/Contents/Resources/DWARF/libgmp.10.dylib

Let’s search for the missing function _gmpn_mul_1.

...
DW_AT_name                  _gmpn_mul_1
DW_AT_decl_file             0x00000001 .../build-gmp-bitcode/mpn/tmp-mul_1.s
...

So the missing function is defined in a file called tmp-mul_1.s. The .s extension indicates that this file is an assembly file. During compilation that file is not compiled from C code to bitcode by clang, but the assembler translates assembly code directly into a native object file. However, to be able to execute the function with the LLVM runtime, we need a bitcode version of that function.

Unfortunately, there is no general solution to this problem. In many cases, however, there is something we can do. What is the reason for having parts of a project written in assembler? Usually it is a performance optimization. A processor might support special machine instructions, which are not easy to target via plain C code. Writing small performance critical parts of an application in assembler can give a performance boost with acceptable maintenance overhead.

Assembler parts are always specific to architecture. In order to be cross-platform compatible, many projects include a generic C implementation used on platforms without a specialized assembler version. Depending on the build system, we might be able to utilize this to get a bitcode implementation.

For libgmp we can check whether configure has something to offer.

cd ${TMP_DIR}
gmp-6.1.2/configure --help

We are lucky. Somewhere in the long list of options we find an entry regarding assembly.

...
  --enable-assembly       enable the use of assembly loops [default=yes]`
...

So let’s reconfigure and rebuild libgmp with the --disable-assembly flag.

cd ${TMP_DIR}
mkdir build-gmp-bitcode-no-asm
cd build-gmp-bitcode-no-asm
../gmp-6.1.2/configure --prefix=${TMP_DIR}/bitcode --disable-assembly
make
make install

We don’t need to recompile pidigits since we installed the new version of libgmp into the same directory. So we can simply rerun the program.

cd ${TMP_DIR}
lli pidigits.bitcode_gmp_run 1000

This time it succeeds.

Now both, pidigits and libgmp, are executed by the GraalVM LLVM runtime and potentially calls between them are optimized. The libc, however, is still native machine code.

In the previous examples, we mix the execution of bitcode and native code at application run time. That means, data is passed between parts that are executed via the LLVM runtime and parts that are executed natively. While we could organize the data use in the LLVM runtime executed parts arbitrarily, the native libc can only deal with raw memory. Therefore, the GraalVM LLVM runtime also uses native memory in the mode where it interacts with native libraries. So a memory allocation in bitcode will trigger a native heap allocation by the LLVM runtime. Memory operations like load and store will simply operate on the raw pointer into the heap. We call this the native mode of the GraalVM LLVM runtime. While this allows full compatibility with native libraries, it is subject to the same issues as native code. There can be use-after-free issues, buffer overflows and segfaults. Want to see that in action? Here you go.

lli pidigits.bitcode_gmp_run

If you forget to add the parameter for pidigits you will be confronted with a segmentation fault. (On MacOS the program might just hang, which is just a different symptom of the same issue. Try executing lli with the --jvm flag if you want to see the segmentation fault as well.) The message tells you that you are trying to access memory that does not belong to you. By looking at the main of pidigits.c, the problem becomes obvious.

int main(int argc, char **argv) {
   ui d, k, i;
   int n = atoi(argv[1]);
   ...

The argument argv[1] is accessed without checking the number of arguments (argc) passed to the program. Just in case you are wondering: Yes, the very same happens if you call the native version ./pidigits.gcc_run without arguments. It would be nice if the runtime would tell us that we are accessing the argv array out of bounds, but in an unmanaged language, the information about the length of the array is not available at run time. The segmentation fault in our example is actually not so bad, since we are in the comfortable situation that we notice the failure. Things are worse if no error is reported and the program, for example, accesses data that should be private.

Managed Execution of Native Code

The LLVM runtime of the GraalVM Enterprise Edition (EE) supports a managed mode, which can help in situations like those outlined above. The idea is simple. Instead of allocating on the native heap, all memory is managed by the runtime and every access is checked. This avoids problems outlined above like illegal pointer accesses, accessing arrays outside of the bounds, and more. As a bonus, we also get garbage collection for free. We will only cover the very basic properties of managed mode here. There is an entire blog post on that feature, if you want to know more.

Since in managed mode all memory is abstracted, we cannot simply pass it to native functions, as it would weaken our safety guarantees and would prevent memory layout optimizations. Therefore, all parts of the application need to be available in bitcode, including the libc. The LLVM runtime of GraalVM EE already comes with a version of the musl libc compiled to bitcode. Since the musl libc is not binary compatible with other libc implementations, we need to recompile our programs if we want to use them in managed mode. Also, the C library uses syscalls to communicate with the operating system. Since syscalls would allow to undermine our security layer, we virtualize them in managed mode. Syscalls are different on every platform. To make the approach more scalable, managed mode only supports Linux x86_64 syscalls. On platforms different to that, for example MacOS, we need to cross-compile. Again, the toolchain tries to shield off this complexity as much as possible.

Managed `pidigits`

Let’s rebuild our running example for managed mode. First, we need to get the managed toolchain. The command is the same as before, but we execute the lli with --llvm.managed to turn on managed mode.

export MANAGED_TOOLCHAIN_PATH=`lli --llvm.managed --print-toolchain-path`
export PATH=${MANAGED_TOOLCHAIN_PATH}:$PATH

You can check whether that worked by looking for managed in the path printed by which clang. We are now ready to build a managed libgmp.

cd ${TMP_DIR}
mkdir build-gmp-bitcode-managed
cd build-gmp-bitcode-managed
../gmp-6.1.2/configure --prefix=${TMP_DIR}/bitcode-managed --disable-assembly --host=x86_64-unknown-linux
make
make install

The --host=x86_64-unknown-linux flag tells configure that we want to build for Linux x86_64. Strictly speaking, it is only needed on non-Linux, non-x86_64 platforms, but it does not hurt to always add it. Note that this is only needed to tell configure what is going on, for example that it should build an libgmp.so instead of libgmp.dylib on MacOS. The managed toolchain compiler will always compile for Linux x86_64. Thus, building pidigits looks similar to before.

cd ${TMP_DIR}
clang -Wall -O3 pidigits.c -o pidigits.bitcode-managed_gmp_run -lgmp \
  -I${TMP_DIR}/bitcode-managed/include -L${TMP_DIR}/bitcode-managed/lib -Wl,-rpath -Wl,${TMP_DIR}/bitcode-managed/lib

We are ready to run a managed pidigits.

lli --llvm.managed pidigits.bitcode-managed_gmp_run 1000

We should see the same results as before.

All the pieces of our application are now managed bitcode, executed by the GraalVM LLVM runtime.

Now let’s check what happens to the segfault example.

lli --llvm.managed pidigits.bitcode-managed_gmp_run

And indeed, instead of a segmentation fault, we get a Illegal null pointer access error message. An embedder could catch this error and continue gracefully. (If you are wondering why it is a null pointer access and not an index out of bounds error, the argv array is always terminated by a null pointer according to the C Standard section 5.1.2.2.1/2.)

Conclusion

In this post, we showed how using our LLVM toolchain makes it easy to compile a native project to bitcode in order to run it on the GraalVM LLVM runtime. Although we needed a tweak or two to get the right result, compiling to bitcode was not fundamentally different to compiling to a native executable. However, the toolchain is the best effort approach. We tried hard to support as many out-of-the-box use cases as possible and we will continue to improve it.

In this post, we only discussed the command line interface of the toolchain. However, there is also a Java API for accessing it from other languages within GraalVM. Python, Ruby, and R already use this API to compile native extensions to bitcode. The user of a language simply installs a package, for example using gem in Ruby, and the toolchain takes care of compiling the native code to bitcode.

Currently, the toolchain supports C and C++ in native mode and C in managed mode (edit 8 Jun 2020: C++ support for managed mode was added in GraalVM 20.1). We might expand the toolchain to cover other languages in the future, but we will focus on those cases which are common in practice.

We encourage you to give the toolchain a try and tell us about your experience. What worked out, what did not? Which configuration flags were needed in order to compile a certain project for the GraalVM LLVM runtime? If you have feedback or feature requests, please create an issue in the Github repository or talk to us on Twitter: @graalvm.