Supercharge Your Java Apps with Python

Published in

graalvm

8 min readJun 3, 2021

The GraalVM ecosystem consists of a very interesting collection of languages: JavaScript, Ruby, Python, WebAssembly, Java, LLVM bitcode, and more. All of them bring unique advantages. Python on GraalVM opens up the rich ecosystem of Python data science libraries to Java developers. While the Python support on GraalVM is still experimental, you can use it today to extend your Java applications with Python code and libraries.

In this article we look at an example application developed by students at the Hasso Plattner Institute (HPI) in Potsdam which can be used as a template for using Python libraries from Java on GraalVM. The full source code is available at the GitHub repository of the HPI Software Architecture Group.

If you want to learn more context about the application described in this article, GraalPy implementation, its performance, and more watch Using Python from Java with GraalVM:

Using Maven to Do Python

The template is an example Java AWT application that can render binomial functions using the PyGal Python library. Let’s walk through the template’s code. Note that we simplified it a little for the sake of clarity.

The first question is — how to get started? This being a Java application and a Maven project, it is natural to start looking at the pom.xml. To make the Python integration work seamlessly, we consider the following aspects when writing the POM:

The project must run on GraalVM with Python installed.
All Python dependencies are declared in the POM and installed when running Maven, just like the Java dependencies.
Required Python packages and files should be included in our application’s resources.

GraalVM Makes It Possible

The only supported way to run Python on GraalVM is to use GraalVM builds and have the Python component installed. Ensuring we run on GraalVM in a Maven project is quite simple. We use maven-enforcer-plugin to check that the JAVA_HOME environment variable points to a GraalVM distribution. This will also come in handy when we want to interact with the Python language tools in the GraalVM tree.

The configuration above checks for the `graalpy` executable in the `JAVA_HOME` directory and if that is not found, fails the build with the helpful error message.

Adding the Python Package Universe

Installing Python packages with Maven is a bit more involved and requires a bit of understanding of how Python applications are usually packaged and distributed. Python libraries and packages can be installed system-wide or per user, but that is rarely desirable if we want to distribute self-contained applications and avoid conflicts with the rest of the system. For that reason the Python community recommends using the venv Python module to create a virtual environment for your projects. The venv module is part of the Python 3 standard library. Users of older Python 2 versions may remember a virtualenv package that served the same purpose, but which was external to the Python standard library.

Virtual environments are created simply by running the venv module with a folder name into which the environment should be created. A Python virtual environment is initially just a collection of scripts and symbolic links that tell the Python runtime where to install and load packages. Once created, a virtual environment has a pip launcher in its bin directory that can be used to install Python packages from PyPI into the environment.

To prepare the virtual environment, we use the exec-maven-plugin. We call the venv module that was shipped with our GraalVM Python (found in JAVA_HOME as ensured above), and then use pip launcher to install PyGal, thus ensuring that a simple mvn generate-resources will install the required Python packages:

Package Them Up

Resource bundling is built into Maven as a core concern, so we just need to understand what those resources are in the case of Python. When we created a virtual environment, a folder structure was created that includes launchers and the packages we installed. You might be tempted to cut down on size by only bundling the packages. This is ill-advised, however.

Python and its tooling are built around the command line and so are virtual environments. In fact, virtual environments use the executable files themselves as markers for where to look for packages. Since we don’t want to rely on the precise implementation of how this works, we just bundle everything:

Connecting Java to Python

The main function creates a simple AWT frame with an SVG canvas and an input field where the user can type a formula. We use the GraalVM embedding API to create a Python context and load our library code. In the callback of the input field we call a Python function to generate new SVG data that we push to the SVG canvas for display.

Creating a Python Context

The first interesting bit is how to use the embedding API correctly with the Python virtual environment we have created. To do so we must set a few options before creating the GraalVM Context:

Let’s go through this. First, we create a Context.Builder and request the "python" language to be available. (More languages are implicitly enabled. For example, Python depends on the "llvm" language for its C extension support.) Next, we set the flag to allow all access to native code, the file system, etc. It's ok for now to start with all permissions to get things going and whittle down to what we need later.

The two option calls go hand in hand and need a bit more Python background. The Python executable on your machine always, as part of its startup code, executes the equivalent of import site. The site module is responsible for setting up the package paths for user and system packages, as well as discovering if the executable is inside a virtual environment and then setting the package paths accordingly.

This may not always be desirable for a Python embedding and thus we need the first option, ForceImportSite, to enable it. Here we also run into a problem: the site module uses the launch executable path to determine package paths — but we are launching a Java application! This is what the second option is for: we tell the Python runtime it should act as if it was launched from an executable inside the virtual environment. Where do we get VENV_EXECUTABLE? Simple:

Preparing the Java Code

We could use the embedding API directly to get Value instances from the Python space and interact with them. However, to decouple the Java and Python code, it makes sense to use a Java interface. This way, we could more easily have other rendering backends not based on Python in the future.

We do have to evaluate some Python code, but we will set it up so that we only need to import one Python class that implements the GraphRenderer interface. After we get a handle to the Python class PygalRenderer we instantiate an instance of it. We can then wrap the object and expose it as an implementation of the GraphRenderer interface and benefit from a bit more static typing during development:

Preparing the Python Code

Now that we have set up what we want the Java code to look like, let’s go and write a Python file to load. The PyGal API is a bit too low-level for our purposes. To match the GraphRenderer interface we have defined in Java, we create a Python class with a render function that takes two parameters in addition to the (in Java implicit) reference to the object instance:

What should the code do? Well, it should calculate the values for each step at some zoom level:

As you can see, we just evaluate the string the user put into the input field. This code doesn’t do any error checking, but the user could really be running any Python expression here so in a real application we should be more careful validating the input, and we should consider using GraalVM’s sandboxing features to minimize the capabilities of the Python code.

Once we have the values, we just call PyGal to render an SVG. The SVG data is a Python bytes object with UTF-8 encoding:

Since the Java interface expects an InputStream return value, we subclass java.io.InputStream in Python and use that subclass as our return value:

Here, SVGInputStream is a proper subclass of java.io.InputStream, only with some methods implemented in Python. Since it is not a Python object but a proper Java object, it is "sealed," meaning that we cannot dynamically add members like we can for Python objects. Python on GraalVM thus offers a special this member only visible from Python, on which we can define additional members dynamically. These, however, are only visible from the Python code; there is no way to reference them from Java.

So how do we implement SVGInputStream? Using the normal Python syntax:

A pitfall is that the InputStream abstract class has an abstract int read() method that we must define in Python. However, since Python does not support function overloading like Java does, the Python read method will implement not only read(), but also override the default InputStream implementations for read(byte[]) and read(byte[], int, int). So our Python implementation must handle all three variants in one method. Since we are dealing with bytes, value ranges must also be considered — Python bytes are unsigned whereas Java bytes are signed. For details of the SVGInputStream implementation, you can refer to the repository.

Finally, we need to export the PygalRenderer to the Java code. Python has no implicit "global" namespace; everything is inside a module (even the REPL you get when you run the Python executable is simply inside the __main__ module). To export the Python class into the global GraalVM Polyglot namespace, we use the following code:

Plugging in

Now that we have sorted out how the components interact and have written the code around it, let’s run our application with the mvn exec:exec command:

The finished AWT application rendering a PyGal graph

Simple, yet beautiful. You can type x**2 or sin(x) into the function field and press Return. The first render takes several seconds on my machine and compilation threads start up to optimize this unique combination of Java and Python code. Subsequent renders get faster and after around 5 or 6 renders (on my machine), each new render request takes less than one second and the machine load goes down since all code has been compiled by the GraalVM JIT.

With GraalVM Native Image, we have a technology to compile the Java code ahead of time. We are actively working on allowing warmed-up Python code to be persisted as well, so that the application can already be fast in the first render. We have a prototype of this already and will hopefully light up that feature for Python in the not too distant future.

Conclusions

Embedding Python in Java applications running on GraalVM is easy, but using Python packages properly comes with some pitfalls we need to be aware of. With the small template repository discussed in this article, anyone can get up and running with it quickly and avoid some snags and startup hurdles. Give it a try! We always welcome feedback either on Github, Slack, or even here in the comments. From feature requests to issues to helping us prioritize the packages in the ecosystem, it’s all valuable input that helps make GraalVM a great runtime for Java and Python.