How to use Python with Quarkus and GraalVM | Quarkify

Dmytro Chaban
Quarkify
Published in
5 min readApr 21, 2020

Quarkus is not just a backend framework. Rather, it’s a platform or ecosystem. With Quarkus you can write as JavaEE, as well as Spring applications, and even mix them, without thinking of underlying details.
We also must not forget that we can use GraalVM. GraalVM is a polyglot language and you can run Java, JavaScript, Python, R, C++ from each other without large overload. Calling code from different languages can be beneficial in terms of saving lots of hours, firstly of course because you don’t need to re-write code from one language to another, that can lead to potential bugs, and second thing is that you don’t need to rewrite anything from start.
So taking all that in mind, let’s see how you can freely re-use some of your Python code in Java

For better command copy-paste experience, please open this article on quarkify.

Setting up the environment

We’ll think that you already installed GraalVM, if not, follow this link. We’ll need to install GraalVM Python interpreter:

gu install python

It will take some time, once it’s done, you’ll have GraalVM’s python interpreter. It can be called via graalpython command. Note that you have a limited amount of packages that you can install. But, this should be more than enough and GraalVM team actively adds new packages to satisfy most of developer's requirements. You can call next command to see available packages

graalpython -m ginstall install --help 

Within different packages, you can find such as numpy. Let’s install it! Keep in mind this will take good amount of time

graalpython -m ginstall install numpy

Running Python code in Quarkus

Once it’s done, you’ll be able to use these tools with graalpython interpreter, which means that you can use it from Java or Quarkus. So let’s do this.

Firstly, let’s create some useful python code. We’ll calculate Effect size, also known as Cohen’s d. Here’s full python code:

import site
from numpy import std, mean, sqrt

x = [2,4,7,3,7,35,8,9]
y = [i*2 for i in x]
x.append(10)
(mean(x) - mean(y)) / sqrt((std(x, ddof=1) ** 2 + std(y, ddof=1) ** 2) / 2.0)

Notice that we import site package, this is a requirement for numpy to work. If you don't want to keep this import in your code, you can create some script that will copy the file, append this package before file and use this copy instead. Either way, once it reaches interpreter, we should have import site.

Create Quarkus project

Now, let’s create Quarkus project that will execute our python code. If you’re lazy and just want to inspect the working solution, just clone this project on master and follow the next steps. If you want to do everything from scratch then you need to init a simple project with maven.

mvn io.quarkus:quarkus-maven-plugin:1.3.2.Final:create \ 
-DprojectGroupId=tech.donau.quarkify \
-DprojectArtifactId=quarkus-python-effect \
-DclassName="tech.donau.quarkify.EffectResource" \
-Dpath="/effect"

Once it’s there, create a python file at src/main/resource/effect.py with the python code provided above.

Open src/main/java/tech/donau/quarkify/EffectResource.java and modify it so it looks like next code sample

Start and test

That’s all that you’ll need to do. Now we only need to start our server and call http://localhost:8080/effect

./mvnw quarkus:dev 
#> ...Listening on:http://0.0.0.0:8080...

Let’s call our service from either browser or terminal.

curl http://localhost:8080/effect

Once it’s there you should see -0.5596621094715237 as output. You will find that the first execution takes around 30 seconds. That’s because GraalVM needs to set up the context for you and within the first run parse and cache some data. Consequent executions will be fast enough.

Passing input into python

As of now, we only executed static code, that is we just executed a predefined file and showed output. But what if we want to execute code and pass some parameters? We’ll need to modify our effect.py to use only our function, so we need to remove everything except function itself

If you’re using our cloned repo, execute git checkout feature/input

import site
from numpy import std, mean, sqrt

def cohen_d(x,y):
nx = len(x)
ny = len(y)
dof = nx + ny - 2
return (mean(x) - mean(y)) / sqrt(((nx-1)*std(x, ddof=1) ** 2 + (ny-1)*std(y, ddof=1) ** 2) / dof)

This way, we’ll be able to run effect.py once, load it into context, and re-use it each time we call the method. Meanwhile, our java class got some upgrades that will help to optimize performance a little bit.

Be aware that this code is not session safe. While it can be used concurrently, context the object in our case will be single for all users, this means that one user can pass sensitive data into context, and another user may get this data back. You need either design your code so you pass data into separate fields or don't store them inside python, or store sensitive data in dictionaries that will store and retrieve only values stored by the same user.

With this in mind, let’s modify our java class to next state:

That’s it. If you have exited quarkus, start it again via ./mvnw quarkus:dev. Now we can curl with a POST method and submit our own data. For simplicity, we used string input, but you can try to modify it so that you accept some JSON object with the array.

curl --location --request POST 'http://localhost:8080/effect' \
--header 'Content-Type: text/plain' \
--data-raw '2,4,7,3,7,35,8,9'

You’ll get again same result -0.5596621094715237. Try to modify some values.

curl --location --request POST 'http://localhost:8080/effect' \
--header 'Content-Type: text/plain' \
--data-raw '100,4,15,3,7,35,8,3'

You’ll see different result. That means that you have dynamic method in python that you execute on-demand via Quarkus Java code.

In conclusion

As you’ve seen, it’s really easy to call Python code from GraalVM or Quarkus, however, you need to follow some rules to make it safe. Moreover, this feature is still in the early stage, which firstly can be visible in the number of packages supported. Moreover, a large amount of python code can be slow and may even cause StackOverflowException.

Should you try it? Absolutely! With initialized context execution speed is close to normal python speed. I personally use it when I need to try out some features implemented in python, but I’m too lazy to rewrite it into Java, and setting up microservice isn’t worthy of time.

Should you rely on it? Absolutely no. If you haven’t installed NumPy or pandas keep in mind that it can take around 30 minutes for each. On my GCP n1-standard-4 (4 vCPUs, 15 GB memory) instance it took around 20 minutes. Moreover, lack of some popular libraries such as sсikit-learn currently is a large drawback for using it actively.

Do you think it’s something to keep an eye on? Would you switch to such a feature if it was on a tolerable performance level? If not, why do you think microservices(or other solution) will be better? Please let me know in the comments, I’m really interested in any discussion on this topic.

Originally published at https://quarkify.net on April 21, 2020.

--

--

Dmytro Chaban
Quarkify

Software Engineer, addicted to productivity and automatization