Developing production-ready serverless applications with Kotlin, Micronaut and GraalVM

Published in

tech{hunters}

9 min readJun 10, 2020

The introduction of serverless computing into the world of software development has brought advantages that are very hard to ignore, not only by engineers but also by business leadership. The reason for this is very simple, the idea that you can utilize computing power only when you need it creates a more efficient allocation of hardware and thus reduces the overall running cost of the software. Furthermore, it incentivizes software developers to engineer smaller services that communicate with well-established boundaries, making the design of such applications more scalable and easily understood.

However, there are also some rather far-reaching consequences and disruptions caused by the new execution model. Some programming languages and tools were designed in a way that makes a trade-off with fast portable runtime but a slow cold start, like the Java Virtual Machine. This has been fine for programs that run continuously, such as the server part of client/server applications that handle communication is with their users, but it is often no longer acceptable for such services if they want to run in a serverless environment, because many users expect a quick and snappy UX software. Slow cold starts would get in the way of such a requirement.

The need for fast cold starts created a vacuum in the JVM world that has been since filled with many solutions and frameworks, like Micronaut, Quarkus etc., and also OpenJDK alternatives like GraalVM. In this article, we talk about our experience with serverless computing provided by Google Cloud Run.

The on-demand flavour of Facebook Insights

In my previous article, we talked about Facebook Insights in the context of how to process and store them. This time, we are creating a service that processes Facebook Insights data on-demand and therefore there is no requirement for storing it. This new on-demand service is needed for data that is huge in volume and thus is very costly to store — but is not requested by users often enough. Such case fits the serverless narrative very well which is the reason why we decided to try it.

Our typical tech stack up to this point had been Spring Boot, Kotlin and JOOQ, obviously on top of OpenJDK. That works very well for applications that run continuously e.g. in Kubernetes because you only care about their performance once they start. However, the serverless environment does not, unfortunately, work very well with these technologies, because they take a long time to start up and that can increase the time the user waits for an answer from the service, and the amount of money your cloud provider charges you.

After some careful research, we decided to build our new on-demand Facebook Insights service with the Micronaut framework. The main reason is essentially its compatibility with GraalVM, which we wanted to use as the underlying JVM because of the immensely improved start up time. At first, we were worried Kotlin would not work with GraalVM, but fortunately, after some native-image tweaks, we were able to fully use it.

Most of our services run in Google Cloud Platform, so we were looking for a serverless solution provided by Google. There are currently two serverless flavours in GCP — Cloud Functions and Cloud Run. Cloud Functions do not support JVM, so we opted for Cloud Run as that turned out to be our only choice. In the following sections, I will walk you through the development of this service, what problems we encountered, and what the result (mainly its performance and cost) turned out to be like.

Setting up Micronaut and deploying it into Google Cloud Run

Setting up Micronaut with GraalVM and Kotlin is very easy, you just need to install Micronaut CLI tool e.g. with SDKMAN, and run the following command:

$ mn create-app com.roihunter.fbinsights \
    --build=maven --lang=kotlin --features graal-native-image

This command will generate the following files:

$ tree
.
├── docker-build.sh
├── Dockerfile
├── micronaut-cli.yml
├── mvnw
├── mvnw.cmd
├── pom.xml
└── src
    ├── main
    │   ├── kotlin
    │   │   └── com
    │   │       └── roihunter
    │   │           └── Application.kt
    │   └── resources
    │       ├── application.yml
    │       ├── logback.xml
    │       └── META-INF
    │           └── native-image
    │               └── com.roihunter
    │                   └── fbinsights-application
    │                       └── native-image.properties
    └── test
        └── kotlin
            ├── com
            │   └── roihunter
            └── io
                └── kotlintest
                    └── provided
                        └── ProjectConfig.kt17 directories, 11 files

Unfortunately, the generated Dockerfile does not work out-of-the-box because it uses scratch Linux build to run the generated GraalVM native image in. Even with other alpine Linux builds, we were getting a segmentation fault when trying to start the application. Furthermore, it builds a statically linked executable which did not work for us, so we had to remove the --static native-image argument. In the end, we changed the Dockerfile to look like this:

FROM maven:3.6.3-jdk-11 as maven
COPY . /home/app
WORKDIR /home/app
RUN mvn -B packageFROM oracle/graalvm-ce:19.3.1-java11 as graalvm
COPY --from=maven /home/app/target/fb-insights*.jar /home/app/
WORKDIR /home/app
RUN gu install native-image
RUN native-image --no-server --initialize-at-build-time=kotlin.jvm.internal.Intrinsics -cp fb-insights*.jarFROM debian:stretch
EXPOSE 8080
COPY --from=graalvm /home/app/fb-insights .
ENTRYPOINT ["./fb-insights", "-XX:MaximumHeapSizePercent=80"]

Notice the parameter --initialize-at-build-time=kotlin.jvm.internal.Intrinsics, this fixes a native-image build issue that GraalVM seems to have with Kotlin when using Feature’s.

To deploy the application into Google Cloud Run using Google Cloud Build, you have to create a cloudbuild.yml file that looks like this:

steps:
  - name: 'gcr.io/cloud-builders/docker'
    args: ['build', '-t', 'eu.gcr.io/rh/rh/fb-insights', '.']
  - name: 'gcr.io/cloud-builders/docker'
    args: ['push', 'eu.gcr.io/rh/rh/fb-insights']
  - name: 'gcr.io/cloud-builders/gcloud'
    args: ['run', 'deploy', 'fb-insights',
           '--image', 'eu.gcr.io/rh/rh/fb-insights',
           '--region', 'europe-west1', '--platform', 'managed',
           '--allow-unauthenticated', '--memory', '512M']
images:
  - eu.gcr.io/rh/rh/fb-insights
options:
  machineType: 'N1_HIGHCPU_32'

Notice the 32-core machine type. Sadly, building a native image requires a lot of memory that only this machine type meets. As a complete build takes about 5 minutes, it, unfortunately, means that a single cloud build costs estimated $0.32.

If you use a different CD/CI tool, or you want to deploy the application from the command line, you just need to run the following gcloud command:

$ gcloud run deploy fb-insights \
    --image eu.gcr.io/rh/rh/fb-insights --region europe-west1 \
    --platform managed --allow-unauthenticated --memory 512M

Developing Micronaut application with GraalVM

Developing a Micronaut application with GraalVM can be a bit tricky because you want to work in an environment that is as similar to the production as possible, but at the same time you do not want to wait for the complete native-image build that takes about 5 minutes at best every time you make a small change in your code editor. To make matters worse, the native-image build can make your computer (even a high-performance PC) quite unusable in the meantime.

There really is not a simple solution to this problem, so the development ended up being divided into the following steps:

Implement some part of a functionality
Test the code with OpenJDK
If an issue is found, fix it and return to 2, otherwise, continue
Build the code into a native image and test it
Repeat

At some point, as you gain experience with GraalVM, you get a feel for what might break the native image and you might adjust the development process a bit. In general, anything related to reflection will need to be tested with the native image because reflection does not fully work with GraalVM.

There are several ways to solve reflection problems. The manual way is to create a reflection JSON config that declares program elements that will be accessed reflectively at runtime and then feed the file to the native-image tool. The file can look like this:

[
  {
    "name" : "com.roihunter.insights.api.InsightsFilter",
    "allDeclaredFields" : true,
    "allPublicMethods" : true,
    "allDeclaredConstructors" : true
  }
]

Creating such entries for each class that needs to be reflectively accessed (e.g. DTOs that are to be serialized/deserialized by jackson) can be error prone and make the code hard to maintain. Fortunately, when you use Micronaut, these entries are generated for you automatically whenever you annotate a class with the@Introspected annotation, you just need to bear that in mind.

Sometimes, a library like jackson might need to overwrite a final field at runtime. To enable that, you need to declare allowWrite on such a field in the reflection config. That cannot be achieved with Micronaut though, so you either need to manually define such entries, or you can use an @AutomaticFeature like this:

Unfortunately, we were not able to solve all reflection problems that we encountered, mainly:

Default Kotlin parameters do not work with jackson serialization/deserialization
Custom jackson deserializers must be declared on class level, rather than on fields
@JsonNaming class annotation seems to be unreliable and sometimes does not work

Development with Micronaut itself is overall quite similar to Spring Boot. Bean management, DI and configuration is pretty much the same, except instead of Micronaut annotations, you can sometimes use javax annotations (@Inject, @Singleton etc.), which in my opinion is the better way to do business logic.

Reactive programming is also supported, but be careful when returning a Flowable / Flow from your controllers as Micronaut http server returns JSON stream for such endpoints. That would be fine as long as you do not deploy your application into a sand-boxed environment, like Google Cloud Run, and try to access the endpoint from a web browser. The problem seems to be the accept-encoding: gzip request header that a browser sends, which causes a crash of the underlying Netty back-end in Cloud Run. Returning a JSON stream can also incur a higher price charged by the serverless cloud provider.

Performance and cost

The main performance concern is of course how fast the finished application starts. I am happy to say that our new service starts within about 2 or 3 seconds, which is an excellent result in my opinion. One small caveat is that we eventually had to increase the memory limit to 2 GB because our application was sometimes crashing otherwise. While initially when the container starts, it consumes only about 50 MB of memory, it grows very fast into the gigabytes territory with subsequent usage of its endpoints. The increase in memory also results in a faster start up (even though one would expect the opposite), but the difference is quite small (1–2 seconds).

The cost is where serverless computing really shows its strength. You can find how Google Cloud Run pricing works on this page. In short, you are billed for time the request takes from its start to its finish. This means you are not billed for the time your application runs, but only for the time your service is actually doing some work. The serverless environment might decide at any time to stop your service if there are no incoming requests and starts it again when a request is received. You might think the platform allows you to set some minimum idle time that your service should continue running after its last request, but this is not possible and it makes sense that Google does not wish to grant you such control.

At the moment, our service receives only about 100 requests per day and they finish in about 800 milliseconds on average. At worst, that is about 2,480 vCPU seconds per month and about twice that many GiB-seconds, so we are well below the monthly free quota of 180,000 vCPU-seconds / 360,000 GiB-seconds and for that reason, we have not paid a single dollar yet (except some fixed cost for Cloud Build).

Conclusion

Trying to run JVM in a serverless environment has its issues and drawbacks. They are mainly related to Java reflection, but the splitting of the development environment and the way the Cloud Run sandbox can backfire on you is slightly concerning too. These problems show that the JVM toolchain has some maturing to do with regards to serverless computing. On the other hand, I really like the direction it is going. The performance for us has been really good so far — even if it has only been tested with a relatively simple application.

The low cost, however, is something that is really hard to ignore. Regardless of how you feel about the direction that software development is taking, you should at the very least be a really big fan of the “A+++” efficiency of using hardware only when it is really, truly needed. After all, this tech has the potential to help at least partially curb problems in some other areas of the world ♻️