The bumpy road of Java apps to the cloud

Ewa Wojtach
Google Cloud - Community
5 min readDec 15, 2023

How to approach Java application modernization in the context of migration to the cloud? What to look at? Where are bumps and bottlenecks?

A little bit of history…

Java has been one of the most popular programming languages for years. With over 20 years of history. Traditionally, Java applications were optimized for data center deployments. They were long running processes with warming up stage at the beginning. Neither programmers nor admins had to care much the about startup time and resource consumption during the initial phase. Applications were just started and then they were running for a long period of time.

Bumpy road to the cloud

Currently, more and more applications are being moved to the cloud. There are plenty cloud providers and plenty choices of the platform. For now, we will focus on GCP, Cloud Run as a serverless platform, and GKE platform, when more infrastructure configuration is needed.

With migration to the cloud comes great flexibility, scalability and cost savings, but also new challenges.

Bumpy road of JVM to the cloud

While deployed in the cloud, applications can scale — from even zero instances to plenty of them. With that comes the need of starting fast, to serve incoming requests.

What can we do to reduce the risk of timeouts and long responses?

First, and quite naive solution, could be keeping at least one instance running all the time. Minimum number of instances autoscaling option can be set to 1. This is, of course, nor optimal neither cost saving. What is more — long startup process is still an issue for scaling to more than one instance.

Are there any better options? What is causing the “cold start” issues in Java? At the startup time Java application is performing plenty of tasks. It is loading all jar files, uncompressing and verifying all classes, executing code in the interpreter, profiling and finally compiling with JIT to machine code. All those take time and processing power.

Building native images

One approach can be building the native image from Java source — with e.g. GraalVM. This comes with very significant start-up time improvement, but also with some limitations. As GraalVM native image building comes with the “closed world” principle, it requires that all executable code needs to be known at the build time. For all dynamic components like reflection, JNI, Dynamic Proxy objects metadata or configuration files need to be provided at a build time. Metadata can be collected during the application run with the Tracing Agent. For some libraries metadata files can be taken from GraalVM reachability metadata repository. Still however, it is significant extra effort. It is very difficult for complex applications. Often it requires significant application modernizarion.

How to approach moving to the cloud without significant code changes?

CPU boost

Several experiments has shown that the startup time can be reduced by allocating more CPU to the app.

So, another naive solution could be deploying applications on more powerful machines. This will help with startup time. But then, after the startup, optimized, and compiled to machine code application reduces its CPU consumption. Still, this solution is not cost effective and leads to resource waste in longer period of time.

Cloud we do any better?

For Cloud Run there is “Startup CPU boost” option, which allocates more CPU during startup time. According to official documentation this leads to up to 50% faster startup times for sample Spring application.

For GKE on the other hand, there is no out of the box support for this use case. Pods with defined requests and limits are operating within provided CPU values. The Vertical Pod Autoscaler to change resource requests for running Pod requires it to be recreated. So it is not a case for JVM cold starts issue.

The 1.27 version of Kubernetes comes with new alfa feature — in-place resource resize for kubernetes pods. This brings the idea, that a feature very similar to the Cloud Run Startup Boost could be implemented. There is Kube Startup CPU Boost open source tool available. The tool increases the container resources until it reaches the Ready status. Then it updates the resources to initial values.

Give me the numbers…

To illustrate better the order of magnitude, I have performed several startup tests on sample Java application called Calculate, built with Java 17, based on Spring Boot.

Following table shows startup times for this application, when run on classic JVM, and built to native image. Both options were tested with and without the CPU Boost option.

Java application on Cloud Run startup tests

The same application tested on GKE (v1.27.3 e2-standard-4 nodes) with and without Kube Startup CPU Boost.

Java application on GKE startup tests

Summary

To sum up, Java has a long tradition. Java applications are often complex and with plenty of dependencies. Moving Java applications to the cloud is not an easy and straightforward task. It requires a good plan, and use of proper tools and technologies.

For relative simple applications, microservices, without or very little reflection and dynamic dependencies — native images are probably a way worth considering (being about 10 times faster!). But those amazing capabilities of GraalVM usually come with significant cost of adjusting your app to build the native image.

For complex, enterprise level systems, boosting gives quite good advantage. For many use-cases it seems to be a no-brainer choice, considering, that it does not require any code changes.

Keep in mind that this article shows only some of the approaches and considerations that might be useful in planning your own journey.

The views expressed are those of the author and don’t necessarily reflect those of Google.

--

--