Cloud-friendly migration through refactoring is often the best solution

Stefan Billet
9 min readJun 1, 2023

--

— Read this article in German

Rehosting, refactoring, or rebuilding? When migrating applications or entire system landscapes to the cloud, it is important to consider the essential questions. As a Software Architect at QAware, I have had the opportunity to gather some experiences and I would like to share what has proven successful in practice with our clients such as Allianz, Ericsson, and BMW.

Before we dive into the typical migration strategies, I would like to first present my understanding: What does it actually mean to operate an application in the cloud? Primarily, it means that the software runs on remote servers, usually managed by an external cloud provider, rather than in the company’s own server room or data center. However, there are various variations of cloud software!

What kind of service level would you like?

One goal of a cloud strategy is to focus on value-added activities during development and operation. The aim is to channel energy into valuable features rather than activities that ensure stable and secure operations. While these aspects remain important, we want to outsource them to the cloud provider as much as possible.

The following graphic illustrates what I consider to be the currently most important variations of cloud computing:

From left to right, the cloud provider’s service level increases, while flexibility decreases. For example, virtual machines (VMs) running in the cloud can support almost everything you could do with a physical server under your own control, but you also have to take care of everything yourself, from the operating system to network setup.

In addition to virtual machines (VMs), containers have gained popularity in recent years. They are more lightweight compared to VMs because they share the host operating system, but they are still strictly isolated from one another. Around the container technology, a vast cloud-native ecosystem has emerged. The cross-vendor de facto standard for running container-based distributed applications is Kubernetes (K8s). As a container orchestrator, K8s provides a unified way to manage and scale containers. The major cloud providers now offer various K8s-based products: in the traditional approach, you configure a K8s cluster and decide the number and specifications of the nodes (servers) yourself. Other products hide the K8s layer, allowing you to focus solely on your own services (serverless containers).

Taking it a step further, there are serverless functions (also known as FaaS — Functions as a Service). Instead of defining a container image, you deploy individual code functions. The cloud platform hides the entire execution environment. In my experience, serverless functions are particularly suitable as a complement to containers for platform-specific tasks. For example, let’s consider an application that allows image uploads. Each image needs to be resized in multiple sizes to save data on mobile devices. In this case, a serverless function worked wonderfully. It had a trigger set to detect the appearance of a file in the corresponding directory and then generated and stored scaled versions of the image. It is also possible to build larger applications in an event-driven style using serverless functions. However, I have no personal experience with this approach from our projects, and I assume that the architecture can quickly become complex and difficult to manage.

Last but not least, cloud-based services that can be easily integrated by users via a browser or application through APIs should not be overlooked. These services are typically offered through a subscription model and run on the provider’s (cloud) infrastructure, separate from your own application. If a suitable SaaS (Software as a Service) offering exists, I would always prefer it over a self-developed or self-operated alternative.

Among the presented options, containers are a good default choice due to their flexibility combined with a standardized external interface. Virtually any workload can be effectively operated within containers, and by opting for a managed service, we can avoid the pains of infrastructure management. For larger applications comprising multiple (micro)services, we often rely on Kubernetes, which we configure according to the application’s needs and extend with open-source tools from the cloud-native ecosystem (e.g., for observability). For small applications with a few containers, serverless container services are highly attractive. This serves as a convenient starting point, especially for startups, as it eliminates the need to initially deal with Kubernetes.

When an application is built in a cloud-native manner, it can fully leverage cloud benefits such as fault tolerance, availability, automated deployments, and dynamic scaling. However, what should be done with the many older applications that are not yet running in the cloud?

Ready for the Cloud?

How well does an application integrate into the cloud environment? This is important because applications cannot automatically leverage many benefits of running in the cloud unless they provide certain interfaces and possess specific characteristics that the cloud platform expects.

We roughly assess the cloud readiness of an application in three stages:

  • Cloud-Alien: An application deployed on traditional servers or VMs in a monolithic fashion.
  • Cloud-Friendly: A containerized application that supports unified deployment and management through a cloud environment like Kubernetes. It adheres to principles such as the 12-factor app, enabling easier integration with the cloud.
  • Cloud-Native: Similar to Cloud-Friendly, but with additional characteristics to fully leverage the advantages of the cloud. This includes a microservices architecture, DevOps practices like automated deployments, high test automation, infrastructure-as-code, and a particular focus on (horizontal) scalability.

Selecting a suitable migration strategy

There are various strategies for cloud migration that differ in effort and achieved cloud readiness.

The simplest strategy is rehosting (lift & shift). Here, the existing application is moved to the cloud infrastructure with minimal adjustments. Virtual machines (VMs) are often used, which closely resemble the original server in terms of the operating system and sizing. However, some adaptation is still required to make the application run in the new (network) environment and connect with interface systems.

Rehosting (Lift & Shift): Moving an old house as a whole to a new location without significant modifications.

Rehosting is a suitable approach when the primary goal is to achieve “operation in the cloud” with minimal effort, for example, to decommission one’s own data center.

Firstly, it needs to be determined whether rehosting is feasible at all. It is often not reasonable to do so because assumptions about the infrastructure have been built into the application. For example, a mainframe application relies on a fixed local database and file system, which cannot be easily replicated in a standard VM.

Furthermore, the rehosting strategy has several disadvantages. It is often not suitable for achieving the actual goals, such as reducing costs, providing visible improvements to customers, or generating new business value. Additionally, it often leads to significant challenges in maintenance and operation.

In general, most of the benefits of the cloud are not realized, as the application remains a foreign component that has been integrated abruptly.

Refactoring (lift & extend) towards a cloud-friendly application is a pragmatic middle ground, which, based on our experience, often represents the best solution. In this approach, we containerize the application while preserving its original architecture and most of its code. We then add the necessary features to ensure smooth integration into the new (Kubernetes) platform.

Refactoring (Lift & Extend): Renovating an older, well-preserved house to seamlessly integrate it into the changed environment and infrastructure.

Last year, I had the opportunity to participate in a project that falls into this category. We migrated a cluster of several older Java applications that were running on VMs to Kubernetes. We worked on this migration for about six months with a team of three developers. This effort is significantly less than starting from scratch, and we believe it provided a favorable cost-benefit tradeoff.

The biggest challenge in this case was getting rid of the widely used, venerable RMI (Remote Method Invocation), a communication protocol for remote procedure calls in Java. This became necessary because RMI establishes connections that require a dynamically selected port from a range, while Kubernetes relies on a statically assigned open port. Our solution was to replace all RMI usages with gRPC, a modern, language-agnostic protocol. However, since RMI supports the full power of Java while gRPC intentionally keeps things simpler, there were many special cases. For example, RMI allows for the transmission of a cyclic object graph, which is not possible with gRPC. In the end, we replaced all RMI usages with a combination of custom code transformers and manual adjustments for special cases.

This example is typical of the challenges faced during a cloud migration. The reason is that the old applications were built under certain assumptions about their operating system, network setup, and available resources. A Kubernetes platform brings its own set of conditions, making conflicts very likely. The good news is that these conflicts can usually be resolved with good engineering and reasonable effort without having to rebuild the entire application from scratch. Additionally, interface issues between multiple applications are always a significant challenge, as partners also need to make adjustments, requiring coordination.

If the full potential of the cloud is to be realized or if other strategies are not feasible, a rebuild remains an option. This involves partially or even completely reconstructing the application in a cloud-native manner. We break down the existing monolithic application into microservices that can be developed, deployed, and scaled independently as much as possible. We replace the infrastructure components used, such as databases, with components running on the cloud platform, and we automate the provisioning of the environment and the deployment of the application.

Rebuilding: Demolishing an old house and constructing a new house from scratch according to modern standards.

Rebuilding makes sense, for example, when the original application is far from the target state or when additional business value needs to be unlocked. A recurring pattern that we often observe with our clients is rebuilding old mainframe applications on a modern stack with a cloud-native architecture. Besides the cloud, there are other drivers for such projects, such as a shortage of developers with expertise in old technologies. Often, these projects are not just migrations but also involve the development of major new features.

For example, we replaced a central mainframe application used for bill of materials calculation with a Java-based cloud-native application for BMW. By leveraging the horizontal scalability of the cloud and utilizing open-source components like Apache Solr, we were able to greatly improve performance, transitioning from nightly batches to interactive calculation. Along with replacing outdated host screens with a modern web UI featuring powerful search and aggregation capabilities, entirely new possibilities were created for the users. Details about this project are explained in an article in the Handelsblatt (in German, p. 15). Building such a complex application from scratch, which has evolved over decades, is a significant, multi-year project. However, we were able to generate high business value in time and budget at costs that are often lower than what other systems of similar complexity would incur during a lift-and-shift migration.

Summary

We can broadly classify cloud migration into three strategies: rehost, refactor, and rebuild.

From my perspective, a cloud-friendly migration through refactoring often represents the best solution as a pragmatic middle ground. It leverages the advantages of the cloud with relatively low effort and thus creates tangible benefits for the organization.

See also:

--

--