How we migrated a legacy banking application to Amazon EKS step-by-step

Published in

Inside the Tech by SoftServe

7 min readNov 25, 2020

Banking legacy app, hosted in on-prem datacenter, with high maintenance cost, long Time To Market, poor 20-year-old UX and no Agility. Client wants an upgraded Amazon EKS-based solution. Here`s a step-by-step guideline.

I am Oleksandr Vorobiov, DevOps Engineer at SoftServe Development Center in Kharkiv. For the last three years I have mostly been working with Amazon Cloud (apps migration, optimization, and re-architecture). The project, I`m writing about, is one of the most challenging and interesting so far.

Resolution plan

Along with re-platforming, we took four additional approaches:

Infrastructure as a Code (IaaC) to easily process all the needed resources.
Helmfile deployment to orchestrate deployment of the components in Kubernetes.
CI/CD to automate the whole process of the migration.
GitFlow to develop microservices.

Preparation

Prior to the migration development team:

Broke down legacy code to small chunks by converting the source code to Java and C++ and added additional features.
Wrapped the app into Docker containers.
Designed a brand new architecture for our app as shown in the diagram.

To AWS Cloud we deployed VPC with three subnets in it. Each subnet contains pre-deployed services:

Amazon EKS.
Aurora databases, those numbers vary depending on the environment.
Kafka services, provided by Amazon.
Elasticsearch with basic components set.

4. Used nested stacks to easily update and make changes to the product. Each nested stack is represented by a .yaml file that leverages a separate set of AWS services. This approach lets us make changes to different parts of infrastructure without any disruptions to the whole product.

5. Used Cloud Formation Custom Resources which essentially are Python scripts, wrapped into AWS Lambda. Custom Resources are used to make custom configuration to some services in case it is not supported by CloudFormation, such as Kafka.

Migration overview

The most obvious solution, when it comes to leveraging microservices in Kubernetes, would be using Helm charts. Indeed, we can use Jenkins to compile Helm charts and pack Docker images with executable code into them. Additionally, Helm charts allow customizing parameters for different environments, such as Production, Staging, or Test.

But…the issue is Helm charts don’t allow you to define deployment dependencies between the microservices.

We can use umbrella helm chart but still it doesn’t provide required level of configuration and dependency management.

And the solution was…

Helmfile

Think of it as of the Helm for your Helm. You practically wrap your Helm charts into Helmfile. With this, you can easily manage environments using parameters in values files and, most importantly, seamlessly define dependencies between your Helm charts.

The structure of Helmfile is as easy as pie.

From top to bottom you:

Specify your environments’ info.
Register internal Helm chart’s repository, hosted in Artifactory.
Describe releases.

Overall, Helmfile is the best idea when it comes to deployment of the (micro)services on different environments using different parameters set.

Helmfile features

Separately, I would like to highlight the major features of Helmfile:

12-factor style configurations.
Inline values.yaml.
CI/CD integration.
Environments synchronization.
Go templating.

CI/CD pipeline

For the microservices development, we followed GitFlow approach.

In this case, we have at least four branches:

Master with release tags of the app.
Release (not in the picture) for preparing releases
Develop where changes to the code were being made.
Feature — derivative to Develop. For each new product feature development, we were creating a separate branch.

For Helmfile — so it automatically establishes deployment dependencies — we followed GitOps approach. It states that the infrastructure state represented in Infrastructure Git repo should represent actual state of running environment.

Briefly, the synchronization works as follows:

Developer makes a change to a microservice and tests it.
A microservice undergoes testing on CI stage.
Jenkins compiles Helm chart and Docker image for this microservice, and tags a new version.
Jenkins updates the version of the corresponding microservice in a infrastructure Git repo, for a specified environment.
Automatically-triggered CD job applies the initial change in a corresponding environment.

As you can see, no manual efforts are needed to make changes to environment. Also, to find out which versions of services are running in specific environment, you can simply check your Git repo.

Local pipeline

For microservices packaging, we used Minikube to simulate whole or at least part of real environment in Kubernetes. To orchestrate process locally we use Gradle, created a Gradle file and described necessary logic.

Gradle file runs automatically upon testing of some change. The script compiles Docker image from a Git repo, installs minikube, and starts Kubernetes cluster if missing. When a Kubernetes cluster is up and running, simply a new version of updated microservice is installed. Minikube here validates if local changes were executed successfully.

Additionally, in a repo we have Helmfile that describes only that part of the infrastructure that is needed to run tests for that specific microservice.

Secrets management

To build communication between the app and secrets, we used Kubernetes External Secret Controller. This solution is also installed from Helmfile.

Inside a Kubernetes cluster, Kubernetes External Secret Controller creates Custom Resource. This Custom Resource has API to fetch secrets from Secrets management (SSM component in Amazon) to the needed namespaces.

To summarize, Kubernetes External Secret Controller enables automatic movement of the secrets to the new namespaces upon their creation. As easy as that…

Service mesh for secure communication

Luckily, the days when Istio was a novelty have passed. This allows me to focus on the peculiarities of this solution’s adaptation to our needs. The major areas that service mesh solved in our case were:

Secure communication between microservices.
gRPC-based communication between microservices.
gRPC load balancing.
Istio serves our case best due to its following features:
Provision of endpoints for user’s communication with the services via the product’s UI. Istio routs requests, made on the UI, to the app’s components.
Possibility to set up mutual TLS between the services. Traffic between the services is encrypted.
Balancing of the gRPC protocol between the services.

Logging and monitoring

Of course, we set up logging and monitoring.

Infrastructure logging

To collect logs from the services (Amazon Aurora, Kafka, and others), which already are leveraged in the cloud, we employed Kibana. For convenient collection of the logs, we used AWS Lambda. Obviously, we configured it slightly to make each component get into Kibana’s needed index.

Infrastructure monitoring

All the metrics are reported to Amazon CloudWatch. Then Grafana fetches data from it and stores metrics in an appropriate format for the user’s convenient access.

Microservices logging

To collect logs from the microservices, we deployed Fluentd to every Amazon EKS worker node. Inside every node, microservices write logs to a Fluentd daemon. To properly retranslate logs from the microservices, which are written in Java and C++, every daemon has a pre-configured parser. Eventually, Fluentd writes logs to appropriate Elasticsearch indexes that visualize data in Kibana.

Microservices monitoring

We implemented microservices monitoring using Prometheus. It collects all metrics from the microservices and puts data into Grafana dashboard.

Improvement areas

Our current top-priority is setting up autoscaling, which is planned to be implemented using the two solutions:

Horizontal Pod Autoscaler for pods scaling in a Kubernetes cluster. Based on custom metrics of our services, it scales pods based on their load.
Cluster Autoscaler, provided by Amazon, for worker nodes’ scaling.

To sum up, migration to cloud is an extremely beneficial idea when it comes to effortless app updating and maintenance. Even for big enterprise, market offers abundance of solutions to make this process easy and risk-free. In our case, Helmfile was what provides flexibility to the team and automation for the product.

Feel free to check out open positions in our team if such complex projects for leading enterprises is something you`re interested in.