Moving From AWS Managed Elasticsearch To Self Managed

Published in

Ula Engineering

5 min readJul 6, 2022

Let us start the story with a little bit of introduction to Elasticsearch.

Elasticsearch is a distributed, free and open search and analytics engine for all types of data including textual, numerical, geospatial, structured, and unstructured. It is build on top of Apache Lucene and is highly praised for its simple REST APIs, distributed nature, speed and scalability. Because of these features, elasticsearch is widely used in applications which includes full-text search, logging and log analysis etc.

At ULA, our major use-case for elasticsearch includes retrieving product details, fetching order information etc. in an efficient manner to improve customer experience. In order to include these capabilities of elasticsearch in our systems, we were using managed Elasticsearch service provided by Amazon known as AWS Elasticsearch Service (AWS-ES) or AWS OpenSearch Service. Using AWS-ES mainly provided us with some benefits which include:

Easy to set up.
Built-in managed Kibana for better visualization.
Native integration with other AWS services.

Even though we had all these pros with AWS-ES, there were some cons which in our case over-weighed the pros and led us to move towards a self-managed Elasticsearch leaving behind AWS-ES. 🥹

Reasons To Move Away From AWS-ES

Here are some of the primary reasons why we decided to move towards a self-managed version of elasticsearch.

While performing bulk updates or heavy indexing, we faced high CPU utilization and JVM pressure which in turn resulted in some blocking for read/write operations on the cluster.
Even though we had some configuration settings available (e.g., version, custom domain name, data storage nodes, and the number of nodes in the cluster), we didn’t have an option to completely customize the cluster to our use case (such as adopting a higher version of Elasticsearch, strategically keeping most used indexes on specific nodes) which in turn limited our expected throughput from AWS-ES.
Since the cluster nodes were internally managed by AWS itself, we didn’t have an option to SSH into cluster nodes for a quick resolution or a quick debugging of an issue.
Last but not the least, the cost factor. The additional premium cost that we had to pay was too high when compared to an elasticsearch cluster hosted over an EC2 instance. A brief idea about the cost factor is given below.

Cost Comparison

To understand the cost comparison in a detailed way, let us take an example.
Let us consider an elasticsearch cluster to be a three-node cluster which in most cases will be enough to handle the daily read-write requests (A three-node cluster will provide enough resilience even if one of the nodes goes down). Now, let us assume that each of these nodes is a m3.large instance.
For an AWS-ES, the cost for one instance = $0.178 per hour.
In the case of an AWS EC2, the cost for one instance = $0.12 per hour.
Accounting for total hours per month as 720 hrs and since we are considering a three-node cluster,
The total cost for AWS-ES = 3 * 720 * 0.17 = $385 (approx.)
Total cost of self-managed ES = 3 * 720 * 0.12 = $260 (approx.)
That itself is a whopping saving of around 48% which we can have if we use a self-managed elasticsearch. 🤑

The cost factor, along with other limitations that we had led us to work towards building a self-managed elasticsearch.

The Transition From AWS-ES To Self-Managed Elasticsearch

Before switching completely onto self-managed ES, we ran a POC with configurations similar to that of our production environment. The POC ran well to provide us with satisfying results which encouraged us to move ahead with the transition process.

Pre-requisites that were taken care of before making the transition:

We had to make sure that the transition happens in such a way it does not disturb the current AWS ES configuration.
The transition plan involves first setting up a parallel ES cluster along with the AWS-ES cluster we already had. This is important in order to be sure that the data remains consistent on both existing AWS-ES and self-managed ES.
Once the parallel setup is done along with proper migration, we had to make sure that we route all the update/edit requests that the original elastic service handles to the new elasticsearch set-up as well.

Parallel architecture has both AWS-ES and self-managed elasticsearch.

The detailed transition steps are as follows:

First, the self-managed Elasticsearch cluster was containerized for deployment which would work in parallel with the current AWS-ES setup.
Now, as the client sends requests(write requests) to the existing Elasticsearch service directly, the existing elasticsearch service needed to be modified to publish the messages to a queue as well.
After that, we created a new branch from the older elastic service that talks to the self-managed Elastic search and polls on the queue mentioned above. This is our Clone Elastic search service which was deployed separately and takes care of communicating with the self-managed ES.
For the Read requests, we used a flag in the existing elastic service. This flag controls the routing of requests between AWS-ES and self-managed elasticsearch. Hence, if the flag is turned on, the requests will be directly routed to the self-managed elasticsearch.
Now, this architecture was monitored for about a month to make sure that the new architecture works well and remains consistent during heavy indexing scenarios.
The read request performance was also monitored by switching the flag on/off.

As a final procedure, the existing connection with AWS-ES and the queues serving the Clone elasticsearch service was cut off and requests from the original Elasticsearch service were now made to directly hit the self-managed ES cluster.

This is how we made a successful transition from using a managed version of Elasticsearch to a self-managed version of Elasticsearch.

It's not the end yet 😉. We will be releasing part 2 of this blog where we will be going through the process of setting up a self-managed elasticsearch cluster on AWS EC2 instances. Stay tuned!

References:
https://cloud.netapp.com/blog/aws-cvo-blg-comparing-the-two-aws-deployment-options-for-elasticsearch
https://www.elastic.co/what-is/elasticsearch

Moving From AWS Managed Elasticsearch To Self Managed

Reasons To Move Away From AWS-ES

Cost Comparison

The Transition From AWS-ES To Self-Managed Elasticsearch

Written by Aromal Benny