4 Points to Consider While Designing Elasticsearch Cluster

Mukul Chaware
Jan 17 · 4 min read
Image for post
Image for post

Elasticsearch is an open-source, RESTful, distributed search and analytics engine built on Apache Lucene. Since its release in 2010, Elasticsearch has quickly become the most popular search engine and is commonly used for log analytics, full-text search, security intelligence, business analytics, and operational intelligence use cases. A few months back, I got a chance to try Elasticsearch when we were trying to implement search functionality in our platform, these are the insights I got while working on it.

1. Difference between Elasticsearch and Other Relational Databases

Elasticsearch is a powerful Open Source, Distributed, RESTful Search Engine, which implies, it is not intended to be and should never be used as a primary database. Given a search query, It is expected to return results in less than 1 seconds (roughly). To work it that way, it is designed and functions very differently from primary relational databases like PostgreSQL, MySQL, etc. That is why the thought process while designing the Elasticsearch structure should be different than the primary relational database structure.

Few key differences in thought processes

  • Relational Databases like PostgreSQL, MySQL favors ‘Normalization’ while Elasticsearch favors ‘Denormalization’.
  • This is a bit correlated to the above point. Relational Databases offers Inner or Outer joins to keep data consistent and to keep Tables less cluttered. Elastic search has a way of providing ‘joins’ features using Nested Attributes and Parent/Child Attributes but it might make search query slow which defeats the purpose of using Elasticsearch in the first place.
  • Elasticsearch supports One-to-One and One-to-Many relationships but it does not support Many-to-Many.

2. Add only ‘Searchable’ Fields to the Elasticsearch

If the primary database has 10 fields and from which, search functionality is needed on only 5 fields, then add those 5 fields only to the Elasticsearch. It will help in keeping the index size at optimum.

3. Many-to-Many Relationships in Elasticsearch

Elasticsearch does not support Many-to-Many relationships. There is an easy way to implement it but can be cumbersome if not designed correctly. To handle Many-to-Many, we need to convert it to One-to-Many form which is supported by Elasticsearch. The solution to that is Duplication! Duplication! Duplication!

Many-to-Many relationship

This can be transformed into one-to-many as below:

Image for post
Image for post
One-to-Many Relationship

Users are duplicated to transform the problem into one-to-many. It implies if there is an update for ‘User 1’, then all the duplicated rows related to ‘user-book #’ need to be updated too with the latest info.

Few tricks to minimize the updates:

  • Duplicate resource which has less probability of getting updated. For example, in most of the platforms, ‘User’ info can have less chance of getting updated on a regular basis than other dependant fields (in this case, books)
  • This is where #3 from above is very important too. If we have a restricted scope of fields for search functionality, then the number of updates in Elasticsearch can be reduced significantly.

4. Keep Elasticsearch as the Last Resort

First of all, you should first check the scale of data to verify Elasticsearch is actually needed or not before moving to it. Setting up the Elasticsearch structure is expensive and is an ‘additional overhead’ to maintain. If the scale is not on a considerably large scale and can be implemented by other simpler but fast implementations then that can be considered first. For example, relational databases like Postgresql provide full-text search functionality for search which is pretty fast and you can get advantages of relational database features like ‘joins’. Elasticsearch has a way to provide ‘joins’ functionality but as it should be avoided as it is a costly operation that might affect search time. Check out this awesome blog for more details regarding Postgres Full-Text Search.


Thanks for reading. I hope this helps. Don’t hesitate to correct any mistakes in the comments or provide suggestions for future posts!

The Startup

Medium's largest active publication, followed by +707K people. Follow to join our community.

Mukul Chaware

Written by

Building changelogg.io — Automatic, Effortless, No-code Changelog for Every Release

The Startup

Medium's largest active publication, followed by +707K people. Follow to join our community.

Mukul Chaware

Written by

Building changelogg.io — Automatic, Effortless, No-code Changelog for Every Release

The Startup

Medium's largest active publication, followed by +707K people. Follow to join our community.

Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch

Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore

Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store