In todays world, data is generated in high volumes and in order to make something out of it, extracted data is need to be transformed, stored, maintained, governed and analyzed. These processes are only possible with distributed architecture and parallel processing mechanisms which Big Data tools are based on. One of the top trending open-source data storage that responds to most of the use cases is Elasticsearch.
Elasticsearch is a distributed data storage and search engine with fault-tolerance and high availability capabilities. In order to make the best use of its search feature, Elasticsearch is need to be configured properly. Because one simple configuration will not be suitable for every use case, you will need to extract your requirements first and then configure your cluster based on your own use case. This article will focus on the search intensive initial and dynamic configurations of the Elasticsearch.
By default, an Elasticsearch index has 5 primary shards and 1 replica for each. Such configuration is not suitable for every use case. Shard configuration needs to be computed properly in order to maintain a stable and efficient index.
Shard size is quite critical for search queries. If there would be too much shards that are assigned to an index, Lucene segments would be small which causes an increase in overhead. Lots of small shards would also reduce the query throughput when multiple queries are made simultaneously. On the other hand, too large shards causes a decrease in search performance and longer recovery time from failure. Therefore, it is suggested by Elasticsearch that one shard’s size should be around 20 to 40 GB.
For instance, if you calculated that your index would store 300 GBs of data, you would assign 9 to 15 primary shards to that index. Depending on the cluster size, let’s say you have 10 nodes in your cluster, you would choose to have 10 primary shards for this index in order to distribute shards evenly between the nodes of your cluster.
If there is a data stream that continuously ingested to the Elasticsearch cluster, time based indices should be used in order to maintain indices more easily. If throughput of the stream changes over time, just changing the next index’s configuration appropriately simplifies the adaptation and allows to scale quite easily.
So, how to query for all of the documents residing in separate time based indices? The answer is aliases. Multiple indices could be put into an alias and searching on that alias makes queries as if they were made on a single index. Of course, a balance needs to be kept on how many indices should be put into an alias since too many small indices on an alias affects performance negatively. For example, a decision might need to be made between keeping monthly or weekly indices. If the cluster allows monthly indices in terms of size with the best configuration, there is no need to keep weekly indices as it will affect the performance negatively since there would be too many indices and the results coming from each of them need to be sorted out.
One use case is that just focusing only on the recent events. Elasticsearch has the mechanism of lazy evolution for such use case. Top documents of each segment are already sorted in the index and if total number of documents is not interested, Elasticsearch will only compare top documents per segment by setting track_total_hits to false. This will help speeding up the conjunctions efficiently which works with low-cardinality fields if index sorting mechanism is used.
Sharding allows to make operations in parallel with the distributed architecture and so, it allows horizontal scaling. There are two types of shards. One of them being the primary shard responsible for reading and writing operations like index, reindex, delete etc… The other one is replica shard responsible for high availability and read throughput.
Size of the shards, size of each segment in a shard, how many active ones in a node are the major considerations while optimizing shards.
Replica shards are important to scale out the search throughput and the number of replica shards can be increased carefully if the hardware is suitable for such case. A good launch point for capacity planning is to allocate shards with a factor of 1.5 to 3 times the number of nodes. Usually, the setup that has fewer shards per node will perform better since filesystem cache will be more efficiently distributed across nodes.
Also, the formula for which allocation of the number of replica shards:
max(max_failures, ceil(num_nodes / num_primaries) — 1)
Expected maximum number of nodes to fail needs to be computed first since no one would want data loss in their database. Then, according to the number of primary shards for index and the number of nodes, efficient distribution of replica shards across cluster for high throughput is extracted. The maximum of each gives the true value for the number of replica shards.
One of the most primary considerations when configuring Elasticsearch cluster is that making sure that at least half the available memory goes to the filesystem cache so that Elasticsearch can keep hot regions of the index in physical memory.
Physical available heap space should also be considered while designing the cluster. Elasticsearch recommends shard allocation based on available heap space should be 20 shards/GB at maximum as a good rule-of-thumb. For instance, a node with a 30 GBs of heap should have 600 shards maximum in order to keep the cluster in good health. storage on a one node could be formulated as below:
Disk space a node can support = 20 * (Heap Size per GB) * (Size of Shard in GB)
Since it is common to see shards between 20 and 40 GBs in size in an efficient cluster, maximum storage a node with 16 GBs of available heap space can support is up to 12 TBs of disk space. Awareness of boundaries helps to prepare for better design and future scaling operations. But of course, in order to make the cluster efficient, one shard per index, per node rule also applies on every ideal scenario.
There are many configuration settings could be made on run-time as well as at the initial phase. Being aware of what can be changed during run-time and what can not be is crucial when constructing the Elasticsearch index and the cluster itself for better search performance.
- Using time-based indices for managing data and also for better organization. If there is no write operation on past indices, one can set passed monthly indices to read-only mode in order to improve search performance made on those indices.
- When an index is set to be read-only, force merge operation could be made to reduce the number of segments by merging them. Hence, optimized segments will result in better search performance since overhead per shard depends on the segment count and size.. Do not apply this to read-write indices since it will cause very large segments to be produced (>5Gb per segment). Also, this operation should be done during off-peak times since it is an expensive operation.
- Cache can be utilized for use-cases of the end user. The preference setting can be used to optimize the usage of the caches since it will allow to analyze a narrower subset of the index.
- Swapping can be disabled on each node for stability, and should be avoided at all costs. It can cause garbage collections to last for minutes instead of milliseconds and can cause nodes to respond slowly or even to disconnect from the cluster. In a resilient distributed system, it’s more effective to let the operating system kill the node. It can be disabled by setting bootstrap.memory_lock to True.
- On active indices refresh interval could be increased. Active indices means that data indexing on these indices are still being made. The default refresh interval is 1 second. This forces Elasticsearch to create segment in every second. Increasing this value depending on your use case, (to say, 30s) will allow larger segments to flush and decreases future merge pressure. Therefore, the fact that it decreases merge pressure on active indices makes search queries more stable.
- index.merge.scheduler.max_thread_count is set to Math.max(1, Math.min(4, Runtime.getRuntime().availableProcessors() / 2)) by default. But this works well with SSD configuration. In the case of HDD, it should be set to 1.
- Sometimes, Elasticsearch will rebalance the shards across cluster. This operation might lead to decrease the performance of the search queries. In the production mode, when needed, rebalancing can be set to none through cluster.routing.rebalance.enable setting.
- The searches based on the dates should not include now parameter since now is not a cacheable parameter. Instead, specifically define the timestamp of now in your queries that includes dates in order for it to be cacheable.
- Make use of the copy-to feature of the Elasticsearch for some fields that are been queried more often. For example, brand name, engine version, model name and color fields of the car could be merged with copy-to directive. It will improve the search query performance made on those fields.
- Of course, having a homogeneous cluster is always desirable. But, in case of heterogeneous cluster, it would be good to assign weight on shard allocation for the node that has better hardware. In order to assign weight, cluster.routing.allocation.balance.shard value need to be set while the default value is 0.45f.
- The query itself has also a major impact on the latency of the response. In order not to break the circuit while querying and cause Elasticsearch cluster to be in an unstable condition, indices.breaker.total.limit could be set appropriately to your JVM heap size with respect to your queries’ complexity. The default of this setting is 70% of JVM heap.
- Elasticsearch, by default, assumes primary use case as searching. In case of a need to increase the concurrency, threadpool for search setting could be increased and threadpool for indexing could be decreased with respect to the number of cores in the CPU on a node.
- Adaptive replica selection should be turned on. The request will be redirected to the most responsive nodes instead of round-robin approach based on these:
- Response time of past requests between the coordinating node and the node containing the copy of the data
- Time past search requests took to execute on the node containing the data
- The queue size of the search threadpool on the node containing the data
There are lots of configuration settings for Elasticsearch cluster to improve the latency of queries. If you are looking to design an Elasticsearch cluster that contains a heavy search use case, it would be my pleasure to help you through every single aspect of the design with this blog.
Thanks for reading! If you find this helpful, click on “♥︎”. You can reach out to me via my e-mail if you have any questions.