Does Elasticsearch lie? How does Elasticsearch work?

Maciej Szymczyk
The Startup
Published in
5 min readJun 23, 2020

--

Elasticsearch surprises us with its capabilities and speed of action, but does it return the correct results? In this post, you’ll learn how Elasticsearch works under the hood, and why returned aggregations are some kind of approximation.

Elasticsearch under the hood

Indices, shards and replicas

Let’s start with how Elasticsearch organizes data. You already know the concepts of index, it looks like a table in a relational database. Each index consists of at least one shard and any number of replicas.

Shards divide the data in the index. Such shard can be allocated on different nodes in the cluster. Thanks to this we can distribute data processing. Generalizing, based on routing, the added document will go to the first, next to the second, etc.

On the other hand, replicas play the role of protection against failure and support readings (you can only write to primary shards, you can read from shards and replicas). Suppose you have an index with a configuration of 3 primary shardy and 2 replicas. How many indexes (primary + replica) will the index have? 9 (1, 1 ′, 1 ″, 2, 2 ′, 2 ″, 3, 3 ′, 3 ″).

Primary shard and replicas within the same data form a replication group. When a new record goes to…

--

--

Maciej Szymczyk
The Startup

Software Developer, Big Data Engineer, Blogger (https://wiadrodanych.pl), Amateur Cyclists & Triathlete, @maciej_szymczyk