Rebuild Elasticsearch index without downtime

Syed Andaleeb Roomy
Craftsmen — Software Maestros
5 min readOct 26, 2019

Elasticsearch allows us to add fast full-text searching and autocomplete features to our projects very easily. We can create an “index” in Elasticsearch and store the items we want to search as JSON documents into the index, to be able to run queries depending on different fields of the documents.

Do we ever really need to rebuild an index?

Yes. There are situations when this becomes unavoidable. For example, if we need to change any of the existing field mappings or analyzers. ES in general does not allow this. So you need to build a new index separate from the existing one.

Can we keep the same index name?

No. In order to have a smooth transition in a live system, we need to keep the old index alive while we are building the new one. So we cannot use the existing name for the new index while the old one still lives.

What about code that refers to the old index?

It can be a problem to update code to use the new index name every time we rebuild. So we should actually not write code that directly uses the index name for searching and indexing.

Not use the index name? Then what?

Alias! We should make an alias to the index, and use that in our searching/indexing code. Then we can update the alias in ES to point to the new index whenever we rebuild. This way, our code that accesses the index does not need to change every time we rebuild the index.

How to keep an index up to date with changes to the source data?

Usually, the ES index consists of data coming from a more persistent storage, e.g., a table in DynamoDB. We can have a trigger set up on the source database table, so that on any modifications to any row, we get a chance to run our indexing code (e.g., with AWS Lambda) for that modified row.

Trigger code keeps the index up to date with DB

How do we actually (re)build an index?

When we need to build an index, we first create the index with new mappings/analyzers and other required settings, then we go through all the rows of the table, and index each row as a document in the new ES index.

Each DB record is indexed while rebuilding

Can’t we just use the _reindex API?

The _reindex API takes a snapshot of the source index and copies the documents to the destination index. It is better to have our own code for building an index from source data instead of relying on the _reindex API, because the old index may not have stored source documents with all the fields required for the new index.

So when do we switch over?

When we are done building the new index, we can update the alias to point to the new index instead of the old one. ES can do this atomically. Then we can delete the old index.

POST /_aliases
{
"actions" : [
{
"remove" : { "index" : "current_index", "alias" : "alias1" }
},
{
"add" : { "index" : "new_index", "alias" : "alias1" }
}
]
}
Alias points to the new index after rebuilding is done

Wait, I know a case when this will not work…

Indeed! In a live system, there can be continuous updates happening to the source data while the new index is being built. Suppose we just indexed row1 in the new index while building it. Then before the the whole rebuild completes and we switch the alias over, row1 gets modified by a user, as a result of which a DB trigger fires that indexes it in the old index. After the switch, when we query the new index, we will get old data for row1, since the new data did not get indexed in the new index.

Can we solve this?

We can, by using two aliases instead of one. One for queries (let’s call it read_alias), and one for indexing (write_alias). We can write our code so that all indexing happens through the write_alias and all queries go through the read_alias. Let’s consider three periods of time:

Before rebuild
read_alias: points to current_index
write_alias: points to current_index

Before rebuild: both aliases point to the same current_index

All queries return current data.
All modifications go into current_index.

During rebuild
read_alias: points to current_index
write_alias: points to new_index

During rebuild: read and write aliases point to different indices

All queries keep getting data as it existed before the rebuild, since searching code uses read_alias.
All rows, including modified ones, get indexed into the new_index, since both the rebuilding loop and the DB trigger use the write_alias.

After rebuild
read_alias: points to new_index
write_alias: points to new_index

After rebuild: both aliases point to the same new_index

All queries return new data, including the modifications made during rebuild.
All modifications go into new_index.

Can we get the modified data even while rebuilding?

It should be possible to do even that, if we make the DB trigger code index modified rows into both the indices while the rebuild is going on (i.e., while the aliases point to different indices).

DB trigger updates both indices during rebuild

At Craftsmen, we have been using Amazon Elasticsearch Service for a long time and this is how we have been rebuilding index without having any downtime or data loss.

--

--