How to Restore Elasticsearch from a Snapshot

Cemal Ünal
Picus Security Engineering
4 min readMar 14, 2022
Photo by benjamin lehman on Unsplash

We use managed Elasticsearch from AWS OpenSearch Service to reduce the cost of cluster and backup management. Even though AWS makes a great effort to make sure that the cluster outages occur as infrequently as possible, however, hardware failures can and do occur.

Recently, the underlying hardware of our development cluster(which runs as a single node) has encountered a hardware failure and has been degraded, and all the data disappeared. Fortunately, we had automated snapshots that are taken regularly by OpenSearch service and we were able to restore the cluster to the desired state by using these snapshots.

In this article, we will see how can we restore from an automated snapshot when such a hardware failure event or another incident occurs.

AWS OpenSearch Service offers the following snapshot taking methods:

  • Automated snapshots: These snapshots allow us to restore our domain in the event of red cluster status or data loss. OpenSearch Service stores automated snapshots in a preconfigured Amazon S3 bucket at no additional charge.
  • Manual snapshots: These snapshots are also cluster recovery or for moving data from one cluster to another. These are stored in our own Amazon S3 bucket and standard S3 charges apply.

Considerations before recovering from a snapshot:

  • The restore operation must be performed on a functioning cluster.
  • An existing index can only be restored if it’s closed and has the same number of shards as the index in the snapshot. AWS recommends ceasing write requests to a cluster before restoring from a snapshot.

Let’s assume that we want to restore some indices and we are running our cluster in example.com domain. Here you can see the steps below to achieve this:

1) Close the index:

$ curl -XPOST https://example.com/test/_close?pretty{
"acknowledged": true,
"shards_acknowledged": true,
"indices": {
"test": {
"closed": true
}
}
}

Please note that if you have more than one existing index in your cluster, you should also close them.

2) List all repositories that contain snapshots:

$ curl -XGET https://example.com/_snapshot?pretty{
"cs-automated": {
"type": "s3"
}
}

AWS documentation states that most automated snapshots are stored in the cs-automated repository. If a domain encrypts data at rest, they're stored in the cs-automated-enc repository. Therefore, we will use the cs-automated repository since we only have automated snapshots and we are not using encryption at rest.

3) List all snapshots and decide which one to restore:

We need to pick the snapshot name by checking the snapshot field in the JSON response.

$ curl -XGET https://example.com/_snapshot/cs-automated/_all?pretty{
"snapshots" : [ {
"snapshot" : "2022-02-21t13-34-32.15a0d6h1-1a44-3453-b088-453vec95fd88",
"uuid" : "...",
"version_id" : ...,
"version" : "...",
"indices" : [ "filebeat-7.6.1-container-2022.02.18", ...],
"data_streams" : [ ],
"include_global_state" : true,
"state" : "SUCCESS",
"start_time" : "2022-02-21T13:34:32.587Z",
"start_time_in_millis" : 1645450472000,
"end_time" : "2022-02-21T13:35:29.639Z",
"end_time_in_millis" : 1645450529000,
"duration_in_millis" : 49052,
"failures" : [ ],
"shards" : {
"total" : 153,
"failed" : 0,
"successful" : 153
}
},
....
....
....
....
]
}

In this case, let’s say we decided to use the snapshot with the name 2022–02–21t13–34–32.15a0d6h1–1a44–3453-b088–453vec95fd88

4) Restore the cluster from the selected snapshot :

$ curl -XPOST https://example.com/_snapshot/cs-automated/2022–02–21t13–34–32.15a0d6h1–1a44–3453-b088–453vec95fd88/_restore

5) Monitor the recovery process by executing the following:

$ curl -XGET https://example.com_cat/recovery?vindex shard  stage  files_percent bytes_percent ...
test 0 done 100.0% 100.0%
test 1 done 100.0% 100.0%
test 2 done 100.0% 100.0%
test 3 done 100.0% 100.0%
test 4 done 100.0% 100.0%

6) When the restore operation is completed successfully, open the closed index/indices:

$ curl -XPOST https://example.com/test/_open?pretty{
"acknowledged" : true,
"shards_acknowledged" : true
}

We see that we can able to restore some indexes using automated snapshots which is great. On the other hand, to mitigate and reduce the probability of the data loss on hardware fail incidents happening in the future, AWS recommends in priority to increase the availability of your Opensearch domain by increasing the number of nodes and using at least one replica for each shard.

Bonus: What if we want to restore only a specific index instead of all indices?

Suppose that we accidentally deleted an index and we want to restore it from the latest snapshot. In this case, again we need to retrieve the snapshot name just like in step 3 above.

After retrieving the snapshot name, we can use the indices parameter like following:

$ curl -XPOST https://example.com/_snapshot/cs-automated/2022–02–21t13–34–32.15a0d6h1–1a44–3453-b088–453vec95fd88/_restore -H 'Content-Type: application/json' -d' {"indices": "test" }'

Thanks for reading. If you have questions or comments regarding this article, please feel free to leave a comment below.

--

--

Cemal Ünal
Picus Security Engineering

Cloud Software Engineer @ Picus Security | AWS Certified DevOps Engineer Professional