Elasticsearch Cluster Administration

Road to Elastic Certified Engineer IV — Elasticsearch 7.2

Carlos Cilleruelo
Apr 1 · 5 min read

After learning how to perform CRUD operations into elasticsearch, we should learn how to administrate our cluster. Backups and shard allocations are fundamental tasks that we should be able to perform.

Shard allocation filtering

As mentioned in previous posts elastic allocate indices into one or more shards, and we can save those shards in specific cluster nodes. For example, imagine that you have several data cluster nodes, two of them with SSD storage. If we are looking for a fast response over one of our indexes, we can configure that their shards go only to the SSD data nodes. This concept is called shard allocation filtering. We can allocate the shards of an index to specific nodes based on a given set of requirements.

In order to perform shard allocation, we need to specify attributes for each node. We do this each time we launch our node:

Or specify those attributes in elasticsearch.yml config file:

Then we can specify our chosen index, twitter, to save on the data nodes with SSD attribute and size medium.

Also, another possibility to perform these operations is to use shard allocation awareness. We can enable Elasticsearch to take our physical hardware configuration into account.

For example, we can specify the rack where each node of the cluster is running:

Or using elasticsearch.yml:

Then we need to specify in our master node that we are going to use rack_id as an attribute:

Later on, if we add new nodes and specify a different rack_id, like rack_two because there are in a different rack. Elasticsearch will move shards to the new nodes, ensuring (if possible) that two copies are not stored in the same rack. This way we can prevent multiple copies of a particular shard from being allocated in the same location.

It could be the case that one rack fails and then you might not have sufficient resources to host your primary and replica shards in only one rack. To prevent overloading in case of a failure, we can use force awareness.

We can specify that replicas will be allocated only if both racks are available. To do that we need to specify in our master node that we are going to use rack_id as an attribute and force the rack_id values.

Cluster’s health

Our cluster has several possible health statuses: green, yellow or red. These health colours will change base on shard allocation, a red status indicates that the specific shard is not allocated, yellow means that the primary shard is allocated but replicas are not, and green means that all shards are allocated.

Cluster health can be retrieved using the Cluster Health API:

Moreover, we can check the health of an individual index and their shards:

If we are dealing with a cluster in yellow or red the status we can check the reasons using the cluster allocation explain API:

As mentioned before statuses are associated with shard allocation. For example, we can check the status of replica shards, ”primary”: false , of an index:

To solve health issues perhaps we will need to add new data nodes or change the allocation rules of an index. In this example, I am configuring the twitter index specifying that its primary shard cannot remain on the node named “data-node1”. This node will only use for replica shards.

Backup and restore

One can think that a backup is as simple as copying the data directories of all of the nodes. You should not make backups this way because elasticsearch may be performing changes to its data while it is running. In order to perform backups, elasticsearch offers us a snapshot API. You can take snapshots of a running index or the entire cluster, and then store that information somewhere else.

Furthermore, snapshots are taken incrementally. Each time we create a snapshot of an index, elasticsearch will avoid copying any data that is already stored because of an earlier snapshot. Because of that is it recommended to take snapshots of your cluster quite frequently.

To save snapshots we need to create a repository. First, we need to specify the possible paths for each repository inside the elasticsearch.yml config file, of each master and data nodes.

After that we can register a repository with a name:

This can also be configured from Kibana going to Management and then Snapshot and Restore.

Independently of the repository and future snapshots we should always back up the elasticsearch config folder, which includes elasticsearch.yml. In this case, we can just make a backup of this folder with our backup tool. But elasticsearch security features are store inside a dedicated index. So it is necessary to backup the .security index.

In order to perform operations, with security features enabled, we need to have the role snapshot_user assign to a user or create a user with the role snapshot_user .

And then take a snapshot of the .security index:

After successfully backup our security configuration we can always restore it. Just create a new user with the superuser role:

Delete the previous security data:

And restore the security index with the new user:

The easiest way to back up our data is to perform a snapshot. By default a snapshot will copy all open and started indices in the cluster:

But we can always just backup some indices:

And after performing a snapshot we can check its status:

Finally, after successfully performing a snapshot we can restore it using _restore and specifying a snapshot name.

Either way, we can just restore some indices:

Or restore an index and change its configuration:

Similar to indices delete a snapshot jus require to specify its name and repository sending a DELETE requests against _snapshot API:

Cross cluster search

An additional useful configuration is to enable cross-cluster search. To do that we just need to specify the IP addresses of each cluster:

After performing this configuration, we can search in remote clusters specifying their names on the requests:

Or search across two clusters at the same time:

Geek Culture

Proud to geek out. Follow to join our +500K monthly readers.

Carlos Cilleruelo

Written by

Bachelor of Computer Science and MSc on Cyber Security. Currently working as a cybersecurity researcher at the University of Alcalá.

Geek Culture

A new tech publication by Start it up (https://medium.com/swlh).

Carlos Cilleruelo

Written by

Bachelor of Computer Science and MSc on Cyber Security. Currently working as a cybersecurity researcher at the University of Alcalá.

Geek Culture

A new tech publication by Start it up (https://medium.com/swlh).

Medium is an open platform where 170 million readers come to find insightful and dynamic thinking. Here, expert and undiscovered voices alike dive into the heart of any topic and bring new ideas to the surface. Learn more

Follow the writers, publications, and topics that matter to you, and you’ll see them on your homepage and in your inbox. Explore

If you have a story to tell, knowledge to share, or a perspective to offer — welcome home. It’s easy and free to post your thinking on any topic. Write on Medium

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store