Elasticsearch Node Roles

3 min readMar 8, 2022

An Elasticsearch software cluster runs on the back of node hardware. Node roles are determined via each node’s Elasticsearch settings and then confirmed via CAT Nodes.

Node role architecture centers around the following questions:

Does this need to be a production / highly-available cluster?
Will you be ingesting time-series data? (Implies using ILM.)
Do you plan to use any of the following features initially: Stack Monitoring, Ingest Pipelines, Machine Learning, Cross-cluster search/replication, Transforms, Fleet, Elastic Security?
Do you plan to use any of the following features ever: Searchable Snapshots?

Setting Config

When you set your node.roles in elasticsearch.yml, e.g. a data node will appear:

node.roles: [data_hot, data_content]

where data_content allows non-time-series and data_hot allows time-series data storage. At least one data-storage node in the cluster should set both, but as we’ll see below may set other values as well.

If you don’t set node.roles a default group is set on your behalf. From CAT Nodes, this default setup appears:

# GET _cat/nodes?v=true
heap.percent ram.percent cpu node.role   master name
          25          76  56 cdfhilmrstw *      elk

This example one-node cluster is great for my testing or mini Python projects, but would not be considered highly-available nor intended for Production use.

First Draft — Master Nodes

Elastic’s Node doc outlines the following basic master node scenarios to start your architecture draft. Node role letters are interpreted via the CAT Node doc and also included farther down. (For this image, the ones we care about are: [m=master, s=data_content, h=data_hot, v=voting_only].)

If your cluster needs to be highly available (HA), you’ll want to note the three node master-eligible configuration. For more information see Elastic’s Plan for Production and Set up a Cluster for High Availability. You can also review Elastic’s Designing for Resilience to match master-eligible node structure to your use case.

Feature Usage — Time-series

If you expect to ingest time-series data, Elastic recommends using their Index Lifecycle Management (ILM) which allows the data to move to more economical nodes and eventually delete on an automated schedule.

Elasticsearch used to require manual node “temperature” setup via Node Attributes but latter moved to Data Tiers to automate the data lifecycle. (Technically, you can still do both, with some overlap.)

However, if you want to use Searchable Snapshots you are required to use a Frozen Data Tier (which usually goes with doing all node temperatures via Data Tier). I can tell you from experience migrating from node attributes to node roles is unpleasant and best avoided.

Feature Usage — Other

The other node roles backing the features listed above can be added during initial setup or appended later on as needed. Node roles correlate to this feature list:

Correlation of CAT Node roles to which Elastic features they support

There aren’t much restrictions about if these feature-specific node roles can overlap or not. From Elastic docs it mainly comes down to if the node hardware is setup to allow each feature to run individually, then it can host multiple features if its hardware satisfies each feature need. The only caveat I’m aware of is when any individual feature is used intensely it’ll be separated out (e.g. Machine Learning, Transform, Ingest Pipeline, Cross-cluster client).

Sizing

So now we know which node roles we want and our basic master-eligible configuration, but what about determining size and quantity of the other node roles?

Elastic’s famous answer is rightly “It Depends”, but they offer pretty good guidance in the way of

Sizing the Elastic Stack for Security Use Cases (discusses ingest rate x lifecycle equations)
How to Design your Elasticsearch Data Storage Architecture for Scale
Benchmarking and Sizing your Elasticsearch Cluster for Logs and Metrics
Elasticsearch Architecture Best Practices