Etcd — the what, why, and how 😎

5 min readOct 15, 2023

Introduction

If you have just started learning kubernetes, you definitely would have heard about ‘etcd’. It is a core component of kubernetes and without it, the kubernetes cluster won’t be able to function properly. In this article, let me try to explain all about etcd what it is, why we use it, and how to use it.

What is etcd?

According to the technical definition of etcd,

“It is a strongly consistent, distributed key-value store that provides a reliable way to store data that needs to be accessed by a distributed system or cluster of machines”.

Okay unless you are some kind of robot you wouldn’t have understood that. So let’s break it down.

Etcd is designed to be running on different nodes (forming a cluster commonly referred to as an etcd cluster) and the data is stored in the form of a key-value pair. For example, name=murtaza, here “name” is the key, and “murtaza” is the value. Now since etcd is designed to be running on multiple nodes it is essential that the data which is stored on all nodes is the same i.e. consistent. How does etcd ensure consistency across different nodes? Read on to find out.

Why do we use etcd in kubernetes?

Most of the time when we hear about etcd it is in conjunction with kubernetes, the question might arise in your mind why etcd with kubernetes? Why not some other database? To answer this question we need to know a bit about the architecture of kubernetes and about the API server.

The API server is a kubernetes component that in itself doesn’t do much but is responsible for the communication between different components in the cluster. The communication between components can happen effectively if the API server is backed by a database that ensures high consistency, high availability, and high performance.

Generally, we have databases that store data in the form of tables (SQL) like Postgres, MySQL, etc as well as databases that store data in the form of non-relational data (NoSQL) like MongoDB, CouchDB, etc. These databases have their place and have some shortcomings due to which they (at least for now) cannot replace etcd. Most of the aforementioned databases have a limitation in that they are not distributed, unlike etcd. Some of the databases which can compete with etcd are Apache Zookeeper and HashiCorps Consul which are compared in this table.

What happens when an etcd node goes down?

As with any other service, etcd is not impervious to failure, in fact having etcd running on multiple nodes is one of the ways that we can prevent our cluster from going down in case a node goes down. Etcd provides built-in fault tolerance in case some node goes down in etcd.

As you can guess if you are running etcd as a single node cluster then you don’t have any fault tolerance since if this node goes down there is nothing that can recover your data. So for more important clusters stick to running etcd on multiple nodes (in odd numbers like 3, 5, 7, etc.)

Based on the number of etcd nodes you have running in the cluster this table shows the fault tolerance.

As you can see from the above image 1 & 2 node etcd clusters provide no fault tolerance meaning if your etcd goes down you are done for 🤣. But if you are using any etcd node count higher than 2 you should have at least some tolerance.

Under the hood of etcd

Etcd is a distributed database and it works based on the Raft Consensus algorithm. This algorithm is used to ensure high data consistency across all nodes in the cluster 💪.

All the nodes in a running etcd can be in either one of 3 states:

Follower (Default mode)
Leader
Candidate

Etcd manages to maintain consistency among the nodes in the etcd cluster by selecting one of the nodes among themselves as the leader node which takes care of data replication and consistency among the remaining nodes known as the follower node. The leader node receives all the requests to write data and it signals all the follower nodes to make a log entry in their respective local state once the leader ascertains that the majority of followers have done that, the leader makes the change in its local state machine. If the leader fails to confirm that the majority of the followers have recorded the change then it retries until it is confirmed.

If a follower node is not able to receive messages sent from the leader for a specific time interval (known as the election timeout), then the node declares itself as a candidate and starts an election procedure and the entire process starts all over again.

If you would like to see the representation of this process (raft algorithm) through a visual mode (which I recommend you do 😄) go to this link.

Configuring etcd

You can configure etcd in two ways

Single node etcd cluster: You can run etcd on a single node within the kubernetes cluster but this is generally a setup you would want for a sandbox or testing environment.
Multi node etcd cluster: Running etcd in this format improves the reliability of the system and this format is recommended for the production environments.

For the Multi Node etcd cluster it again has some ways in which we can configure it which are listed below.

Setting etcd up as static pods on a highly available kubernetes cluster (Default).
Setting etcd as an external cluster. Generally referred to as external etcd setup.
Setting it up as a systemd service on the master nodes of the kubernetes cluster.

Conclusion

In this article, we went over some of the basic questions that might come across your mind when dealing with etcd as well as some of the options you might need to consider while setting up the etcd cluster. In the next article, we will discuss how to setup an etcd cluster as a systemd service 😄.