ETCD - the Easy Way

Published in

Nerd For Tech

6 min readMay 2, 2021

This is a guide which will help you get started with etcd and help you understand how it is used in a kubernetes setup.

To put in one sentence, etcd is a distributed, reliable key-value store for the most critical data of a distributed system.

Every necessary detail of every resource in a cluster is stored in the form of key-value pairs divided into different directories as per the type of resource like namespaces, pods, apiservices, clusterroles, configmaps, deployments, etc. These key-value stores helps a kubernetes cluster to maintain its intended state.

Brain of the cluster

Also called the brain of a kubernetes cluster, etcd has complete knowlegde of every resource in a cluster. Let me show you how

When creating a resource…

Let’s take an example where we are creating a pod using a command like kubectl run some-pod --image=some-image, then the following steps would occur

User uses kubectl command line tool to send the request to kube-apiserver. The request is then autheticated, validated and executed. Pod is created (with no node assigned) and this data is set into the etcd.
A confirmation is sent back to the user stating that the pod is created.
Kube-scheduler monitors the api-server to find new requests and upon receiving the request, it looks for available nodes to schedule the pod. Once a node is found, it passes the information to the apiserver which feeds the same into the etcd.
Apiserver sends the pod data to the kubelet on the available worked node which runs the pod on the respective container runtime engine and sends back confirmation.
Pod status is stored back in etcd.

A similar process is followed when a resource is updated.

When fetching a resource…

Now that the pod is created, we can use kubectl get pods command to fetch the pod details.

Since etcd is updated after all the stages, the api-server can directly fetch the data from the etcd.

User sends request to kube-apiserver which is then authenticated and validated.
Apiserver checks the etcd for pod data
Data is retrieved and sent back to the user.

Here you can see that every operation is updating the etcd with the latest status. There are also a few more thing to note here

All kubectl get commands fetch data straight from etcd
Only the kube-apiserver directly interacts with the etcd
Every change when gets updated in the etcd, only then it is considered complete.

Get started with etcd

If you have installed a cluster using a tool like kubeadm, it is likely that you will find etcd already installed in your cluster. In this case, you can view the etcd pod as follows

controlplane $ kubectl get pods --all-namespacesNAMESPACE     NAME                 READY   STATUS    RESTARTS   AGE
kube-system   coredns-66bff467...   1/1    Running      0        87s
kube-system   coredns-66bff467...   1/1    Running      0        87s
kube-system   etcd-controlplane     1/1    Running      0        88s
kube-system   kube-apiserver-c...   1/1    Running      0        87s
kube-system   kube-controller-...   1/1    Running      0        87s
kube-system   kube-flannel-ds-...   1/1    Running      0        71s
kube-system   kube-flannel-ds-...   1/1    Running      0        85s
kube-system   kube-keepalived-...   1/1    Running      0        39s
kube-system   kube-proxy-2dfck      1/1    Running      0        86s
kube-system   kube-proxy-kqj2z      1/1    Running      0        86s

And if you’re setting up a cluster from scratch, then you can simply download and build using these instructions.

Once the etcd service is up, it will listen on port 2379 by default. You can now attach clients to the etcd service to start putting and retrieving data. Note that one client would be pre-installed i.e. the etcd control client, etcdctl. You can use this client to interact with the data in the etcd. To do so
1. Set the ETCDCTL_API environment variable to set which version of commands you will be using. By default this value will be 2.

export ETCDCTL_API=3

2. Set the endpoints for different etcd nodes (multi-node/high availability will be discussed later in this article)

HOST_1=X.X.X.X
HOST_2=Y.Y.Y.Y
HOST_3=Z.Z.Z.Z
ENDPOINTS=$HOST_1:2379,$HOST_2:2379,$HOST_3:2379

3. Connect to etcd

etcdctl --endpoints=$ENDPOINTS <some-command>

Some handy commands

For starters, these commands can help you flow through your data in etcd

Check endpoint health:

etcdctl --endpoints=$ENDPOINTS endpoint health10.240.0.17:2379 is healthy: successfully committed proposal: took = 3.345431ms
10.240.0.19:2379 is healthy: successfully committed proposal: took = 3.767967ms
10.240.0.18:2379 is healthy: successfully committed proposal: took = 4.025451ms

To add some data: etcdctl --endpoints=$ENDPOINTS put key1 value1
To retrieve that data: etcdctl --endpoints=$ENDPOINTS get key1
Delete data: etcdctl --endpoints=$ENDPOINTS del key1
Monitor a key: etcdctl --endpoints=$ENDPOINTS watch key1(Any further update on key1 will be displayed on the console)
Conditional/Transactional operations:

etcdctl --endpoints=$ENDPOINTS txn --interactivecompares:
value("key1") = "value1"success requests (get, put, delete):
del key1failure requests (get, put, delete):
put key1 value2

In the above case, if value of key1 is equal to value1, then key1 is deleted else its value is set to value2. The bold parts of the above stated code snippets are prompted by the system as we are using the —-interactive

Temporarily set value:

etcdctl --endpoints=$ENDPOINTS lease grant 60
Output: lease 2be7547fbc6a5afa granted with TTL(60s)etcdctl --endpoints=$ENDPOINTS put key1 value1 --lease=2be7547fbc6a5afa
# key1’s value is valid for next 60setcdctl --endpoints=$ENDPOINTS lease keep-alive 2be7547fbc6a5afa
# lease timer is stopped indefinitely until abortedetcdctl --endpoints=$ENDPOINTS lease revoke 2be7547fbc6a5afa
# lease manually timed out

Take snapshot for disaster recovery: etcdctl --endpoints $ENDPOINT snapshot save snapshot.dbYou can restore your data from this snapshot using the etcdctl snapshot restore command

Running etcd with high availability

The simplest method of implementing high availability is through distribution and replication. On the same path, etcd can be deployed as a cluster of nodes to achieve high availability and resilience.

In such a configuration, all the nodes are available for retrieving data but only one is used for writing. A writer node, or a leader, is elected amongst the nodes using the Raft Consensus Algorithm. To put it simply

A random timer is given to all the nodes participating in the election.
Whichever timer runs out first, that node sends out a request to the other nodes to become their leader.
Other nodes send back acknowledgement to the leader and the node with maximum votes wins the election.

Still don’t get it? Check here

Leader also sends regular notification to the followers that it is continuing to assume leader role. In case a followers don’t receive such notification from leader in the expected time-period, then they re-elect using raft.

Now that we have a leader in place, any write operation sent to etcd is forwarded to the leader node which writes the data and send copies to followers. The write operation is considered complete only when followers confirm back to leader on receipt of this update.

There might be a case that a follower goes down and in such a case, the write is considered complete only when quorum of total number of nodes gets written successfully

For N nodes in a cluster
Quorum = floor(N/2 + 1)

For a single or double node cluster, quorum is same as N. Also, quorum of every even number M is same as quorum of M+1. Hence, fault tolerance is better if we go for M+1 instead. So it is always advisable to keep the number of nodes to be odd and ≥3.

Parting note

I hope you got a bit of an idea about etcd, why is it such an important component in a kubernetes cluster and how to get started with it. Now it’s time to kick start your journey with etcd and keep on creating.