A Guide to Kubernetes DaemonSets

Published in

Supergiant.io

8 min readMar 5, 2019

As you remember, Deployments and StatefulSets ensure that a specified number of application replicas (the desired state) is always running. DaemonSets take the same approach but applies it to nodes. In a nutshell, a DaemonSet makes sure that all (or several) nodes in your Kubernetes cluster run a copy of a pod.

Why this might be useful? Well, Deployments and ReplicaSets are good in maintaining the desired state: they always keep a certain number of pods running, right? They can also distribute pods across nodes if the node’s resource constraints allow. The resource constraints of each node, however, might also play a factor in how many pods from the ReplicaSet can be scheduled on each node. Because of this, Deployments cannot guarantee that specific pods appear on every node. This the first reason why you may need DaemonSets.

The second reason is this. If you have spent time as a systems administrator, you might know that there are many programs that need to run on a system without intervention from the user. These programs are often referred to as background processes or Daemons. DaemonSets are good for running such programs.

So, a DaemonSet takes a pod specification given to it and ensures that the pod is scheduled and running on every single available node. Of course, if a node is unavailable or out of resources, it will not schedule. This is why many DaemonSets run with resource requests set to “0,” even though they need resources.

Once the DaemonSet is created, it will dynamically add pods to nodes. For example, if a new node is added to the cluster, the DaemonSet controller will automatically add a pod to this node. On the contrary, if a node is removed from the cluster, a DaemonSet will ensure that pods living on that node are garbage collected.

DaemonSet Use Cases

DaemonSets can be used to deploy a wide variety of daemons and background processes cluster-wide. For example, at Qbox we use DaemonSets in following ways:

for a node monitoring daemon which provides node information to Prometheus. Other monitoring options with DaemonSets include collectd, Ganglia gmond or Instana agent.
for a daemon which does logs rotation and cleaning log files.
for troubleshooting using the node-problem-detector. This is a daemon that runs on each node, detects node problems and reports them to the api-server.
for running a cluster storage daemon, such as glusterd or ceph, on each node.
for running a logs collection daemon, such as fluentd or logstash , on each node.

Tutorial

To complete examples used below, you’ll need the following prerequisites:

a running Kubernetes cluster. See Supergiant documentation for more information about deploying a Kubernetes cluster with Supergiant. As an alternative, you can install a single-node Kubernetes cluster on a local system using Minikube. However, because Minikube creates a single-node cluster, it’s not a production option for deploying DaemonSets.
A kubectl command line tool installed and configured to communicate with the cluster. See how to install kubectl here.

Defining a DaemonSet

A DaemonSet spec is similar to Deployments in that you define a pod template and a selector to match a set of pods defined by that template. In the example below, we define a DaemonSet for the Fluentd log collector for collecting logs from each node in the Kubernetes cluster and sending them to Elasticsearch.

apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: fluentd
  namespace: kube-system
  labels:
    k8s-app: fluentd-logging
    version: v1
    kubernetes.io/cluster-service: "true"
spec:
  selector:
    matchLabels:
      name: fluentd-elasticsearch
  template:
    metadata:
      labels:
        name: fluentd-elasticsearch
    spec:
      serviceAccount: fluentd
      serviceAccountName: fluentd
      tolerations:
      - key: node-role.kubernetes.io/master
        effect: NoSchedule
      containers:
      - name: fluentd
        image: fluent/fluentd-kubernetes-daemonset:elasticsearch
        env:
        - name:  FLUENT_ELASTICSEARCH_HOST
          value: "f505e785.qb0x.com"
        - name:  FLUENT_ELASTICSEARCH_PORT
          value: "30216"
        - name: FLUENT_ELASTICSEARCH_SCHEME
          value: "https"
        - name: FLUENT_UID
          value: "0"
        - name: FLUENT_ELASTICSEARCH_USER
          value: "kksdf992388923"
        - name: FLUENT_ELASTICSEARCH_PASSWORD
          value: "75c8sdfkk92"
        resources:
          limits:
            memory: 200Mi
          requests:
            cpu: 100m
            memory: 200Mi
        volumeMounts:
        - name: varlog
          mountPath: /var/log
        - name: varlibdockercontainers
          mountPath: /var/lib/docker/containers
          readOnly: true
      terminationGracePeriodSeconds: 30
      volumes:
      - name: varlog
        hostPath:
          path: /var/log
      - name: varlibdockercontainers
        hostPath:
          path: /var/lib/docker/containers

DaemonSets have a number of specific requirements and fields you should be aware of. Let’s discuss those in detail.

.spec.template — this is a required field that specifies a pod template for the DaemonSet to use. Along with the required fields for containers, this template requires appropriate labels (.spec.template.metadata.labels). You should also remember that a pod template of your DaemonSet must have a RestartPolicy equal to Always, which defaults to Always if not specified.

.spec.selector — this is a required field that allows the DeamonSet to select a group of pods based on the specified labels. The field is important because it ensures that only specific pods are managed by the DaemonSet. As of Kubernetes 1.8, a pod selector should match labels of the pod template .spec.template.metadata.labels discussed above. In our case, a pod selector matches all pods that have a label name: fluentd-elasticsearch. Please, take note that once the DaemonSet is created, you can not change selectors and labels. This may lead to unpredictable behavior and bugs.

As an alternative to the matchLabels, you can also use matchExpressions inside the .spec.selector. This field allows building more sophisticated selectors by specifying key, list of values and an operator that relates the key and values. If you specify both matchLabels and matchExpressions, the result is ANDed.

Defining Tolerations for Pods

Taints and tolerations is a big topic we’ve already covered in the other tutorial. However, the knowledge of their basics is important for properly configuring DaemonSets. To make a long story short, taints and tolerations work together to ensure that pods are not scheduled onto inappropriate nodes. Taints prohibit certain nodes from scheduling pods on them whereas tolerations allow (but not require) certain pods to be scheduled on nodes with matching taints.

Before tolerations can be applied in a DeamonSet, you should have an appropriate taint added to a node. A taint can be added to a node using kubectl taint command. For example:

kubectl taint nodes node1 taintKey=taintValue:NoSchedule

adds a taint to node1. The taint has a key taintKey and a value taintValue, and taint effect NoSchedule. This means that no pod will be able to schedule onto node1 unless it has a matching toleration.

Once the taint is applied, you can use tolerations in a pod template of the DeamonSet in the.spec.template.spec.tolerations field. In our case, we use the tolerations with the key:node-role.kubernetes.io/master and the effect NoSchedule to allow Daemon pods to be scheduled on the master of our Kubernetes cluster (see another taint-toleration example in the image below).

Tolerations are useful when you want to run DaemonSet pods on nodes that prohibit scheduling. However, what about running DaemonSet pods on specific nodes matching certain criteria? In this scenario, we can use the.spec.template.spec.nodeSelector field that tells the DaemonSet controller to create pods only on those nodes which match the node selector (e.g"disktype: ssd"). For a given pod to run on a node, the latter must have each of the indicated key-value pairs as labels. In our example, the node must have a "disktype: ssd" label. Correspondingly, this label should be applied to a node/s so that the DaemonSet is able to schedule pods on this node/s.

To label a node, you first need to get the nodes’ names. This can be done with kubectl get nodes command. Then, select the node that you want to add a label to, and run kubectl label nodes <node-name> <label-key>=<label-value> to add a label to the node you’ve chosen.

For example, if your node name is kubernetes-first-node-1.b.e and your desired label is disktype=ssd, you can run kubectl label nodes kubernetes-first-node-1.b.e disktype=ssd to add this label. You can verify that it worked by re-running kubectl get nodes --show-labels and checking that the node now has a new label.

You can also achieve a more fine-grained control over how nodes are selected using a .spec.template.spec.affinity field. In that case, the DaemonSet controller will create Pods on nodes which match that node affinity. This is part of a bigger topic of assigning pods to nodes which will be discussed in the future tutorials.

Now, as you understand the basics, we can create the DeamonSet by saving the spec above in a file (e.g fluentd-es.yml) and running:

kubectl create -f fluentd-es.yml

Please, note that for this example to work you should use a working Elasticsearch cluster with corresponding credentials and have a Fluentd RBAC created beforehand.

Updating DaemonSets

Some actions trigger DeamonSet automatic updates. For example, if you change the node label from "disktype: ssd" to "disktype: hdd", the DaemonSet will add pods to nodes that match the new label and automatically delete pods from not-matching nodes.

Sometimes, you may need to modify the pods that a DaemonSet creates. If this is a case, remember that pods do not allow all fields to be updated (e.g pod labels). Also, the DaemonSet controller will apply the original pod template the next time a node is created. That means that all modifications will be lost!

Performing a Rolling Update on a DaemonSet

Since Kubernetes version 1.6, you can perform a rolling update on a DaemonSet. RollingUpdate is the default update strategy for DaemonSets. If it’s enabled, after you update a DaemonSet template, old DaemonSet pods will be removed, and new DaemonSet pods will be created automatically.

Let’s use a simple DaemonSet for Apache HTTP server to illustrate this.

apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: logging
spec:
  selector:
    matchLabels:
      app: httpd-logging
  template:
    metadata:
      labels:
        app: httpd-logging
    spec:
      containers:
        - name: webserver
          image: httpd
          ports:
          - containerPort: 80

As you see, we did not explicitly specify the RollingUpdate strategy (it’s default). However, you could set .spec.updateStrategy.type to RollingUpdate to achieve the same result. You may also want to set .spec.updateStrategy.rollingUpdate.maxUnavailable (defaults to 1) and .spec.minReadySeconds (defaults to 0) as well. As you might remember from our tutorial about Deployments, the first feature specifies the maximum number of Pods that can be unavailable during the update process, whereas the second feature specifies the minimum number of seconds for which a newly created Pod should be ready without any of its containers crashing, for it to be considered available.

For the sake of simplicity and because we are using single-node Minikube cluster, we omitted those settings. You can find more options for Rolling Updates in our article about Kubernetes deployments.

Let’s save the spec above in httpd-dset.yml and run the following command:

kubectl create -f httpd-dset.yml
daemonset.apps “logging” created

After the DaemonSet is created, any updates to a RollingUpdate DaemonSet .spec.template will trigger a rolling update. For example, we can trigger the RollingUpdate by changing the container image used by the DaemonSet:

kubectl create -f httpd-dset.yml
daemonset.apps “logging” created

After setting a new image, you can watch the rollout status of the latest DaemonSet rolling update:

kubectl rollout status ds/logging

You should see the following output:

daemon set “logging” successfully rolled out

Deleting DaemonSets

When you delete a DaemonSets, all pods managed by it are automatically deleted too. However, if you want to maintain those pods intact, you can specify --cascade=false with kubectl. Then, if you create a new DaemonSet with a different template, it will automatically recognize all existing Pods as having matching labels. It will not modify or delete them despite a mismatch in the Pod template. Therefore, you will need to force new Pod creation by deleting the Pod or deleting the node.

Conclusion

As you’ve seen, DaemonSets are very useful for running daemons and background processes on specific nodes of your Kubernetes cluster. A wide variety of applications that collect logs, manage cluster-wide storage, or monitor nodes use this pattern. DaemonSets are very mature in the latest Kubernetes versions. In particular, you can configure them to create pods on certain nodes using tolerations and node selectors and perform rolling updates on them in a controlled manner.

Originally published at supergiant.io.