What is Kubernetes?
Kubernetes is a container-orchestration system which was open-sourced by Google in 2014. In simple terms, it makes it easier for us to manage containers by automating various tasks.
If you are not familiar with containers, checkout: https://medium.com/coding-blocks/docker-made-easy-901b792bec7c
Why use Kubernetes?
A container-orchestration engine is used to automate deploying, scaling and managing containerized applications on a group of servers. As I mentioned above, Kubernetes makes it easier for us to manage containers and ensure that there is no downtime. To give you an example, suppose one of the containers that you are running went down, it won’t take much effort to restart it manually. But suppose a large number of containers went down, wouldn’t it be easier if the system handles this issue automatically? Kubernetes can do this for us. Some of the features include scheduling, scaling, load balancing, fault tolerance, deployment, automated rollouts, and rollbacks, etc.
Kubernetes Architecture consists of master node which manages other worker nodes. Worker nodes are nothing but virtual machines / physical servers running within a data center. They expose the underlying network and storage resources to the application.
All these nodes join together to form a cluster with providing fault tolerance and replication. These nodes were previously called minions.
It is responsible for managing the whole cluster. It monitors the health check of worker nodes and shows the information about the members of the cluster as well as their configuration.
For example, if a worker node fails, the master node moves the load to another healthy worker node. Kubernetes master is responsible for scheduling, provisioning, controlling and exposing API to the clients.
It coordinates activities inside the cluster and communicates with worker nodes to keep Kubernetes and applications running.
Components of the Master Node
- API server
Gatekeeper for the entire cluster. CRUD operations for servers go through the API. API server configures the API objects such as pods, services, replication controllers and deployments. It exposes API for almost every operation.
How to interact with this API?
Using a tool called kubectl aka kubecontrol. It talks to the API server to perform any operations that we issue from cmd. In most cases, the master node does not contain containers. It just manages worker nodes, and also makes sure that the cluster of worker nodes are running healthy and successfully.
It is responsible for physically scheduling Pods across multiple nodes. Depending upon the constraints mentioned in the configuration file, scheduler schedules these Pods accordingly.
For example, if you mention CPU has 1 core, memory is 10 GB, DiskType is SSD, etc. Once this artifact is passed to API server, the scheduler will look for the appropriate nodes that meet these criteria & will schedule the Pods accordingly.
- Control Manager
There are 4 controllers behind the control manager.
- Node Controller
- Replication Controller
- Endpoint Controller
- Service Accountant Token Controller
These controllers are responsible for the overall health of the entire cluster. It ensures that nodes are up and running all the time as well as the correct number of Pods are running as mentioned in the spec file.
Distributed key-value lightweight database. Central database to store current cluster state at any point of time. Any component of Kubernetes can query etcd to understand the state of the cluster so this is going to be the single source of truth for all the nodes, components and the masters that are forming Kubernetes cluster.
It is basically any VM or physical server where containers are deployed. Every Node in Kubernetes cluster must run a runtime such as Docker or Rocket.
Components of the Worker Node
Primary Node agent that runs on each worker node inside the cluster. The primary objective is that it looks at the pod spec that was submitted to the API server on the Kubernetes master and ensures that containers described in that pod spec are running and healthy. Incase Kubelet notices any issues with the pods running on the worker nodes, it tries to restart the Pod on the same node. If the fault is with the worker node itself, then Kubernetes Master detects a Node failure and it decides to recreate the Pod on another healthy Node. This also depends on if the Pod is controlled by Replica set or Replication Controller (ensures that the specified number of pods are running at any time). If none of them are behind this Pod then Pod dies & cannot be recreated anywhere. So it is advised to use Pods as deployment or replica set.
Responsible for maintaining the entire network configuration. It maintains the distributed network across all the nodes, across all the pods, and all containers. Also exposes services to the outside world. It is the core networking component inside Kubernetes. The kube-proxy will feed its information about what pods are on this node to iptables. iptables is a firewall in Linux and can route traffic. So when a new pod is launched, kube-proxy is going to change the iptable rules to make sure that this pod is routable within the cluster.
A scheduling unit in Kubernetes. Like a virtual machine in the virtualization world. In the Kubernetes world, we have a Pod. Each Pod consists of one or more containers. There are scenarios where you need to run two or more dependent containers together within a pod where one container will be helping another container. With the help of Pods, we can deploy multiple dependent containers together. Pod acts as a Wrapper around these containers. We interact and manage containers through Pods.
Containers are Runtime Environments for containerized applications. We run container applications inside the containers. These containers reside inside Pods. Containers are designed to run Micro-services. For more detailed information, check out my blog on Docker.
Imagine that, you want to quickly test something on your Kubernetes cluster. But it is not readily available. And you don’t want to set up Kubernetes cluster.
Play-with-k8s provides with a Kubernetes playground which is similar to play-with-docker. A GitHub or Docker account is required.
It is used in case you want to install it on the system but have limited system resources. Minikube is all in one system, i.e. no multiple architectures of master and worker Node. The same system acts as a master as well as a worker node. It can be used for testing purposes.
It is an actual real-time setup. Using Kubeadm tool we can setup multi-node Kubernetes cluster. It is very popular and you can have multiple VMs on your machine and configure Kubernetes master and node components. If you have limited resources but want to use Kubeadm, then you need cloud-based VMs.
- Cloud Platforms
Various cloud services are being provided to run and manage Kubernetes. One can define a number of nodes in cluster, CPU and RAM configurations, etc. and the cloud will manage those resources. Some of the examples include GCE, AWS, Azure, CloudStack, etc.
For more information, check: https://kubernetes.io/docs/concepts/cluster-administration/cloud-providers/
More about Pods
Every node inside a Kubernetes cluster has its unique IP address known as Node IP Address. In Kubernetes, there is an additional IP Address called Pod IP address. So once we deploy a Pod on the worker node, it will get it’s own IP Address.
Containers in pods communicate with the outside world by network namespace. All the containers inside a pod operate within that same network namespace as the pod. Means all the containers in a pod will have the same IP Address as their worker node. There is a unique way to identify each container. It can be done by using ports.
Note: Containers within the same pod, not only just share the same IP Address, but will also share the access to the same volumes, c-group limits, and even same IPC names.
How do Pods communicate with one another?
- Inter-Pod communication: All the Pod IP addresses are fully routable on the Pod Network inside the Kubernetes cluster.
How do containers communicate in the same pod?
- Intra-Pod Communication: Containers use shared Local Host interface. All the containers can communicate with each other’s port on local host.
- Define the pod configuration inside manifest file (explained ahead) in yaml/json. Submit the manifest file on the api server of the Kubernetes master.
- It will then get scheduled on a worker node inside the cluster. Once it is scheduled, it goes in the pending state. During this pending state, the node will download all container images and start running the containers. It stays in the pending state until all containers are up and running.
- Now it goes in the running state. When the purpose is achieved, it gets shutdown & state is changed to succeeded.
- Failed state: It happens when the pod is in the pending state and fails due to some particular reason. If a pod dies, you cannot bring it back. You can replace it with a new pod.
Pod Manifest file
- name: nginx-container
We can define Kubernetes objects in 2 formats: yaml and json.
Most of the Kubernetes objects consist of 4 top level required fields :
It defines the version of the Kubernetes API you’re using to create this object.
- v1: It means that the Kubernetes object is part of first stable release of the Kubernetes API. So it consists of core objects such as Pods, ReplicationController and Service.
- apps/v1: Includes functionality related to running apps in Kubernetes.
- batch/v1: Consists of objects related to bash processes and jobs like tasks.
It defines the type of object being created.
Data that helps uniquely identify the object, including a name string, UID, and optional namespace.
The precise format of the object spec is different for every Kubernetes object, and contains nested fields specific to that object.
For more information, check out: https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.15/
The above file will create one instance of nginx container inside your Kubernetes cluster.
In this demo we will use the manifest file nginx-pod.yaml described above.
Deploy the pod from nginx-pod.yaml
$ kubectl create -f nginx-pod.yaml
To list all the pods
$ kubectl get pods
NAME READY STATUS RESTARTS AGE
nginx-pod 1/1 Running 0 9m25s
Every pod has a unique IP address
$ kubectl get pod nginx-pod -o wideNAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATESnginx-pod 1/1 Running 0 11m 172.17.0.7 minikube <none> <none>
Get pod configuration in YAML format
$ kubectl get pod nginx-pod -o yaml
Get pod configuration in JSON format
$ kubectl get pod nginx-pod -o json
This will display details of the pod which includes list of all events
from the time the pod is sent to the node till the current status of the pod
$ kubectl describe pod nginx-pod
Check if the pods are accessible: Verify if the connectivity from the master node to the pod is working by using the pod’s IP address
$ ping 172.17.0.7
Expose the pod using NodePort service
$ kubectl expose pod nginx-pod --type=NodePort --port=80
Here you can see the NodePort
$ kubectl describe svc nginx-pod
Port: <unset> 80/TCP
NodePort: <unset> 30843/TCP
Session Affinity: None
External Traffic Policy: Cluster
Now lets get inside the pod and execute some commands
$ kubectl exec -it nginx-pod -- /bin/sh
bin boot dev etc home lib lib64 media mnt opt proc root run sbin srv sys tmp usr var
Delete the Pod
$ kubectl delete pod nginx-pod
pod "nginx-pod" deleted