Kubernetes, Local to Production with Django: 6— Add Prometheus & Grafana Monitoring With Helm

In this tutorial, the concept of monitoring will be introduced and to this end, the following points are covered:

  • Integrate with Prometheus and Grafana for Kubernetes resource monitoring as well as application monitoring.
  • The monitoring tools are added using the helm package manager which we will touch on briefly.
  • The monitoring applications will be deployed against a minikube environment.

Requirements

Some basic knowledge of Kubernetes is assumed, and the code for this tutorial will be deployed into a Kubernetes cluster running in minikube.

The code for this part of the series can be found on Github in the part_6-monitoring branch. The Kubernetes manifest files can be found in the following file path within the repo:

$ kubernetes_django/deploy/..

The rest of the tutorial will assume the above is the current working directory when applying the Kubernetes configuration files.

1. Background

Monitoring is a crucial part of any production level application and can be fairly simple to configure in trivial systems. However, with Kubernetes, not only do we need to monitor the infrastructure i.e. nodes, networking etc, we also need to know the status of the pods, containers and other native Kubernetes resources as well as the applications running the business specific code within the cluster. Thus the level of complexity becomes non trivial even for a simple application like what this tutorial series is covering.

Fortunately, Kubernetes has a good ecosystem of various monitoring options available. The first is the rudimentary Kubernetes dashboard that we have been using so far. We have already investigated how to install it into our cluster as well as the different types of metrics it provides. But it only offers a limited view of our cluster, thus better monitoring tools are required.

There are several options of monitoring tools to choose from, which include self hosted solutions such as Heapster, Prometheus, Grafana etc. Another option is using third party hosted services such as Datadog, Google Stackdriver, sysdig etc. As the focus of this tutorial is to better understand the Kubernetes product, we will steer away from third party hosted solutions and focus on self hosted solutions. Specifically, the focus will be on Prometheus and Grafana which are quite popular.

In addition to the monitoring tools, we will be looking at a new way to deploy resources to the Kubernetes cluster. For this we will be investigating Helm.

Helm

As we move closer to a production environment, we need to find a way to reduce deployment complexity, as well as minimizing the time spent in writing boilerplate code for applications that support our infrastructure e.g. logging and monitoring tools. There are a couple of ways to do this, one is to copy and paste code found in the wild (looking at you Stackoverflow), or another is to use a prepackaged solution written in line with best practices i.e. package managers.

Helm is one most popular package manager currently. Helm is to Kubernetes as apt/yum/brew is to operating systems or npm/pip is to Javascript/Python. This means it’s a way to deploy packages called Charts from the Kubernetes Helm repository into our application with minimal setup and help avoid the learning curve one would need to tailor a solution from scratch. It also functions as a template system that helps deploy custom resources in unison.

We won’t go into too much depth concerning the internals of Helm, as we will only cover what will be required to setup our monitoring tools, and depending on the feedback I will dedicate a tutorial to helm.

Helm Setup

Helm can be installed by running the following command on a Mac OS (the windows and linux equivalents can be found on the helm documentation page):

$ brew install kubernetes-helm

Once helm has been installed, it needs to be initialized by running the following command:

$ helm init

This process does two crucial tasks, it initializes the local helm client and installs the tiller server within the Kubernetes cluster which handles all the helm packages to be deployed.

Helm works by taking a set of values (usually written in the values.yaml file) and applies them to predefined templates called Charts. Think of Charts as all the collection of files that describe the Kubernetes resources which include, Services, Pods, Persistent storage volumes, Configmaps etc. that the application will require to function optimally.

For our use case, we will be using the stable Charts from the Kubernetes project repository. We will focus on two approaches; in the first approach we will download the Charts to the local repo, edit the values.yaml file and deploy this local Chart. In the second approach, we will edit the values.yaml file and apply them to a stable Chart directly from the Kubernetes repo. Both have their advantages and disadvantages, you can choose whichever approach addresses your use case.

To have a better grasp of the manifest files that will be deployed and the options the values file takes, we will add the stable Charts into our codebase before making the necessary edits. To search for available helm Charts in the Kubernetes stable repo, the command is:

$ helm search

This will return a list of Charts where each chart has the format stable/<chart_name>.

To download the necessary chart to our repo, the command to run is:

$ helm fetch stable/<chart_name> --untar

This returns a tar version of the chart, the helm cli tool provides the --untar convenience argument to convert it to a regular folder ready for use.

To deploy to the cluster, it might be necessary to edit the values.yml file in the chart according to the requirements. The edits may include; changing the storage size, configuring which pods to install, updating credentials etc. (more on this later). Finally the chart can be added to the cluster by running:

$ helm install ./<chart_name>

Prometheus

Prometheus is an open-source monitoring and alerting metric collector with a time series database. It was taken in by the Cloud Native Computing Foundation (CNCF) as the second hosted project after Kubernetes.

Prometheus with Helm

To fetch the helm chart, the following command needs to be run:

$ helm fetch stable/prometheus --untar

Once the chart is downloaded, open the values file found in location prometheus/values.yaml. Though it contains a number of options, there are only a few options we need to worry about. The first is:

rbac:
create: false

This option creates the role-based access control and is beyond the scope of this tutorial. However I found that leaving it as true, led to deployment issues in my minikube environment, so it’s turned off to ensure the deployment will still work.

The other option you might want to consider is the size the of the persistent volume, this can be set individually for the alertmanager and the prometheus server using the size variable:

size: 2Gi

There are several options which are set to true but are not necessary, these includes the alertmanager, pushgateway etc, setting them to true or false doesn’t adversely affect our deployment. You can read up on the documentation to figure out if it covers your use case. Once we have made the necessary edits to the values file, the prometheus chart can be added to our deployment by running:

$ helm install ./prometheus

This returns a slew of information which has the following structure:

NAME:   messy-manta
LAST DEPLOYED: Tue Apr 17 00:41:16 2018
NAMESPACE: default
STATUS: DEPLOYED
.
.
RESOURCES:
.
.
NOTES:
.
.

The NAME variable is the release name of the helm deployment for prometheus. This allows us to update, delete and inspect the deployment based on the release name e.g. helm delete <RELEASE_NAME>. To see a list of running helm applications according to their release names, the command is:

$ helm ls

The next crucial bit of information is RESOURCES. As the name indicates, this sections contains a list of the Kubernetes resources that were deployed e.g. Services, Daemonsets, Deployments, Pods etc and that are required to have an optimally functioning prometheus application. Finally, the NOTES section, contains the developers notes on how to perform some convenience functions which include how to access the prometheus application by conducting a port forward command, which can be done by running:

$ export POD_NAME=$(kubectl get pods --namespace default -l "app=prometheus,component=server" -o jsonpath="{.items[0].metadata.name}")
$ kubectl --namespace default port-forward $POD_NAME 9090

This provides access to the prometheus dashboard on http://127.0.0.1:9090/. This is an expression browser for performing ad hoc querying and debugging. However, the dashboard is not intuitive and can take a lot of work to properly configure.

To have a better understanding of the metrics returned, we need to find a more user friendly dashboard, and for this we will be using Grafana.

Grafana

Grafana is an open source dashboard tool with aesthetic analytics and monitoring, that supports the querying of prometheus.

Grafana with Helm

In order to add Grafana to our cluster, we need to fetch the stable Grafana Chart maintained by the Kubernetes project, the command for this is:

$ helm fetch stable/grafana --untar

Next, we need to edit the values file found in grafana/values.yaml. The credentials and the storage configuration values can be updated according to your requirements where more secure credentials can be chosen and the persistence size can be set as well:

adminUser: admin
adminPassword: strongpassword
persistence:
enabled: true
size: 1Gi

The next configuration value is the data source which is commented out by default. First uncomment it as follows:

datasources:
datasources.yaml:
apiVersion: 1
datasources:
- name: Prometheus
type: prometheus
url: <PROMETHEUS_URL_HERE>
access: proxy
isDefault: true

Then in the url field, add the prometheus url, which is the output of the following command:

echo http://$(kubectl get service --namespace default -l "app=prometheus,component=server" -o jsonpath="{.items[0].metadata.name}").default.svc.cluster.local

Skipping the datasource step here will mean an extra step in setting up the data source in the UI.

The chart can be deployed in our cluster using the stable repository version and not the one we downloaded. This can be done by:

$ helm install -f grafana/values.yaml stable/grafana

What this does, is instead of deploying the local Chart as was with the prometheus deployment, it takes our values file and applies it against the grafana Chart in the stable/grafana repository. Both approaches are okay and it depends on your business requirements and operational best practices on which you choose.

On deploying we get a similar output as with Prometheus i.e.:

NAME:   lopsided-jaguar
LAST DEPLOYED: Tue Apr 17 18:35:07 2018
NAMESPACE: default
STATUS: DEPLOYED
RESOURCES:
.
.
NOTES:
.
.

Where the NAME is the release name. This is then followed by a list of resources, and notes concerning key features of the deployment. The notes includes the command used to perform port forwarding in order to view the deployment from our localhost browser window.

There seems to be an error with the port forwarding command as described in the NOTES, if the port forwarding command fails, run the one below but replace the <RELEASE_NAME> with your release name.
$ export POD_NAME=$(kubectl get pods --namespace default -l "app=grafana,release=<RELEASE_NAME>" -o jsonpath="{.items[0].metadata.name}")
$ kubectl --namespace default port-forward $POD_NAME 3000

Navigate to 127.0.0.1:3000 in the local browser, where the grafana login screen is shown. Enter the credentials that were set in the grafana/values.yaml file. Once this is done you will end up in the following page:

As you can see the datasource is configured based on the values.yaml file so there is no need to set it up again. The next thing we need to do is to import a Grafana board. To do so, click the top left add (i.e. +) icon, which will show the drop down (within the red rectangle in the above image) and then click import. Alternative go to the following link http://127.0.0.1:3000/dashboard/import, where in the first input box:

Enter the value 3131, which is the unique identifier for the Kubernetes All Nodes prebuilt dashboard from the Grafana community site. Clicking anywhere outside the input field will lead to the following view displaying the dashboard options.

Set the prometheus data source to prometheus (shown in the red rectangle) and then click the import button. This should take you to the new dashboard with the basic metrics running similar to the one as shown below (give it a few minutes to start populating the data):

Kubernetes Dashboard

Application Level Monitoring

Now that we have set up cluster level monitoring, we can begin looking at how to set up application level monitoring. For this we will be using the django-prometheus package. Following the documentation, there are a few changes we need to make to our application, this includes:

  • Update the requirements.txt file and add django-prometheus==1.0.13 to it.
  • Next we need to update the settings.py file as documented on the django-prometheus github page i.e:
INSTALLED_APPS = (
...
'django_prometheus',
...
)

MIDDLEWARE_CLASSES = (
'django_prometheus.middleware.PrometheusBeforeMiddleware',
# All your other middlewares go here, including the default
# middlewares like SessionMiddleware, CommonMiddleware,
# CsrfViewmiddleware, SecurityMiddleware, etc.
'django_prometheus.middleware.PrometheusAfterMiddleware',
)
  • Afterwards we need to update the urls.py file as follows:
urlpatterns = [
...
url('', include('django_prometheus.urls')),
]
  • Next we need to rebuild our docker image, from previous tutorials the command is $ docker build -t <IMAGE_NAME>:<TAG> .. If you have been following the tutorials until now, make sure the TAG is different than what was previously used otherwise the change will not be picked up when you redeploy.
  • We can then update our deployment with the new <IMAGE_NAME>:<TAG> by changing the image field on django/deployment.yaml:
$ image: <IMAGE_NAME>:<TAG>
  • The next thing we need to do is update the ./django/service.yaml file according to:

The change is that annotations: prometheus.io/scrape: "true" was added and allows prometheus to monitor our service (we won’t go into too much detail here).

  • The changes we made to the deployment and service needs to be redeployed to the cluster:
$ kubectl apply -f django/
  • To confirm this is working we can use minikube to show us a version of our webpage:
$ minikube service django-service

This should open our django application in a new browser. To see the data that is been monitored, navigate to the /metrics endpoint where you will the metrics that will be sent to the prometheus server.

Unfortunately I wasn’t able to find a working Django Grafana dashboard. Building a Grafana dashboard is out of the scope of this tutorial. However, this is a good time to show how to view metrics on the prometheus expressionless browser. To view the metrics, execute a port-forward according to the notes provided on deploying prometheus with helm and navigate to the url provided, in my case it’s http://127.0.0.1:9090/. The following view should be shown:

From the dropdown enclosed in the red rectangle select any django metric and click execute to see see that the django metrics is indeed been scraped by prometheus.

Conclusion

In this tutorial we have seen how to add prometheus and Grafana using Helm to our project to conduct basic monitoring, however this is a small sliver of what it can do, to find out more about them read the prometheus, grafana and helm documentation.

Now that monitoring is in place, the next tutorial will look at collecting logging information.

Issues

At some point in writing this tutorial, Grafana wasn’t detecting the prometheus data, completely deleting all helm Charts, and Kubernetes resources, then restarting minikube and perform all the installation steps caused it to work.

Credits:

Tutorial Series

  • Part 2: How to deploy a basic Django application into a local Kubernetes cluster.
  • Part 3: Integrated with a Postgres database to it and run migrations.
  • Part 4: Added Celery for asynchronous task processing with Redis as a message broker as well as a cache backend.
  • Part 5: Deployed to AWS using Kops with RDS Postgres