Monitor Kubernetes Cluster using New Relic
Hi Reader, You have landed right here meaning you’re probably looking to apprehend the strength of New Relic Monitoring and alerting capabilities or interested in learning something new.
Excellent!! What’s better to start with monitoring the Kubernetes cluster and get alerted every time something is going wrong?
Let us get started but what hack is New Relic?
New Relic is an observability platform that helps you build better software. You can bring in data from any digital source so that you can fully understand your system and know how to improve it.
With New Relic, you can:
- Bring all your data together: Instrument everything and import data from across your technology stack using relic agents, integrations, and APIs, and access it from a single UI.
- Analyze your data: Get all your data at your fingertips to find the root causes of problems and optimize your systems. Build dashboards and charts or use the powerful query language NRQL.
- Respond to incidents quickly: New Relic’s machine learning solution proactively detects and explains anomalies and warns you before they become problems.
Let us get this implemented to monitor the Kubernetes cluster today.
Assuming you are new to New Relic, let us start with creating a new relic account however in case you have already got one, feel free to pass this part :)
Get started with New Relic
STEP 1: Follow the link and create a new account. It is free, forever!!
Enter your name, email and click on Start Now.
STEP 2: You will receive an email to verify your email, click on verify email and set your password.
Select one of the two regions where you want your data to store and click Save.
STEP 3: You will be now landed on the installation plan page, select Kubernetes.
now, click on Begin Installation.
STEP 4: Assuming you have Kubernetes cluster available, enter your cluster name in the placeholder and click continue. You may change the namespace in case you desire to install new relic agents in a different namespace.
Check all the required data you want to gather from the Kubernetes cluster according to your use case and click continue.
New Relic offers different ways to install its agents on the k8s cluster, either by using helm or directly by manifest files. I decide on the helm, you may directly deploy manifest files if you wish.
Copy the command and log in to your Kubernetes cluster.
$helm repo add newrelic https://helm-charts.newrelic.com && helm repo update && \
kubectl create namespace newrelic ; helm upgrade — install newrelic-bundle newrelic/nri-bundle \
— set global.licenseKey= <your license key>\
— set global.cluster=my-cluster \
— namespace=newrelic \
— set newrelic-infrastructure.privileged=true \
— set global.lowDataMode=true \
— set ksm.enabled=true \
— set kubeEvents.enabled=true
once this is successfully executed, run the below command to check if all the agents are installed.
$kubectl get pods -n newrelic
Wait until all the pods are up and running.
Now go back to your New Relic UI and click continue. Wait for 2–3 minutes and then you must see “We are successfully receiving data from your cluster. 🎉”
If you see this, congratulations you have integrated your Kubernetes cluster with your New Relic account. Now click on Kubernetes cluster explorer.
All the information of your cluster is visible as below:
Click on the Control plane to get all the core components monitored and click on events to understand everything happening and recorded as infrastructure events. It is difficult to monitor Cronjobs/Jobs using any conventional approach of monitoring. For example, Prometheus require an additional push gateway setup to scrap those matrices however with Kubernetes integration in New Relic this can be achieved as it is recorded as events from the Kubernetes cluster.
NRQL is New Relic's SQL-like query language. You can use NRQL to retrieve detailed New Relic data and get insight into your applications, hosts, and business-important activity.
Click on Explorer -> browse data -> Events
Then switch to Query builder to query the data using New Relic Query Language.
Let's understand by an example to get all the pods that are not in the Running state. Use the below query the get the desired results.
SELECT podName FROM K8sPodSample WHERE clusterName =’my-cluster’ and status != ‘Running’
You can now add this to a dashboard by clicking on the Add to Dashboard button at the bottom.
You can also create an alert and get notified to act upon the pod failure.
Alerts and Notification channel
We need a notification channel to send alerts.
STEP 1: Click on create alert option below the query. Enter condition name and scroll down, you will notice the error as below:
The reason behind this error is that it is expected that the query should return an integer result otherwise it is considered invalid. So, we will modify our query to return an integer value.
$SELECT count(*) FROM K8sPodSample WHERE clusterName =’my-cluster’ and status != ‘Running’
Notice that I have replaced podName with count(*)
threshold states that a violation should open if the query returns a value above 1 for at least 5 minutes.
Note that a violation does not mean that an alert is triggered.
Now, scroll down to “Connect your condition to a policy” and choose the Kubernetes default alert policy from the existing policy. That's it, Click on Save condition.
You will see a popup message stating that your condition is saved.
STEP 2: Now, Create a New channel to receive alert. Click on Explorer -> Alerts & AI -> Alerts(classic) -> Channels
Click on create new notification channel from the top right and select any channel you want to receive notification on, I will choose Email.
Add details and click on Create channel.
Once you click on Create channel you will see an option of Send a test notification, Click on it and you will receive an email notification on the email ID.
STEP 3: Switch to Alert policies from the top and add Kubernetes default alert policy to this notification channel. This means we have linked policy with notification channel, meaning all the alert conditions with this policy now have a channel to send a notification to.
If you’ve made it this far, thank you for reading and congratulations we have just configured New Relic to monitor the Kubernetes cluster and set up alerts and notifications to acknowledge and act on the cause as soon as possible.
Hope you like it, Happy Reading!