Multi-site monitoring with HA and dynamic scale using VictoriaMetrics. A Practical guide

Amit Karni
Israeli Tech Radar
Published in
14 min readJul 30, 2022

Why VictoriaMetrics?

In my previous article, I explained my reasoning for believing VictoriaMetrics will and should take over Prometheus, as well as useful details about VictoriaMetrics components.

The microservice architecture in VictoriaMetrics appears to be more effective than that in Prometheus.
In comparison to the Prometheus stack, VictoriaMetrics offers superior performance and data compression.
Scaling is also simple due to the separate components, most of the components are stateless, so it can be designed to run on spot nodes.

Main challenges with Prometheus that VictoriaMetrics solves

  • Monitoring systems with Prometheus requires major engineering time and looking over many resources to maintain each workload.
  • Prometheus is designed to scale vertically only. As a result, compute costs increase.
  • It can be difficult to maintain stability at large scales.
  • Since Prometheus requires deploying a full and heavy stateful application for each workload, the overall costs are too high.
  • Since Prometheus is not highly available by design, there is always a single point of failure.

Article goal & Overview

I decided to write this article after I got several “how to” questions about my previous post.

The goal of this article is to guide how to design and deploy a multi-site VictoriaMetrics cluster architecture on Kubernetes that runs on Spot and On-demand nodes, and achieves high availability, dynamic scalability, high performance, and cost savings.
In addition, solving the scalability and operational challenges of Prometheus above will allow users to create a reliable, dynamic, and cost-effective monitoring platform.

High-level architecture

Description of the architecture

  • vmagent is going to scrape the workloads (k8s clusters in our case), add relevant labels, if configured (To filter the same metrics based on source/k8s-cluster names), I added the “cluster” name label), and send the data to vminsert using remote_write protocol.
  • vminsert accepts the ingested data and spreads it among vmstorage pods.
    We might run vminsert on spot-nodes and configure it to use Horizontal-Pod-Autoscaler since it is stateless.
  • vmstorage stores the raw data and returns the queried data on the given time range for the given label filters. Unlike other components in the cluster, this is the only component that must be run using on-demand nodes, which means it cannot be run with HPA, only Vertical-Pod-Autoscaler.
  • vmselect queries the data from all the configured vmstorages
    We might run vmselect on spot-nodes and configure it to use Horizontal-Pod-Autoscaler since it is stateless.
  • vmalert will run on both clusters for redundancy in alert configurations.
  • Alertmanager is deployed in cluster-mode between both zones for High-Availability.
  • With Grafana, we can view all the workloads metrics in one place using one data source — vmselect.
    To filter metrics based on k8s-cluster names, I added the “cluster” label.

*Note*
Currently, I am running this design in my company production environment with 6 workloads and it’s working great!

Practical guide

Helm charts

For both VictoriaMetrics cluster deployments and VMagent/workloads connection deployments, I used “victoria-metrics-k8s-stack” official charts (using different values).

It allows us to deploy the VictoriaMetrics Kubernetes stack all at once or just parts of it,
It includes VMcluster, VMagent, VMRules, the VictoriaMetrics Operator, ServiceScrapes, Exporters, Alertmanager, and Grafana + relevant dashboards.

VictoriaMetrics documentation provides all the information you need to understand the k8s-stack chart.

How I customized the values

For this post, I created this Github repo: https://github.com/Amitk3293/VictoriaMetrics-MultiSite-HA-Cluster

The goal of this repository is to show how I edit the Helm-chart values for VictoriaMetrics-k8s-stack in order to create the design above by using Helm only.

The folder “VictoriaMetrics-cluster-k8s-stack” in my repo contains the entire chart files (for convenience only) and two separated values files:

  1. values-main-cluster.yaml for the main VictoriaMetrics cluster
  2. values-2nd-cluster.yaml for the additional VictoriaMetrics cluster

The main changes between these two will be in the urls (ingresses, remotewrite.url, etc…), and the 2nd cluster will NOT deploy Grafana.
For example, the ingress of vminsert will be:
for the main cluster -
http://vminsert.domain.com
for the 2nd cluster -
http://2nd-vminsert.domain.com

For a better understanding of the changes, I suggest you clone the original values file and compare it with my customized values file.

Below are the main examples of the customization I have made to the values:

Choose VMcluster and not VMsingle

  • Disable VMsingle and enable VMcluster:
    We want to deploy the VMcluster version and not the VMsingle version.
disable VMsingle on both values files
enable VMcluster on both values files

VMcluster spec:

  • Each cluster component (VMstorage, VMselect, VMinsert) has three replicas configured with ingress and Horizontal Pod Auto-scaler (except VMstorage which don’t use HPA).

Spot nodes(optional):

  • It is possible to define VMselect & VMinsert to run on Spot nodes, and save a lot of money.
    In order to ensure ONLY VMinsert & VMselect Kubernetes deployments will be scheduled on Spot nodes, we need to add “taints” to the Spot node pool, along with matching“tolerations” and “nodeSelector” to VMinsert& VMselect deployments.
    Using this method, we will be able to manage pods scheduling easily and we won’t need to specifically define other Kubernetes deployments to run on on-demand nodes.
  • Make sure your Spot node-pool is configured with “taints”
    For example:
VMstorage Spec on both clusters vlaues files:
VMselect & VMinsert Spec on both clusters values files: HPA + NodeSelector & tolerations

Ingress enabled:

  • To be able to get metrics from workloads in other sites, you must to have vminsert ingress enabled and working.
    With other components, you can decide which would be appropriate for ingress, I enabled them all.
VMinsert ingress enabled. each values file will have a unique Ingress hostname

Alertmanager cluster-mode

  • As part of deploying Alertmanager in cluster mode, we have to expose the Alertmanager on port 9094. (Ingress comes with a default port of 9093 for Alertmanager web-interface.)
    For this purpose, I configured the Alertmanager Ingress host with the extra path “/cluster”to point at port 9094 for Alertmanager service.
  • As a second step, we add the “additionalPeers” URL, which is the Ingress extra path we exposed that points to the Alertmanager service on Port 9094 (if you don’t want to use cluster-mode in Alertmanager, just comment it out).
  • externalURL should point to the Alertmanager web-ui URL so that Alertmanager-slack-config alerts can be more effective, such as buttons
Alertmanager Ingress + extrapath for cluster-mode
Alertmanager clustermode + externalURL

Alertmanger config

  • alerts config based on cluster label:
    We need to be able to filter which metric value or which alerts belong to which workload since we’ll get the same metrics with different values across workloads.
    For that I created:
    1.Slack alerts are grouped by cluster label.
    2.Using the cluster label to route Slack alerts.
    3.An extra template to display the cluster label in all Slack alerts.
Alertmanager config alert routing based on cluster label
Extra template to have “cluster label” in Slack-webhook alerts

VMagent:

  • For syncing and sending data between two clusters, I added additionalRemoteWrites, which points to the other cluster vminsert ingress-url.
    Note that since remotewrite.url inserting data with Prometheus remote write API., we are going to add insert/0/prometheus/api/v1/write
    to the vminsert URL:
    http://2nd-vminsert.domain.com/insert/0/prometheus/api/v1/write
  • VMagent relabeling is helpful since we have a number of workloads and only one datasource, so we can filter which metrics or alerts belong to which workloads.
    In each VMagent we are going to add externalLabels with the label
    cluster=<cluster-name>
VMagent additionalRemoteWrites + ExternalLabels

Grafana

  • As a dependent chart, Grafana can be customized according to the official Grafana helm chart values.
  • I upgraded the values to deploy the latest Grafana + sidecar images.
  • I enabled multicluster to have Grafana dashboards filtered by cluster-label
  • I added PVC to make Grafana persistent.
Grafana latest images-tags + multicluster enabled + PVC

Some other values:

  • As a dependent chart, Prometheus Node-Exporters can be customized according to the official Prometheus Node-exporters helm chart values.
    Consider marking it disabled in the chart if you plan on deploying it to a cluster equipped with Prometheus Node-exporters already.
  • There are many values that can be customized based on your kubernetes cluster or your preferences, for example, I disabled K8S-coreDNS monitoring and enabled K8S-kubeDNS monitoring.
    Go over the chart and define it as you wish.

Important Note

  • I have shown examples for the values-main-cluster.yaml file, which is for the 1st VictoriaMetrics cluster. For the second cluster, we have to modify values-2nd-cluster.yaml, and change the IPs and URLs called for syncing between the two different clusters.

Let’s start with installation

prerequisites:

optional:

  • 2 node-pools in each kubernetes cluster, 1 for spot-nodes(with relevant taints) and 1 for on-demand nodes.

Installation

  • add the Helm repositories including dependencies
helm repo add grafana https://grafana.github.io/helm-charts
helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
helm repo add vm https://victoriametrics.github.io/helm-charts/
helm repo update

1st Kubernetes cluster

  • connect to the 1st Kubernetes cluster
  • Create a new namespace
k create ns vm
  • Go to ./VictoriaMetrics-MultiSite-HA-Cluster/VictoriaMetrics-cluster-k8s-stack and run Helm install using values-main-cluster.yaml as below.
helm install vmetrics vm/victoria-metrics-k8s-stack -f values-main-cluster.yaml -n vm
  • run kubectl get pods to see that all the pods are running successfully
amitk@pop-os VictoriaMetrics-cluster-k8s-stack git:(master) ✗ k get pods -n vm         
NAME READY STATUS RESTARTS AGE
vmagent-vmetrics-victoria-metrics-k8s-stack-74d4594ffd-txmnb 2/2 Running 0 3d4h
vmalert-vmetrics-victoria-metrics-k8s-stack-7bdb59cf58-9snsr 2/2 Running 0 3d4h
vmalertmanager-vmetrics-victoria-metrics-k8s-stack-0 2/2 Running 0 2d23h
vmetrics-grafana-576ddc6c96-xmhrv 3/3 Running 0 5h24m
vmetrics-kube-state-metrics-69b8795b77-d2zbk 1/1 Running 0 21d
vmetrics-prometheus-node-exporter-gfkdz 1/1 Running 0 17d
vmetrics-prometheus-node-exporter-j8css 1/1 Running 0 21d
vmetrics-prometheus-node-exporter-v4jgv 1/1 Running 0 21d
vmetrics-prometheus-node-exporter-wbxkb 1/1 Running 0 21d
vmetrics-prometheus-node-exporter-zjclk 1/1 Running 0 21d
vmetrics-victoria-metrics-operator-7d5984fb9b-rvqgp 1/1 Running 3 (11d ago) 21d
vminsert-vmetrics-victoria-metrics-k8s-stack-6f75754b76-sj5sq 1/1 Running 0 3d4h
vminsert-vmetrics-victoria-metrics-k8s-stack-6f75754b76-wzr87 1/1 Running 0 3d4h
vminsert-vmetrics-victoria-metrics-k8s-stack-6f75754b76-xjndp 1/1 Running 0 3d4h
vmselect-vmetrics-victoria-metrics-k8s-stack-0 1/1 Running 0 3d4h
vmselect-vmetrics-victoria-metrics-k8s-stack-1 1/1 Running 0 3d4h
vmselect-vmetrics-victoria-metrics-k8s-stack-2 1/1 Running 0 3d4h
vmselect-vmetrics-victoria-metrics-k8s-stack-3 1/1 Running 0 3d4h
vmselect-vmetrics-victoria-metrics-k8s-stack-4 1/1 Running 0 3d4h
vmstorage-vmetrics-victoria-metrics-k8s-stack-0 1/1 Running 0 3d4h
vmstorage-vmetrics-victoria-metrics-k8s-stack-1 1/1 Running 0 3d4h
vmstorage-vmetrics-victoria-metrics-k8s-stack-2 1/1 Running 0 3d4h

2nd Kubernetes cluster

  • connect to the 2nd Kubernetes cluster
  • Create a new namespace
k create ns vm
  • Go to ./VictoriaMetrics-MultiSite-HA-Cluster/VictoriaMetrics-cluster-k8s-stack and run Helm install using values-2nd-cluster.yaml as below.
helm install vmetrics vm/victoria-metrics-k8s-stack -f values-2nd-cluster.yaml -n vm 
  • run kubectl get pods to see that all the pods are running successfully
amitk@pop-os VictoriaMetrics-cluster-k8s-stack git:(master) ✗ k get pods -n vm         
NAME READY STATUS RESTARTS AGE
vmagent-vmetrics-victoria-metrics-k8s-stack-74d4594ffd-zcgas 2/2 Running 0 3d4h
vmalert-vmetrics-victoria-metrics-k8s-stack-7bdb59cf58-weras 2/2 Running 0 3d4h
vmalertmanager-vmetrics-victoria-metrics-k8s-stack-0 2/2 Running 0 2d23h
vmetrics-kube-state-metrics-69b8795b77-d2zbk 1/1 Running 0 21d
vmetrics-prometheus-node-exporter-qtsuv 1/1 Running 0 17d
vmetrics-prometheus-node-exporter-p8u6a 1/1 Running 0 21d
vmetrics-prometheus-node-exporter-95qer 1/1 Running 0 21d
vmetrics-prometheus-node-exporter-az245 1/1 Running 0 21d
vmetrics-prometheus-node-exporter-sdaa1 1/1 Running 0 21d
vmetrics-victoria-metrics-operator-1a22d5a89-asdqew 1/1 Running 3 (11d ago) 21d
vminsert-vmetrics-victoria-metrics-k8s-stack-87s564d89a-sj5sq 1/1 Running 0 3d4h
vminsert-vmetrics-victoria-metrics-k8s-stack-87s564d89a-zca57 1/1 Running 0 3d4h
vminsert-vmetrics-victoria-metrics-k8s-stack-87s564d89a-gsadw 1/1 Running 0 3d4h
vmselect-vmetrics-victoria-metrics-k8s-stack-0 1/1 Running 0 3d4h
vmselect-vmetrics-victoria-metrics-k8s-stack-1 1/1 Running 0 3d4h
vmselect-vmetrics-victoria-metrics-k8s-stack-2 1/1 Running 0 3d4h
vmselect-vmetrics-victoria-metrics-k8s-stack-3 1/1 Running 0 3d4h
vmselect-vmetrics-victoria-metrics-k8s-stack-4 1/1 Running 0 3d4h
vmstorage-vmetrics-victoria-metrics-k8s-stack-0 1/1 Running 0 3d4h
vmstorage-vmetrics-victoria-metrics-k8s-stack-1 1/1 Running 0 3d4h
vmstorage-vmetrics-victoria-metrics-k8s-stack-2 1/1 Running 0 3d4h

Results

Grafana Explore + Dashboards + VictoriaMetrics cluster stats

  • After installing VictoriaMetrics cluster we can login to our Grfana with our admin password ( k get secret -n vm vmetrics-grafana -o jsonpath=”{.data.admin-password}” | base64 — decode ; echo )
  • Using the VM-k8s-stack Helm-charts, we have already deployed some valuable dashboards.
    One of them is VicotriaMetrics-cluster.
    VictoriaMetrics cluster useful metrics are contained here, which is helpful if you need to follow or debug your cluster,
    The Stats panel also displays all the VM instances from both k8s-clusters that are part of the VM cluster and many other useful data.
  • Keeping in mind that since we are working with only one datasource, using cluster label to filter k8s-clusters and workloads metrics is necessary, sometimes it will be necessary to change some other dashboards queries.
  • Here are 2 excellent Kubernetes dashboards that I adapted from the Grafana website and modified to filter by cluster labels:
    1.K8S Fully overview by Exporter and cluster label
    2.K8S Fully overview by kubelet and cluster label
Filter by cluster label
K8S overall monitor by cluster label

Validate Alertmanager cluster-mode

  • Check the Alertmanager web-ui — Status to see if the system is in cluster mode.
Alertmanager cluster-mode

Connect workloads using VMagent

  • To send metrics from any workload to our VictoriaMetrics cluster, VMoperator and VMagent can be deployed on the workload.
  • To do this, I use the same VictoriaMetrics-k8s-stack Helm-charts I used for the cluster deployments, BUT, I disabled all other components except VMoperator and VMagent.
  • In the Helm-charts values, I edit the remotewrite.url in VMagent to point to both k8s clusters VMinsert Ingress/URLs, as well as the cluster label key according to the cluster name.

How I customized the values

The folder “VMagent-only-k8s-stack” in my repo contains the entire chart files (for convenience only) and values-vmagent-only.yamlfile.
In the values-vmagent-only.yaml file we can see that

  • All the components are disabled except VMoperator , VMagent & kubelet/Node-exporters(Optional)
  • VMagent is configured with "additionalRemoteWrites" to send metrics to both VMinsert URL/ingresses of the VictoriaMetrics clusters.
  • cluster label key is configured with the workload name.

Installation

Connect Kubernetes workload cluster

  • connect to the required Kubernetes cluster
  • Create a new namespace
k create ns monitoring
  • Go to ./VictoriaMetrics-MultiSite-HA-Cluster/VMagent-only-k8s-stack and run Helm install using values-vmagent-only.yaml as below.
helm install vm-devapps-ml vm/victoria-metrics-k8s-stack -f values-vmagent-only.yaml -n monitoring
  • run kubectl get pods to see that all the pods are running successfully
amitk@pop-os ~ k get pods -n monitoring 
NAME READY STATUS RESTARTS AGE
vm-devapps-ml-kube-state-metrics-8656556777-v24tp 1/1 Running 0 16d
vm-devapps-ml-prometheus-node-exporter-cxrzs 1/1 Running 0 16d
vm-devapps-ml-prometheus-node-exporter-pnmj6 1/1 Running 0 23d
vm-devapps-ml-prometheus-node-exporter-zxt8j 1/1 Running 0 3h6m
vm-devapps-ml-victoria-metrics-operator-b68bb4cd8-fg7sz 1/1 Running 0 16d
vmagent-vm-devapps-ml-victoria-metrics-k8s-stack-59f57b79db-jpgsf 2/2 Running 0 8d
%

VMagent dashboard

  • Using the VM-k8s-stack Helm-charts, we have already deployed some valuable Grafana dashboards, One of them is VMagent.
    VMagent dashboard shares useful metrics which may be helpful if you need to follow or debug your workload connections.
    As we can see below in my case I’m getting metrics from 6 different workloads using VMagent, and can filter based on the workload (job variable).
VMagent dashboard + 6 workloads

Summary & Achievements

To sum up what we’ve done,
We created a multi-site VictoriaMetrics with the following benefits:

  1. High Availability & Fault tolerance: VictoriaMetrics cluster is multi-site, so even if one cluster goes down, the other cluster will continue to operate.
    Moreover, VictoriaMetrics has built-in redundancy and auto-healing features for each component.
    All the
  2. High performance & Dynamic scaling: Most of the VMcluster components like VMinsert and VMselect scale wide based on resource consumption through Horizontal Pod Auto-scaling.
    Moreover, VictoriaMetrics by design delivers better performance than Prometheus.
  3. Cost-effective: We can run most of the VMcluster components, such as VMselect and VMinsert on Spot nodes, which is very cost-effective.
    Moreover, compared to Prometheus, VictoriaMetrics consumes fewer resources and compress more data which saves us money both in computing and storage.
  4. All-in-one data & management: It gives us the option to store all the metrics/data from many workloads in one Time-series-DB.
    In addition, we manage the whole monitoring from a single platform/datasource.

It is then possible to connect any workload using only a tiny component such as VMagent:

  1. VMagent is fast and friendly to compute resources
  2. With remote_write protocol, it can collect metrics from various sources and send them to VictoriaMetrics cluster.
  3. Configuration, connecting workloads, and managing are so easy.

--

--

Amit Karni
Israeli Tech Radar

Senior DevOps Engineer with a passion for keeping up with new technologies. https://www.linkedin.com/in/amit-karni