Multi-site monitoring with HA and dynamic scale using VictoriaMetrics. A Practical guide
Why VictoriaMetrics?
In my previous article, I explained my reasoning for believing VictoriaMetrics will and should take over Prometheus, as well as useful details about VictoriaMetrics components.
The microservice architecture in VictoriaMetrics appears to be more effective than that in Prometheus.
In comparison to the Prometheus stack, VictoriaMetrics offers superior performance and data compression.
Scaling is also simple due to the separate components, most of the components are stateless, so it can be designed to run on spot nodes.
Main challenges with Prometheus that VictoriaMetrics solves
- Monitoring systems with Prometheus requires major engineering time and looking over many resources to maintain each workload.
- Prometheus is designed to scale vertically only. As a result, compute costs increase.
- It can be difficult to maintain stability at large scales.
- Since Prometheus requires deploying a full and heavy stateful application for each workload, the overall costs are too high.
- Since Prometheus is not highly available by design, there is always a single point of failure.
Article goal & Overview
I decided to write this article after I got several “how to” questions about my previous post.
The goal of this article is to guide how to design and deploy a multi-site VictoriaMetrics cluster architecture on Kubernetes that runs on Spot and On-demand nodes, and achieves high availability, dynamic scalability, high performance, and cost savings.
In addition, solving the scalability and operational challenges of Prometheus above will allow users to create a reliable, dynamic, and cost-effective monitoring platform.
High-level architecture
Description of the architecture
vmagent
is going to scrape the workloads (k8s clusters in our case), add relevant labels, if configured (To filter the same metrics based on source/k8s-cluster names), I added the “cluster” name label), and send the data tovminsert
using remote_write protocol.vminsert
accepts the ingested data and spreads it amongvmstorage
pods.
We might runvminsert
on spot-nodes and configure it to use Horizontal-Pod-Autoscaler since it is stateless.vmstorage
stores the raw data and returns the queried data on the given time range for the given label filters. Unlike other components in the cluster, this is the only component that must be run using on-demand nodes, which means it cannot be run with HPA, only Vertical-Pod-Autoscaler.vmselect
queries the data from all the configuredvmstorages
We might runvmselect
on spot-nodes and configure it to use Horizontal-Pod-Autoscaler since it is stateless.vmalert
will run on both clusters for redundancy in alert configurations.Alertmanager
is deployed in cluster-mode between both zones for High-Availability.- With
Grafana
, we can view all the workloads metrics in one place using one data source —vmselect
.
To filter metrics based on k8s-cluster names, I added the “cluster” label.
*Note*
Currently, I am running this design in my company production environment with 6 workloads and it’s working great!
Practical guide
Helm charts
For both VictoriaMetrics cluster deployments and VMagent
/workloads connection deployments, I used “victoria-metrics-k8s-stack” official charts (using different values).
It allows us to deploy the VictoriaMetrics Kubernetes stack all at once or just parts of it,
It includes VMcluster
, VMagent
, VMRules
, the VictoriaMetrics Operator,
ServiceScrapes
, Exporters
, Alertmanager
, and Grafana
+ relevant dashboards.
VictoriaMetrics documentation provides all the information you need to understand the k8s-stack chart.
How I customized the values
For this post, I created this Github repo: https://github.com/Amitk3293/VictoriaMetrics-MultiSite-HA-Cluster
The goal of this repository is to show how I edit the Helm-chart values for VictoriaMetrics-k8s-stack
in order to create the design above by using Helm only.
The folder “VictoriaMetrics-cluster-k8s-stack
” in my repo contains the entire chart files (for convenience only) and two separated values files:
values-main-cluster.yaml
for the main VictoriaMetrics clustervalues-2nd-cluster.yaml
for the additional VictoriaMetrics cluster
The main changes between these two will be in the urls (ingresses, remotewrite.url, etc…), and the 2nd cluster will NOT deploy Grafana.
For example, the ingress of vminsert will be:
for the main cluster - http://vminsert.domain.com
for the 2nd cluster - http://2nd-vminsert.domain.com
For a better understanding of the changes, I suggest you clone the original values file and compare it with my customized values file.
Below are the main examples of the customization I have made to the values:
Choose VMcluster and not VMsingle
- Disable
VMsingle
and enableVMcluster
:
We want to deploy theVMcluster
version and not theVMsingle
version.
VMcluster spec:
- Each cluster component (
VMstorage, VMselect, VMinsert
) has three replicas configured with ingress and Horizontal Pod Auto-scaler (exceptVMstorage
which don’t use HPA).
Spot nodes(optional):
- It is possible to define
VMselect
&VMinsert
to run on Spot nodes, and save a lot of money.
In order to ensure ONLYVMinsert
&VMselect
Kubernetes deployments will be scheduled on Spot nodes, we need to add“taints”
to the Spot node pool, along with matching“tolerations”
and“nodeSelector”
toVMinsert
&VMselect
deployments.
Using this method, we will be able to manage pods scheduling easily and we won’t need to specifically define other Kubernetes deployments to run on on-demand nodes. - Make sure your Spot node-pool is configured with “taints”
For example:
Ingress enabled:
- To be able to get metrics from workloads in other sites, you must to have
vminsert
ingress enabled and working.
With other components, you can decide which would be appropriate for ingress, I enabled them all.
Alertmanager cluster-mode
- As part of deploying Alertmanager in cluster mode, we have to expose the Alertmanager on port
9094
. (Ingress comes with a default port of 9093 for Alertmanager web-interface.)
For this purpose, I configured the Alertmanager Ingress host with the extra path“/cluster”
to point at port9094
for Alertmanager service. - As a second step, we add the
“additionalPeers”
URL, which is the Ingress extra path we exposed that points to the Alertmanager service on Port9094
(if you don’t want to use cluster-mode in Alertmanager, just comment it out). externalURL
should point to the Alertmanager web-ui URL so that Alertmanager-slack-config alerts can be more effective, such as buttons
Alertmanger config
- alerts config based on cluster label:
We need to be able to filter which metric value or which alerts belong to which workload since we’ll get the same metrics with different values across workloads.
For that I created:
1.Slack alerts are grouped bycluster
label.
2.Using thecluster
label to route Slack alerts.
3.An extra template to display thecluster
label in all Slack alerts.
VMagent:
- For syncing and sending data between two clusters, I added
additionalRemoteWrites
, which points to the other clustervminsert
ingress-url.
Note that sinceremotewrite.url
inserting data with Prometheus remote write API., we are going to addinsert/0/prometheus/api/v1/write
to thevminsert
URL:http://2nd-vminsert.domain.com/insert/0/prometheus/api/v1/write
VMagent
relabeling is helpful since we have a number of workloads and only one datasource, so we can filter which metrics or alerts belong to which workloads.
In eachVMagent
we are going to addexternalLabels
with the labelcluster=<cluster-name>
additionalRemoteWrites + ExternalLabels
Grafana
- As a dependent chart, Grafana can be customized according to the official Grafana helm chart values.
- I upgraded the values to deploy the latest Grafana + sidecar images.
- I enabled
multicluster
to have Grafana dashboards filtered by cluster-label - I added PVC to make Grafana persistent.
Some other values:
- As a dependent chart, Prometheus Node-Exporters can be customized according to the official Prometheus Node-exporters helm chart values.
Consider marking it disabled in the chart if you plan on deploying it to a cluster equipped with Prometheus Node-exporters already. - There are many values that can be customized based on your kubernetes cluster or your preferences, for example, I disabled
K8S-coreDNS
monitoring and enabledK8S-kubeDNS
monitoring.
Go over the chart and define it as you wish.
Important Note
- I have shown examples for the
values-main-cluster.yaml
file, which is for the 1st VictoriaMetrics cluster. For the second cluster, we have to modifyvalues-2nd-cluster.yaml,
and change the IPs and URLs called for syncing between the two different clusters.
Let’s start with installation
prerequisites:
- Helm 3 + following the prerequisites in the official helm-repo
- 2 different Kubernetes clusters with Ingress installed (I used Nginx with internal-LBs only) to run VictoriaMetrics clusters and other additional tools come from the chart.
- At least 1 Kubernetes workload to run and be scraped by
vmagent
&VMoperator
optional:
- 2 node-pools in each kubernetes cluster, 1 for spot-nodes(with relevant taints) and 1 for on-demand nodes.
Installation
- add the Helm repositories including dependencies
helm repo add grafana https://grafana.github.io/helm-charts
helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
helm repo add vm https://victoriametrics.github.io/helm-charts/
helm repo update
1st Kubernetes cluster
- connect to the 1st Kubernetes cluster
- Create a new namespace
k create ns vm
- Go to
./VictoriaMetrics-MultiSite-HA-Cluster/VictoriaMetrics-cluster-k8s-stack
and run Helm install usingvalues-main-cluster.yaml
as below.
helm install vmetrics vm/victoria-metrics-k8s-stack -f values-main-cluster.yaml -n vm
- run
kubectl get pods
to see that all the pods are running successfully
amitk@pop-os VictoriaMetrics-cluster-k8s-stack git:(master) ✗ k get pods -n vm
NAME READY STATUS RESTARTS AGE
vmagent-vmetrics-victoria-metrics-k8s-stack-74d4594ffd-txmnb 2/2 Running 0 3d4h
vmalert-vmetrics-victoria-metrics-k8s-stack-7bdb59cf58-9snsr 2/2 Running 0 3d4h
vmalertmanager-vmetrics-victoria-metrics-k8s-stack-0 2/2 Running 0 2d23h
vmetrics-grafana-576ddc6c96-xmhrv 3/3 Running 0 5h24m
vmetrics-kube-state-metrics-69b8795b77-d2zbk 1/1 Running 0 21d
vmetrics-prometheus-node-exporter-gfkdz 1/1 Running 0 17d
vmetrics-prometheus-node-exporter-j8css 1/1 Running 0 21d
vmetrics-prometheus-node-exporter-v4jgv 1/1 Running 0 21d
vmetrics-prometheus-node-exporter-wbxkb 1/1 Running 0 21d
vmetrics-prometheus-node-exporter-zjclk 1/1 Running 0 21d
vmetrics-victoria-metrics-operator-7d5984fb9b-rvqgp 1/1 Running 3 (11d ago) 21d
vminsert-vmetrics-victoria-metrics-k8s-stack-6f75754b76-sj5sq 1/1 Running 0 3d4h
vminsert-vmetrics-victoria-metrics-k8s-stack-6f75754b76-wzr87 1/1 Running 0 3d4h
vminsert-vmetrics-victoria-metrics-k8s-stack-6f75754b76-xjndp 1/1 Running 0 3d4h
vmselect-vmetrics-victoria-metrics-k8s-stack-0 1/1 Running 0 3d4h
vmselect-vmetrics-victoria-metrics-k8s-stack-1 1/1 Running 0 3d4h
vmselect-vmetrics-victoria-metrics-k8s-stack-2 1/1 Running 0 3d4h
vmselect-vmetrics-victoria-metrics-k8s-stack-3 1/1 Running 0 3d4h
vmselect-vmetrics-victoria-metrics-k8s-stack-4 1/1 Running 0 3d4h
vmstorage-vmetrics-victoria-metrics-k8s-stack-0 1/1 Running 0 3d4h
vmstorage-vmetrics-victoria-metrics-k8s-stack-1 1/1 Running 0 3d4h
vmstorage-vmetrics-victoria-metrics-k8s-stack-2 1/1 Running 0 3d4h
2nd Kubernetes cluster
- connect to the 2nd Kubernetes cluster
- Create a new namespace
k create ns vm
- Go to
./VictoriaMetrics-MultiSite-HA-Cluster/VictoriaMetrics-cluster-k8s-stack
and run Helm install usingvalues-2nd-cluster.yaml
as below.
helm install vmetrics vm/victoria-metrics-k8s-stack -f values-2nd-cluster.yaml -n vm
- run
kubectl get pods
to see that all the pods are running successfully
amitk@pop-os VictoriaMetrics-cluster-k8s-stack git:(master) ✗ k get pods -n vm
NAME READY STATUS RESTARTS AGE
vmagent-vmetrics-victoria-metrics-k8s-stack-74d4594ffd-zcgas 2/2 Running 0 3d4h
vmalert-vmetrics-victoria-metrics-k8s-stack-7bdb59cf58-weras 2/2 Running 0 3d4h
vmalertmanager-vmetrics-victoria-metrics-k8s-stack-0 2/2 Running 0 2d23h
vmetrics-kube-state-metrics-69b8795b77-d2zbk 1/1 Running 0 21d
vmetrics-prometheus-node-exporter-qtsuv 1/1 Running 0 17d
vmetrics-prometheus-node-exporter-p8u6a 1/1 Running 0 21d
vmetrics-prometheus-node-exporter-95qer 1/1 Running 0 21d
vmetrics-prometheus-node-exporter-az245 1/1 Running 0 21d
vmetrics-prometheus-node-exporter-sdaa1 1/1 Running 0 21d
vmetrics-victoria-metrics-operator-1a22d5a89-asdqew 1/1 Running 3 (11d ago) 21d
vminsert-vmetrics-victoria-metrics-k8s-stack-87s564d89a-sj5sq 1/1 Running 0 3d4h
vminsert-vmetrics-victoria-metrics-k8s-stack-87s564d89a-zca57 1/1 Running 0 3d4h
vminsert-vmetrics-victoria-metrics-k8s-stack-87s564d89a-gsadw 1/1 Running 0 3d4h
vmselect-vmetrics-victoria-metrics-k8s-stack-0 1/1 Running 0 3d4h
vmselect-vmetrics-victoria-metrics-k8s-stack-1 1/1 Running 0 3d4h
vmselect-vmetrics-victoria-metrics-k8s-stack-2 1/1 Running 0 3d4h
vmselect-vmetrics-victoria-metrics-k8s-stack-3 1/1 Running 0 3d4h
vmselect-vmetrics-victoria-metrics-k8s-stack-4 1/1 Running 0 3d4h
vmstorage-vmetrics-victoria-metrics-k8s-stack-0 1/1 Running 0 3d4h
vmstorage-vmetrics-victoria-metrics-k8s-stack-1 1/1 Running 0 3d4h
vmstorage-vmetrics-victoria-metrics-k8s-stack-2 1/1 Running 0 3d4h
Results
Grafana Explore + Dashboards + VictoriaMetrics cluster stats
- After installing VictoriaMetrics cluster we can login to our Grfana with our admin password (
k get secret -n vm vmetrics-grafana -o jsonpath=”{.data.admin-password}” | base64 — decode ; echo
) - Using the
VM-k8s-stack
Helm-charts, we have already deployed some valuable dashboards.
One of them isVicotriaMetrics-cluster
.
VictoriaMetrics cluster useful metrics are contained here, which is helpful if you need to follow or debug your cluster,
The Stats panel also displays all the VM instances from both k8s-clusters that are part of the VM cluster and many other useful data. - Keeping in mind that since we are working with only one datasource, using cluster label to filter k8s-clusters and workloads metrics is necessary, sometimes it will be necessary to change some other dashboards queries.
- Here are 2 excellent Kubernetes dashboards that I adapted from the Grafana website and modified to filter by cluster labels:
1.K8S Fully overview by Exporter and cluster label
2.K8S Fully overview by kubelet and cluster label
Validate Alertmanager cluster-mode
- Check the Alertmanager web-ui — Status to see if the system is in cluster mode.
Connect workloads using VMagent
- To send metrics from any workload to our VictoriaMetrics cluster,
VMoperator
andVMagent
can be deployed on the workload. - To do this, I use the same
VictoriaMetrics-k8s-stack
Helm-charts I used for the cluster deployments, BUT, I disabled all other components exceptVMoperator
andVMagent
. - In the Helm-charts values, I edit the
remotewrite.url
inVMagent
to point to both k8s clustersVMinsert
Ingress/URLs, as well as the cluster label key according to the cluster name.
How I customized the values
The folder “VMagent-only-k8s-stack
” in my repo contains the entire chart files (for convenience only) and values-vmagent-only.yaml
file.
In the values-vmagent-only.yaml
file we can see that
- All the components are disabled except
VMoperator , VMagent & kubelet/Node-exporters(Optional)
VMagent
is configured with"additionalRemoteWrites"
to send metrics to bothVMinsert
URL/ingresses of the VictoriaMetrics clusters.- cluster label key is configured with the workload name.
Installation
Connect Kubernetes workload cluster
- connect to the required Kubernetes cluster
- Create a new namespace
k create ns monitoring
- Go to
./VictoriaMetrics-MultiSite-HA-Cluster/VMagent-only-k8s-stack
and run Helm install usingvalues-vmagent-only.yaml
as below.
helm install vm-devapps-ml vm/victoria-metrics-k8s-stack -f values-vmagent-only.yaml -n monitoring
- run
kubectl get pods
to see that all the pods are running successfully
amitk@pop-os ~ k get pods -n monitoring
NAME READY STATUS RESTARTS AGE
vm-devapps-ml-kube-state-metrics-8656556777-v24tp 1/1 Running 0 16d
vm-devapps-ml-prometheus-node-exporter-cxrzs 1/1 Running 0 16d
vm-devapps-ml-prometheus-node-exporter-pnmj6 1/1 Running 0 23d
vm-devapps-ml-prometheus-node-exporter-zxt8j 1/1 Running 0 3h6m
vm-devapps-ml-victoria-metrics-operator-b68bb4cd8-fg7sz 1/1 Running 0 16d
vmagent-vm-devapps-ml-victoria-metrics-k8s-stack-59f57b79db-jpgsf 2/2 Running 0 8d
%
VMagent dashboard
- Using the
VM-k8s-stack
Helm-charts, we have already deployed some valuable Grafana dashboards, One of them isVMagent
.VMagent
dashboard shares useful metrics which may be helpful if you need to follow or debug your workload connections.
As we can see below in my case I’m getting metrics from 6 different workloads usingVMagent
, and can filter based on the workload (job variable).
Summary & Achievements
To sum up what we’ve done,
We created a multi-site VictoriaMetrics with the following benefits:
- High Availability & Fault tolerance: VictoriaMetrics cluster is multi-site, so even if one cluster goes down, the other cluster will continue to operate.
Moreover, VictoriaMetrics has built-in redundancy and auto-healing features for each component.
All the - High performance & Dynamic scaling: Most of the
VMcluster
components likeVMinsert
andVMselect
scale wide based on resource consumption through Horizontal Pod Auto-scaling.
Moreover, VictoriaMetrics by design delivers better performance than Prometheus. - Cost-effective: We can run most of the
VMcluster
components, such asVMselect
andVMinsert
on Spot nodes, which is very cost-effective.
Moreover, compared to Prometheus, VictoriaMetrics consumes fewer resources and compress more data which saves us money both in computing and storage. - All-in-one data & management: It gives us the option to store all the metrics/data from many workloads in one Time-series-DB.
In addition, we manage the whole monitoring from a single platform/datasource.
It is then possible to connect any workload using only a tiny component such as VMagent
:
VMagent
is fast and friendly to compute resources- With
remote_write
protocol, it can collect metrics from various sources and send them to VictoriaMetrics cluster. - Configuration, connecting workloads, and managing are so easy.
Having a deeper dig into your monitoring platform, you now know how to make it more dynamic, powerful, and efficient… Enjoy!
Credits:
A big thanks to Yoni Amikam who has tested, reviewed, improved, and contributed to this deployment.