Victoriametrics? Replace Prometheus?

darrengomez
99.co
Published in
6 min readDec 31, 2022

The issue with Prometheus?

You have your Kubernetes cluster setup, and now you're looking for some visibility on what's happening in my cluster you come across Prometheus and Node Exporter, and there you go, it opens a whole new world into monitoring by using Grafana. Everything seems to be perfect.

With time you feel like you need more visibility for the applications that run on your cluster. Just CPU, Memory and Hard disk metrics are not enough, teams need to debug more into applications running. Viola Prometheus has libraries for almost all languages, and dev teams integrate Prometheus into applications and expose application-level metrics, now you have set up dashboards for the application metrics and everything looks amazing since you have visibility that you've never had before.

Let's say for example you have around 20 microservices ingesting data into Prometheus, one-day requests start increasing and you want to monitor how your services are performing and you notice gaps in your dashboards then you start checking has your pod had a restart and start debugging Prometheus.

Grafana dashboards when Prometheus restarts

Then you start to notice that with time memory has been increasing on the Prometheus deployment. Memory usage in Prometheus is directly proportional to the number of time series stored, as your time series grows, you will have more OOM kills. Then the next thing is to increase resource quotas, but you cannot do this till it fills up your entire node. Prometheus which ingests millions of data points can easily grow to around 100GB of memory which will be an issue if running on Kubernetes.

The problem arises when the system loads increases and Prometheus goes down due to resource crunches. Now you do not have metrics to even check why the services have a huge surge of requests which leaves dev teams in the dark. Once Prometheus has occupied your node’s resources other pods cannot be scheduled as well, Yes, you can have dedicated nodes for Prometheus but it does not take away the fact that the resource usage is increasing daily.

Why does Prometheus behave this way?

The main reason why Prometheus behaves this way is because of the way it is designed. It single-handedly scrapes metrics from all the targets specified in the configurations, stores the data, and processes the requests received from Grafana to visualize, or even normal queries received. Prometheus is not designed to be horizontally scaled, so once you hit the bottleneck of vertical scaling you're done. This becomes a very serious problem since Developers and Stakeholders understand the power of custom metrics and business data.

What's Next?

We need to find a solution that has the amazing functionality that Prometheus offers, something that scales horizontally, a solution that can be implemented without making application changes.

Victoriametrics

Here’s one solution for using Prometheus-like functionality at scale Victoriametrics (VM). According to my personal experience, the reason why migrating from Prometheus to VM is easier is that no components change, and the same syntax used for PromQL can be used to query data from VM. Simply deploy it on infrastructure and configure it as a data source in Grafana; existing dashboards will continue to function as before because there is no difference in data storage.

What's different with VM?

Data insertion, selection/querying, and storage are all handled by a single Prometheus instance, which is why when it fails, all monitoring verticals fail. This is a dangerous environment for a mission-critical product that relies heavily on its monitoring and alerting stack. These three main functionalities are separated by VM. They are known as vminsert, vmstorage, and vmselect. Allow me to elaborate.

  • vmstorage: indexes and stores the raw data and returns when requests with any label filters if needed
  • vminsert: accepts incoming data and sends it to corresponding vmstorage nodes with all the labels sent
  • vmselect: accepts incoming queries to VM and requests data from the configured vmstorage nodes
Victorimatrics Cluster Architecture

The three components listed above work together to provide an uninterrupted monitoring stack. Even if the insertion fails, the storage and querying functions continue to function normally, giving the team time to debug and fix.

Performance comparison Prometheus vs VM

A benchmark test article was written for comparison purposes, with the below storage stats,

  • Ingestion rate: 280K samples/sec
  • Active time series: 2.8 million
  • Samples scraped and stored: 24.5 billion

Summarizing let's see the benchmark results obtained by the above test,

Disk space usage:

Disk space usage: VictoriaMetrics vs Prometheus

Disk IO usage:

Disk IO: bytes written per second: VictoriaMetrics vs Prometheus
Disk IO: bytes read per second: VictoriaMetrics vs Prometheus

CPU usage:

CPU usage, vCPU cores: VictoriaMetrics vs Prometheus

Memory usage:

RSS Memory usage: VictoriaMetrics vs Prometheus

For more in-depth details on the specs used for the above test check out the article mentioned.

VM for small-scale monitoring architectures

VM offers two options. Even if you only need a small amount of monitoring, you can use vmsingle for stacks that ingest less than 1 million data points per second. In comparison to Prometheus, vmsingle can be scaled and performs well.

The load on vmsingle can be reduced by using the vmagent service, which scrapes data and writes it to vmsingle, requiring vmsingle to only serve storage and retrieval requests. You can ingest data for vmagent using this method from any deployed infrastructure.

Example scrape configs for vmagent

  global:
scrape_interval: 20s
# scrape self by default
scrape_configs:
- job_name: vmagent
static_configs:
- targets: ["localhost:8429"]
- bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
job_name: kubernetes-apiservers
kubernetes_sd_configs:
- role: endpoints
relabel_configs:
- action: keep
regex: default;kubernetes;https
source_labels:
- __meta_kubernetes_namespace
- __meta_kubernetes_service_name
- __meta_kubernetes_endpoint_port_name
scheme: https
tls_config:
ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
insecure_skip_verify: true

The above configs are similar to the Prometheus scrape configs, which makes it easier to migrate (No syntax changes)

Example remote write configs:

remoteWriteUrls: ["http://vmsingle-victoria-metrics-single-server.monitoring.svc.cluster.local:8428/api/v1/write"]

This basically points to the cluster DNS of the deployed vmsingle instance.

Note:

You can use the Prometheus remote write API as well to write data to vmsingle, but Prometheus WAL is heavily used for the remote write feature which in return brings the resource crunch problem back, so we easily replace Prom with vmagent.

There are a few default limitations that VM has which can be parameters tweaked which can be passed as extra arguments for a Kubernetes deployment, for example:

  • maxLabelsPerTimeseries: labels to store when data is ingested, the default value is 30
  • search.maxConcurrentRequests: search requests that hit the vmsingle node
  • search.maxQueueDuration: VM handles all incoming search requests via an internal queue mechanism, this parameter decides how long to hold the queue for, the default value is 10s
  • search.maxQueryDuration: time taken for a search query

These are some parameters that I adjusted to ensure that the functionality remained consistent with Prom and that the dashboard visualized the same data. All warnings and errors will be thrown on the associated node and can be debugged; more common errors can be found here.

Conclusion

If you use Prometheus and rely on it for mission-critical alerts, it is highly recommended that you look into alternatives because it is not designed to scale horizontally, so developers will find themselves increasing resources and maxing out their nodes over time, which will in turn restart Prometheus, creating a never-ending cycle.

VM is both resource and performance efficient. Migration is simple because the same scraping configurations can be used; no syntax changes are required. It is advised to run VM alongside Prometheus and observe the differences before fine-tuning it to match the same functionality and performance.

--

--