Benchmarking Liqo: Kubernetes Multi-Cluster Performance
Liqo is an open-source project enabling dynamic and seamless Kubernetes multi-cluster topologies. You may be wondering whether this comes with significant performance overheads compared to vanilla Kubernetes. Below, we benchmark liqo (v0.4.0), focusing on the most relevant verticals concerning the offloading process. Spoiler alert: you will hardly feel any difference.
Application Offloading
The first benchmark analyzes the capability to start a huge burst of pods, which may be impacted by the offloading process performed by the custom liqo Virtual Kubelet (VK) implementation, as well as by the 2-level scheduling process.
The testbed
Our testbed was made of two Kubernetes clusters (k3s v1.21), one playing the role of the resource provider, and the other of the resource consumer. The resource provider includes 100 worker nodes that, for scalability reasons, are implemented as kubemark hollow nodes. Each one is backed by a component, the hollow kubelet, that pretends to be an ordinary kubelet, but it does not start any container it is assigned to, it just lies it does. This allows to start a massive number of (fake) containers, even tens of thousands, consuming little resources in our cluster, given that containers are not actually running.
We performed all measurements through a custom tool [1], which creates the appropriate deployments and waits for the corresponding pods to be generated, possibly offloaded, and become ready. We initially executed it directly on the resource provider cluster to determine the vanilla Kubernetes baseline, and then on the consumer, peered with the provider through liqo, to measure the offloading performance.
The results
The chart below presents the outcome of the evaluation, showing the time elapsed from the creation of a deployment up to the instant all generated pods are ready, for a number of pods varying between 10 and 10000. All measurements are repeated ten times, and the error bars represent the resulting standard deviation.
At a glance, it stands out the performance alignment between the vanilla Kubernetes logic (i.e., all pods scheduled on local worker nodes) and the liqo multi-cluster scenario, with a limited overhead visible only when starting very few pods in parallel (≈200 ms). We also evaluated the performance in case consumer and provider are interconnected by a high-latency WAN (100 ms round trip time, simulated on the hub cluster through netem): no significant difference emerged compared to the on-LAN scenario.
Service Exposition
This test evaluates the time required by the liqo reflection logic to replicate a service and all the associated endpointslices to a remote cluster, hence making them available for consumption by remote applications. This shows the time required to propagate a new service (or a new running endpoint) across the control plane of the virtual cluster, and the scalability of the solution.
The testbed
Our testbed was similar to the previous one, with two Kubernetes clusters and a set of hollow nodes to host the fake containers. We considered the opposite scenario, with a varying number of pods started locally (i.e., on the provider) and, once ready, exposed through a single Kubernetes service, making them accessible from the consumer thanks to liqo.
A custom tool measures the time required to fill all endpointslice entries, both on the local cluster (i.e., by the vanilla Kubernetes logic) and on the remote one (i.e., through the liqo reflection logic).
The results
Results are in the chart below, which shows the ten-runs average of the time elapsed from the creation of a service targeting the given number of pods to the complete creation of the corresponding endpointslices (i.e., they are marked as Ready).
The graph confirms the limited performance overhead introduced by the liqo reflection logic compared to vanilla Kubernetes, accounting for a few milliseconds only even in the most demanding scenario. Given the overall short times required to complete the process, the effect of the underlying network latency becomes relevant. However, in absolute terms, the overhead is close to the network latency itself, which is unavoidable.
Digging deeper: assessing service reachability
To further characterize the service propagation overhead, we measured also the time elapsed from the creation of a service to the instant it is fully reachable, which is what matters for a typical user. This number includes the reflection logic, the configuration of iptables rules by vanilla kube-proxy, and the contribution of the network fabric data plane (e.g., packets traversing the VPN tunnel).
In this scenario, we used a single nginx pod as service endpoint, with a custom tool executed in both clusters that continuously probes the service through TCP SYN segments, until the corresponding acknowledgement is received; hence, confirming the service is fully reachable.
Across ten runs, the local service (i.e., where the back-end pod is running) and the liqo-reflected one (on-LAN) turned reachable almost simultaneously, while the WAN scenario is associated with a slightly worse value due to the higher underlying network latency. Overall, these results confirm once more the extremely limited overhead of liqo compared to vanilla Kubernetes, despite its additional multi-cluster features.
Focusing on network throughput, the data plane handling the actual communication between any two pods hosted by different clusters relies on standard VPN technologies (i.e., WireGuard) and inherits their performance, as well as those of the underlying network. Hence, we can say that its performance is equal to that guaranteed by Wireguard, as reported on their official website benchmarks.
Peering Establishment
Differently from other technologies, liqo supports highly dynamic topologies, with a potential large number of (short living) peers (e.g., edge clusters). This test measures the scalability of the peering establishment process, evaluating the time elapsing from the discovery of a new candidate to the creation of the associated virtual node, varying the number of target clusters.
The testbed
The testbed used in this benchmark consists of n Kubernetes clusters (k3s v1.21), with a central entity (hub cluster) establishing uni-directional peerings towards all peripheral clusters (i.e., the hub can schedule workloads on the peripheral clusters, but not vice versa).
A custom tool executed on the hub cluster was responsible to detect the peripheral clusters and start the peering process (i.e., creating the corresponding ForeignCluster resource), while monitoring the total time.
The results
The chart below shows the time required to complete the peering process (i.e., when all the virtual nodes are ready for application offloading) for a number of peripheral clusters ranging from 1 to 128, with all the results averaged across ten runs. To evaluate the impact of the distance between the hub and the peering candidates, we considered also the case when they are interconnected by a high-latency WAN (100 ms round trip time, simulated on the hub cluster through netem).
Results show that the total time increases mostly linearly with the number of parallel peering candidates, while being characterized by a constant lower bound when dealing with less than ten clusters (due to the liqo VK pods startup time). The overall trend is consistent regardless of the underlying network latency, although the WAN scenario shows a relatively higher burden (especially when establishing few peerings in parallel).
These results confirm the scalability of the liqo peering process, even when we handle a significant number of peerings in parallel, and requires way less than one second for each target cluster in the most demanding scenario.
Liqo Resources Characterization
The last test characterizes the liqo resource demands, in terms of CPU and RAM required for the control plane execution, as well as the network traffic generated by liqo towards the remote Kubernetes API servers during the different operational phases (e.g., peering, resource offloading, etc.). This could be useful to determine the additional resource consumption of liqo in your home cluster.
The testbed
The testbed is similar to the one used for the previous evaluation, and it is composed of eleven single-node k3s clusters, one playing the role of the hub and ten behaving as peering candidates. CPU and RAM consumption is retrieved every second on each cluster through the APIs exposed by the containerd container runtime, while network traffic is measured on the hub by a custom libpcap-based program.
The results
The chart below presents the outcome of the measurements, subdivided into (1) the liqo control plane of the hub cluster (i.e., all the liqo components excluding the VKs), (2) the sum of the ten VKs hosted by the hub cluster, and (3) the liqo control plane of the peripheral clusters (no VKs are present in this case, since peerings are unidirectional). As for the latter, CPU and RAM are averaged across all peripheral clusters (although differences are really negligible), hence showing the average requirements for a single leaf cluster.
We considered five different usage phases (highlighted with different colors):
- At rest and with no active peering (0-60 s): liqo requires ≈150 MB of RAM on each cluster, with almost no CPU usage and zero network traffic.
- Peering with ten peripheral clusters in parallel (60–70 s): the local control plane shows a short CPU spike during this process and a few MB increase in memory consumption, while the ten VKs, started in parallel in the hub cluster, account for ≈250 MB of additional RAM in total. The exchanged network traffic is negligible.
- At rest and with the virtual nodes ready (70–120 s): no CPU and network resources are required by liqo to maintain the peerings active, while the memory occupancy remains stable compared to the previous phase.
- Offloading 1K pods to simulate high churn rates (e.g., a large number of pods is started or changes its state) (120–180 s): the ten VKs (as a whole) required about 0.3 CPU cores on the hub cluster, while the additional request of both local and remote control plane was almost negligible. VKs RAM usage increased as well, since memory consumption is directly related with the number of pods, according to the standard Kubernetes operators implementation (i.e., watched resources are cached by informers). The information synchronization between the different clusters resulted at the same time in some additional network traffic, although it never exceeded 4 Mbps. This brings to the conclusion that in these conditions each peripheral cluster (hosting ≈100 pods) contributed with about 400 Kbps of additional network traffic.
- At rest, with the active peerings and the offloaded pods running on the remote clusters (180–250 s): none of the metrics displayed variability, confirming the negligible demands in absence of transient periods.
Concerning the offloading phase, we repeated the evaluation for different numbers of pods, while keeping constant the other parameters. Overall, we observed similar CPU usage and network traffic, although the duration of the transient period is proportional to the number of pods (e.g.,≈30 s with 500 pods), according to the Kubernetes deployment pacing. As for the VKs RAM usage, the theoretical linear correlation was not completely reflected in the actual measurements, due to the default behavior of the Go language garbage collector. Similarly, different experiments done with different infrastructures (e.g., node characteristics) showed very similar patterns, with approximately the same amount of total CPU consumed and traffic exchanged, although constrained in shorter (with more powerful nodes) or longer (with less powerful nodes) intervals.
Memory usage (≈160 MB on each peripheral clusters) could be the only metric that might require attention, as it may be critical for micro-clusters created by resource-constrained nodes (e.g. embedded devices running at the edge of the network). However, it is worth mentioning that even lightweight Kubernetes distributions do require non-negligible amounts of RAM (e.g., k3s recommends at least 1 GB), and, at the time of writing, liqo could be further optimized in this direction.
Concluding remarks
Our tests confirm the limited overhead introduced by liqo in terms of additional resource demands with respect to vanilla Kubernetes. This proves the feasibility of the transparent Kubernetes multi-cluster approach enabled by liqo, allowing to seamlessly leverage the resources and services made available by remote infrastructures.
If want to know more details about the liqo architecture, the long-term liquid computing vision it fosters, the experimental setup leveraged for the different benchmarks and additional in-depth results, you can read our manuscript publicly available on arXiv (https://arxiv.org/abs/2204.05710).
[1] The artifacts required to replicate the experimental setup and perform the measurements are open-source and available on GitHub.