2xPOLITO + TOP-IX: Multi-Cluster Sharing is now Real with Liqo!

Fulvio Risso
The Liqo Blog
Published in
4 min readDec 17, 2021

For the first time ever, three independent clusters started to share cloud resources over the Internet.

Photo by Rod Long on Unsplash

On December 13th, for the first time ever, Liqo was set up to connect and share resources between three Kubernetes clusters: two located at Polytechnic of Turin (POLITO), Italy, the third located at the Turin/Piedmont Internet Exchange (TOP-IX).

Each cluster is independently configured and operated and is under the control of different organizations (Computer Engineering Dept @POLITO, Computer Networks research group @POLITO; TOP-IX). Each administrator made his own choices for the cluster setup and administration (e.g., CNI plug-in, external network connectivity, HA properties, etc).

Each cluster was configured with a fixed percentage of resources reserved for local use, while the rest was left for sharing with other clusters. Each cluster configured independently its peering, hence the sharing policies, with other clusters. For instance, CrownLabs established a sharing agreement with both the other clusters, while NetLab and TOP-IX decided to peer only with CrownLabs. With this setup, a job started in the CrownLabs cluster can be scheduled in any cluster, while a job started in TOP-IX or NetLab can be executed either locally or on CrownLabs.

Topology of the Liqo pilot, including different sites and their characteristics.

To verify that the infrastructure behave as expected, a pilot application was started in each virtual cluster. We selected the Google Online Boutique, which is made by 10 closely connected microservices; in addition, an instance of the Locust traffic generator was set up to artificially create a continuous load on the previous application. Proper monitoring probes were configured to collect and store run-time data in a local Prometheus/Grafana service, in order to keep under control the run-time behavior of the system over long timespans. A snapshot of the Grafana dashboard running in CrownLabs is shown below: application latency is very good (27ms), inter-cluster traffic (Liqo Gateway Traffic) is pretty stable over the time, with 9 offloaded pods over 16.

Security

Sharing resources with another organization raises strong security concerns. Liqo, although in this first implementation, takes this point into high consideration and enforces the following properties:

  • The of resources assigned to each cluster are capped in order to avoid “foreign” clusters to consume more than what it has been guaranteed;
  • “Foreign” pods do not have visibility over any other resource/pod running in the hosting datacenter.

What’s next

At Polytechnic of Turin, due to the COVID-19 pandemia, all exams are performed online. However, in some cases, the University would like to offer a remote desktop to each student to carry out the exam with the required application (e.g., programming tools, simulation tools, etc). Yet, the number of students that join each exam session is so huge that the POLITO cluster that hosts the necessary services (i.e., CrownLabs) cannot sustain the requested workload. In fact, the CrownLabs cluster is a small facility that was designed to provide experimental services to students and was not dimensioned to serve thousands of users at the same time.

Liqo provides a solution to the above problem: its capability to virtually extend a cluster over “foreign” resources will introduce the required elasticity to the CrownLabs infrastructure. A cluster can borrow resources from another cluster, without any preliminary configuration (e.g., set up of the foreign cluster with the set of services required to perform the exam). Cloud bursting is made simpler with Liqo.

We will report about this live experience after the exam period, next February. Our fingers will be kept crossed till then, but we are confident that students will experience a smooth and relaxed exam session.

Looking forward

The Liqo project balances two orthogonal (but intertwined) objectives. From one side, solid code and engineering practices, targeting real deployments, not just proof of concepts. From the other side, forward looking ideas, fostered from our research-oriented background.

With respect to the latter, we can observe that, so far, we activated a direct peering between parties. But… what about a different scenario, in which peering is enabled through a service composition point, similar to a broker, that facilitates the connection between different parties? This looks an intriguing idea to us, which may enable multi-stakeholder cloud provisioning, with multi-cloud federator/orchestrator capabilities and a possible marketplace engine. And, last but not least, the additional possibility to setup dedicated/SLA-aware network connections e.g., through an Internet Exchange (and possibly more than one), as part of the Liqo peering process. By the way, this would make our good friends at the TOP-IX Internet Exchange so happy!

Somebody could say that this is Gaia-X. Yes, likely, the overall vision is very similar. Ideas are welcome!

Want to join?

This pilot is open to anybody is interested in practicing about cloud sharing and we welcome new organizations wanting to setup a new peering with (some) of the current members. New experience, additional requirements, and forward looking discussions are always welcome!

For additional information, feel free contact the Liqo team.

--

--

Fulvio Risso
The Liqo Blog

Professor at Politecnico di Torino (Italy), passionate about network and cloud infrastructure.