KubeCon + CloudNativeCon Europe’24 notes

The Kubecon + Cloudnative Europe 2024 took place in Paris from March 19 to March 22. This year, the event attracted over 12,000 participants from around the world.

Criteo R&D
Criteo Tech Blog
12 min readApr 16, 2024

--

Authors: Harold, Geoffrey, Mathieu, Sennen, Flavien, Jonathan, Xavier, Thomas, Amine, Bruno, Dmitry and Michaël.

Some of our Criteos at the booth

The conference featured keynote speakers, technical sessions, workshops, and networking opportunities designed specifically for developers, operators, and other professionals interested in cloud-native technologies. Participants gained valuable insights into the latest advancements in Kubernetes and various open-source projects while also learning best practices for deploying and managing cloud-native applications.

As a sponsor of the conference for another year, we had the opportunity to engage with attendees at our booth and enjoy hosting live quizzes on Kahoot. We extend our gratitude to everyone who stopped by and took the time to interact with our Criteos.

People enjoying our live quiz

Criteos’ takeaways

As the conference took place in Paris this year, we dispatched numerous Criteos from our office to attend. Now, let’s review their insights and notes about the conference.

Harold Dost

For many years, I have been interested in observability as a topic. I look forward to seeing people’s usages and developments in and around the OpenTelemetry(OTel) community. I found From RUM to Font-End Observability with OpenTelemetry most interesting, which covered how things can be instrumented on the browser side using OTel and gathering RUM/web vitals metrics.

Purvi brought up many examples and mentioned something I’ve been feeling for a while. Instrumentation can be temporary. You don’t always need a heart rate monitor, and you don’t need all telemetry all the time. I would encourage anyone to spend some time and take a look at her presentation.

Tracing from the Browser in the Server Backend.

Geoffrey Beausire

The scale of KubeCon was once again quite impressive, with more than 12 thousand attendees and around 18 talks in parallel. It is quite exciting to see such a big community getting together. This year, we can definitely see the impact of AI as many talks revolved around MLOps and the orchestration of a Machine Learning pipeline in Kubernetes. My main regret this year was not seeing many talks with real-life returns of experience.

As for the talks, I have been particularly intrigued by Beyond Default: Harnessing CPU Affinity for Enhanced Performance Across Your Workload Portfolio. This talk showed the capabilities of the NRI (Node Resource Interface) balloon plugin. This plugin allows the allocation of containers to pools of CPUs (called balloons). These balloons can be changed dynamically to change the allocation of CPU.

At Criteo, performance and low latency are paramount, so controlling which CPUs our workloads run is a must. Servers with more than 100 CPUs are becoming increasingly common. While Kubernetes provides the Topology Manager to help the alignment of Pods to physical cores and NUMA nodes, it lacks flexibility and control. Some workloads will fare better leveraging NUMA affinity or having process runs on a different physical core. The NRI Balloon allows a high degree of configuration, such as:

  • preferCloseToDevices which allows scheduling close to a device (i.e., network card)
  • preferSpreadOnPhysicalCoresto spread over the physical core (over-packing on the same hyper threads).

Mathieu Bodjikian

Kubecon Paris was an invigorating immersion into the world of Kubernetes and its expanding ecosystem. The conference venue buzzed with enthusiasm as attendees, including myself, delved into the latest trends and innovations shaping the container orchestration landscape.

A few talks resonated deeply with me. One standout session showcased the next wave of advancements in Kubernetes, unveiling a host of new features poised to revolutionize container orchestration. From enhancements in scalability to improvements in workload management, these developments promise to elevate the capabilities of Kubernetes to new heights. I have in mind the in-place resources scaling, where you can now edit Pod resources (CPU and RAM) without having to restart the Pod.

Another highlight was the focus on observability, a topic near and dear to my heart as an SRE on the observability team. Several sessions explored the burgeoning landscape of observability tools tailored specifically for Kubernetes environments. From distributed tracing to metric collection, these tools offer unprecedented insights into the performance and health of Kubernetes clusters, empowering SREs like myself to ensure the reliability and efficiency of our infrastructure.

As I reflect on the wealth of knowledge gained at Kubecon Paris, I’m excited to bring these insights back to our observability team. The new features unveiled at the conference will undoubtedly shape our approach to upcoming projects, guiding our efforts to enhance the observability of our systems and applications.

Sennen Ekouaghe

I really enjoyed KubeCon, especially since it was my first-ever conference experience. It provided a fantastic opportunity to engage with vendors and learn about their work, which could be relevant to our team. Of course, the talks were a highlight as well. During the event, I had a productive discussion at one of the booths with an individual who introduced me to a new configuration language called Nickel. However, the quality of the talks varied, as some lacked technical depth, failed to deliver on their promised content, or lacked actionable takeaways for other companies’ contexts.

In my interactions with vendors, two topics stood out as potential areas for internal exploration:

  1. Facilitating Troubleshooting for Failing Workloads (esp. pods) in Kubernetes Clusters: This is an important aspect, and finding effective solutions could greatly enhance our operations.
  2. Continuous Profiling of Applications: This could prove valuable for popular languages in Criteo, which lack the robust support available for .NET.

While I also enjoyed some of the talks mentioned by my teammates, I also enjoyed the following talks:

From CNI Zero to CNI Hero: A Kubernetes Networking Tutorial Using CNI (slides + video)
This talk provided an excellent introduction to the role of Container Network Interface (CNI). It demonstrated how straightforward it can be to create a CNI plugin and how to troubleshoot common issues. The hands-on approach made it a great resource for understanding CNI.

Building a Large-Scale Multi-Cloud Multi-Region SaaS Platform with Kubernetes Controllers
(slides + video)

I appreciated how they decoupled actual clusters from the global management layer. By doing so, they made it easy to recreate clusters based on configurations handled by the management layer.

The discussion covered their choices for the global layer, which drew inspiration from Kubernetes’ controller design. Their solution, backed by a relational store, showcased thoughtful architecture.

Lastly, they addressed the challenge of handling reconciliation data — essentially managing a large number of events — when dealing with the cluster restart.

Overall, KubeCon was an enriching experience, and I look forward to applying the insights gained from these talks in our work.

Flavien Quesnel

This KubeCon was the occasion to get a lot of raw materials on many topics of interest for Criteo.

Regarding sustainability, we want to reduce the footprint of our workload. This can be achieved by correctly sizing each pod/container, e.g., leveraging the vertical pod autoscaling (either in enforce mode, or in recommendation mode), or by dynamically adapting the number of pod replicas to load variations, e.g., relying on Kubernetes event-driven autoscaling. On the specific topic of reducing the footprint of Kubernetes control planes, it is possible to nest Kubernetes clusters inside Kubernetes clusters via kpc, cluster, or open cluster management projects to decrease the number of dedicated nodes.

On hybrid & multi-cloud management, there are multiple general-purpose service mesh implementations like Cilium, but also more specialized frameworks like k8gb for load balancing over geographically dispersed Kubernetes clusters.

For privacy aspects, the confidential containers project looks promising for protecting data while they are being processed.

Finally, there have been many talks to raise the level of security by defining and enforcing governance rules (e.g., with Kyverno), hardening configurations, deploying threat detection stacks, or going even further with honeypots (e.g., Kwok, Kubernetes without kubelet) to observe attackers activity.

Now it is time to digest all of this, discuss it with stakeholders, and plan & prioritize future work.

Jonathan Amiez

I mostly focused on Network and Operations tracks as they relate to my day-to-day work.

I found “No ‘Soup’ for You! Enforcing Network Policies for Host Processes via eBPF” particularly interesting. The speaker discussed how modern container networking solutions like Cilium enable efficient manipulation and control of the data path in Kubernetes clusters. However, these features do not apply to containers running on the host network or plain host processes, as they are all grouped under a single identity. The talk focused on demonstrating how to assign Cilium identities to host network namespaces, allowing for individual control over their traffic.

Two other talks that caught my attention were “Connecting Millions of Containers Spanning Dozens of Clusters” by Datadog and “CRD Vs Dedicated etcd as Storage Backend : Lessons from Taming High Churn Clusters” by Isovalent. The first talk delved into strategies for designing inter-cluster traffic with minimal layers in scenarios where multiple codependent clusters are running. The second talk explored the challenges of scaling Cilium on large Kubernetes clusters. These topics align with our current focus at Criteo, making these insights particularly valuable.

Aside from attending talks, I also had the opportunity to engage in discussions with representatives from various projects and companies. These interactions have been invaluable for feeding into our ongoing efforts to enhance our infrastructure automation and reliability.

Xavier Milliard

You can meet three types of people at Kubecon:

Hardcore geeks (or at least technical people)

How to identify them
They wear last year’s conference t-shirt.

How to engage in conversation
Go straight to them and ask with a pensive mood: “How can we tune the scheduler to optimize startup time in the context of an overloaded cluster?”

There was a large panel of technical talks. The first difficulty was to prioritize talks you wanted to attend as so much was happening at the same time. Two of the most extremely interesting talks were:

I was a bit skeptical about WASM, but listening and talking with the audience has modified my understanding and interest. Now, it’s time to think if we can apply some of those approaches to our use cases.

End users

How to identify them
They generally carry a bag full of goodies

How to engage in conversation
Easier if you wait in queue for coffee or the restroom. You can ask what company they work for, from what country they are coming from or if they know when we can have food.

Another great time for me was talking with end users either randomly bumping into someone or at the Criteo booth. We met several companies with different architectures than ours but with the same kind of questions and problems when trying to serve their customers.

Salespeople

How to identify them
They have an “I love AI” tattoo on their forearm; they will surreptitiously scan your badge and have loud affirmations like “A recent customer has reduced (time to market/cost resource) by more than 78% and just by using our product!”

How to engage in conversation
Don’t 😉

Navigating through over-enthusiastic salespeople pitching their products was a funny exercise.

More seriously, I was afraid of too many discourses like “Kubernetes will solve all your problems,” but in fact, it was more “Think carefully before starting something.”

It is nice to see that Kubernetes has generated a large ecosystem to propose various solutions to almost all use cases users can imagine.

Thomas Langé

This was my second year at the Kubecon, and I was still pleasantly surprised by the diversity of talks and the amount of people gathered in one place to discuss openly about this stack.

I enjoyed exploring the various booths, discovering new software and open-source projects, engaging in discussions with maintainers, and exchanging ideas on problem-solving approaches on Kubernetes.

I obviously also attended several conferences, as many of them resonated with the challenges faced by my team. Criteo is on the move to adopt Cilium as CNI on their Kubernetes clusters, which triggers many questions about the overall architecture decisions to ensure the proper scaling of our stacks. In this context, CRD Vs Dedicated Etcd As Storage Backend [for Cilium] and Connecting Millions Of Containers Spanning Dozens Of Clusters were particularly insightful presentations.

As always, I left the CloudNative event brimming with countless ideas and am excited to continue my journey in this ecosystem and build it at Criteo.

Amine Falek

It was the First KubeCon I’ve ever attended and the scale of the event was impressive. I enjoyed that the presentations were organized into tracks which facilitated selecting the talks I wanted to attend, mainly focusing on performance optimizations, security, and overall troubleshooting.

Some of the presentations I ended up attending are:

  • Tutorial chaos unleashed workshop: interesting interactive troubleshooting session with the audience, but the host had multiple unexpected live issues with running their experiments.
  • Beyond default harnessing CPU affinity for enhanced performance across your workloads: great talk with in-depth details and results on NRI balloons policy configuration.
  • Choose your own adventure: The struggle for security: the best interactive presentation I attended, very engaging content about configuring a production-ready use-case to achieve a secure deployment.
  • How to save millions over years: interesting feedback from BlackRock experience on managing a cloud deployment to reduce wasted resources. The talk was, however, a bit clickbait as the measured cost reduction was only theoretical, and still, the computations estimating the cost were very basic.
  • Tutorial CTF (Capture The Flag): great workshop, we had a live cluster setup and the goal was to infiltrate the cluster taking advantage of its configuration weaknesses to retrieve a secret payload.

The overall event was good, but I took note of the following concerns:

  1. Most of the content was heavily marketing-oriented instead of technical.
  2. All the presentations I attended seemed to leverage cloud deployments (EKS usually). I could not find any company/presentation with a similar Criteo on-premise use case.

As a takeaway, I learned quite a lot about Kubernetes, even though I would have enjoyed more in-depth technical presentations. I learned what other companies are using it for and their current challenges/solutions, which helped me steer my own learning/research on the topic.

Bruno Bianchi

For my first experience at the KubeCon, I was pleasantly surprised at how huge this event is and how many sponsors are involved in the realm of Kubernetes.

I strolled within the booths, discovering the impressive amount of tools developed around Kubernetes, and I attended several conferences. A common thread pops up: AI is and will be everywhere to help in cluster management as well as to detect and prevent issues in maintenance. IA impacts even Moore’s law by changing its multiplying factor to 10 every 18 months, according to the speaker. It also comes with the challenge of integration and management of GPUs amongst the clusters in Kubernetes.

In the domain of security and authentication, solutions are being developed integrating passkeys, making authentication faster than ever.

RUST language is becoming more and more popular, solving the pain points of many current languages with a strong type system and memory safety enforced at compile time.

Dmitry Koshelev

It was my first KubeCon event and I would like to highlight two talks I’ve found interesting and well-presented:

Michaël Sanchez

This was my first experience attending a KubeCon event, and I must confess, I was taken aback by its sheer scale! Thousands of individuals congregated to exchange ideas, share experiences, and showcase projects within the CloudNative community. During the event, I observed several emerging trends within the ecosystem:

⚙️ There’s a noticeable convergence between Kubernetes and ML/Big Data workflows. Many discussions revolved around Spark running on Kubernetes as the preferred platform for ML workflows. This was particularly interesting as it aligns with Criteo’s strategy towards greater disaggregation between our data storage layer (HDFS), compute engine (Spark), and resource manager (YARN) in our datacenters.

🤖 The influence of Gen-AI is pervasive, and KubeCon was no exception. Numerous presentations explored efficient GPU resource management within Kubernetes, probably reflecting companies’ concerns about managing monthly cloud expenses 😃

🔍 The combination of Kubernetes and WebAssembly (Wasm) holds significant promise, offering benefits such as smaller image sizes and faster startup times. This is a development worth monitoring closely in the near future. Is it possible to run a low-latency Real-Time Bidding service as a Wasm application?

To wrap up, KubeCon + CloudNativeCon Europe 2024 provided an incredible experience and valuable opportunity for our teams to engage with numerous experts and gain extensive knowledge through the attended conferences and workshops.

Some of our Criteos at the booth

We extend our gratitude to the organizers for their outstanding work in orchestrating this expansive event. The seamless experience provided for both participants and sponsors is truly commendable. It has been an honor to be a part of this event, and we eagerly anticipate future editions.

--

--

Criteo R&D
Criteo Tech Blog

The R&D team building the Commerce Media Platform for the Open Internet.