How we are managing a container platform: a tale about the past

Published in

adidoescode

4 min readJan 26, 2024

On May 10, 2022, we began the migration of our platform’s configuration to a GitOps-based setup. This shift stands as one of the most significant strategic moves made within adidas’ container platform (up to the date of writing this post). While there’s plenty of room to improve, this step is a crucial moment in our technological progress.

Audience

These articles doesn’t require an in-depth technical background. Whether you’re well-versed in technical aspects or have a business-oriented profile, you’ll find valuable insights within these pages.

We are aware

We recognize that GitOps was a significant trend in the past years. However, considering our extensive operational scale and the unique constraints we navigate, we find it relevant to share insights after over a year of running this in production.

Source: https://trends.google.es/trends/explore?date=2015-03-12%202024-01-01&q=gitops

Context & Scale

adidas has been creating and enhancing its container platform for over five years (since late 2018). adidas isn’t a startup, nor is it even a tech company, but it does leverage technology to enable other teams to work more efficiently. From physical stores to the .com site and even product design teams, everyone uses container-based technology in one way or another. Ensuring the proper functioning of this container platform is critical for adidas to maintain its position as a leader in the fashion industry.

It’s imaginable that adidas’ container platform has to be a global one. We have infrastructure spanning from China, through Singapore, across Europe, to North and South America. We operate thousands of (ephemeral) servers in the cloud running containers 24/7. We enable hundreds of teams and thousands of developers worldwide.

Previous to GitOps Configuration

With dozens of clusters spread globally and the premise of maintaining all configuration as code, the team developed a framework to configure each cluster in more or less granular detail.

This involved having a configuration repository for each cluster and, at the same time, a repository with shared configurations for all clusters.

The ideal world doesn’t exist; each geographical zone has its peculiarities, so we needed to customize the configuration for each cluster and override certain values. To achieve this, we used branches in the configuration repositories.

Additionally, we have our own developments to integrate different adidas systems into the container platform, versioned in different code repositories.

The continuous delivery system was responsible for pushing these changes to the different clusters, taking into account maintenance windows for critical systems (again, each geographical zone have its restrictions).

In summary, we had:

One repository for each cluster.
Several branches in the cluster repositories.
Each branch had its own pipeline to apply the configuration.
A shared configuration repository.
Several branches in this shared configuration repository to override different configurations depending on the environment and/or geographical zone.
Code repositories implementing integrations with internal adidas systems.
A central repository creating deployable packages of these internal developments.

This meant that, when updating a component, we had to modify between 4 and N repositories (where N could reach ~50) and open different change requests with their review and approval cycles (following the principle of ‘four eyes’).

All of this involved running several (~50) validation and deployment pipelines, which took hours to apply new configurations to all clusters, not to mention maintenance windows. There were times when we couldn’t deploy changes on time in critical environments because the system was too slow, leading to different configurations across environments/clusters.

*Close up of STRUNG textile from* *SOURCE* *(news.adidas)*

We had to change. Objectives:

Reduce the time spent on operations.
Manage platform clusters more efficiently.
Improve process resilience.
Enhance visibility of configuration status globally and locally.

Preparing for the Next Stage!

Get ready for the next part of our series! In Part Two, we’ll explore details like the GitOps project structure, monorepo setup, validation/diff pipelines, PR comments, dashboards, alerts, and more. Like preparing for the next phase of a game, we’re diving into the core mechanics that drive our technology at adidas. It’s time to explore further!

The views, thoughts, and opinions expressed in the text belong solely to the author, and do not represent the opinion, strategy or goals of the author’s employer, organization, committee or any other group or individual.