Conquering the Summit: Navigating the Parallels Between Scaling the Matterhorn and Building a Cloud-Native Infrastructure

Thibault Dulon
MANGOPAY
Published in
11 min readOct 4, 2023

🧬 Life has an uncanny way of presenting challenges that mirror our most profound experiences.

🗻 Standing at the base of the formidable Matterhorn, a towering peak that has captivated mountaineers for generations, I couldn’t help but draw parallels between the grueling ascent before me and the monumental task that awaited me back in the corporate realm.

🧑‍🎓 The ascent of the Matterhorn taught me invaluable lessons about perseverance, strategic planning, and the art of conquering seemingly insurmountable obstacles — lessons that I found surprisingly relevant to the demanding endeavor of building a state-of-the-art Cloud Native Infrastructure for our company’s multi-region architecture.

As you already know my taste for comparing the incomparable, strap on your climbing gear and get ready to uncover the astonishing parallels between conquering the Matterhorn and building an inspiring Cloud Native Infrastructure for our company’s multi-region architecture 💪

Looking at my incoming challenge, Monte Cervino (Matterhorn) in Italy

0. 📝 Goal Definition: Planning and Preparation

You don’t climb a mountain like the Matterhorn without preparation. As much as you don’t create a whole multi-region infrastructure in the cloud without preparation either. 🤷

🎯 So once the objective is clear, you need to answer a bunch of questions like:

  • 🚩 What are the key milestones that separate you from the objective?
  • 💪 What are the skills you require to achieve your objective?
  • 🧑‍🏫 Can you be accompanied by experts?
  • ⏰ Is there any timeline in play? Deadline? Specific favorable period?

For example, here are the answers for my Matterhorn expedition:

  • 🏃 I needed to climb several mountains beforehand to get mountaineering techniques and knowledge. I also needed to train and set up several physical milestones like running 10km in under a certain time, achieving runs with more and more elevation gain…etc.
  • 🧗 I needed to know a lot of things. How to climb on rocks, how to climb on ice, how to deal with altitude, how to walk with spikes, how to endure fatigue but also a lot of climbing rope techniques and manipulations…etc.
  • 🦾 Not only that I want to climb the Matterhorn, but I also wanted to climb it the old alpine way (from the town downhill to the top in one go) and in the shortest time possible. So I decided to hire the best I know, Matheo Jacquemoud (I let you google him 😉)
  • ☀️ Matterhorn season is between July and August

Do you see the point? Such big objectives can be overwhelming but answering these questions allows you to have a clearer plan and a first macro planning. Don’t hesitate to use brainstorming and take the time to order all the ideas properly before jumping in.

When I started to lead the Cloud transition and tech modernization in Mangopay, I needed a plan so I first answered these questions:

  • 🗓️I needed to study new technologies deeply. How Kubernetes works, how AWS works, what are the best practices…etc. I can’t start without at least 20% of the answers (as said by Mike Horn, I’ll learn the other 80% along the way 😉). Then I need to develop a first proof of concept to ensure the viability of the plan. I need a strong team, I’ll then need to create the infrastructure, migrate the first app, validate load testing…etc. These milestone needs to mean something without going too granular at first.
  • 🧑‍🎓 When dealing with the cloud, you start dealing with the DevOps world and platform engineering. So the skill requirements are endless. Kubernetes, AWS, Cloud architecture, Software architecture, container management, observability, testing, networking, chaos engineering, resiliency, redundancy…etc. Formation and self-learning are crucial. We’ve been accompanied by Amazon to build a strong learning plan for the teams during the first months of the journey for example and open access to Udemy for teams also
  • 🎇 You need experts! Don’t be ashamed of that. I’ve hired a strong team to help us in this enormous challenge, and I’ve set up regular meetings with several Solution Architects from Amazon. They have the experience to make you gain time and build fast. They will avoid most of the errors you would have made as a beginner.
  • 🥵 1 year… Yes! Just one year. With no internal knowledge, we needed to build everything from scratch and upskill the whole company in 1 year

1. 🚠 The Climb Begins: The Matterhorn’s First Steps and Project Inception

Much like embarking on the Matterhorn expedition, the journey of launching a new Cloud Native Infrastructure began with careful preparation and a clear vision. Just as climbers analyze routes, equipment, and weather conditions, our team meticulously mapped out our architectural blueprint, selecting the optimal technologies, tools, and deployment strategies.

Every step on the mountain mirrored a corresponding decision in the project: determining which cloud services to leverage, configuring high availability, and anticipating potential bottlenecks. Just as the first steps on the Matterhorn set the tone for the climb, our project’s initial phases laid the foundation for a successful ascent toward our cloud-native summit.

After having hired the best team I could imagine, and finalized a first PoC. We started this incredible journey.

🏃 As I took my first steps toward the Matterhorn, the air was filled with excitement and anticipation. The initial hours of the climb felt easy, much like the early stages of embarking on our Cloud Native Infrastructure project. It seemed as if we were making swift progress, just as those first hours of trekking brought us closer to the mountain.

🦸 At the outset, both journeys appeared to be adventures we could easily conquer. Scaling a mountain or launching a new project can fill you with a sense of invincibility, like the world is at your feet, and success is just around the corner.

😳 However, much like the mountain revealed its true challenge with each passing step, our project unveiled its complexity as we delved deeper. The path to the summit grew steeper, just as our project’s intricacies became more apparent. What initially felt like a straightforward ascent now demanded a profound level of commitment and expertise.

🏔️ As the mountain loomed larger before us, we couldn’t help but draw parallels to the way our project was evolving. The thrill of starting a new endeavor was met with the realization that this was no easy feat. It was a reminder that the journey, whether up a mountain or into the realm of cutting-edge technology, would always test our mettle and push us beyond our comfort zones.

In the company, everything became harder step by step until one moment everything got complicated so fast.

  • 💥 Other squads were starting to use the new platform to test it and discovered problems
  • 🤯 We suddenly needed to face a massive need for access to cloud services for the whole company without having enough time to think about security or even without having chosen all our tools yet
  • ⛑️ With the first achievements we understood the amount of work missing to have a production-ready platform and our applications mature enough to run on it
  • 📈 Not only we had to work harder on the project but we also started to receive an increasing number of requests from people willing to understand this new platform and deploy their app on it

We decided to reorganize a bit. Fewer meetings, more focus time, more help from other teams, and an organization for treating the “run requests” (questions from other colleagues). Aaand… Obviously, we ingested the workload.

I remember during the Matterhorn expedition that I had a big moment of doubt arriving at Tyndall Peak. The Climb is very aerial, 3000 meters of nothing under my feet, my vision is getting blurry, I’m losing my focus and I’m feeling all my tired muscles. I look up and see the path yet to be climbed and I’m wondering “Can I make it?”

2. ⛰️ Scaling Heights: Conquering Challenges, Crafting Success

On the Matterhorn, as doubts and exhaustion set in, I pushed forward. Each step became a testament to my resolve, and each obstacle surmounted was a victory. In these moments of adversity, the true essence of the climb revealed itself — the ability to conquer not only the physical challenges but also the mental hurdles.

Similarly, in our Cloud Native journey, as we faced technical roadblocks and questioned the feasibility of our ambitious project, we discovered a similar second breath. It was a turning point when we realized that challenges were not roadblocks but stepping stones to innovation and growth.

For example, we knew for a while that we needed a strong API Gateway solution.

We also knew what were our functional requirements. So we tried several solutions. 3 to be transparent.

🤔 But after the third one, we were questioning ourselves. Why no tool is fully responding to our needs? Are we doing things properly?

It required perseverance and research to finally find the last technology that fit the most our needs and then, we were back on track 💪

The deadlines were also pressuring us a lot and we had to accept sometimes taking shortcuts and accumulate technical debt that we would have to pay later on. So we created an EPIC on our JIRA to collect all our debt and pay it after the launch.

One of the nights we worked together switching traffic to the first cloud infrastructure pieces

We had two parallel strategies in Mangopay. The first one was a quick relocation to move our Virtual Machines from on-premise to AWS. It allowed us to not rely anymore on our old infrastructure.

The second and longer strategy was to modernize the whole technical stack and rely more on native services that offer AWS.

Here are a couple of milestones that gave us strength:

  • We managed in a good time to create the whole infrastructure for our environments using Terraform (network, queuing system, database cluster, caching system, Kubernetes cluster, logging system… etc.)
  • We managed quick and safe communication between accounts and regions thanks, among other things, to the AWS transit gateway
  • We deployed successfully several new microservices on this infrastructure
  • We also had to set up a whole new CI and CD pipeline. We took advantage of it to introduce CI as code in the company
  • We managed to finally relocate our whole infrastructure to the cloud (as the first step of our journey) after 3 complicated nights of hard and collaborative work

3. 🚀 The Summit Push: Reaching the Pinnacle in the Clouds

🏔️ Reaching the summit of the Matterhorn was a moment of unparalleled triumph — a culmination of preparation, effort, and determination. So much time preparing for that adventure, so many nights dreaming of that moment, and so much self-involvement to finally get there was a moment of strong emotion and pride.

At Matterhorn summit, 4478m

🚀 At Mangopay we’re a few weeks from being fully operational on our whole tech stack modernization. The missing parts seem ridiculous compared to what we already achieved but we can’t lose focus and unwind yet. One more month to go more or less but the feeling is already starting to be great.

Here are a couple of achievements we already had. One year ago:

  • We had almost no microservices. Now we have plenty of them
  • Apps were running on IIS servers. Now they’re running on Kubernetes
  • We had only 1 app containerized. Now every single app is containerized
  • Our Continuous Integration pipeline was manual through a UI. Now it is automatized and as code
  • Everything was on-premise. Now everything is on AWS
  • All apps were using an expensive and difficult-to-scale DBMS. Now, except for 1 app, everything has been migrated to cheaper and better DBMS
  • Our monolith was the one routing the request to our other microservice. Now we have an API Gateway
  • We were using a deprecated way of authenticating our customers, now we’re using a high-standard and agnostic solution
  • Our infrastructure and software architecture were single-tenant. Now we’re able to be multi-tenant

We also changed our VPN solution, our monitoring solution, and our way of collecting logs, we implemented OpenTelemetry to also collect metrics and traces…

Well, you got my point. This one-year journey has been full of changes and… So far, so good!

4. 🏞️ The Descent: Learning from the Journey and Ensuring Long-Term Sustainability

Descending from the Matterhorn was a time for reflection — a chance to learn from the journey and share insights with my guide.

What did I do well? What are the strong bases on top of which I can capitalize?

What did I do wrong? What should I work on to be better in the future?

Most people think that once you arrive at the top, then you accomplish your journey. But the truth is, it’s not finished until you’re back home.

It’s exactly the same for our Cloud journey. After the official launch, we’ll probably have months of work to adjust, scale, fix, and improve.

The path is long to reach a multi-regional architecture that is fault-tolerant, redundant, scalable, reliable, and available. But the north star is shining bright ahead of us, showing us the direction of Mangopay’s future tech.

A couple of things that worked for us during this adventure:

🎏 Cloud Migration was divided into streams (API gateway setup and implementation, new high standard authorization system, big framework update of all our apps… etc.) so we naturally split our team by stream. Each stream had an owner in the team and each owner had the responsibility to move forward

  • 👍 It made everyone gain more responsibility
  • 👍 It made everyone feel responsible for some serious project
  • 👍 It gave me more mental time and space to organize everything
  • 👎 We lost a bit of the collaboration feeling
  • 👎 It made everyone lose track a bit of the other streams

🧙‍♂️ Our way of working has been revisited several times. We were using the AGILE methodology with all its ceremonies. Daily, Retro, Planning, Refinement, Tri-amigos…etc. We also had 1 weekly by stream…etc. It took us too much precious time. So we switched to Kanban, removed refinement and planning ceremonies, kept some weeklies to sync every stream, and removed the rest

  • 👍 It gave us back precious time every week
  • 👍 It accelerated the projects
  • 👍 It didn’t affect the involvement or the time to market
  • 👎 It made everyone lose track a bit of other streams

📝 We documented everything, made and maintained roadmaps

  • 👍 It made easier the reporting to upper management
  • 👍 It made clearer the path missing for each achievement
  • 👍 It helped everyone in the team AND outside of the team to upskill
  • 👎 It took us a load of time

♻️ We stuck to an iterative process for each stream (PoC → MVP → iteration 1… etc..) and with the long-term idea of a multi-tenant architecture

  • 👍 it gave us the possibility to show progress quickly
  • 👍 it gave us a fast feedback loop to improve each stream
  • 👍 it made us sure to not have to revisit the architecture later
  • 👎 nope, no bad point here really 🤷

What worked for us doesn’t mean it will work for you and doesn’t mean either that it was the best solution but as the descent is always the good moment to take a step back and see what worked and what not, I felt useful to talk about it here 🙂

If you’re reading these lines, thanks for having read this long story! I hope it has been entertaining and it gave you ideas!

See you on the next story 😉

--

--