IT project delivery: How we made a two-year cloud migration at idealo successful

Honey Feelisch
idealo Tech Blog
Published in
9 min readJan 24, 2024
The five phases: More of a rollercoaster than a linear process. Source: project-management.com

If you’re dealing with IT projects that need to be rolled out efficiently towards a fixed goal, this article is for you.

I was Project Manager for idealo’s AWS Cloud Migration in 2022–23. Together with my colleagues, we did things that worked extremely well. We closed the project on budget and on time. What we dealt with can be abstracted to general challenges that occur if over 300 people are asked to reach an ambitious goal within two years. In this article, I’m going to share some of the learnings from a project-perspective.

To understand our decisions better, I’ll start with background information about idealo, if you’re not interested, you can directly skip to the “what worked well” part.

More on idealo’s cloud history

  • In 2018, idealo made its first steps with a Data Lake on AWS.
  • During 2018–2021, some cloud-based solutions were tried out, and more knowledge about the cloud was gained.
  • In 2021, idealo’s leadership decided that the whole product infrastructure shall be cloud-based until EOY 2023.
  • During 2022–23, idealo’s running offer-backend, Website and App for users and merchants in six countries were migrated into the AWS cloud by 36 product teams.

idealo’s mission as one of Europe’s largest price comparisons is to always show the best offer with the best price to its users. Our offer-infrastructure alone processes around 98 MB of offers per second, which puts idealo into the top category of German companies running data processing on AWS.

How idealo’s product teams are set up

In our Product & Tech department, we work in partly cross-functional product teams (Product Owner, Engineers, Team Lead, optional: Analyst, optional: Designer). “You build it you run it” is a lived reality in these teams, up to on-call for everything they own. For this article, I will differentiate between platform teams and application teams:

Customer-facing applications at idealo are, e.g., our Price Alert or our Wishlist. Internal applications are, e.g., our Offer Recognition Service or our Offer Analytics Service.

Our platform teams provide a platform layer with partially managed platform products. While the application teams migrated around 100 applications into the cloud, the platform teams built up a new platform layer in the cloud and maintained the old one in parallel.

Here we go — What worked extremely well for our cloud migration project

Planning Phase:

1. Clarify with upper management what the vision/priority is:

That gave me an understanding of what to spend my energy on as Project Manager and which goals to set for the project: Timing, optimization, structures, etc. Questions were: What do we prioritize, speed over costs, speed over optimized infrastructure? To dig one level deeper, I asked questions with opposing options as much as possible, e.g., In a worst-case scenario, would we rather hire externals to hold the timeline or extend the timeline? What impact would either of them have on our business?
For us, the general priority was to meet the timeline. The faster we would be in the cloud, the better our complete picture of the forecasted budget would be, and the faster we would experiment and learn in the real environment instead of drafting concepts.

2. Have a clear sponsor for the project:

It was crucial for me to have an official go-to-person at the top management level who could make personnel or budget decisions. In my case, it was easy because it’s the person who initiated the cloud migration and co-owns the Product & Tech department, our CTO (Hello Andreas :-)).

3. Form a project team:

The CTO and I formed a project team together with a Platform Domain Lead and the PO of the Cloud Shuttle Team (more on that below). We met biweekly to discuss open questions and distribute work. This worked out extremely well because, apart from other Stakeholders (e.g. Finance, Security, Legal, leadership round), we combined different skills and hierarchy levels in the same team. We worked on: Progress monitoring, personnel questions, budget questions, central technical issues, large blockers, and their solutions. We were able to solve a lot of problems within the project team. That was a very mission-critical setup, and I would always try it out again.

4. Form a shuttle team of experts:

Based on suggestions from AWS, idealo formed a so-called Cloud Shuttle Team before the project started. The team consisted of very Senior Engineers from our application teams and an Architect. The mission was to enable idealo’s move into the cloud by acting as internal tech consultants. What was crucial for us was that there was a commitment from the members of the Shuttle Team and their management to have them work as consultants full-time. Together with other platform teams, they explored new cloud solutions, were part of the nitty-gritty details when something didn’t work for the tenth time, established new formats like a weekly Demo between Engineers, and defined guidelines for how to develop software in the cloud. To summarize: The Cloud Shuttle Team was, as well, very mission-critical for the success of the project.

5. Be pragmatic about defining the vision/long-term result:

Long timelines are hardly plannable and often lead to a long road of theoretical discussions about what the team, department, or company envisions as the end-result and how to get there. We learned that we can be more pragmatic with that. The Cloud Shuttle Team had already established best practices and principles together with other application teams, e.g., about the IT security design or our account structures. Within a few weeks and just right before the project started, we had a first draft for a Cloud Operating Model (how platform and application teams will work together) plus a rough vision statement that explained the positive impact we wanted to have with migrating into the cloud. This was enough to start and have basic agreements between the participants throughout the project.

Execution Phase:

1. Set very concrete, quantifiable goals that you can track the hell out of:

Next to speed, we valued budget and stability highly. Our goals were structured as follows:

  • Objective: 2023 is going to be the first Black Friday season in which idealo completely runs in the cloud while offering a stable experience for our customers and partners.
  • KR1: We migrate 100% of all applications into AWS until the 1.10.23.
  • KR2: The average time to recover (TTR) for AWS-related incidents levels 1 & 2 stays below 4h until the 1.12.23.
  • KR3: idealo is staying within the allocated yearly AWS budget.

KR1 and KR2 were tracked via Jira tickets. We tracked KR3 via a monthly cost FC per team. The cloud costs were visualized in Tableau by our Finance department, which we and our Product & Tech leadership closely collaborated with.

This tracking combination helped us to always know where the project was compared to our target.

2. Find ways to constantly lower time/costs:

When the project is running, there are only a few windows to make central decisions with high impact. I believe that every IT project has such possibilities; you only need to find them (fast enough).

On our technical side, we made such a decision after the Cloud Shuttle Team came up with a concept to mirror data into the cloud first before starting with migrating our applications (our Kafka data, which makes up around 80%). With the mirrored data already available for all teams in the new cloud environment, they didn’t have to wait for their data producers to migrate first. On paper, we paid IT costs for the mirroring, but we saved a huge amount of transaction costs and reduced the blast radius of costly mistakes, by decoupling most of the migration activities of our teams from each other.

On the financial side, our Finance department used the perks that AWS offers their partners as exhaustively as possible (e.g., upfront commitments like saving plans or their migration tagging program). There are always onboarding or retention perks that providers offer, and we did well by going for them early and with a good enough understanding of the future needs. For year-long service commitments, good enough in some cases meant that even a deviation of up to 30% from our FC budget would still save us costs through the lowered price.

End Phase:

1. Set a point of no return at around ¾ into the project:

The last 25% are, in my experience, the make-or-break phases for IT projects. There is nothing major that can be changed now. You base everything on the groundwork that you have set. If you make it until then and have a slight feeling that everything can work out — that’s all you need.

It was important for us to create an environment of: This is it; that’s the last sprint. We scheduled slots in All-Hands, increased email communication, and searched for close collaboration with single teams to resolve issues faster. We called it the point of no return stream and set a stream-goal of 60% of migration progress that had to be reached per team before ¾ into the timeline.

Overall:

1. Build on processes that already exist — don’t reinvent the wheel:

One example for us was idealo’s well-established incident process. The incident process works with standardized Jira tickets, automated notifications, guidelines for teams, incident managers, and post-mortem events. The incident process came into play during the go-live phases of our application teams. It was unclear at the beginning how much uncertainty the teams wanted to go-live in case they would occur late into the project. Since speed was one of our main drivers, we moved forward with medium uncertainty at the end, knowing that the incident process would ensure instant problem-solving when needed. Our KR2 was an additional commitment to ensure fast reaction times during incidents.

2. Work very closely with Account Partners:

Members of our project team met on a biweekly basis with our AWS Account Partners. Sometimes we pulled their IT Architects into our work meetings, to have them challenge us with an outsider perspective. Our teams also used a dedicated MS Teams support channel with them, next to AWS’ own support tickets that the cloud provider uses to organize issue handling. We did similar with our other third-party providers, that have large stakes in our infrastructure.

3. Have one-directional feedback loops with the main project participants:

We used anonymous end-of-year surveys, accompanied by a Retro after the first year and occasional interviews with colleagues from all levels, to get a feel for the “atmosphere” of the project. The insights were a key changer for us. A lot of our setup that is described in this article came to life after the first year — thanks to all the feedback.

What could have worked, but we didn’t do it

Don’t reinvent the wheel, part 2 — tool decisions:

For some reason, every IT project that I’ve been part of includes at least one new tool purchase. The discussions that I’ve been part of were mostly circling around the technical and organizational implications of new tool introductions. This time, and as with any other company that has migrated into the cloud, we were confronted with a row of platform tool decisions (for computing, observability, data storage, and data processing). We didn’t know how much standardization made sense for our teams. For some of the categories, we decided too late and had to let teams try out different tools to move forward with their migration. A few months in, we looked the truth into the eye: There are companies on the market that are comparable to us in technology and size — and all of them had made similar decisions on where to unify their tool landscape and where not. So, we would have been more efficient if we had spent time understanding the decisions of platform departments at other companies.

Have more conversations about the vision:

Although we had a storyline for a future state, we could have gone deeper into a second draft during the project and as soon as every team had experienced the new environment for a couple of weeks. With that, we would have strengthened the general direction for the years after the migration. Something to be done after the project.

There’s more, but I’m valuing your time here ;-)
Thanks for reading!

If you want to share your best practices for project delivery, comment below, and let’s discuss!

Do you love agile product development? Have a look at our vacancies.

--

--