Skyscanner and AWS: cloudy, with a chance of lessons learnt

Ashley Sole knows a thing or two about moving to the cloud — in this piece he shares his hard-won knowledge from Skyscanner’s recent project: the migration of 300 services from an estate comprising five data centres and 7,000 VMs — to AWS

Skyscanner Engineering
Jul 11 · 6 min read
Empty racks and happy faces — Skyscanner team members celebrating the completion of a mammoth AWS migration project

Here at Skyscanner, at the end of 2018 we completed our migration to the cloud (AWS). This concludes a 3 year journey of moving Skyscanner from on-premise data centers to being fully hosted in the cloud. For the people working on the migrations, it has at times felt like a never-ending journey. We are immensely proud of our achievement - cloud migrations are not simple and we’ve learned a lot along the way. Many other companies have been on a journey to cloud for far longer than us and made less progress.

At the end of 2018, the hardware hosting Skyscanner in the data centers was approaching 7 years old, so at the start of last year we had to make a decision: would we re-invest in data center hardware at a huge cost, or execute an aggressive “all-in” cloud strategy? We chose to go all in, so we rolled up our sleeves and got stuck in.

What was involved?

How did we approach it?

You build it, you run it
- Skyscanner engineering principle

Skyscanner operates a “you build it, you run it” approach, so each team was responsible for formulating and executing a plan to migrate their own services. There was no central “migration team” responsible for migrating software; teams know their services better than anyone else, so they were responsible for deciding and implementing the most appropriate migration for them. In order to succeed with this approach communication between teams and clear expectations of deadlines was critical.

Throughout this process, we regularly referenced the 6 Strategies for Migrating Applications to the Cloud. Some of our migrations were rehost and some were replatform efforts, but in large part our migrations were cloud-native rewrites. The most dramatic piece of software modernisation we carried out for the migration was to rewrite the core Flights stack to turn a .Net, SQL Server-backed monolith into stateless Java microservices running in Kubernetes on AWS spot instances. This was a huge undertaking, but the gains in terms of software modernisation, resiliency, ability to scale and cost optimisation are huge.

Every team was in charge of their own migration and most team’s Plan A was a cloud-native rewrite. But we soon learned that for some migrations this was either not feasible or not appropriate. We used project roadmaps to define milestones and deadlines, along with a tonne of communication between teams to make sure everyone was clear about what was expected. This then made conversations about executing a rehost instead of a rewrite much easier.

What did we learn?

Lesson 1 — Rewriting software for cloud takes a very, very long time

“It always takes longer than you expect, even when you take into account Hofstadter’s Law.” — Hofstadter’s Law

Lesson 2 — Tech debt is necessary

The biggest piece of tech debt was our SQL server estate which we rehosted to AWS. There’s nothing inherently wrong with SQL server, it is still the core of Skyscanner services. It is however a monolithic database that in order to modernise will need to be carved up into different technologies appropriate for their usage. For example, certain tables in the SQL DB rarely change, so would fit well with a static file store like S3; other parts change more frequently so a caching solution may be appropriate. It will take many more years to fully migrate completely, but this was a compromise we had to make to hit our targets.

Lesson 3 — Constant planning and excellent communication are key to success

“I have always found that plans are useless, but planning is indispensable.” — DWIGHT D. EISENHOWER

The migration project was operated in a true agile sense, there was no “500 page cloud migration strategy”, just lean principles. We could constantly ask ourselves:

  • What’s the current status?
  • Are we on track?
  • What are the blockers?
  • How could we go faster?

Status emails would turn into meetings, which would turn into actions, which would turn into update reports — in the space of hours. This enabled us to move fast and deliver successfully.

What now?

Join Skyscanner, see the world

We’re hiring!

About the Author

If you liked this post, here is another one I wrote about my daughter arriving early and my company actually caring.

Ash Sole, Skyscanner

Skyscanner Engineering

Written by

We are the engineers at Skyscanner, and we are transforming the travel industry. Visit Skyscanner to see how we walk the talk https://www.skyscanner.net/