Scaling our infrastructure at TransferWise: how we make it work

feidhlim o'neill
Sep 16 · 6 min read

At TransferWise, we’re proud of our autonomous team culture and the way we work. Over the past few years we’ve been growing rapidly. We now have over 60 independent, autonomous teams building and improving our product for our 8 million customers. Scaling our product has resulted in our infrastructure and the team behind it to grow and evolve accordingly. This has meant moving from a focus on physical infrastructure to introducing specialised teams to accommodate for new requirements of the business.

This blog is about the rise of specialisation at TransferWise and how we make it work. We’ll deep-dive into the upsides and challenges of specialisation, as seen by our Infrastructure team, and how our ways of working, customer focus and mission help us balance the trade-offs.

The last few years have been quite a ride for TransferWise. We’ve grown into a global team of 2200+ employees with over 400 Engineers and now have over 300 microservices running in production. Our autonomous team structure has allowed us to scale our teams, product and infrastructure quickly. But this structure also requires constant support and decision-making, while balancing between reflecting back to our foundations and thinking strategically about the future. We need to think through how and what can we scale, what’s culturally important, what behaviours should be shared and what shouldn’t and, of course, always keep in mind how our decisions benefit (or hurt) our customers.

Image for post
Image for post

Scaling fast = more complexity

As we grow, the way our autonomous teams work together and ship new features is becoming more complex.

We work in a heavily regulated environment so many requirements aren’t optional. We also support a lot more customers, who have different types of needs from our product. We’re operating and expanding into a lot more countries, while building lots of new features. It’s just more complex!

Our technical platform is a key part of how we operate and scale. It helps us ship products quickly, while reducing the burden of managing complex requirements.

This technical platform covers a broad number of shared requirements like messaging, service layer and observability. To design and operate these shared technologies, we’ve built teams around these domains. The teams work out how to provide good enough basic capabilities and where we can invest in to give us advantage. They aim to abstract some of the complexity away from our Product teams so teams can work efficiently.

So, how do we evolve our platform?

Our Product teams own the full life-cycle and decision-making for their areas. This comes with lots of questions and decisions to make about how a feature is designed, built and operated. They often come with new requirements for our Platform teams.

For example:

  • Does our Database team onboard a new technology to support a new TransferWise product?

The answer is of course, that it depends. But decisions like this are always guided by how worthwhile the tradeoffs are and our long-term customer benefit. Teams make these decisions in collaboration.

Image for post
Image for post

Looking back, the genesis of these specialist platform teams that support these shared requirements, is in the small ‘infra’ focused team. Back in the day our Infrastructure team was very much focused on physical infrastructure when we ran in our own data centre. From this beginning we’ve built out our foundation technical platform (and migrated to AWS) that our products run on. What used to be a single Infra team, has grown into over 12 Platform component teams with over 60 Engineers.

To understand what works for us and where teams can accept a shared approach we’ve developed our opinionated platform that provides a standard pattern. In short, an opinionated platform is where we have coded a preferred way of working that makes it easy to adopt that preferred way but is flexible enough to allow teams to deviate from that standard when needed.

Creating teams around these shared components means we can manage more scale and business complexity and provide more polished solutions. Specialisation is essential. An example of this would be observability, monitoring and alerting. We’ve developed libraries that integrate into our observability stack that provide standard ways of exposing metrics as well as monitoring and alerting capabilities that build on these metrics. This means the ‘out of the box’ experience is good for teams for all the basic things they need to get their new product live. But, if the Product team has a requirement for monitoring or tooling, we collaboratively discuss how to accommodate and may decide the best approach is for them to build their own capabilities.

Fun fact: we collect over 5billion metric reading an hour from our platform!

If you want to know more about our stack read Yurii’s excellent post here.

How we overcome some of the challenges of specialisation

While our collaborative way of working helps us scale fast, there are some risks to it. Platform teams are specialists and have the challenge of siloed decision-making. With the complexity in their domain, it can make it more difficult for people outside their team to engage in decisions they make. This is problematic as we know it’s exactly that external perspective and collaboration that’s needed to ensure the team is on the right track.

So, we need specialist teams, but want to avoid silos. How do we do it? Here’s a few ways we make it work:

Staying aligned and starting with the why

First off, we make sure our Platform teams build systems the same way other product-focused Engineering teams do. We can avoid some of these pitfalls by focusing on why this work benefits all our customers. Understanding the why behind what we’re building, before thinking about the how of doing it, we stay on the same track and collaborate on decisions.

We ensure transparency and gain engagement by sharing plans and sharing approaches as soon as possible. We do this through open communication channels and quarterly planning sessions and this allows us to be transparent about work being planned and alternatives we can and could use.

Ensuring feedback between Platform and Product teams

Sharing openly about our plans and work brings me to feedback. Looking for feedback has always been part of how Engineering teams work at TransferWise. Evolving our forums to make it easy to discuss and contribute across Engineering is part of this.

As an example, we host open ‘show and tell’ type sessions where teams present ideas and proposed solutions. The emphasis on these forums is to share ideas early, get feedback from others and discuss solutions together. Platform teams also share their quarterly plans to the entire company and get lots of feedback on how to improve their impact.

Of course, this is a two way process — Platform Engineers are expected to give feedback on Product teams’ plans too. This way, solutions naturally evolve from early stage feedback.

Finding a balance between enabling teams to work and building up new capabilities.

With our opinionated platform we’re aiming to avoid a restrictive environment and provide just enough guard-rails so that teams can move fast while maintaining high non-functional quality. When thinking about work we can do, we always refer back to the impact on our customers rather than doing something because it’s the easiest thing to do.

In the future we expect this model to mature. It’s all about balancing between good enough capabilities that teams just need to operate and things that help us achieve our mission quicker. On a company level, we use our mission to guide our work, whether it’s creating new teams, making product decisions or prioritising new features. It helps us to stay aligned, share learnings and prioritise what’s best for our customers.

Specifically in the Platform team, a key focus area for us is to better understand those use cases and teams that our platform doesn’t support well today. This means understanding when to actively unblock teams so they can work efficiently while not getting involved with teams who don’t need support at the moment. Our future plans are to iterate on the current model, consciously incorporating new requirements in support of our team’s shared mission and better quantifiably measure our impact.

I hope this blog gives you a good overview of how we look at specialisation at TransferWise along with insights into how we make it work for our Platform team. If you have any questions, I’m happy to answer them in the comments section.

P.S. Interested in working with us? We’re hiring! Check out our open Engineering roles here.

TransferWise Engineering

Posts from @TransferWise’s Engineering Team

Medium is an open platform where 170 million readers come to find insightful and dynamic thinking. Here, expert and undiscovered voices alike dive into the heart of any topic and bring new ideas to the surface. Learn more

Follow the writers, publications, and topics that matter to you, and you’ll see them on your homepage and in your inbox. Explore

If you have a story to tell, knowledge to share, or a perspective to offer — welcome home. It’s easy and free to post your thinking on any topic. Write on Medium

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store