Evolution of the Data team at Motorway

Published in

Motorway Engineering

8 min readDec 19, 2021

In the last two years we’ve grown our data team from one to six people. This may not seem like anything to write home about, but we’ve been laying the foundation for the next phase of our growth. In this post, I’d like to talk about the journey so far, the rationale and inspiration for continuously iterating on our structure, and how we’re confident that the current iteration will scale well.

Photo by Cenk Batuhan Özaltun on Unsplash

The individual contributor

When I joined Motorway back in early 2020, weeks before the UK went into its first Covid lockdown, there was one analyst supporting two departments. It will come as no surprise that we had Tableau hooked up to a read replica of our monolith production databases. For that particular analyst, life was pretty good — for a time. Unfortunately, doubts about data quality began to creep in, as is often the case when dealing with production data directly. We could write a whole post about some of the issues faced when working with production data, but to name a few: duplicate data, incomplete data, invalid data caused by software bugs, synchronisation issues, information obfuscation by users, etc…

The first problem I encountered when I joined was attempting to do ad hoc analysis. Postgres is not optimised for analytical workloads and so lots of time was being lost writing queries and waiting for them to run, with fingers and toes crossed that it wouldn’t time out.

The second issue I found, mentioned briefly above, and which became more apparent after my first few weeks in the role, was that there was a significant lack of trust in the data. Our ETL (Extract -> Transform -> Load) data pipeline was literally a set of a few hugely complex materialized views. Maintaining them was grinding our data team’s productivity to a halt as we spent days and weeks performing painstaking data archaeology, digging through layer upon layer of subqueries and transformation logic.

Last but not least, Tableau did not have solid tools for collaborative development or source control. We couldn’t scale the team with these processes.

Working in a team

The perfect opportunity to scrap and rebuild our data architecture arrived with the first Covid-19 lockdown here in the UK. Motorway, along with many businesses, effectively closed its doors. This gave us the opportunity to draw a line in the sand and rebuild.

I’d been reading about dbt and decided to investigate it further. It appeared that moving to an ELT workflow (Extract -> Load -> Transform), and consolidating our transformations into a single process (dbt running on BigQuery), would allow us to cut a huge amount of maintenance overhead and rapidly increase our turnaround time.

In an ELT workflow, with separate data extraction and transformation steps, we could use a low-cost off-the-shelf data pipeline solution like Fivetran or Stitch to send source data to our data warehouse (“EL”), and then transform it any way we liked (the “T”). This would give our team much more flexibility when building data transformation logic. We’d always have the raw source data available and could iterate on how we transformed it after the fact. We could plug in new sources, use different tools (no vendor lock-in), and upgrade our pipeline as and when we saw fit.

I‘ve read blog posts about how long it has taken teams to make the switch and stand up a dbt project in production (six months to a year in some cases). But after some long days and nights during lockdown, we were able to rebuild much of our pre-existing platform in four weeks. When the business opened its doors again in May 2020, we were up and running. Were it not for the fact we ditched Tableau in favour of Google’s Data Studio, I doubt many of the stakeholders would have noticed any change.

While we were at it we added support for two more departments.

One team, many domains

The natural next step to our team structure emerged in some part after reading this super article laying out the iterations of the data team at Snaptravel. It essentially provided a shortcut to what we ended up with. The structure denotes one senior member of the team as the ‘domain lead’ of a given area of the business (product, finance, marketing etc…). That person — and their manager — ‘owns’ the domain. Depending on the size of the domain, or the priority for that quarter, other members of the data team can volunteer as contributors of that domain, meaning they work as individual contributors supporting the work of the domain lead. This felt like a great fit for our team as we added two more to the headcount.

Life was good. Everyone was comfortable, knew their remit and felt supported when they needed it. But before long some cracks began to emerge.

The rest of the company was growing quickly. Our customer experience team had nearly tripled in size and our product team had doubled. Where there were three squads we now had five, with plans to add more. Requests were coming in thick and fast in the form of direct messages to the domain leads, providing the manager (me) with no visibility on the workload. We were also being pushed to do hacky things to get analysis and insights released quickly (everyone needs the data “yesterday”). Over the hedge in the engineering team, we could see some green green grass, the same folks demanding rapid work from us wouldn’t dream of asking them for a feature release tomorrow. How did they live in this dream world? What could we do to join them?

One of our mantras in the team is that success is decisions enabled, not data delivered. We don’t want to lose sight of that but equally, in order for us to make progress on bigger needle-moving projects, we needed a way to make time (and get universal acceptance and understanding) for those larger tasks.

We were already becoming more engineering orientated. dbt allowed us to adopt best practices of software engineering like modularity, testing and version control. The next step was clear: we needed to run our team like a product team. We had conversations with product managers to understand how they spend their time, to find out what works for them and what might also work for us. Agile software engineering practices have become the standard work management tool for modern software development teams. Were these techniques applicable to analytics and data science teams? Hmm…well, there was only one way to find out! I could write an entire post about working with Agile as a data team; watch this space…

We also took steps to centralise all requests into a single slack channel and were stringent in ensuring direct messages were redirected there for both visibility and triage. Following a team brainstorm session, we implemented a team duty rota: each week the member on duty will triage and manage the channel to make sure every request is managed end to end, we call this person the Batman/Batwoman. In addition, we introduced the Data team survey to get regular feedback from the organisation. It helps us prioritise and focus on what will have the biggest impact.

Hub and Spoke

This takes us up to today and our current structure. One of our domains was growing more quickly and had a greater thirst for data than any of the others — the product and engineering teams needed better data support, and are starting to roll out machine learning features which the data team is building and/or contributing to.

Tasked with a wide range of responsibilities, we brainstormed ways to keep the data team relatively flexible and nimble while still enabling the entire organisation to make data-driven decisions.

We decided to move to a hub and spoke model as publicised by a couple of great articles by Monzo and Postman. In this model, the central team becomes ‘the hub’. A couple of our data scientists with specific domain experience were reallocated into spokes, embedded in the business domains they support.

With this new approach, we are redefining what it means to be a centralised data team — instead of providing insights, we will provide accessible and easily digestible data. The hub’s responsibilities revolve around guaranteeing a sturdy infrastructure, maintaining the dimension and metrics layers, data cataloging for easy discoverability, and monitoring access to critical data. We will continue to recruit analytics engineers to build up the hub team, working alongside analysts and data scientists to deploy into the spokes. Everyone will be centrally onboarded and can then be reallocated into a spoke on a specific project or a six-month rotation.

One thing that has emerged immediately after moving to this structure is that we need to be careful not to alienate those members of the data team now operating in spokes. With only one data professional in each spoke (for the time being anyway), there is a lack of understanding and empathy from the wider group on what the data team is actually working on. Having them join a hub standup and spoke standup each morning is clearly not feasible — too many meetings. So how do we make sure that the data crew continues to be aligned?

Enter the Community of Practice (CoP) or team guild.

Communities of practice are groups of people who share a concern or a passion for something they do and learn how to do it better. Members of Motorway’s data team come from diverse career paths and backgrounds, and as a result possess extensive knowledge, skill, and expertise across a wide range of technical and cultural domains. Our data guild aims to promote the exchange of these ideas and best practices for the betterment of all.

In our first session, the team submitted ideas for what we wanted our CoP to be defined by (see below).

We hold a meeting every week. We try our best not to talk about specific projects but more generally about good practices, tech, career development, ways of working, and other cool stuff we’ve learned or want to learn.

I don’t expect v4 will be the last iteration of our team structure. As the space continues to evolve at the pace it is, for sure there will be some bumps in the road; but we have a great foundation, we’re working with great people, and for a great company. The best bit? We’re only just getting started.

Anyone interested in joining our team please feel free to register your interest by submitting your CV to motorway@jobs.workablemail.com, add ‘Data team’ in the subject line for good measure, and be sure to check out our careers page.

Thanks for reading.

Evolution of the Data team at Motorway

The individual contributor

Working in a team

One team, many domains

Hub and Spoke

Written by W McCoull