Scaling Data Teams: 5 Learnings from BlaBlaCar

Emmanuel Martin Chave

Published in

BlaBlaCar

5 min readJun 30, 2023

Start simple, be pragmatic, mirror the business, create Chapters and invest in your platform.

Now and then: today’s organization and where we come from

As of June 2023, BlaBlaCar’s Data department sits within the Engineering organization. Its 45 people are divided into 6 teams: 5 multidisciplinary squads, and one platform-oriented team. The squads are composed of Data Analysts, Data Scientists, Software Engineers and Data Engineers. That makes them autonomous on all data projects.

Multidisciplinary squads: autonomy, together!

In 2021, the Data department was organized by technological layers: a Data Engineering team, a Data Science team, a Data Analytics one, etc. This organization optimized for skill-based expertise. It also created a layered technical stack; interoperability between layers was low. This showed whenever one project required work across two or three layers.

The stack is the consequence of the organization (Conway’s law). An organization by centers of expertise produces a technologically-layered stack.

This model stopped scaling in 2021: Data teams became a bottleneck. Their delivery speed of large projects slowed; prioritizing work across technological layers was difficult.

Transverse topics, like data quality, had ambiguous ownership. Integrating data from M&As took years. Our data architecture and organization were rigid; this made it costly to address new use cases, e.g. streaming data for ML applications.

The need for organizational change was obvious. Data teams needed multidisciplinary squads tied to business domains. Thus, they could operate effectively, autonomously and locally scaling with business needs.

Getting inspiration from Engineering

The principles of the Data Mesh inspired me to change our paradigm.

This architecture decouples technological layers from teams’ organization. Each squad becomes responsible for managing operational and analytical data within its business domain(s). To enable this ownership, they must be autonomous; that’s why we have multidisciplinary squads. These squads can scale at different speeds and directions, according to business inputs.

The paradigm was enticing. But I didn’t know if it would work in practice.

I needed concrete examples; unfortunately I couldn’t find many Data teams that implemented a Mesh: Zhamak Dehghani’s foundational article was just 2-years old. So I took best practices from our Product and Engineering colleagues. They experienced similar challenges. Multidisciplinary feature teams are common in Software Engineering. Learning from Product and Engineering teams at BlaBlaCar, I validated the paradigm and got tips on its implementation.

5 Learnings from our experience

Changing a 40+ people organization doesn’t happen overnight. We made incremental changes to incorporate new knowledge:

Q3 2021: Diagnosis. Understand the problem that the department faces. With a clear problem statement, build a vision towards the solution.
Q4 2021: Onboarding managers. Align on the problem statement and the target vision. If they do not own change, they will not lead it effectively.
Q1 2022: Creation of 2 squads, out of 6 new teams in the target vision.
Q2 2022: Integrating feedback and creating the remaining 4 teams.
H2 2022: Test and learn. Share operational insights between squads.
2023: Focus on the technical migrations to complement the organizational changes.

Below are our five learnings from this experience.

1. Where to start?

We picked the easiest areas to make independent. We factored in known internal moves, accelerating some changes. We created last the hard-to-isolate domains, leveraging experience from our previous moves.

2. Should we work on all aspects of the Data Mesh at the same time?

No. Aim for something pragmatic, not perfect. This change impacts our stack, skills, management, processes, and mindset. It is not necessary to excel in every aspect to achieve a functional implementation.

At BlaBlaCar, we invested the least in the Data As A Product aspect. For instance, there is no Data Product Owner for each domain at BlaBlaCar. Likewise, squads do not operate 100% of their ingestion pipelines. Some are still centralized. Decentralizing them added non-urgent work for squads, while they already had substantial changes to absorb.

3. Are there any changes to be made outside of Data, especially on the Business side?

There weren’t for us. That’s because we designed Data teams to mirror the business organization. The Data department is an independent entity reporting to the CTO. By molding our teams on business areas, we maintained close links with stakeholders.

Mirroring business: organize your teams along business lines. Or spend time solving misalignment…

4. What about keeping the expertise for each job?

BlaBlaCar wanted to maintain common practices among experts in squads. So we created Chapters. They are communities by expertise, where peers meet, challenge their technical choices and learn. Chapters embody the notion of federated governance. They guarantee the convergence of practices despite skill distribution among squads.

Autonomy and boundaries between squads need a counterweight: federated governance and common practices.

Having Chapter Leads also empowers individual contributors. It prevents the concentration of responsibilities on managers.

5. How do we keep the Data machinery well oiled?

One transverse team — Data Ops — provides the common infrastructure and services to other squads. For instance, Data Ops builds ingestion patterns consumed by the Data Engineers in the squads. At our current size, this setup works without making Data Ops a bottleneck. As the team grows and matures, some parts start being distributed to squads.

To avoid creating silos, members from the DataOps team often temporarily join squads. This benefits everyone by: (i) guaranteeing centrally designed tools and patterns meet squads’ needs, and (ii) allowing team members to explore and learn new topics.

My advice: clarify which problem you’re trying to solve

We succeeded because we knew what problem we wanted to solve. We were no longer efficient at our scale. This problem statement was our beacon when making difficult decisions. My best advice here is to invest time to align on the problem. Don’t implement a solution if you aren’t clear what problem it’s solving!