Platform Engineering

Building A DX Team: Lessons Learned

A reflection on 2 years being part of a growing Developer Experience team, the successes, and the challenges.

Lou Bichard
DAZN Engineering

--

For the past two years, I’ve been lucky enough to be part of the Developer Experience (DX) team at DAZN. I’ve watched our DX team grow from just an idea, to now, a full-blown, dedicated team of 12 engineers, distributed in 4 different countries (UK, Netherlands, Spain and Italy) and working on our 70 repositories and many projects.

A few days ago, our head of DX, Cirpo, put out a blog post talking about how our DX team works at DAZN, and how the team is structured. That post nicely lays the foundation for this follow-up post, where I wanted to get into some finer details about DX, some of our successes, and challenges over the last two years, and to answer some common questions that we get asked.

To save you re-reading the entire article about our team structure, what’s the tl;dr;? The DX team is an internal-facing team, who are part of our wider platform engineering department (alongside Cloud, SRE, etc). The DX team works on internal tooling, and processes, with the team goal of increasing the productivity of all engineers working at DAZN.

Today, I’ll cover a few important topics, such as how we manage the challenging topic of ownership, the relation of strong technical opinions and DX, how the DX team prioritise and measure our outcomes.

Let’s jump straight in with one of our biggest challenges: ownership.

Scaling The Ownership Of Tooling.

Ownership is an important topic within software engineering because when we get ownership right, it has been shown to directly affect the impact that our teams can have. And when wrong, the structure can become a source of frustration and angst.

One of the greatest challenges we have faced is balancing between helping out our very busy engineers by building tools within the DX team, whilst simultaneously not making the DX team gatekeepers or becoming knowledge silos.

Ultimately, we want to avoid teams needing to come to DX and wait for a fix, but instead to empower teams to fix issues for themselves, where possible.

To quote Ellen Chisa’s talk about DX: “We don’t want magic, we want to be magicians”.

So how do we manage the challenges surrounding structure? One of the ways we solve the issue of ownership is through encouraging a culture of inner-source.

Within DAZN, we take a liberal stance on who can create new tooling. Engineers within DAZN are free to create their own repositories. For instance, there is no “review” before repositories are created. Engineers can declare their own new tooling and have that tooling adopted by other teams immediately. This approach means we’re trusting engineers to properly manage any internal open-source tools they create.

The DAZN Inner-source Marketplace

Building an inner-source culture isn’t easy.

There are many related challenges that we can’t confidently (yet) say that we’ve truly solved. For instance, it’s continually difficult to manage the offboarding of Inner source maintainers. Whilst we do collect data on contributors and maintainers for our inner source tools, we inevitably have projects that are left without an appropriate amount of maintainers or contributors.

In addition to an inner source culture, another way we scale our ownership is by making it as easy as possible to contribute to any DX projects. Within DX projects, we have a conscious focus on high-quality, and consistent documentation. Are all the DX projects perfect? No. Is it important to us that we provide high-quality documentation? Absolutely.

The DX Repo Template Project

This is why we have a common template for our repository documentation. This template covers the different aspects of a project we’ve seen to be useful when it comes to contributing to a project. From the more obvious like usage instructions, and contribution steps (like how to test), to architecture, runbooks, deployment steps, and service level objectives (if necessary).

Developer Experience Is Built Around Strong Opinions

Something that I’ve observed over time, is that Developer Experience is built around strong (technical) opinions. When we create a new tool or start a new project, the tool is almost always built on top of either an implicit or explicit set of opinions. This point is relevant because the impact that a DX can have is intrinsically tied to the ability to make, and communicate technical standards and decisions.

Strong technical opinions usually stand for something — but also — strong opinions also typically stand against something else. For instance, Honeycomb (the observability tool) is built around the opinion of events, in favour of logs or metrics. The programming language Go is also built around strong opinions, notably, its conservative views around adding new language features, which differs heavily from most other programming languages that prioritise more features, not fewer.

In a DX context, if all our teams operate with their own strong opinions that are not common or shared, we cannot have common or shared tooling, and a DX team can have minimal impact. But, I know what you might be thinking: it’s not exactly a straightforward task to achieve consensus on technical opinions across a large organisation. So, what are the practical steps that we can take?

One of the many ways that DAZN makes technical decisions is via our RFC (request for comments) process. RFC’s live in a GitHub repository, and technical standards are then gradually adopted within our tools and by our engineers in the company. An RFC can be raised by any engineer, and voted on by all engineers. If upvotes exceed downvotes, the RFC is adopted. It’s worth noting, that the RFC voting process hasn’t yet been “stress-tested” by any contentious decisions (at least so far!).

Because RFC’s usually only apply for engineering-wide processes — and not team processes — the adoption of an RFC typically results in the RFC being integrated directly into our platform tooling. For example, our RFC on log formats is then applied into our log ingestion pipeline, or our infrastructure tagging RFC is enforced by notifying engineers of any incorrectly tagged resources.

For example, we have an RFC for our .dazn-manifest.yml file, which is a metadata file that describes the contents of a repo. By having a common metadata file, we’ve been able to enable things like search and discovery, through our Backstage instance. Through this file we can configure tooling, like our service level objects (SLO), so that they “just work” with only a few lines of configuration. These features wouldn’t be possible if we had many different approaches or opinions.

An example RFC for our .dazn-manifest.yml file

Of course, a DX team can only have a certain amount of influence over the topic of making and taking technical decisions. But, we’ve seen first-hand how being active in participating, and engaging with our technical decision-making processes has allowed us to make a better, and more integrated tooling ecosystem.

How Does DX Choose What To Work On?

A very frequent question we get asked is about how we prioritise our work and choose the most important projects to work on. This topic covers a few different areas: firstly, how we generate, gather, and make sense of feedback, and then how we measure the effectiveness of what we have done or are choosing to do.

Firstly, it’s important to note that all our DX projects are treated as independent products. The DX products have one or two owners from DX, who help to define the product “roadmap”. During each round of our 6-week iterations, our product owners define “bet” documents for what they propose to work on next. These bets are then discussed, with their potential impact reviewed.

When talking to other engineers about our prioritisation process, I often get the impression that the prioritisation process is seen as being highly scientific, in working out which features to work on next. But, what we’ve found out over time is that the next most important priority is often made very clear to us from our engineers, through their feedback on our chats, survey results, or informal discussions.

Results from the survey question “Our CI pipelines are slow”

It’s not always “big” projects that make the biggest impacts. Another approach we’ve used is pushing ourselves to deliver on projects in very short timelines, quite like the idea of a “Hackathon”. We’ve found self-imposed limitations on time can lead to some outsized results. For instance, our alternative CI interface (discussed more in our GitHub Actions migration article) was built and deployed in just 2 weeks, by utilising existing open-source code, and ruthless prioritisation and simplifying.

A common question we’ve been asked in the past is whether the DX process is proactive or reactive, and the answer is that it’s both. Because we try to keep our delivery schedule flexible, so we can react to comments, and support requests, but in between those requests, we’re also building a strategic foundation for our tools (we’ll discuss this more later, in the section about how our tools interact with the different points on the software development lifecycle).

There’s no comprehensive recipe we can give for how we choose what to work on, as it’s a careful balance between supporting business goals, work that aligns with our quarterly team objectives, the effort of the work, its strategic value, and other more “practical” aspects like the people we have available at the time.

How Do You Measure A Developer Experience Team Effectiveness?

It’s important to us that we can quantify the impacts we’re having. Success metrics help us to justify further investment in our projects and help us to manage our priorities. One of our DX team values is even “prefer data-driven decisions”.

One example of how we have recently used data to drive our decisions is in our recent Github Actions migration. Initially, we had no visibility into which of our teams had migrated their repositories to the new GitHub actions tooling. We wanted to answer questions like: which teams are struggling with the migration? How many of our repositories are potentially orphaned and unlikely to be migrated in time?

Dashboard showing metrics on GitHub Actions migration.

This data is collected via an extract transform load (ETL) architecture which periodically ingests data, primarily from GitHub, and stores it into a PostgreSQL relational database. But the raw data alone is not very useful without some way to access or visualise it. This is why the database is integrated with a self-hosted Metabase instance, allowing us to perform ad-hoc querying and analysis.

Exported data tables in our database

This combination of the ETL ingestion pipeline, with the Metabase visualisation tooling on top allows us to interrogate our data and answer questions like: “What is the most common node package we use?”, “Which CI pipelines use X plugin?”. This data then gets used when either reviewing project success, or when deciding on success criteria before a new project begins.

It’s important to mention at this point that we mostly measure our success at a project level, rather than at a team, or company level. We don’t (at least yet) attempt to measure the general productivity of our engineers. That said, with this tooling platform, we can begin to define more general performance metrics and start to track higher-level productivity, not just at the project level.

Should A DX Team Be All Engineers?

Another factor that we think has been part of any successes we have been lucky to have, is by having an engineering-led team. We made the decision in the early days to try and avoid diluting our team with additional members like product, etc. Opting to try and manage with a team of purely engineers. This comes from a belief that the more knowledge, and empathy towards fellow engineers you can embody in your DX team, the better.

But, it’s a double-edged sword. Because hiring these types of engineers is not easy. Hiring any type of engineer is not easy! Let alone the type of engineer that is not only technical, but also deeply empathetic about their customer, their fellow engineer, and cares deeply about the outcome (improving the life of their customer). Writing a job description for this type of engineer is hard, which is possibly why the majority of the DX team comes from internal hires, or organically from within the company. This correlation is likely not a coincidence.

One way that we’ve discussed to take this concept further, of having a very engineering-led team is to have engineers potentially work with DX, or our Platform team as part of their induction to the company, this would give them exposure to Platform and DX, and our broad suite of tools. And also, to start to rotate engineers into DX from within other teams, to ensure that the knowledge within DX is always relevant to the real work that our engineers are doing.

Building A Tooling Foundation

When the DX team is not responding to engineer feedback or requests, we are actively working on building on and extending our tooling foundation. We work to fill any gaps and make sure that our engineers have tools at every step of the way through the software development lifecycle.

At this point, we now have tools that interact with engineers right from day 0 of a new project to when it’s eventually deprecated. In terms of DX products, our inner source marketplace helps engineers find and assess internally built tools, when code is written, our GitHub PR linter nudges engineers towards best practices, and when deployed, the DAZN CLI allows engineers to log into and debug their cloud accounts.

The help page of DAZN CLI (showing just some of the commands).

When it comes to prioritising our work in DX, sometimes our efforts go towards quick-wins, and responding to requests, but also our efforts go towards building potential new opportunities to “nudge” engineers in the right direction. By building these foundations across the whole development lifecycle, our engineers can now look to contribute to our tooling to suit their needs, like adding new linter rules to the GitHub PR linter or extending the DAZN CLI with new commands.

What Was The Biggest Lesson Learned?

Original Twitter thread asking for input for this article.

When writing this article, I discussed with Cirpo, our head of DX what he thought was our “single most effective decision we made in the entire 2 years” and this is what he had to say:

“You can have the best ideas, the most talented engineers, etc. But if you don’t take care of the people side of DX, it can’t work and it will not scale. It’s always about people.“

So, Are You Building Your Own DX Team?

So there you have it! 2 years of learning and experience from growing our Developer Experience team. We’re not finished yet, but we hope that you can learn from some of our experiences. Here’s a short summary of some of our main takeaways from over these last two years for others who might want to start a DX team:

  • Drive community contributions to scale DX efforts
  • Engage actively in any technical-decision making processes
  • Rely heavily on GitHub (or source-control) data for success metric
  • At first, use project-based (instead of company-wide) success metrics
  • Find or recruit engineers who care about “product” as well as code
  • Build tools that interact across the whole development cycle
  • It’s always about people

If you’re looking to start your own DX team, build a culture of Developer Experience within your company, we’d love to hear from you! The best place to find us is on Twitter @dazneng.

A big thank you to @TastefulElk, @NullishCoalesce, @cctechwiz, @paulienuh, @mikenikles, @lucamezzalira, @areaweb, @cirpo, and @crivetechie for reviewing parts of this article, everyone who chatted about and asked questions about Developer Experience on Twitter, and of course, the DX team, platform, and the rest of DAZN engineering! 🚀

--

--

Lou Bichard
DAZN Engineering

Teaching the next generation of Cloud Native Software Engineers @ thedevcoach.co.uk.