Who, where, when: a components system for allotting team member responsibilities

Eugene Tupikov
Bumble Tech

--

Hi everyone, I’m Evgeny Tupikov, lead PHP developer at Bumble Inc— the parent company operating Badoo and Bumble, two of the world’s largest dating and connection apps.

We currently have more than 200 backend developers working on hundreds of modules and individual services in our applications. But back in the day, in 2006, it was just one project that a small team worked on. Back then, each developer had a good understanding of how it all worked: they could easily navigate the code, knew what services were available and how they all interacted with each other. However, as the project grew it took longer and longer to identify the “knowledge keepers” — those responsible for a particular functionality and to whom you could address a question or make a suggestion.

In this article, I describe how we used the component approach to solve the problem of division of responsibility and make the process of updating information quick and easy.

Background

Take this example situation: one team’s developer is fixing a bug, and in the debugging process they come to the code for which another team is responsible. How can we be sure who is responsible for the functionality and so who to go to with the questions?

In other words, we realised that we needed an easy way to identify the person responsible for this or that part of the system. To do this, we started using two special tags in the file’s DocBlock:

  • @team — the team responsible for a particular part of the system;
  • @maintainer — the person who develops its functionality (there may be several such employees).
/**
* @team Team name <team@example.com>
* @maintainer John Smith <john.smith@example.com>
* @maintainer ….
*/

This approach is extremely easy to implement and use. For example, you can set up a template in PhpStorm — and the necessary tags will be automatically inserted when creating a new file. We also made a separate Git hook to ensure all the files have the right tags in the required format.

However, it wasn’t long before this approach began to lose its flexibility: new people would join the company, someone would change teams — and each time it was necessary to update the information on maintainers. Moreover, this information was duplicated when configuring monitoring in Zabbix. While it is pretty simple to fix code, updating the list of recipients of notifications on the result of a particular check takes time: you have to assign a task to the monitoring department and wait for it to execute it within a given flow.

We wanted to be able to update the list of maintainers in one place. And the objective was that the other systems would register the changes automatically.

This is what led us to the component approach.

What is a component?

Let’s start with the definition: a component is an abstraction representing a certain part of the system. It is important to note that in our case, components are not so much a code-structuring tool as an administrative or organisational tool for allocating areas of responsibility.

A component is not necessarily a functionality directly related to the code. A component can also be an organisational process, such as the release of a new version of an application.

In transitioning to the component approach, we formulated five rules or restrictions:

  • Component structure to always be linear
  • Each team to have its own set of components (the same component cannot belong to more than one team)
  • For each component, a list of maintainers to be specified, with the optimal number of maintainers for one component being two, three or four
  • Adding or removing components to be limited to the manager or team leader
  • Each component to have a unique identifier (alias).

We have a special interface for managing components on our intranet. The component page looks like this:

An example of a component page on our intranet

An example of a component page on the intranet

We see that the component has:

  • a unique identifier
  • an email address
  • the name of the team that is responsible for it
  • the name of the project in Jira to which the component belongs
  • a brief description of what the component is for and its area of responsibility
  • a responsible person list: each component has an owner who can edit it (perhaps the manager or team leader, or the person who had been explicitly specified as the owner when the component was added);

How we use components in code

File DockBlock

/**
* @component component_alias
*/

We have finalised a Git hook that checks the DockBlock. It makes sure that files that have been modified contain the @component tag and that the specified component exists.

remote: ERROR in SomeClass.php:
remote: * Unknown @component: UnknownComponent. You have to create component before using it in the code

We also have a hook that notifies maintainers about all the changes in their files made by other developers. This is useful when developers from other teams make changes to your code but you are not sent the task for review.

API for working with components in code

To work with components in the code, we have a separate class that enables you to get a list of all components and find any specific one either by its full name or by its identifier. The same information is available in the code as in the interface on the intranet.

$componentManager = new \Components\ComponentManager();
$component = $componentManager->getComponent(‘component_alias’);
$recipients = [];
foreach ($component->getMaintainers() as maintainer) {
$recipients[] = $maintainer->getEmail();
}

Or find a component duty officer:

$componentManager = new \Components\ComponentManager();
$component = $componentManager->getComponent(‘component_alias’);
foreach ($component->getMaintainers() as $maintainer) {
if ($maintainer->isDuty()) {
return $maintainer;
}
}

Component duty officer

Many of you will have encountered the problem of context switching. In my opinion, this is something you should try to avoid. Everyone works at a different pace — and it takes varying lengths of time to get back into the task (context). Some people only need a few minutes, while others need much longer.

To minimise the effects of context switching, we decided to introduce the practice of on-duty. Every component now has a duty officer, who is the first port of call for all issues related to it. Whenever someone comes on duty, a note confirming the fact appears on all the required components and is displayed in the public group in the messenger.

This does not mean that the team leaves the duty officer to fend for themselves. Should they encounter an unfamiliar problem, they can always consult their colleagues. However, if the person on duty is able to answer even 80% of the questions themselves, it means that the rest of the team members can get on with their tasks 80% of the time undisturbed and without fear of being taken out of context.

In addition, duty assignments allow the sharing of knowledge within the team, thereby increasing the bus factor.

Integration with internal systems

In addition to using components directly in code, support for them has been added to most of our internal systems. Here follow some examples.

PHP errors tracking system

Historically, to collect and analyse PHP errors, we used a system we created in-house that is similar in functionality to the popular Sentry and Splunk but adapted to our internal processes. We added component support to it first.

One of the stages in error collection is to saturate the event with additional information. We have added a new step to the pipeline, in which the system collects a list of affected components based on the list of files from the error stack trace.

This information can be used:

  • to search for errors on a particular component;
  • to create reports and graphs per component.

In addition, having information about components makes it easier to find the person responsible for the part of the system where the error occurred: just go to the page with detailed information about the error and look at the stack trace:

Database registry

The backend of our Badoo and Bumble apps consists of hundreds of different modules, systems, and services. Most of them use MySQL to store data.

Now let’s imagine this situation: a developer begins working on a new functionality, and in the process, they need to look at a table scheme. To do this they need to:

  1. find in the code which host the database lives on
  2. connect to the host through any convenient tool (console utility, phpMyAdmin, Sequel Pro, IDE, etc.)
  3. find the correct database and table
  4. study the table information.

What if you want to know the size of the table on the production?

First, you have to request access to the database, wait for it to be granted, and only then can you get the information you need. Even though the process of gaining access is automated, it takes quite a long time to get a response to what is a rather trivial request.

And let’s look at another example: say that you need to find a list of tables that have not been used for a long time and are just taking up server space. Here, you need to write a script that bypasses all the necessary servers and collects statistics.

So, to make life easier for developers, we created a system called DBRegistry. It stores information accessible via INFORMATION_SCHEMA for all databases.

We also added the ability to specify which component a particular database or table belongs to when migrating to a component approach.

Information about the component will be useful if ever there is a problem with a server (e.g. high CPU usage) and the database administrator has found a problematic request and wants to report the problem to the developer. They simply find the desired table in DBRegistry, see which component it is assigned to, and describe the problem to the person on duty.

Conclusion

More than three years have passed since we switched to the component approach, and during this time component support has been implemented in all our internal systems. We have managed to make processes altogether faster and clearer. No longer does anyone need to ask, “Who is responsible for this code?” All they have to do is go onto the intranet, find the component they want and it will contain all the information they need.

Updating the list of those responsible is now reduced to a few clicks in the interface while, prior to the introduction of components, updating this information in all the systems could take days, if not weeks.

That’s all for now and thanks for reading!

--

--