How many people does it take to change a light bulb?

aka API-Centric Development in Non-API Company

Sergio Dal Bello
Bumble Tech
Published in
11 min readMay 19, 2020

--

(Based on a talk by Konstantin Yakushev @ Nordics APIs)

The answer is 24, one to hold the light bulb and …

Okay, it’s a bad joke, but the number is right and I’ll explain it why later.

Today’s topic is how we ended up developing an API-Centric Development environment at Bumble — the parent company operating Badoo and Bumble apps — a Non-API Company, and especially why. Our numbers are quite impressive: we are talking about over 800 feature flags (we have a complex way to handle feature toggling in our systems) that serves several applications with over millions of users! As you can imagine, it can be difficult to handle and requires a very complex infrastructure to remain extensible and maintainable. These numbers continue to grow each week as our client teams release new versions with new features and improvements, and so it’s important to keep up.

When I say API-Centric Development, I mean that we have a dedicated API design team which designs, defines, and documents client-server communication and technical solutions relating to feature development. And we do this before developing the actual feature; as you will see, the API team plays a central role in the feature development.

This is despite us not offering public APIs; it might sound overkill at first, however, I’ll try to explain this and some additional questions in the next few sections.

To provide some context, I first want to describe a typical situation in which our API team is applied: the Feature Cycle.

The Feature Cycle

As with many product development teams, the first step is to define the description and specification for what to build. Product owners create a document, a Product Requirements Document (PRD). This contains a description of the feature they want to be implemented, business logic, design, marketing metrics, and everything needed to determine how and why the feature should be developed. PRDs are published in our internal wiki, and they are constantly revisited to keep them up-to-date.

The best way to show their usage is through an example. Imagine we have a really successful application downloaded by millions of users, the “Bumble Calculator” (it supports +, -, / and * !!). Let’s say we want to introduce a new feature because our calculator is cool and we want to keep up with Instagram: Calculator Stories.

To introduce this new feature, the Product Owner writes a PRD which contains the feature description, the use cases, screenshots and other relevant information. It might look a bit like this:

When the specification is complete and the PRD is ready, the business logic is described and we can start defining the technical implementation of the feature. An API ticket is raised and our team enters the process.

API development is split into two phases; defining messages and writing documentation, and a 2-stage review process.

Messages and documentation

Firstly, we define and maintain which messages are exchanged between server and client. To do so we use Google Protobuf, but you can use whatever you prefer to use, e.g. Swagger. We define the requests from clients and the server response, including which fields must be included, their data-types, any constants (defined in protobuf enums), and even which push notifications and how clients should handle them. Once defined, we document everything using a markdown language, namely reStructuredText.

This documentation contains the protobuf messages, the client/server logic, designs and everything that might be useful to developers, which is then built into HTML via a tool called Sphinx.

So let’s return to our definitions: we have our PRD and now we need to define our messages. What do we need? What are the use cases? Let’s see our Calculator Stories feature!

First of all, we need to load all the stories, and we choose to do it on startup

We define a startup message (Client/Server prefixes define the direction, to the server or to the client), and we need to tell the server we support Bumble Calculator Stories.

Additionally, we must describe a StoryPreview message, representing a story preview.

message ServerStartupApplication {
optional bool stories_supported = 1;
}
message ClientStartupApplication {
repeated StoryPreview story_previews = 1;
optional int32 max_number_of_stories = 2;
}
message StoryPreview {
optional string story_id = 1;
optional string image_url = 2;
}

What about the zero case? Clients can show a placeholder when there are not enough stories. We also describe the maximum number of stories in the startup so clients know how many placeholders to show.

Now we want Users to be able to add a story, so we need a way to send new stories to the server.

message ServerSaveStory {
optional string story_operation = 1;
}
message ClientSaveStory {
optional bool success = 1;
optional string error_message = 2;
}

We have a message to save a story, and we have the response of the server. It can contain an error (if you try to insert wrong characters like letters).

Lastly, our feature can’t be completed without giving the possibility to view a story.

We create a message to request a story and the related response, which of course needs a message representing our story model, containing the needed data.

message ServerGetStory {
optional string story_id = 1;
}
message ClientStory {
optional StoryModel story = 1;
}
message StoryModel {
optional string story_id = 1;
optional string title = 2;
optional string calculation = 3;
optional string text = 4;
}

As you can see from the figures, the documentation doesn’t only cover the feature from a technical point of view. If you read it, you will see it’s very narrative in nature; there are omitted parts, there are images, and there are of course code blocks too. The documentation is a reference for every aspect of the feature, including technical details, design, business logic, and anything else deemed relevant to developers.

We’ve defined our messages and covered all the use cases. We’ve determined which team has to implement which part of the feature: for example the placeholder logic is client-side, but the server team can control it using a flag. We’ve even defined an error message that will be on server-side in case something goes wrong.

In a real scenario, we would need to think about backward compatibility, which features every client version supports, and suggest fallback solutions in case clients do not support them, as well as handling feature toggles, A/B tests, and statistic gathering. But for now, let’s move onto the second phase of our API development: the 2-stage review.

2-stage review

Following the development process, every ticket is reviewed by at least 2 members of the API team. We are a team of 4 people, so this means that 3 out of 4 will have seen what is going to be implemented, and how. In practice, everyone in the API team should know most of the features that have been implemented, and that’s very important if we want to be able to develop new features in the future because they know what exists, what can conflict, and what can be reused.

After review by the API team, other teams which should support the feature review the ticket as well, for example: one reviewer from the Android team, one from the Server team, one from the iOS team, and so on.

During the review process, the ticket is still considered in a development phase, as there are always a lot of changes to be made; it’s a matter of compromise in order to find a good solution for all the developers. Development should only start once all teams have agreed, after which we move on to the final steps.

Once the ticket has been reviewed it’s ready to be implemented. We create various tickets for all the platforms, define their respective specifications that are required for other teams to be able to work with the feature (for example, strings to be translated).

The protobuf messages are then compiled automatically into client-code for every platform, and finally it’s time for developers to start implementing the feature. Often, it’s possible for different platforms to work simultaneously, but in our experience that is not always the case because server teams usually implement it first, and only after that can all the clients work on it.

As you can see, it’s a long process, and at times it can feel bureaucratic as there’s a lot of back-and-forth between participating parties. Product managers write their documentation, you write your own one, you change it, you rephrase a sentence, you explain it, you choose a different approach, and so on.

Reasoning

You may be wondering why we have created such a complex process and why we are sticking to it.

Let’s go back in time a bit and look at what we had many years ago.

In June 2007, the first iPhone was released, and in March 2008 Steve Jobs introduced the very first iOS SDK which allowed developers to create their own applications for iPhones. It was the beginning of the mobile applications era. We definitely didn’t want to be left behind by the competitors, so the first Badoo mobile applications team was born: several backend developers and a couple of IOS devs. It was a start-up inside a start-up: no documentation, no dedicated API team, and most of the code was written from scratch.

Most mobile applications have to exchange data with the server, and Badoo was no exception. We needed a protocol for this, and Google Protocol buffers looked very promising so we chose those. It’s a binary protocol, but all messages and enums are described in plain-text proto files which suited us well. These files lived in our repository and all developers from the mobile team were able to contribute to the protocol definition. Protocol definition files contained comments, and we used these comments as documentation. It worked well — for a while.

At some point, we discovered that sometimes developers add new fields/messages instead of reusing the existing ones, or reused existing messages in ways that they weren’t meant to be used. We understood that we needed a single person to be the owner of the protocol, so that they could at least review other developers’ changes. Then the number of such people increased and comments in the protocol files no longer fulfilled our need regarding documentation; so we started to write documentation in separate files and make this documentation available for everyone.

The more documentation and protocol messages we had, the more we felt the need for a dedicated team for it: and so our first API team was finally born.

After several years of using this process, we realised that our documentation covers all of the features which have been implemented since the introduction of the team, acting as a timeline where you can see into the past, observe the present, and plan for the future.

And the fact that the feature development might even take as long as a few weeks actually helps the process. This is because if somebody wants to change something quickly in the app or needs to fix a bug, they have to be deep in the relevant context and know exactly how the app is supposed to work. Instead of asking people around and being redirected to someone else, they can simply consult the documentation and see “OK, it says to do this and don’t do that.”

There are no blind spots, everything is there.

After many years of fine-tuning, our team knows what you need to do in every situation (well, let’s say almost). We don’t have to remember all the time how a feature is implemented, it’s there in our documentation and it’s easily accessible.

We often help developers implement a certain feature. We need to be ready, we need to decontextualise the question and provide a fast answer. Well, only through the documentation we can give a definite answer.

The numbers of our documentation hits are pretty impressive: with a team of only 4 people, we are supporting more than 150 users and over 17,000 requests to the documentation per month!

And there’s another advantage. As the protobuf messages are compiled into client code, we have full control over which messages client and server are exchanging. There’s no need for synchronisation between server and client, no need for hardcoded parameters. Isn’t the field used anymore? We deprecate it, and it gets marked as deprecated for all the clients. And the developers don’t have to waste their (precious) time finding out which parameters to send, to create models for it … everything is ready for them, they just have to use it as described in the documentation. Everyone knows the state of the application, what is used, what has been done, and what to do.

As I said before, it’s a big process, but as you can see it’s worth every minute it takes.

Of course, there are shortcuts too, not all the features are big and not all of them are complex, so in many cases, we can simply skip some steps; we often don’t need so many reviewers, or we can update the documentation directly.

The more we get used to the process, the smoother it becomes: a normal feature can easily be delivered in a few hours.

Conclusions

To sum up, we can honestly say that moving to an API-Centric process has proved a huge success for us. In return for the cost of a heavier process in the earlier phases of the feature, we have gained in reliability, development time, product logic clarity and in many other ways.

Going back to that first question asked at the beginning of this article. Let’s say we have a big feature to implement, i.e. a very big light bulb to change. So, how many developers does it take to change it? Yes, you already know the answer. 1 Product manager, 1 API developer, 2 API reviewers, platform reviewers; and then you need to implement the feature: 1 developer for each platform, and some code reviewers, QAs and their reviewers … How many in total? 24!

--

--