To Kill a Microservice

From Microservice to Monolith

Published in

Botify Labs

9 min readMay 14, 2020

One of the core values of our Engineering team at Botify is ownership. We thrive to give our Engineering and Product teams the autonomy and leeway to own their projects and bring them to completion. However as we grew to become a larger team working on a bigger stack we started to run into some issues with how we shared our work.

Towards the end of 2016, we wanted to give more local ownership to Engineers and Product Managers in order to bring their products and stacks to life as fast and painlessly as possible. To that end, we chose to split our monolithic Django application into microservices.

This is the story of how we failed, and why today we are moving these services back to a monolithic application. We’ll also take some time to share our hard-learned lessons in the process.

Before we dive in, we want to stress that this article is not meant to be a condemnation of microservices in general. At Botify we strongly believe in using the best tool for a fitting situation — we are saying that microservices were not the right tool to solve our problems at this point in time.

Starting the journey of microservices

First, some history. Botify was founded in 2012, on a Python/Django stack. In the beginning of 2016, the entire Botify platform was served through a load-balanced cluster of our Django application. We had, at the time, a Product & Engineering team of about 15 people.

Why Microservices?

Two major goals drove us to evaluate the possibility of using microservices in our stack:

Increase velocity: we wanted to reduce dependencies between frontend and backend and increase local ownership to drive subjects to production within a small team.
Unify technologies: we were having more and more trouble hiring Python/Django Engineers in Paris, and thought that unifying JavaScript in frontend and backend would allow for easier recruitment as we would converge towards a single “FullStack” JavaScript role.

Dream big, but start small

We decided to implement our first microservice in the form of our authentication and authorization stack. We thought we’d start small, by taking a limited functional scope currently handled by our Django application and moving it out to its own microservice. We created a small JavaScript team to implement a NodeJS backend that would handle our users, their accounts and permissions. We proudly called it Customer Success.

Summary of the API architecture at Botify using a microservice

Pain points

Organization

When you make technical decisions driven by human factors like we did, you quickly run into issues. The newly formed team’s organization and processes were hard to put in place alongside the existing team. As features were developed in isolation, compartmentalization of knowledge was quick to emerge. The inner workings of the microservice were not shared and the missing cross-stack code reviews made us lose the feeling of ownership.

Isolation

As one might notice in the architecture above, the Customer Success microservice is not exactly isolated. There is backend-to-backend communication either from the monolith to microservice, or vice versa. This isn’t exactly bad — it even seems rather common — but we still impede performance and create a dependency between both services. It also implies more sophisticated mechanisms in the long-run, such as circuit breakers and graceful service degradation, to ensure uptime and availability.

Another observation from the schema is that our main relational database was shared. Inside the database, some tables were mapped to models in both Django and Express/Sequelize. Meaning that schema modifications on the shared tables require a synchronization between the microservice and the monolith. This is the result of a badly separated domain.

Tooling

With time, we learned how to robustly build and deploy our Django application, but this had to be mastered again for each new stack. In a microservice environment, where independent deployments are a core advantage, we remained more confident deploying our monolith than microservice. We spent less time to get robust and smooth automated deployments for our microservice.

On a day-to-day basis, we were running into more and more issues, twisting and turning our solutions to make our architecture work better. During this time we worked on our Django stack, bettering the code, coverage, testability, dependencies and performance. We hosted a few Django and Python events, and recruited some amazing Django developers to work on our monolith.

We inverted the initial paradigm that drove us to microservices in the first place.

The stack for the microservices was chosen from the current trendy frameworks and what was best known by the development team. However, at Botify we firmly believe in choosing the right tool for the right job: the right tool is not necessarily the most famous tool, nor the one we already know.

We decided it was time to seriously (re-)consider our microservice architecture and our Customer Success backend.

There and back again: back to our (monolithic) future

With all of our daily grunt working back and forth between our JavaScript authentication backend and our Django monolith, we sat our Engineers around the table and weighed the pros and cons of our architecture, as well as the potential cost of migration. We boldly chose to fold our authentication backend back into our Django monolith, reproducing the API endpoints we had transferred from Django to Node in 2016 back from Node to Django. Here are some of the prevalent reasons why we would go backwards like this.

Reasons

The right solution for the right problems: it is important to remember our use cases and who Botify is serving. Our platform is an enterprise B2B service helping the largest websites on the Web improve their SEO. Our scale challenges reside in the analysis of our customer’s data, not in our traffic: we handle Petabytes of data every month, on long-running tasks such as crawling or processing data. We do not, however, require huge user-facing traffic, as, at the time of writing, our users are mainly the SEO managers of the biggest web properties on the Web. A microservice architecture handling user-related data aims to serve a high-traffic B2C platform, while Botify’s challenge lies in dynamically aggregating gigabytes of SEO data and making it available in seconds. Responding in milliseconds for metadata concerning some ten thousand customers is not a task requiring a highly scalable microservice architecture. Quite the opposite, our backend-to-backend communications were slowing down these light retrieval processes and making these requests take more time.

Shared resources means synchronized deployments. An illustrative example: a recent project was to remove an unmaintained dependency handling JSON database columns as text to leverage the built-in jsonb column type from PostgreSQL. However, these text columns were shared between two codebases and a progressive migration was cumbersome. Deploying multiple backends synchronously is error-prone, especially when scaled and load-balanced, and goes against the microservices’ initial benefits. We’ve always enjoyed this cautionary tale at Botify to illustrate our dislike of synchronized deployments, it’s a fun read if you have a minute.

Separate codebases dilute knowledge, ownership, and implication: the microservices were implemented by few people in the first place, and in an attempt to shield newcomers from an unmaintained codebase later on, the same persons tackled all evolutions. This phenomenon went against our code review mentality: anyone being able to review and understand what is going on. We like to work in the open at Botify, so anyone in the team can have a look at all code and make comments or suggestions. With separate stacks, teams, and languages, we lost a lot of the benefits that teamwork and strict review bring to a development squad.

These were the main reasons, amongst others, that pushed us to kill our authentication microservice and migrate its endpoints back to our Django monolith.

To Kill A Microservice

There aren’t 100 ways to go about it. Deprecating and killing a microservice is as straightforward as it gets.

We wanted to kill the Customer Success backend with the smallest possible set of changes, which meant mimicking the behavior of the microservice, re-writing the API endpoints one-by-one in the monolith and switching the routing progressively. That way, changes only impacted the API, leaving the frontends untouched, and consequently avoiding dreaded synchronized deployments or API versioning.

These are the steps we took to kill our microservice:

Identify routes

List all API routes that are served by the microservice, identifying their usage and goal. All low-hanging fruits are good to take: some routes might be unused or not necessary.

Create routes in destination backend

The most bothersome task is to write the same logic in the destination codebase. Same input, same output with the aim of minimizing risk of switching.

Y-branch the calls for a given time to compare results

Set up a simple branching system that will use both backends and compare responses to catch mismatches.

QA results, make fixes

Have a large and broad QA pass on the logic to catch any remaining bugs.

Progressively enlarge the Y-branch funnel

Switch the Y-branching more and more to use the newly migrated logic.

Once 100% on the monolith, remove microservice and clean up

Remove the instances and machines that were running.

We are proud to say we successfully killed our Customer Success NodeJS authentication backend at 10:34 on February 4th, without any synchronized deployments or API versioning.

Line chart showing the number of incoming API calls suddenly dropping to 0 — Time of death: 10:34

Pain points

As previously said, killing a backend is pretty straightforward, and with our approach we did not run into major blocking issues.

The most frustrating issue we ran into is actually one we introduced all by ourselves: when building our Customer Success backend we chose the strategy of always responding with a 200 whatever the result of the REST call, with a success boolean and statusCode response key, for richer errors:

{
  "errors": [
    {
      "message": "Organization was not found",
      "code": 404,
      "name": "NotFoundError"
    },
  ],
  "success": false,
  "statusCode": 404
}

While this is a common REST practice, it’s not one that sits too well with Django REST Framework. Migrating these endpoints from Express to Django without changing the contract included heavily overriding the Django REST framework behavior to match what we had implemented in Express. Instead of leveraging the HTTP status code for errors, we modified it so we could return 200 status codes with the same response keys. Frustrating, but not the end of the world. This discrepancy itches and we’ll fix it as part of our continuous “Kitchen” (read Technical Debt) if we deem it necessary.

On the bright side, our microservice defined a smart(er) Cross-Origin Resource Sharing strategy which had to be reproduced on our monolith. Even if it was tedious to get right at first, it allowed us to secure our application better against malicious attacks.

Conclusion

This is not a condemnation of microservices, nor is it a comment on Django vs Express REST backends. Each of these has a place, but we believe in choosing the right solution at the right time to build our products. Today, we’ve clearly gained from not switching back and forth and maintaining two backends.

As a side note, when we built our microservice backend we actually built another one alongside it, which today still lives happily (ever after?). Its function is less tied to users and to our main monolith, so it works well and there are less dependencies and pain points. If we had to implement a similar feature again, we would probably go with our monolith, but we don’t need to fix what is not broken for now.

For some use-cases and situations, we still believe in microservices. We are not excluding the possibility of creating new microservices in the future. We have, however, learned from our mistakes.

We also evolved in the organization of our teams, now split into squads, independent entities that work on product features and technical challenges with frontend and backend Engineers, a Product Manager and a QA Engineer. Having learned from the past, each squad designs and implements its features in near-isolation, but the codebase is shared cross-squads, with code reviews occurring from all of the Engineers at Botify. This enables us to share the ownership, knowledge and maintenance of the stack and reach the level of technical excellence we want to achieve together as a team.

Interested in joining us? We’re hiring! Don’t hesitate to send us a resume if there are no open positions that match your skills, we are always on the lookout for passionate people.

To Kill a Microservice

From Microservice to Monolith

Starting the journey of microservices

Why Microservices?

Dream big, but start small

Pain points

There and back again: back to our (monolithic) future

Reasons

To Kill A Microservice

Pain points

Conclusion

Written by David Wobrock