Making the switch from monolith to event-driven architecture

Data consistency across microservices

Ivan Milanov
Just Eat Takeaway-tech
5 min readJul 14, 2021

--

When we started developing a service in one of our teams, we faced a dilemma — should we continue to contribute to the monolith application or take the microservice approach. And the microservice approach is what we took.

The shared data(base) problem

The first part was simple — create another repository, new project, new deployment, easy-peasy. Next we had to create our data storage, which comprised of several tables. Unfortunately, at that point, there were no separate databases and we had to put our data in a big database with a prefix before the names of our tables (not great, not terrible). Everything worked fine, the people were happy, the business was growing.

Time passed and the infrastructure was ready to handle event-driven microservice approaches at scale. Hooray! And we started moving our data into our own database. When you think about it, it’s really just moving data from one database into another. Of course with the addition of some down time, while migrating the data, but overall it was a matter of changing the environment, copying the data and redeploying.

And that’s where you’re wrong, kiddo!

In reality, during that whole time the service was running, several other teams got access to our data and were relying on it. So we postponed the migration of the database, while everyone got access to the new one. At that point questions like “What if we change column names?”, “What if we change the type of a column?”, “What if the business requirements change?” started to play a big role in our decision process. And after migrating, we started working on our next big movement.

Taking ownership of the data

Imagine the following: you have divided your monolith, and now have 2 new services — customers and orders. Both are separate services. Both have their own databases, and up until that moment, they were sharing that data in the monolith-database, querying each other’s data. Customers — to get all orders for specific customers and Orders — to get the name of the customer for a specific order. Now that you have separated the data, neither service has access to it anymore.

Something similar has happened with our service, as well. We could have given access to the other service, but doing this we would have gotten really tight coupling between the services. This would have resulted in a lot of communication and collaboration for simple changes between the teams, eventually limiting the autonomy. And that’s why we decided to expose endpoints to provide that data. There are some great benefits for that:

  • control who has access to the data
  • make database changes independent of other teams
  • aggregate data for specific needs with optimised queries and caches
  • increase autonomy of two or more independent teams

Of course the balance must be preserved in the Universe and that’s why we had some trade-offs we should think of:

  • exposing API means more traffic to our service, which we should handle carefully
  • using more resources, as we have to run a whole service in order to provide results
  • you have “one more thing” to worry about, when developing and maintaining the service
  • you can no longer use transactions (at least in the simple way you used to)
  • to support transactions we should implement complex mechanisms, e.g. Saga Pattern

The trade-offs and approach should be carefully selected and it would involve multiple decisions on multiple levels (team, group of teams and even the whole department). And as we are constantly expanding we just can’t settle with that.

Taking one step further and going event-driven

Assuming you got this far by reading everything above — you know what challenges we’ll face with the “standard” microservice architecture. Most notably, it is the overall increasing complexity. Fortunately, some of the trade-offs can be resolved with the event-driven approach.

  • We’re not going to have constant traffic from internal services, querying our data
  • We won’t have to scale the application in order to process internal service communication

In order to take that one final step, we asked ourselves “When is the data changing?”, “What data is being changed?” and “Who is interested in that data?”. Answering the first two questions gives us the information about when to emit messages and which part of the data we should expose in these messages. Is it completed registration? Is it an item that’s being added in the cart? Is it an order that’s being placed? And the last question gives us the answer — whether we really need an event for that. For example, if only we are interested in a given change, then why bother creating another event or topic at that point?

The transactions problem

After we moved to API/event-driven approach, we still have other services that rely on our one, in order to complete a user’s request. Imagine the following: John wants to update his avatar. We receive the image, but we rely on some kind of response from a second service (AvatarService) that’s going to process the image, crop it, resize it and upload it to a public storage. That service may take time in order to process all this, and we don’t want John to be hanging in our system, looking at loading messages. The solution is really elegant:

  • Receive the request from John
  • Save the requested action in the database with “pending” status
  • Emit message in the event streaming platform, notifying AvatarService about the action
  • Return successful response to John with pending status

Up until this point, John’s intended action was processed and he can continue to use our system. In the background, AvatarService is picking up the emitted message, doing whatever it has to do with the data and then emits a message in another stream (either with success message and data or error message with additional information). Our service is listening for these events and checks if the data in the event matches a pending request in our database. If a match is found, our database is updated with either success or failure. And voila, we have data-consistency across multiple systems in event-driven microservice architecture.

In conclusion:

  • Plan for scale at early stages (as early as possible).
  • Take ownership of your overall solution, including software service, the data, and so on.
  • Provide access to your data via APIs and/or event stream(s).

Did we take that final approach and get rid of the APIs?

No.

It is not a question of either/or, but of ‘and … and’. The combination of both these approaches is actually what is going to be most beneficial for us, as they have different applications and behaviour.

Just Eat Takeaway.com is hiring. Apply today!

--

--