Understanding Gousto Architecture: The Theory

Chloe Helen
Oct 29, 2021 · 5 min read

Flashback to 10 months ago and my interviews with Gousto Tech:

Me: So, Gousto has various microservices across a distributed system…I currently work on a domain with a similar architecture, it’s made up of 8 different microservice. How many are there at Gousto?

Interviewer: Oh, somewhere in the hundreds

Me: …Woah.

Fast forward to today and, while it’s true that working on a large-scale distributed system definitely has its challenges, I’ve learnt that Gousto’s architecture is governed by a set of key principles and design patterns. Understanding those has been key to developing an understanding of the system at a macro level and to managing the complexity that often follows.

So, what does architecture look like at Gousto Tech?

Domain Encapsulation

Having a microservice architecture means that Gousto’s business functionality is split into loosely coupled services, each with their own clearly defined domain and associated responsibilities.

A service can be as small as a single function or be composed of a few different elements such as an API, a database and maybe a function or two. While there’s no single rule for what a microservice should look, they must adhere to the Domain Encapsulation Principle:

  • Each service is independent and loosely coupled
  • A good test for this is whether not it can be deployed independently, not requiring redeploys of other services
  • Each service is the master of it’s own data
  • Its data store is the Source of Truth and no other service can edit it directly

Deciding how to split the business functionality into well defined, independent domains is a complex task but if we simplify Gousto to one key customer journey:

A customer selects some recipes, creates an order and receives a delivery

We could decide to split the business functionality into four such domains and create a microservice for each:

Asynchronous Communication

Now that the functionality has been split we need to introduce a communication channel to enable our user journey. At Gousto communication is Asynchronous by default.

This means that, rather than making synchronous requests to an API on the service, each service will publish a message when an action of interest occurs. Other services can listen out for relevant messages and when they receive one, carry out their part. This is known as the Publish/Subscribe model, or pub’sub’.

In our example, a customer might select some recipes, triggering the Recipe Service to send a message that the Order Service picks up, the Order Service turns those recipes into an order and then sends a message that the Delivery Service picks up so that it can organise a delivery.

Asynchronous communication via messages provides many benefits: it helps to keep services decoupled and the message format is a clear contract for downstream services. Further, it makes the system more robust. For example, if the Order Service was down for a short time, messages could join a queue and be processed later or if there were a huge number of orders in a short period of time, messages could again join a queue for the Order Service to process when it has capacity.

Data Bulkheading

Going back to our example, there are a couple of gaps left to fill. If we’re relying on asynchronous communication how does the delivery service know where to send the delivery to, given that address details are owned by the Customer Service?

At Gousto, we use Data Bulkheading to solve this problem. Data Bulkheading is a pattern by which each service will maintain its own data store, containing just the data it needs to do its job. We use our pub’sub’ communication channels to create and update those data stores.

So, when someone signs up to Gousto and adds an address to their account the Customer Service will publish a message to which the Delivery Service is subscribed. The Delivery Service will store a copy of that address data so that, when the customer makes an order, it can organise the delivery.

Databulkeading helps keep our services decoupled, if there’s an issue with one service the other services shouldn’t be affected. Also, our services are not all reliant on one centralised data store which could cause performance issues as the number of reads/writes grows. However, given that our communication is asynchronous, downstream data stores will inevitably become out of sync with the source of truth for a short period of time. For example, if a customer were to update their address, the delivery service would still have the old address until it’s able to receive and process the address-updated message. This is known as eventual consistency and is a trade off we make against the benefits of high availability of data. In almost all cases, a short time delay has no impact and where it does, we can make appropriate mitigations.

Command Query Responsibility Segregation (CQRS)

CQRS creates a separation between executing commands on our data (create/update/delete) and querying our data (read). Whilst the implementation of this can take many different forms, at Gousto it means that each service is the sole editor of its own data and other services need their own copy in order to query it.

Our earlier principles of Domain Encapsulation and Data Bulkheading come together to help us achieve this. Data Bulkheading ensures that services have their own data store. These can be queried by their respective, domain specific, services so that they can perform their individual business function. This gives us our separated query functionality.

There are many benefits to CQRS on the query side: we ensure a high availability of data on small databases that are easy to query, data models and access patterns can be tailored to each individual service’s requirements and services can scale independently.

When it comes to creating and updating data, only the parent function can edit the data at the source of truth. Going back to our example, only the Customer Service can edit the Customer data and only the Recipe Service can edit the recipe data. This gives us our separated command functionality. All other services find out about these updates via messages.

In this way, we can think of a service’s database as consisting of a combination of its own data and relevant data from upstream services. This enables each service to query the data it needs, while protecting the source of truth behind a separated command service.

This adds complexity in keeping all of our databases in sync but in understanding eventual consistency and knowing when to make exceptions to this rule, we can ensure all services have the correct data they need to perform their actions.

Adhering to our target architecture principles at Gousto Tech helps us to innovate and grow while reducing complexity (as, with more services than I can count, things can get pretty complicated!)

Gousto Engineering & Data

Gousto Engineering & Data Blog