Designing scalable backend infrastructures from scratch
Designing a future-ready backend platform from the ground up is in a lot of demand these days but it’s not easy to wrap your head around the overwhelming information available on it on the internet. So, we’ll build a fully featured scalable backend step by step in this multi-part series.
I’ve created a youtube series out of this blogpost since I got so many requests. Please follow my youtube channel alphacode for series of lectures on microservices architecture.
Link to the first crash course series: https://www.youtube.com/playlist?list=PLZBNtT95PIW3BPNYF5pYOi4MJjg_boXCG
When developing the first version of an application, you often do not have any scalability issues. Moreover, using a distributed architecture slows down development. This can be a major problem for startups whose biggest challenge is to rapidly evolve the business model and reduce market time. But since you are here, I’m assuming you already know that. Let’s jump straight into it while keeping the following goals in mind:
- Distribute API development: The system should be designed in a way such that multiple teams can work on it simultaneously and a single team should not become a bottleneck nor does it needs to have expertise on the entire application to create optimised endpoints.
- Support multiple languages: In order to take advantage of emerging technologies every functional part of the system should be able to support the preferred language of choice for that functionality.
- Minimize latency: Any architecture that we propose should always try to minimize client’s response time.
- Minimize deployment risks: Different functional components of the system should be able to deploy separately with minimal coordination.
- Minimize hardware footprint: System should try to optimize the amount of hardware used and should be horizontally scalable.
Building Monolithic Applications
Let’s imagine that you were starting to build a brand new eCommerce application intended to compete with Amazon. You would start by creating a new project in your preferred choice of platform such as Rails, Spring Boot, Play etc. It would typically have a modular architecture something like this:
The top layer will generally handle the client requests and after doing some validations it will forward the request to service layer where all the business logic is implemented. A service will make use of various adapters like database access components in DAO layer, messaging components, external APIs or other services in the same layer to prepare the result and return it back to the controller which intern returns it to the client.
This kind of application is generally packaged and deployed as a monolith, meaning one big file. For e.g. it’ll be a jar in case of spring boot and a zip file in case of Rails or Node.js app. Applications like these are pretty common and have many advantages, they are easy to comprehend, manage, develop, test and deploy. You can also scale them by running multiple copies of it behind a load balancer and it works quite well up to a certain level.
Unfortunately, this simple approach has huge limitations like:
- Language/Framework lock: Since entire application is written in single tech stack. Can’ t experiment with emerging technologies.
- Difficult to Digest: Once the app becomes large it becomes difficult for a developer to understand such a large codebase.
- Difficult to distribute API development: It becomes extremely difficult to do agile development and a large part of the developer’s time is wasted in resolving conflicts.
- Deployment as a single unit: Cannot independently deploy a single change to a single component. Changes are “held hostage” by other changes.
- Development slows down: I’ve worked on a codebase which had more than 50,000 classes. The sheer size of the codebase was enough to slow down the IDE and startup times due to which productivity used to suffer.
- Resources are not optimized: Some module might implement CPU-intensive image processing logic requiring compute-optimized instances and another module might be an in-memory database and best suited for Memory-optimized instances. But we’ll have to compromise on our choice of hardware. It might also happen that one module of application requires scaling but we’ll have to run an entire instance of the application again because we can’t scale a module individually.
Wouldn’t it be awesome if we could break down the application into smaller parts and manage them in such a way that it behaves as a single application when we run it? Yes, it would be and that’s exactly what we‘ll do next!
Many organizations, such as Amazon, Facebook, Twitter, eBay and Netflix have solved this problem by adopting what is now known as the Microservices Architecture pattern. It tackles this problem by dividing it into smaller sub-problems aka divide and conquer in developers world. Look at figure 1 carefully, we’ll cut vertical slices out of it and create smaller interconnected services. Each slice will implement a distinct functionality such as cart management, user management and order management etc. Each service can be written in any language/framework and can have the polyglot persistence that suits the use case. Easy-peasy right?
But wait! 🤔 We also wanted it to behave like a single application to the client otherwise, client will have to deal with all the complexity that comes with this architecture like aggregating the data from various services, maintaining so many endpoints, increased chattiness of client and server, separate authentication to each service. Client dependency on microservices directly makes it difficult to refactor the services as well. An intuitive way to do this is to hide these services behind a new service layer and provide APIs that is tailored to each client. This aggregator service layer is also known as API Gateway and is a common way to tackle this problem.
All requests from clients first go through the API Gateway. It then routes requests to the appropriate microservice. The API Gateway will often handle a request by invoking multiple microservices and aggregating the results. It might have other responsibilities such as authentication, monitoring, load balancing, caching and static response handling. Since this gateway provides client specific APIs it reduces the number of round-trips between the client and application which reduces network latency and it also simplifies the client code.
The functional decomposition of the monolith will vary according to the use case. Amazon uses more than 100 microservices to display a single product page whereas Netflix has more than 600 microservices managing their backend. The microservices listed in the above diagram gives you an idea of how a scalable eCommerce application should be decomposed but a more careful observation might be needed before implementing it for production.
There ain’t no such thing as a free lunch. Microservices brings some complex challenges with it, like:
- Distributed Computing Challenges: Since different microservices will need to run in a distributed environment we’ll need to take care of these Fallacies of Distributed Computing. In short, we have to assume that the behavior and the locations of the components of our system will constantly change.
- Remote calls are expensive: Developers need to choose and implement an efficient inter-process communication mechanism.
- Distributed Transactions: Business transactions that update multiple business entities need to rely on eventual consistency over ACID.
- Handling Service Unavailability: We’ll need to design our system to handle unavailability or slowness of services. Everything fails all the time.
- Implementing features that span multiple services.
- Integration testing and change management become difficult.
Of course, managing complexities of microservices manually will soon start getting out of hands. In order to build an automated and self-healing distributed system we’ll need to have following features in our architecture.
- Central Configuration: A centralized, versioned configuration system, something like Zookeeper, changes to which are dynamically applied to running services.
- Service discovery: Every running service should register itself with a service discovery server and the server tells everyone who is online. Just like a typical chat app. We don’t want to hard-code service endpoint address into one another.
- Load balancing: Client side load balancing, so that you can apply complex balancing strategies and do caching, batching, fault tolerance, service discovery and handle multiple protocols.
- Inter-process communication: We’ll need to implement an efficient inter-process communication strategy. It can be anything like REST or Thrift or asynchronous, message-based communication mechanisms such as AMQP or STOMP. We can also use efficient message formats such as Avro or Protocol Buffers since this won’t be used to communicate with outside world.
- Authentication and security: We need to have a system for identifying authentication requirements for each resource and rejecting requests that do not satisfy them.
- Non-blocking IO: API Gateway handles requests by invoking multiple backend services and aggregating the results. With some requests, such as a product details request, the requests to backend services are independent of one another. In order to minimize response time, the API Gateway should perform independent requests concurrently.
- Eventual Consistency: We need to have a system in place to handle business transactions that span multiple services. When a service updates its database, it should publish an event and there should be a Message Broker that guarantees that events are delivered at least once to the subscribing services.
- Fault Tolerance: We must avoid the situation whereby a single fault cascades into a system failure. API Gateway should never block indefinitely waiting for a downstream service. It should handle failures gracefully and return partial responses whenever possible.
- Distributed Sessions: Ideally we should not have any state on server. Application state should be saved on client side. That’s one of the important principles of a RESTful service. But if you’ve an exception and you can’t avoid it, always have distributed sessions. Since client only communicates with API Gateway we’ll need to run multiple copies of it behind a load balancer because we don’t want API Gateway to become a bottleneck. This means that client’s subsequent requests can land on any of the running instances of API Gateway. We need to have a way to share the authentication info among various instances of API Gateway. We don’t want the client to re-authenticate every time its request falls on a different instance of API Gateway.
- Distributed caching: We should have caching mechanisms at multiple levels to reduce client latency. Multiple levels simply means client, API Gateway and microservices should each have a reliable caching mechanism.
- Detailed monitoring: We should be able to track meaningful data and statistics of each functional component in order to give us an accurate view of production. Proper alarms must be triggered in case of exceptions or high response times.
- Dynamic Routing: API Gateway should be able to intelligently route the requests to microservices if it does not have a specific mapping to the requested resource. In other words, changes should not be required in API Gateway every time a microservice adds a new endpoint on its side.
- Auto Scaling: Each component of our architecture including API Gateway should be horizontally scalable and should scale automatically when required even if it is deployed inside a container.
- Polyglot Support: Since different microservices might be written in different languages or frameworks, the system should provide smooth service invocations and above mentioned features regardless of the language it is written in.
- Smooth Deployment: Deployment of our microservices should be fast, independent and automated if possible.
- Platform independent: To make efficient use of hardware and to keep our services independent of the platform on which it is deployed we should deploy our web services inside some container like the docker.
- Log Aggregation: We should have a system in place which automatically keeps aggregating logs from all the microservices onto a file system. These logs might be used for various analytics later on.
😮 Whoa! These are lot of features to implement just to take care of an architecture. Is this really worth it? And the answer is “Yes”. The microservices architecture is battle tested by companies like Netflix which alone consumes around 40% of world’s internet’s bandwidth.