Design to Scale (Part 2 — Modern Architecture)

Published in

Xebia Engineering Blog

8 min readDec 12, 2019

In the part 1 of this series, we looked at a sample application which is serving the feature but not doing too well on scalability and cost effectiveness.

Below is the architecture of the original app, in this final part, we will look at each of the issues and solve them as we proceed.

High coupling & low extensibility

We observed two issues:

a) Performance of Accounts service shouldn’t block the performance of payments service

b) Development team should be able to work and deploy the services independently, improving productivity and reducing the inter-dependent production issues.

We will start by breaking this dependency using message queues. A message queue helps you to decouple interaction between two services and change the communication pattern from Synchronous to Asynchronous. There are many messaging queues are available and you can read about a few important ones:

1. Event Hub

2. MSMQ

3. Kafka

Based on the features desired, an enterprise can choose the message queue or build their own. Let’s look at the new architecture diagram:

What has changed now and how it solves the underlying problem?

To begin with, now payments service doesn’t have to know about accounts or any other service. After interaction with swift gateway, the payments workflow is complete. There is no coupling between services now.

Is it correct, what if accounts service is down or relatively slow? This is exactly what message queue solves for. This is called asynchronous communication pattern; the message queue holds the messages and guarantee reliable ordered delivery. If accounts service is down, when it’s up again the service can start reading from the queue.

If accounts service is relatively slower than the payments service, then you can spawn two instances of the account service consuming the message queue while payments service is publishing to it.

In terms of development and productivity, we can have separate team on payments and accounts services with no knowledge of each other’ area. Developers need to comply with the messaging format though which can be versioned as well.

But, remember there are no free lunches. Asynchronous communication pattern comes with some complexity especially if more than one instance of service wants to publish or consume from the queue. At a minimum consuming service instances must manage the sequencing they need to read from queue.

I hope this will give you some idea about the pattern, you may browse to learn more about it, and we will try to publish a separate series on this topic as well.

So, what we observed is that by introducing asynchronous communication pattern, service dependencies are resolved. So, why do people talk about Micro-services? Which are the applicable cases?

Let’s move to issue 2, Scalability and try to answer. We found three main problems under this topic:

Server Capacity

As we observed in part 1, to run 100 applications, organization had to procure minimum 100 servers for segregation between these.

Instead of procuring all physical servers, there is already a better approach, which is in use for years, that is setting up Virtual Machines (VMs), which can help to run segregated 100 applications on much lesser physical servers.

But there is a huge overhead with VMs, each VM needs its own Operating system (OS) and each OS needs processing power on the server. So, if we create 4 VMs on a physical server then reasonable amount of server processing capacity would be consumed by the OS itself and rest would be available for applications.

Moreover, depending upon the OS, there is an additional licensing cost as well.

So, this is a good time to introduce containers. A container is light weight process which provides a way to isolate applications and provide a virtual platform for applications to run on. Multiple containers can be deployed on a single server running on a single OS.

Much better in performance than VMs !!

You may read these links to get insight about containers:

https://www.electronicdesign.com/dev-tools/what-s-difference-between-containers-and-virtual-machines

https://docs.docker.com/

So, we are yet to solve the scaling problems of the organization, but it is observed that if we use a containerized application then it can be a good starting point to cost-effectively scale with much lesser number of servers and it is also much lighter and faster than VMs, which need their own OS to run.

Service Capacity

The tweaking in architecture for ‘High coupling and low extensibility’ problem and usage of container, implicitly solves the service capacity issue.

Think of a model where Payments service is deployed in its own container and two instances of accounts service are running active / active on two containers and a message queue is brokering communication between them. Now, we can add more nodes to the service which needs more power and the rest of the system can work “as-is”

Great, so our services are decoupled and containerized. But still there are problems to solve:

This sounds great that in containerized world, more nodes can be added to the service which needs more power. But is it all manual? Do we need a human monitoring the load and assigning the containers on different servers?

During peak load for a shorter time (festival season), how can the application(s) scale without increasing a long term OPEX (Operational Cost).

Also, what if everything else is running super-fast but DB can’t sustain the volumes?

Let’s address these problems.

Scalability and solving DB as bottleneck.

This is the right time to introduce another concept called Orchestrator, Kubernetes is quite popular, and few others are available as well like Azure service fabric.

https://kubernetes.io/docs/tutorials/kubernetes-basics/

https://azure.microsoft.com/en-in/services/service-fabric/

Docker swarm is a lightweight orchestrator but not as feature rich as the former ones above.

The purpose of orchestrator is to ensure that containers are automatically scaled-up and down based on the set-up. There is a steep learning curve on Orchestrator but am sure you will take it up to become a champion on modern architecture.

So, Orchestrator will solve the problem of managing containers which are running your enterprise applications and it can range from tens to thousands, doesn’t matter!

Now, focusing our attention to peak load scenario. This is where cloud comes handy, you can leverage any of the existing cloud service like Azure, Google or AWS to host the applications and configure the orchestrator to dynamically scale the containers, leveraging the cloud resources.

Remember that the container and orchestrator (e.g. Docker + Kubernetes), can run on-premise as well, so depending on level of scalability required your organization can decide to deploy to cloud. Correct usage of this paradigm can provide you a cloud native but cloud agnostic deployment strategy which is quite useful for managing cost in long term.

Having crossed these obstacles, we are left with one more challenge, DB as bottleneck and yes, modern architecture answers it with the concept of Micro-services.

By definition, a micro-service focuses end to end on a single feature of your application and can be developed and deployed independently. If you have been following all along, then it’s no brainer that in our example Payments can be one Micro-service and Accounts can be another one.

Wait, isn’t it already like that, what’s new here? Well, the answer is each micro-service has its own DB to decouple the DB dependency. But, won’t it affect the atomic nature of transactions we all have read since our college days? The answer is Yes and No. There are patterns to use with micro-services to handle the 2-phase commit scenarios.

You can read here for getting started on micro-services concept.Here is the updated architecture, with two instances of account service:

Remember, this whole architecture can be deployed on-premise as well, it’s just that number of servers will be constrained by the data centre, which is almost endless in cloud.

Having covered this, there are few more concepts which are essential to understand, these are not in scope of this post, you should however read about these:

1. REST API

2. Stateless service

3. Serverless architecture

So, finally we have moved from a monolithic to Micro-service based design, providing us potentially unlimited scalability in cloud with optimal cost.

We can quickly cover the other two issues:

Release pipeline and CI / CD

This is a great benefit of micro service based design. Since, systems are decoupled they can be independently developed and continuously deployed. Cloud gives you a lot of out of the box features to set-up pipelines and deployment slots.

Similar functionality can be achieved on-premise as well with correct tools. Important aspect is that our architecture supports independent development and deployment of services.

Feel free to explore this aspect further, especially if you are into DevOps or release management, industry is moving towards automated pipeline and CI / CD, it is useful to up-skill sooner than later. You can go through this link to get started — https://azure.microsoft.com/en-in/services/devops/pipelines/

Limitations

Availability on mobile devices

This is where API first design is important. Our services need to expose the features as REST API which can be utilized by different UI frameworks. It gives immense flexibility on a single code-base. As with other topics, there is some learning curve with understanding and more importantly practicing API based design, and you will find enough good posts on this topic.

Cognitive / other services

Well, the application design is cloud native and cloud agnostic now. You can choose to leverage whichever service(s) you want and choose to deploy on any cloud or keep the application on-premise and leverage the offerings through API.

The possibilities are endless.

Conclusion

The modern development architecture and cloud are evolved based on the solution engineers have provided over the years. For e.g. Kubernetes was developed in-house in google to solve the scalability of google search and eventually became open source.

There is surely some learning curve but it’s inevitable and applications can’t be developed and deployed in legacy ways, and to stay relevant engineers must learn these and organizations must adopt.

Design to Scale (Part 2 — Modern Architecture)

Written by Vikas Thareja