Microservice - Deep dive
Prerequisite: we have learned in previous microservice blogs about monolithic and microservice architecture, the Advantages and disadvantages of both, and when to go for which architecture.
https://medium.com/system-design-concepts/monolithic-architecture-51272d0b3393
https://medium.com/system-design-concepts/microservice-architecture-47e9581f8be9
Microservice is an architectural style in which an application is basically structured as a group of individual loosely coupled services that are very fine-grained and talks with other microservice/s using a very lightweight protocol(as they are Single purpose, do one thing only e.g REST/HTTP, RPC). These services are also deployed separately and not dependent on any other service/s.
Let’s take the example of e-commerce microservice. As we can see in the picture above, e-commerce microservice essentially comprises 4 other loosely couple, fine-grained services that are deployed individually.
Monolith is the way of developing an application where all the modules are built into one big application and are deployed together. The advantage of this architecture is that they are simple to develop, deploy, and scale. It gets complex when the customer base increases. Drawbacks are technology dependency, engineering focus, scaling data layer not easy, overloaded VM/container, understanding of code base, and many more.
FUNCTIONAL DECOMPOSITION:
A technique where we break down the application into smaller modular pieces. How do we do this? We decompose the application based on the functional areas like search, payment, product, shipping, authentication, and other functional areas into their own microservices. These are now granular service deployed independently and all of these has their individual codebase. They can be scaled horizontally based on traffic/load.
How users/actors/applications interact with these services?
In the above picture, the left side of the dotted line depicts monolith architecture where all the modules i.e search+product, shipping, and notification are bundled together, have one common code base, and are deployed together on one server. When the shipping module wants to interact with the notification module, it can make a simple function call. Also, these modules share one common DB, which can be RDBMS or NoSQL.
The Right Side of the dotted line depicts microservice architecture where all the services have their own codebase (can be written in different languages say search+product in java, shipping in python, and notification in golang) and each one of these microservice has its own DB. The Advantage of this is that it gives us the freedom to have our choice of DB as per our use i.e for search we can use elastic, for shipping we can have any NoSQL say mongo DB, and for notification, we can have RDBMS(oracle). If the shipping service wants to interact with the notification service it makes a rest call/RPC call to the notification service.
As we can see in the second scenario, scaling is easy i.e if more and more user is performing only the search operation in our application we can scale “search+product” without touching/affecting other services, deployment is easy, no dependency on technological stack among services, loosely coupled i.e changing one part of code won't affect other, and easy to understand(one can focus on the service he/she is deveoping). Disadvantages include interprocess communication ( Call using REST API), distributed transaction(Will discuss this later in more detail), more resources, and debugging issues.
THE SCALING CUBE:
In the previous blog, we have already gone through the scaling cube concept. I am attaching the link herewith. Happy learning!!!
API GATEWAY:
In a typical e-commerce application, we can see product details, reviews, ratings, price, frequently together bought item, merchant details, shipping information.
Let’s say all these services are functionally decomposed. When an eCommerce page loads up we see all this info at once. How does this happen?
Direct calls where front end/client calls all these services one by one or parallelly i.e 7 different calls to get this info. It’s not the best way of doing this. It would have an impact on load and time.
API GATEWAY: Another way is using API gateway i.e between client and these microservices there’s one more service “API GATEWAY” in the same network as these services (hence reducing the calling time). This service internally can call these other 7 services in any order. It is much faster than the above approach.
Advantage:
- Authentication: Instead of every microservice checking for authentication, now it can happen at the API gateway level itself (JSON web token).
- SSL termination: HTTPS is recommended for better security. However, we can call other services from API gateway to microservices can use HTTP. For the clients, they use only HTTPS, From the API gateway, it can be web socket/HTTP/HTTPS/RPC anything.
- Acts as a load balancer.
- Insulation: No client from outside can directly access the microservice.
Disadvantage:
- An increase in the hop may result in increased latency.
- Results in a complex system as we now need to maintain API gateway.
Another pattern is BFF: Back-end for front-end: sort of same as API GATEWAY pattern.
Here we have three different API gateway for three different purposes. If the requet is coming from the web application then all calls will get redirected to API gateway for web and in the case of mobile application and third-party users, all the calls will get redirected to the respective API gateway as shown in the above image. This way we can compose different responses for different types of clients using the same microservice. Also, we can track and rate limit the 3rd party API usage. one API gateway can be for android and one for IOS as well.
SERVICE REGISTRY:
Let us consider a situation where we have horizontally scaled search+product, inventory, merchants, and finance microservice. Let's say we have 3 instances of merchants running and one goes down. The question here arises that whose responsibility is it to inform API gateway about the network address of the 3rd instance which is now down and the network address of the new 4th instance which came up as a part of auto-scaling?
There can be a situation where the “shipping” service wants to interact with the “merchant” service. How will the shipping service get to know about the IP address of the merchant service?
These are the problem which we will solve using “service discovery”.
Service discovery is a pattern to identify network add of other services using service register which is kind of DB which contains the list of instance and IP address. so if the client wants to interact with the shipping service directly it can query service register and get the list of network addresses for that particular service.
The question here arises that how service registry will get to know the latest network address of any microservice?
- Self registry: Let's say one instance gets added up in “shipping” microservice, shipping microservice then interacts with service registry and adds/updates its network address, port, and service name. If any of the running instances goes down then after every x no of minutes/seconds microservice keeps on updating their location, if any service fails to update then the service registry deletes that information.
- Third-party: in this case, a service registry asks microservices about their address and port. It keeps on checking microservices periodically and keep updating the service register/DB. The advantage of this is that now the registry knows exactly how many instances of service are running and the state of each and every service.
Let’s talk about discovery now. It is the counterpart of registration. We basically access all the information present in the registry.
This answers “There can be the situation where the “shipping” service wants to interact with the “merchant” service. How will shipping service get to know about the IP address of the merchant service?”
The client can now request service registry and gets the list of network address lets say ip1:<port1>, ip2:<port2> and ip3:<port3> in case of merchant also load balances the requests among these machines. client can’t keep on talking with service registry every time if it needs to interact with microservices, there will be a designated TTL. What will happen if the service registry updated itself but at the client end information still is not updating. This is one disadvantage of talking with services directly and is not advisable. Also, traffic on the service registry will be high as clients keep asking for the latest data continuously. This is what the client discovery is. There’s something known as server discovery as well where instead of a client talking with service registry, API gateway requests service registry to get the updated list of a network address of all the microservices and to interact with all the microservices.
That’s all for this blog. Much more to come.
Stay tuned!!!