Supercharge your microservices by splitting load to readonly database replicas.

Amol Ghotankar
Engineering @Varo
Published in
5 min readAug 24, 2021
supercharge your microservices by using read only databases

It’s 2021 and most likely you have already moved to state-of-the-art, cloud based, kubernetes backed infrastructure supporting tons of microservices and an awesome service mesh. Moreover these microservices are most likely connected with a cloud-scalable, fully managed, multi-zoned, highly available database cluster. You are pouring tons into the pockets of cloud providers for the scale you need, but are still worried if all the infrastructure you procured is enough — or have you over-provisioned?

OK, let’s run a quick audit of the infrastructure to understand if it is being leveraged to the full potential.

  1. Firewall/Edge Protection — Yes, we can validate all bad guys are being kept out.
  2. Networking — Yes, seeing good traffic, throughput, and all networking gears in use so services are able to talk with each other.
  3. Queues — Yes, seeing messages coming in and out and microservices are getting their data feeds.
  4. Containers — Yes, seeing pods processing requests and serving our customers.
  5. Databases — Main cluster utilization is a solid 50–60% cpu. Oh but wait, what are these “read replicas” doing? CPU is just 2% and they are not really being used? We need them for failover as we can’t lose any data or stand any downtime for our customers. These are the most underutilized beasts just burning cash.

Yes folks, several studies across the globe have suggested that databases are typically oversized compared to their average usage to cover the abrupt spike and most of the microservices architecture backed with typical RDBMS systems that aren’t really built intrinsically to support autoscale.

Hold on, after reading “Abrupt Spike” or “Auto Scaling” or “Scaling DB” or “Optimization,” do any of these terms resonate with you?

Great. Let’s dive deeper and spend some time understanding the problem itself. An abrupt spike is an unexpected, sudden, and short-term load your system faces. It might be due to either user habits, seasonal behavior, targeted marketing campaigns, an external reason beyond control, or the best scenario — a viral campaign which attracted a huge amount of new and existing customers.

Whatever it is, as an engineer you want your systems to scale and support the spike and serve your users. Downtimes definitely affect your brands bigtime.

Let’s analyze the spike a little more before we come up with a solution. What did the users do when the spike occurred? Were they logged in, signing up, landed on the default screen, etc.?

After analysis, typically the activities can be categorized into major two categories:

  • User gets some information from the server
  • User sends some information to the server

From a database point of view, this translates that the users did some operations that caused a spike on read or writes to the database.

So how do we handle this, add more infrastructure? Prescale database? Or can we have our microservices to split the load between reads and writes? What if, when we have spikes, we use those underutilized Read replicas to handle any reads? Or even better, we always read certain business critical reads from read replicas. Sounds win-win, but how?

Let’s say we want to split read and write calls. We can make it happen with one of the below

  • Client Aware — Let the client decide which calls are read vs write.
  • Network Aware — Let the network or orchestration decide which calls are read vs write.
  • Service Aware — Let the service decide when to make read vs. write calls to db.

Client Aware

Client Aware database connections to readonly replicas

This pattern, typically known as CQRS (Command Query Repository Segregation) server, provides distinct endpoints for reads and writes so clients know when it needs to do read vs write, and calls them as needed. This pattern has several implementation sub-patterns where server hosts read and write endpoints on completely separate services encapsulating the caching logic or even connecting separate database hosts too for example: Write to RDBMS like postgres or mysql but read from elasticsearch or mongodb.

Pros

  • Clear separation between reads and writes.
  • More mature and less friction.
  • Fits well when there is domain driven RESTFUL api making a clear distinction where GET means read and POST/PUT would mean write.

Cons

  • Works well if starting with this pattern from day 1, or else you’ll need to invest in re architecture.
  • Clients need to know when to read vs write.
  • May not fit well in situations where clients talk to a secondary orchestration layer or with complex multi-domain operations where internally a GET may do a write vs. a POST, which may do several reads.

Network Aware

Network aware database connections to readonly replicas

In this pattern, typically client and server are not changed much, but either based on routing, orchestration, or network configuration where the traffic is routed to copies of services which can do either read write or read only. So technically if you have Service-A, another copy of the same service-A-RO is deployed. Based on if the callee needs to read or write, the traffic is directed to the respective service instance. Clients can leverage headers to override or force switch given control to clients and extend network aware + client aware capability.

Pros

  • Not much development needed to get started.
  • Easy to switch from read write and read only.

Cons

  • Need at least two service instances and if the load is fluctuating between RO/RW, then you’ll need to keep both of them scaled.
  • Because RW/RO both have read and write functionality, it induces a risk of network or orchestration call read instead of write

Service Aware

Service aware database connections to readonly replicas

In this pattern, we encapsulate the logic of whether to make an api call to hit the database for read or write. We try to keep this encapsulation as close to the database connection as possible to give enough granularity, keep clients from worrying about internal details, and let the server have more control.

Pros

  • Abstraction & Encapsulation of the implementation details inside a service.
  • More control inside the service and keeps clients free from any of these implementation details.
  • Makes it easier to switch over from readonly to readwrite and handle failover.

Cons

  • Requires development and a selection of which api or unit of function should be readonly.
  • Clients inherently do not have much control until we build a header or some explicit way to override.

Conclusion: Each approach as any technical solution would have its own advantages and disadvantages. The choice depends on what your use case is and what your short-term, mid-term, and long-term goals are. Whichever approach you choose, having one is certainly helpful to getting the right competitive advantage. Share your thoughts on which approach you would like to use and why. Happy coding.

--

--

Amol Ghotankar
Engineering @Varo

Techie who is always excited to solve business problems with engineering solutions that empathize with the users.