The case when HTTP threatens the Database

Intro

Almost two years ago I decided to start a journey of learning AWS Lambda and Serverless Framework, being a Ruby/Rails developer is not a common thing to do as many of you might think, but if you know me well you will understand, I’m always open to learning new things, constantly trying to strengthen my problem-solving skills by expanding my horizons. I still have a passion on Ruby but this experience has been amazing and complementary; I’ve learned a lot about these tools and their ecosystem, and from a more general perspective, about architectural resources to be used when dealing with massive data volumes and real-time applications. Today I want to take some ideas related to architecture and not to any particular technology and put them here as a reference for the future me and those who might be interested, in particular, I will try to describe a solution I’m tending to use for the case massive data ingestion.

Capacity

Dealing effectively with big systems requires a lot of effort in many areas but I’m going to focus on Capacity management this time, clearly, if we want stable systems we need to pursue predictability by doing some capacity planning, why you say? first, we don't want to overwhelm any service on our system on traffic spikes, second, who wants to spend lots of money in infrastructure just to remain operational under occasional bursts? third, no one considers service outage as an option.

HTTP Bursts

The HTTP protocol is a push-based protocol, meaning that the clients push requests to the server and it's server responsibility return responses otherwise the client will get an error. There isn't another choice than accepting all the possible requests coming from the external world, period. This is not a simple problem to solve but the tooling has gotten great these days, we have Docker, Kubernetes and many Cloud Providers supporting auto-scaling and load balancing features out of the box which is amazing, however, head ups! accepting the requests is just the tip of the iceberg.

Push systems could overwhelm the capacity of the receiver when the producer is faster than the consumer creating bottlenecks.

Accepting the request is just the beginning of the request-response cycle, what happens to the rest of the system under a request burst? how it gets affected by accepting a massive number of requests?

Well, the requests will produce a number actions unleashed by the request processing logic, these actions will go through every part of the system at very high rate hitting every component in the road. Normally, you will see this as a normal flow in a back-end but it depends, it's a matter of bandwidth and capacity. Let’s take as an example the Database, when the number of ingested requests grows significantly for a period of time, a propagation of writes at very high speed will require a lot of effort from Database to handle the load and eventually it could cause a service outage if the load is intensive. Since this is an unpredictable situation, users connect on different times, from different countries, etc. you will be will be forced to give more power to the Database just to deal with bursts, being overprovisioned must of the time for no reason. This situation is not maintainable for DBAs neither for the business, it's unpredictable nature prevents you to play with resource allocation and it’s not cost effective at all. (Even under predictable high load, is not feasible use a resource allocation policy for the Databases based on the HTTP request load, a linear matching of both is still not cost effective)

All the requests that entered the system will hit the Database at a high rate, causing an outage eventually.

How to prevent this to happen without losing the capacity of accepting a high number of requests?

The way I see it is that we can do better, the solution comes from trying to create more predictable write patterns where you can write fast enough but not going over capacity by regulating the rates.

Enter pull systems

Pull systems have a steady flow that provides predictability. The consumer processes things at its own pace.

Call it contention mechanism, back-pressure or reactiveness, pick the buzzword of your choice :) The idea is to introduce an intermediary service just after your HTTP endpoints where you store "events" containing the information of what the request intended to do. Sidekiq, Message Queues / Brokers such as RabbitMQ, SQS or more sophisticated platforms as Kafka, Kinesis or Redis Streams are valid alternatives to implement this pattern. How are they better than the DB? these services are optimized to behave reliably under huge load, the will accept all your data at a really high rate. Notice that we still need a Database, this intermediary service is meant to reduce the volatility of the data flow inside the system in a way that you can programmatically pull data from it and consume it at a rate which is suitable to the Database or any other downstream service. Basically, you will be able to regulate the speed the data gets sent to the next step in the processing chain putting you in control of moving data, being way easier to operate at full capacity, not over it.

The intermediary service allows to propagate events at regulated speed avoiding downstream services from going over capacity

The back-pressure mechanism is required most the times when the publisher is faster than the consumer which is basically the case of most of the backends out there. A friend suggested connection pooling as a solution but notice that this strategy operates at instance level which is only feasible in the early stages when you are not even considering service duplicity. What is discussed here, just to clarify again, is an option when dealing mostly with distributed systems, imagine multiple instances of Rails spread across a cluster or a number auto-scalling HTTP AWS Lambda functions.

Conclusion

What I have just covered is just the beginning, actually, I only tackle a way to optimize writes, there are many things involved on this area, starting from reads optimization I plan to write a bit further about my experiences. If you are eager on learning about it, here are some terms you can dig into Domain Driven Design (DDD), Event Sourcing (ES), Command Query Responsibility Segregation (CQRS) and Eventual Consistency.

Last words

Being only a piece in a bigger picture the results still feel enormous, it will please me if you now start thinking about Capacity planning and contention mechanisms to prevent service outage or poor predictability on your systems, I guess that will make my day :)

Happy coding! }