Who limits the rate-limiter?
Queueing API requests to use Cloudant more efficiently
APIs make our lives easier. As developers, we are all consumers of APIs built and maintained by service providers. It’s important to remember that this relationship is a two-way street.
To ensure a good experience for all applications hitting their APIs, providers need to limit the rates at which they are consumed. Whether your app experiences a sudden surge in popularity, or buggy code is unintentionally flooding a service with requests, you can make API providers’ lives easier by accounting for usage spikes in your design.
In this article, I’ll explore a simple technique for queueing API requests, using the Cloudant database service as an example.
Get in line to miss the 429
With any API, if you exceed its rate limits, your request will get a “HTTP 429 Too many requests” response. The Cloudant Node.js has a ‘retry’ plugin that will resend such requests. This approach is handy when you only occasionally are hitting a limit, perhaps at times when your site or app is unusually busy.
If you are routinely exceeding the quota, then no amount of retrying will help because your app will be systematically retrying a swathe of requests. In that case you need to look at upgrading your API access (if possible) or adding a layer of abstraction to handle your request rate.
I’ll use Cloudant’s most basic Lite plan as an example. Here are the data and API rate limits (at time of writing):
- < 1GB data size
- < 1MB request size
- < 20 lookups (hits on the primary index) per second
- < 10 writes per second
- < 5 queries per second
To use this plan as efficiently as possible and keep your write requests to 10 per second, you have two options:
- make bulk requests — instead of writing 50 documents individiually, write all fifty in one call to
POST /db/_bulk_docs
- queue your requests and only allow the queue to be consumed at rate that is less than the permitted level
Implement a rate-limited queue with qrate
I wrote a Node.js module to help with the latter option. The qrate library lets you create queues and specify:
- concurrency — the number of jobs that will be worked on in parallel
- rateLimit — the number of jobs per second that are allowed to be consumed from the queue
Here’s how it works. First, bring in the silverlining library to access a Cloudant database and the qrate package too:
const silverlining = require('silverlining');
const qrate = require('qrate');
const db = silverlining('https://reader:password@reader.cloudant.com/queue');
Then, define a “worker” function that deals with a single queue item. In this case, you want to write a document to Cloudant. The worker function receives the ‘document’ and calls ‘done’ when it’s finished:
// the worker function:
// writes the document to Cloudant
const worker = function(document, done) {
console.log('worker', document);
db.insert(document).then(done);
};
You can then create a rate-limited queue using the qrate
module:
var concurrency = 3; // three workers at a time
var rateLimit = 10; // maximum 10 items per second
var q = qrate(worker, concurrency, rateLimit);
Then, feed the documents you want to add to the database to the queue (q) with q.push
:
for(var i = 0; i < 100; i++) {
q.push( { i: i, name: 'hello world' } );
}
Even though there are 100 items in the queue and up to three workers running at once, the queue rate never exceeds 10 per second. So no 429 responses are received from Cloudant and no retry logic is required.
In a real application, you would add documents you want to save to the queue instead of writing them directly to Cloudant. The queue would ensure that the writes happen slower than the prescribed rate, with the excess building up in memory.
This approach is useful for processing calls to an API service that has a rate limit!
If the qrate
API looks familiar to you, then that is because it is based on the excellent async library, which provides a range of tools for writing asynchronous code for JavaScript. The qrate
library is essentially the same as async.queue
with an extra, optional rateLimit
parameter. Without it, it behaves as a normal async.queue
.
Happy API queueing!
If you enjoyed this article, please ♡ it to recommend it to other Medium readers.