Who limits the rate-limiter?

Queueing API requests to use Cloudant more efficiently

--

APIs make our lives easier. As developers, we are all consumers of APIs built and maintained by service providers. It’s important to remember that this relationship is a two-way street.

To ensure a good experience for all applications hitting their APIs, providers need to limit the rates at which they are consumed. Whether your app experiences a sudden surge in popularity, or buggy code is unintentionally flooding a service with requests, you can make API providers’ lives easier by accounting for usage spikes in your design.

In this article, I’ll explore a simple technique for queueing API requests, using the Cloudant database service as an example.

“Who watches the Watchmen?” Image scan Corey Pung. Watchmen property of DC Comics.

Get in line to miss the 429

With any API, if you exceed its rate limits, your request will get a “HTTP 429 Too many requests” response. The Cloudant Node.js has a ‘retry’ plugin that will resend such requests. This approach is handy when you only occasionally are hitting a limit, perhaps at times when your site or app is unusually busy.

If you are routinely exceeding the quota, then no amount of retrying will help because your app will be systematically retrying a swathe of requests. In that case you need to look at upgrading your API access (if possible) or adding a layer of abstraction to handle your request rate.

I’ll use Cloudant’s most basic Lite plan as an example. Here are the data and API rate limits (at time of writing):

  • < 1GB data size
  • < 1MB request size
  • < 20 lookups (hits on the primary index) per second
  • < 10 writes per second
  • < 5 queries per second

To use this plan as efficiently as possible and keep your write requests to 10 per second, you have two options:

  1. make bulk requests — instead of writing 50 documents individiually, write all fifty in one call to POST /db/_bulk_docs
  2. queue your requests and only allow the queue to be consumed at rate that is less than the permitted level

Implement a rate-limited queue with qrate

I wrote a Node.js module to help with the latter option. The qrate library lets you create queues and specify:

  • concurrency — the number of jobs that will be worked on in parallel
  • rateLimit — the number of jobs per second that are allowed to be consumed from the queue

Here’s how it works. First, bring in the silverlining library to access a Cloudant database and the qrate package too:

Then, define a “worker” function that deals with a single queue item. In this case, you want to write a document to Cloudant. The worker function receives the ‘document’ and calls ‘done’ when it’s finished:

You can then create a rate-limited queue using the qrate module:

Then, feed the documents you want to add to the database to the queue (q) with q.push:

Even though there are 100 items in the queue and up to three workers running at once, the queue rate never exceeds 10 per second. So no 429 responses are received from Cloudant and no retry logic is required.

In a real application, you would add documents you want to save to the queue instead of writing them directly to Cloudant. The queue would ensure that the writes happen slower than the prescribed rate, with the excess building up in memory.

This approach is useful for processing calls to an API service that has a rate limit!

If the qrate API looks familiar to you, then that is because it is based on the excellent async library, which provides a range of tools for writing asynchronous code for JavaScript. The qrate library is essentially the same as async.queue with an extra, optional rateLimit parameter. Without it, it behaves as a normal async.queue.

Happy API queueing!

If you enjoyed this article, please ♡ it to recommend it to other Medium readers.

--

--