Welcome to the real world. How to integrate your application with “lazy” APIs

Alex Lahtarin
Billie Engineering Crew
5 min readOct 24, 2017

“Lazy” API as a specimen

Imagine, you celebrate one year of your marriage. You come to a bakery shop to order a gorgeous cake. The bakery man promises you, they start working on it and asks to come back later. You come back in an hour — the cake is not ready. You come back again — the bakery is closed until tomorrow. After the tenth attempt, you finally get your cake ready.

Now let’s say, we have a service /api/v1/backery.json which we should integrate with our system to get the cake. The process could last unpredictably long but we have deal with it. Let’s do it!

“Lazy” API and Billie

Here at Billie, we integrate with different 3rd party data providers, some of which can be classified as “lazy”. Let’s take one of them as an example and see how we solve the problem of communication with it.

Let’s say, we have a new user on the website. At some point during his application process, we want to ask a 3rd party API if we can trust him, a so-called /api/v1/check-user.json service. The process of checking the user is not automated — special people sit in the office from 9 to 18 and prove every user request manually. Because of that, we never know how long it will take get the response.

Preconditions.

The /api/v1/check-user.json (further: service) receives request and returns back the process id. Further requests to this service with process id will result either in in_progress status or done. The process takes an unpredictable amount of time.

A queue is the answer!

Unknown engineer

V1 — The simple queue.

The easiest solution to the problem is to use the message queue to track the progress of the service. Let’s specify the message format:

{
process_id: <int>
}

… and implement the flow:

In this approach, we send the message to the queue once we get the process id from the service. The consumer will then receive the message and make a new request to the service to get the actual status of the process. In case the process is not yet finished, the consumer will either:

  1. Acknowledge the current message and create a new one with same process id;
  2. Don’t acknowledge the message (throw an exception, for example) so it will be returned back to the queue.

Simple? Reliable? Fast? Maybe — if the bakery will not ban us from coming and asking about the cake every 5 seconds.

Pros

  • Fastest implementation

Cons

  • Queue overload (new message generated instantly if process is not finished)
  • Service overload (request to the service on every message)

V2 — The smarter queue.

Let’s tweak our queue a bit and think — do we need to perform the request to the service on every message? If we have a fast queue — let’s say, one second delivery time — how big is the chance that the process that was still running a second ago will be finished now? Actually, not so big. So let’s add a new property to the message and call it execute_after:

{
process_id: <int>,
execute_after: <datetime>
}

Now, when the consumer receives the message, it will compare the current DateTime with the execute_after property of the message and reject the message if the time hasn’t come. The new diagram will look like:

With this approach we achieved two great pros:

Pros

  • Less amount of the requests to the service
  • The polling of the service could be scheduled, e.g. poll once in an hour during the business hours and don’t poll at nights and weekends

Cons

  • The queue is still overloaded

V3 — The delayed queue.

Let’s make one last effort and solve the remaining problem — spamming the queue with a new message every second (we said our system is quite fast :) ). The solution to this could be the use of exchange — an extra layer that stores messages coming from consumer and sends them to the specific queue according to the conditions. In our case, we need one condition — delay. The new message will look like:

{
process_id: <int>,
delay: <int>
}

Depending on the message broker realisation, the delay parameter could be specified in the message itself or in the message headers. Anyway, the exchange will receive the message, store it for delay seconds and only then push it to respective queue.

The new flow:

In this flow, the consumer will receive the message after delay seconds. In case of a process is not finished, the delay parameter could be adjusted (for example, it may be increased every time or we may want to postpone the polling to the next day). This approach is the best from a performance point of view, yet, there is one downside — not every implementation of message broker may support the delayed exchanges (and exchanges itself) out of the box.

Pros

  • No excessive messages
  • No excessive service requests

Cons

  • Dependance on the message broker implementation

Message acknowledged

We did a great job! Finally, our cake arrived from the bakery. And while it’s being eaten, let’s resume the main points:

  • The API queries may result in a long processing time without any notifications;
  • Message queues could be used to check the status of such API’s processes;
  • No need to poll the API every second; the scheduled polling may save a lot of resources on both sides;
  • Message executions may be delayed in order to save the queue resources;

And in the end, some links to the resources which helped this article to become an article:

--

--