Four years ago B2W, the largest e-commerce operation in Latin America, started a microservice migration. It was a project that took 3 months to generate its first results and 3 years to be completed.
From a technical perspective, the benefits to the teams responsible for the microservices were clear from the first deploys. With the new architecture, even though they had more applications to handle, they were smaller which meant: easier maintenance, technology freedom and deploy risk mitigation.
Nonetheless, for the microservices consumers the architecture went not so smoothly. The front layer developers (website and apps) started to raise issues in connection with application performance and code complexity.
Previously they had to deal with few services to get the required information to render a screen. A common approach was to have a back-end service that would gather all or most of the information needed. In this new scenario the information was spread throughout different microservices and had to be orchestrated.
To give an example, let’s consider a payment screen from an e-commerce website:
On a monolithic website, a common approach is to make a route available which returns the whole page HTML or all the data required for the initial render.
However, in a microservice architecture there are multiple services, each one of them handling a single concern. In the above example, there’s one service for the basket, another for customer address, another for shipping and payment options. This architecture ends up generating the following request scenarios.
Multiple parallel calls
A complex application using a microservice architecture may have dozens of parallel requests to get the render information. Add to the previous example: gift packaging, recommendation, upsell, etc. This is a problem for application running under HTTP 1.X where browsers limit the number of parallel requests made to the same domain, thus impacting the application performance.
Chained calls (1+n request problem)
Often a microservice requires data from another microservice to serve its request. For example, to calculate the shipping fee the service needs the products, returned from the cart service, and the zip code, returned from the address service.
If one considers the average latency to a datacenter being 100ms, in a chained invocation of 3 calls a 300ms performance penalty is being paid, only because of network round-trips. This scenario is made worse in mobile networks where there’s a big variance in the network latency.
At that time Falcor, from Netflix, already had been announced but not yet open sourced. Its goal was to make it easier to fetch information scattered in different places, thereby solving the code complexity problem. Falcor was also able to optimize the requests so some extend (deduping and batching it), lessening the round-trip problem.
GraphQL offered a mature language to query data from services and a growing ecosystem.
However, unlike REST, GraphQL didn’t leverage the many HTTP based features developed across the years. Two things we considered important:
GraphQL was initially developed by Facebook where every request/response is highly customized. Every user sees a different mix of information, which means equal requests/responses are seldom.
Having said that, caching, specially network caching (CDN, proxy cache like Varnish, Squid) is not crucial.
On the other hand, a public website, like an e-commerce, will have many users visiting the same information (e.g. a product page). This makes caching, specially network caching, very efficient.
Because GraphQL will usually work over POST method and always returns HTTP success (200), it’s not possible to take advantage of network caching and native browser caching won’t work either.
HTTP return codes
HTTP status code provides a very standardized way to communicate different errors and success scenarios, which makes it easier for the client to figure out what’s happening in a human and machine readable way.
This allows the developer to navigate between different APIs and contents, only having to learn about the content itself.
The machine readable nature enables automatic handling of different scenarios, which in turn improves resilience. For instance it’s common for a HTTP client to retry a request upon receiving a 5XX response, but it will give up after a 4XX response.
In view of our situation, in an internal hackathon was developed what came to be called restQL. restQL is a query language for microservices that makes it easier to fetch information from multiple services in the most efficient manner.
It aims to integrate seamlessly with a REST microservice architecture. From the client perspective, restQL Server is like any other REST service, so it doesn’t require any special access or specific client.
restQL now is an open-source project available under MIT license.
A restQL query is expressed using a statement describing the resources to be fetched and its parameters:
from product with productId = "156AB"
This query format allows easy use of common service orchestration scenarios.
Chained and parallel calls
Upon receiving a query, restQL builds a dependency graph of the listed resources. Those resources which have no dependency are executed in parallel and those which have dependency are parked until the dependency returns.
In the example below we have a query free from dependencies. In in this case both are going to be executed in parallel.
from productSearch with productName = “TV”from customer with customerName = “Joe”
In the query below we have an example of a chained request. In this case restQL will wait for the customer service response and right after it call the customerPurchase service passing on the customer information as a parameter.
from customer with customerName = “Joe”from customerPurchases with customerId = customer.id
A common scenario when orchestrating microservices is when a service returns a list and for each item of that list another service invocation is required. For instance, the product search service may return the product ids and it’s necessary to call another service to get the details of each product (price tag, images, etc.).
restQL syntax is the same for simple and multiplexed call. If the return of the first call is a simple value, restQL will perform one call. If it’s a list, it will multiplex the call, making one call for each value on the list.
from productSearch with productName = “TV”from product with productId = productSearch.result.productId
We can use restQL query language to filter the resource response, thus reducing payload for the client. This is useful when the API doesn’t implement filtering itself. Filtering is achieved by using only modifier.
from product with productId = “156AB” only title
restQL server is the application which handles restQL queries. restQL server is agnostic of the underlying services, which means it doesn’t hold any service schema or special ways of invocation. It acts like a bridge between the client and the backend APIs.
The only information stored in restQL server is the resource name and its invocation end-point.
restQL server is implemented in Clojure using HTTP Kit and Clojure CSP implementation (core.async).
For clients in the JVM world it’s possible to use directly restQL core, writing and executing restQL queries directly without the need of an additional server.
REST impedance mismatch
There’s an impedance mismatch in a solution like restQL, where one service invokes another N services. To keep compatibility with a REST architecture some topics need to be addressed.
HTTP Status Code
Because restQL may invoke many services in the same query, what should be done if one service returns success code (200) and another one in the same query returns error code (500)?
Given the practical implications, if restQL returned 200 in this case, the intermediate proxies would consider a success HTTP call and would potentially cache the request, carrying the error to the whole user base. For HTTP clients, automatic retry wouldn’t work.
To address this issue, restQL always returns the highest HTTP code from the invoked services as the query response status code.
In the same query there may be essential and non-essential resources. Think about a query fetching a cart and its related recommendations. It’s reasonable to show the cart even though we were not able to fetch the recommendations. In such cases it’s possible to use the ignore-errors modifier in the non-essential resources. When this happens, even if the call to this specific resource fails, restQL will return 200. Even in these circumstances the client can still manually check the restQL response to find out if the request was successful. This status code information is included as a metadata in the response and can also be used for debugging and troubleshooting purposes.
Cache-control is a header returned in a HTTP call that tells the intermediate proxies and the end client on how to handle the cache of the returned content . For example: a cache-control header with max-age=60 tells that the client can safely cache the request for 1 minute.
To avoid stale data, restQL considers the smaller max-age among the services responses to use in the query response.
It’s also possible to configure the cache-control for a specific query using the directive use inside the query.
Each microservice in a query may have a different SLA, which means different response times.
When each microservice is called isolatedly, it’s easy to configure the timeout of each call as most HTTP clients allow the configuration of such setting.
In case of an orchestrator like restQL it’s necessary to implement some way to fine-grain control the timeout of the underlying invocation. In restQL this is achieved by the timeout modifier in the query, which specifies the timeout of each resource. If timeout triggers, a 408 (timeout) status code is assigned to the resource.
Although URI RFC doesn’t set any limits to the URL length, in practice some browsers and proxies (e.g. Apache, Nginx) won’t work with long URLs. For example, Nginx default URL length limit is 8k, after which it returns 413 (entity too large). This may be a problem if the query is too long.
To work around this issue restQL has saved queries. Saved queries is a way for clients to previously save a query in the server and invoke it by just referencing its name and revision.
Each saved query update will generate a new query revision and keep the existing one untouched. This avoid consumers being broken down and makes the cache of the query statement retrieval on the server very efficient (you can actually cache it forever).
As we’ve seen, a microservice-based architecture isn’t a silver bullet. Despite its immediate benefits to the service teams, it poses by its very nature a challenge in terms of complexity and performance to the consumers.
Fortunately there are some technologies available to overcome those problems, worth mentioning Falcor, GraphQL and the recently introduced restQL.