How we make inter-service communication less painful with a gem

Akihiko Ito
Getsafe
7 min readMay 21, 2021

--

We at Getsafe have an architecture consisting of several distributed services, each of which is responsible for a particular domain (e.g. user management, payment processing, etc.), talks to the other services via HTTP API, and is mostly developed by a particular team.

In this article, I’m going to discuss how we deal with the complexity in the inter-service communication that comes with the architecture, by leveraging shared API client classes called nxt_clients (it’s a private gem).

Topics such as what microservices architecture is, or what the general benefit and drawbacks of the architecture are, won’t be part of this article.

Background

Distributes services of Getsafe

We have the following services (the list is not exhaustive).

User service

As the name suggests, this service is responsible for user management, such as CRUD operation of users and user authentication.

Insurance service

This service is responsible for the insurance products, such as CRUD operation on users’ insurance portfolio and price calculation.

Payment service

This service is responsible for managing data and processes around payment, such as determining how much is to be collected from which user on which date.

Admin panel

This service provides various information from the above-mentioned services for our customer support agents and lets them perform various operations.

User service, for example, as a source of truth regarding our customers, is a service that virtually all other services depend on. Insurance service, as a source of truth regarding the insurance products and the customers’ insurance portfolio, often has mutual dependencies with other services. All services (except for Admin panel) offer endpoints for various resources, either as public API or internal API.

All these services are written in Ruby. One of the nice things about the architecture is that each service can be developed with a completely different tech stack from each other if need be. We just haven’t seen a need for using other programming languages. However, we do use different Ruby gems in different services to do essentially the same thing, depending on the specific needs in each service and the responsible developers’ liking.

Problem

Now, as you can imagine, this architecture introduces a great complexity in terms of inter-service communication.

For example, one of the most frequently used endpoints is the read operation on a user that the user service offers. Typically, services that need to fetch user from the user service would have to implement an API client class like

This is fine as long as only one service is using the endpoint. However, as soon as multiple services start depending on the endpoint, this naïve approach becomes problematic.

Number of client classes

First and foremost, the sheer number of client classes across all services would grow as quickly as O(M•N²) (M being the number of endpoints per service and N being the number of services). If you have N services with M endpoints each, you’d have to implement M•(N -1) client classes in all N services. This clearly doesn’t scale.

Maintenance burden

Stemming from the first issue, keeping all the client classes scattered across different services up-to-date with the changes to the endpoints would be nearly impossible. It would also be prone to having classes that are supposed to do the same thing, but have diverged in their implementations or even functionalities.

Testing is another issue. One endpoint could be used by different services for different purposes, therefore, each service would test the client classes with different scenarios, which itself isn’t bad, but it certainly adds even more complexity.

Illogical

Not only is it impractical to implement API client classes in each service, but also illogical. Within a single application, you don’t define multiple classes that do exactly the same thing. Why would you do that just because now the application is split into multiple services?

Solution

One of the things we have developed over time, in order to mitigate the above-mentioned issue, is a private gem that provides a collection of API client classes.

For example, the gem provides a class called UserService::Api::UserFetcher which basically does what the API class in the above example does, using our public gem called nxt_http_client. The actual implementation of the class looks like this:

Now, each service that needs to fetch a user from the user service only has to add this gem to its dependency and use the client class, instead of implementing its own UserFetcher class. If something changes in the upstream services, we create a new version of the gem, and thanks to depfu, developers of each service will be able to know of it and upgrade the gem accordingly.

Number of client classes

With this approach, each atomic API operation is implemented and tested in a single place, as opposed to being implemented and tested in several different places across different services. As a result, the number of API client classes we have to implement grows only as quickly as O(M•N), or in words, just the number of endpoints.

This works well for us because all the services are written in the same programming language — Ruby. If we ever decide to develop a service in, for example, Go, obviously we cannot use the Ruby gem. However, even then, it would be a good idea to create a similar shared client library for Go. In that case, the effort required to develop and maintain the API client classes would be O(L•M•N) (L being the number of programming languages used in the ecosystem). Given L≤ N, the growth of the number of client classes (or structs, functions, etc) is still slower with the shared library, unless every service is written in a different programming language (L = N) which, at least in our case, is very unlikely to be the case for the foreseeable future.

Testing

We test the client classes thoroughly within the gem, not in the dependent services. Dependent services don’t need to worry about the client classes being outdated or having bugs, and can focus on higher-level logic that they are responsible for.

The gem provides a couple of handy additional features to help us developers work with inter-service communication.

Wrapped response

As you can see in the method #call, the response from the API is wrapped into a model instead of being a raw hash, which gives the developers of the dependent services a clear understanding of what they should expect to get. They don’t have to look at tests or actually invoke an API call to see what the response looks like, but they can just look at the model, in which the type of each attribute is clearly defined. In a way, the models serve as (a part of) API documentation.

Shared factories

The gem also provides shared FactoryBot factories of the models, which by design conform to the schema of API responses. Developers of the consumer services can use the factories to mock the API client classes using the shared factories without having to fear their mock response is wrong or outdated.

Remaining problems

With the shared API client classes, we have less amount of work to do in terms of inter-service communication. However, that doesn’t mean we are now pain-free. The shared API clients merely reduces the amount of pain, but doesn’t really cure the pain. There are still issues we have to address, with or without the gem.

In the “shared factories” section above, I wrote

… shared FactoryBot factories of the models, which by design conform to the schema of API responses.

This is premised upon the assumption that the API client classes are always tested against the current version of the endpoints, which is not easy to guarantee. To test the client classes in the gem, we use VCR to record HTTP interactions and then use them as mock, to avoid making real requests to provider services during tests.

This means that if, for example, the user service decides to implement a phone number validation, some or all of the VCR cassettes (recorded HTTP interactions) will no longer reflect the actual behaviour of the user service. Then the developers of the user service have to make sure that the client classes have been updated accordingly, a new version of the gem has been released and the dependent services have upgraded the gem, before they deploy their changes to the user service. And they have to know which other services are relying on the user service. Otherwise, the developers of the dependent services won’t know about the change until something breaks at runtime due to an unhandled phone number validation error.

In an upcoming article, we’ll discuss how we tackle the remaining issues with inter-service communication.

We have been, and will be continuously improving our software systems. What we have discussed in this article is just a baby step in the continuous process, and there are much more to do.

Do you have further improvement ideas or experiences dealing with distributed services? Are you interested in innovating the centuries-old industry? Come join our journey of shaping the future of insurance!

--

--