How not to lose sleep when integrating with a 3rd party REST API — Part 1
The Do’s & Don’ts of coding a successful integration, so that you don’t get support calls in the middle of the night
Have you ever spent weeks and months integrating with a 3rd party REST API, only to have it suddenly return error codes when least expected, lock out users randomly, create frighteningly random side effects and frequently fail with internal errors?
Integrating with external REST APIs can be challenging at the best of times, and a complete nightmare when things don’t go as expected.
In this short series of blog posts, I’ll share our challenges in integrating Wix Stores Scala based service with one of these rather challenging REST APIs, and how we took control of these seemingly uncontrollable interactions to build a robust, resilient and elegant solution. No project is real without failures, so I’ll also share what went wrong, and how we could try and avoid those problems.
In this post, I’ll focus mostly on the integration design and how we dealt with challenges. In the subsequent post, I will talk how we implemented the client code that calls the REST API and how we tackled testing this implementation.
Doesn’t sound too tricky, right?
At Wix.com, we integrated with ReliableVendor.com (names have been changed to protect the innocent). ReliableVendor.com provide a service for calculating and tracking complex eCommerce transactions on behalf of merchants, and offer a well-documented REST API for intermediate ?integrators? such as ourselves. From now on, I’ll refer to the REST API as the RV.com API. To keep things simple, let’s assume that we wanted to create and administer accounts in RV.com on behalf of our merchants, and then transparently register financial transactions when our users execute purchases.
So what’s the end game when integrating with a REST API? Well, first and foremost, we want to make sure that we fulfil all the functional requirements that we planned on. And maybe as important, we want to make sure that it works, and works well. That sounds pretty obvious, but let’s take a step back and try to figure out what it means when we say something works.
Getting it to work
When thinking about what it means to have a successful integration, it made sense to split it into a few main categories.
Functional — map our functional requirements to REST API calls on RV.com. This means learning and mastering their REST API, involving a research phase where simple prototype scripts are crafted to validate our understanding of what we think that the APIs should do.
Failure — Design the integration with the knowledge that software will fail, and it’ll fail when we least expect it. Software for the internet can be so complex — tens of dynamically changing services communicating synchronously and asynchronously over active/active data centers around the globe, persisting data in multiple databases with different consistencies — failure is inevitable.
Monitoring and alerts — We must know how our integration is behaving — is the latency of the calls to RV.com acceptable, what % of responses fail, etc…, and is the system behaving as defined?
Security and performance are other important aspects of the integration, but we’ll spend less time on them in this series, focusing more on the integration design.
So, okay, I get it, we must design this integration taking failure into mind.
In the next section I’ll explain how we managed to address both functional requirements and dealing with failure in the design.
Designing the integration
Let’s take a bash at implementing the logic for creation of an account entity in RV.com. I’ll use Scala in the code examples, since that’s what we used.
Since it was imperative that we have full test coverage of all functionalty, I realized that I don’t want to access the real RV.com API in my tests since I’ll need to control the responses of the API. So I decided to create an interface that encompasses the functionality, providing an implementation that calls the real RV.com API, and another fake implementation for testing purposes. The fake implementation would need to fake responses according to test requirements.
Here is how it looked:
First try at the interface
My first test went something like this:
After doing some quick prototyping against the REST API, I realized that account name must be unique within RV.com.
Dealing with errors
As I started coding this change, it dawned on me that there will probably be different errors with time, and, in the same way that we declare that creating an account returns an account, it can just as easily return an error object. I wanted the programmers who are using the transaction service to look at the signature of the methods and know what failures can happen, especially since some of these failures may be displayed back to the UI with a message such as ‘User name already exists’.
Since Scala, as a functional programming language, encourages using objects to describe errors over throwing exceptions, we decided to use the
Either[Left, Right] construct. The
Either type can be either one type (
Left) or another (
Right), and the code that receives an
Either will test for the type and act accordingly. Traditionally, the
Left type represents an error type, and the
Right type represents the result returned in a ‘happy path’. Read more about
Next stab at the interface
and another new test case
The Accounting service logic was now updated to deal with duplicate accounts.
After some more testing, I noticed that at certain times of the day, I started to see random failures when calling
createAccount() against the RV.com API. After a bunch of calls with RV.com tech support, we came to the conclusion that during scheduled configuration upgrades at RV.com, account creation failed with some sort of internal configuration error. Together with tech support, we decided the best approach was to convert this HTTP error to an
RVInternalError and let the user simply resubmit the request.
We realized that there are 2 different ways that account creation can fail — bad user input, or other errors originating in RV.com API, or due to network issues etc…. Any error that originated due to bad input, we did not want to retry, but for requests failing with other errors, we want to be able to retry in the hope that it will suceed in subsequent call.
So we created a base exception called RVException for ALL types of errors, and a subtype of RVException called RetriableRVException. We changed the
createAccount method to return
Either[RVException, Account] , allowing the caller to use pattern matching on response to catch all cases.
In parallel, we added a
giveErrorOnCreateAccount method to
FakeTransactionService to allow us to test
AccountService with these errors.
Tweaking the created account
Part of the onboarding process of a new account involved editing the default settings of the account created by RV.com. This required calling a different API method in RV.com, so we added a new method,
editAccount to TransactionService. This time, realizing that it could fail with anything from bad input to network outages, we coded failure into the signature of the new method
Here’s the new API now:
Since the setup of the account is now a multi-stage process, in which we first create the account with
createAccount and then tweak it with a call to
editAccount, we needed a way to chain these calls, whilst dealing with the option that either one of the calls could also return an error. Since both calls to the API could return either a
Right, this could create quite ugly code — we only want to call
editAccount if the call to
Just imagine how things get more complicated if we had a third chained call to the
TransactionService , such as
activate on the account.
Luckily for us, Scala for comprehensions come to the rescue. Since for comprehensions provides syntactic sugar over composition of multiple monadic operations, and
Either happens to be a monad type, we are able to write our chain of operations in a fluent manner, as seen in
Now, if at any stage in the account setup there is a failure, the chain of operations will be elegantly short -circuited and the error can be returned
Until now, we have been focusing on design the integration with the REST API, and testing the logic that calls out to the
TransactionService, but have not yet written and tested the actual code which will call RV.com REST API in production.
In the next post in this series, I will share how we went about testing this implementation,
RVTransactionService, including how, when we discovered erratic and chaotic behaviour in production, we were able to write tests to reproduce this behaviour, increasing our confidence in the integration, and allowing us to confidently push this new functionality out to production.