One of our main products here at EBANX is a payment gateway, in simple words, it is a system in which a payment request is received and then routed to the desired method of payment. Very often, we need to augment the number of external services we consume, for three main reasons: to isolate new software components, to increase our resilience in case of a failure of an external provider, and to support new methods of payments.
We need an SDK!
As in any company which needs to grow fast, the platform must not be an impediment to its growth. We started with the idea of outsourcing the development of some components of our platform, and the first candidates were the parts of the system which deal with integrations of credit-card acquirers. By that time, in order to decrease the risk of that new bunch of code being integrated into our codebase, we needed to make sure that no component would put the rest of the platform in danger, and also, no component would let sensitive data to be logged or leaked. This was our first thought in creating our own SDK.
Some of the architectural requirements for new packages that used our SDK were:
- The software components must be isolated from the platform and should not have dependencies between them;
- No package malfunction must be able to break the platform;
- All communication with the external world must go through a robust mechanism which strips sensitive data;
- The packages should not depend on parts of the platform we don’t explicitly let them depend on;
- The interface between the platform and those external components must be simple.
A first thought when this sort of problem occurs is to create a set of interfaces (I’m talking about PHP interfaces) in which every external package implements them and they are the entry point for that package. Here’s a simple high-level view of what I just described:
This approach removes the dependency of our platform with the external packages, but it still has some fragilities that we can’t take. First of all, the contract between the platform and the package must be very flexible, it is not desirable to touch each component which implements an interface every time we add a new field to one of those interfaces. Also, we still have the problem in which the packages might not use our service of providing HTTP requests, and the developer might forget to add timeouts, or worse, log sensitive data that must not be logged. So as the obvious implementation showed many fragilities, we started thinking about something else, and at this point, the first draft of our SDK was born.
Our SDK consists of two main types of classes: Messages and Effects. Messages are basically the interface of one functionality, but both of the platform and the component depend on them. To be more concrete, one example is a message called GetCustomerInfo, the package calls the method get of this class, but it doesn’t know how it is going to be resolved, it only knows it will be resolved. Inside the package, this dependency works as a simple method call. But for a message to be resolved, we have to have an implementation of an effect to that message, so we would have a GetCustomerInfoEffect which responds to that message, and lives inside the platform, not the package. What is great about that mechanism is that when we implement an effect, we can ensure that all constraints which relate to the piece of data it handles are satisfied. The effect I used as an example, only retrieves data from the platform ensuring only the data for a given payment is retrieved, nothing more, nothing less. One more example could be another message we have, which is the MakeHttpRequestWithSensitiveData message. This message provides the functionality of communication through HTTP to the outside world, and its effect ensures that every request made is logged, sensitive data stripped out, and a predefined timeout (if the developer forgets to explicitly set it) is set.
How all this code is organized is as follows: all messages reside inside the SDK, which is a dependency for both the platform and the packages. Also, as the packages live outside the platform, there is a simulator that implements all messages that the platform implements but for testing purposes. Finally, all effects live inside of the platform (or the simulator).
The control flow of this SDK is made through what we call the Effect Engine, which receives a message, verifies whether it is fulfilled or not, and then dispatches it to the correct effect. Visually, we could see the flow of execution of a package developed using our SDK something like that:
It can be noted from this diagram that the package only depends on messages that it really uses. So it is possible to introduce new messages and no existing package will ever be touched unless it needs to use it.
After almost some months developing and maintaining packages which use the SDK, and also, extending the SDK itself, we developed around 40 packages ranging from integrations with credit-card acquirers to payment risk analysis packages, some lessons were learned.
The first packages developed were created in repositories outside of our platform, and we, in turn, declared them as dependencies using Composer. As all code we send to production must be reviewed, it got really boring to update these dependencies.
At the time, if you needed to change one line of code in one package, you needed to: create a Pull Request to the package repository; after your PR was approved and merged, you needed to create a PR on the platform to update the package there, which in turn needed to be approved, merged and deployed. During the period that your first PR was merged and your second PR is being prepared, people would update their own packages which might conflict with the changes introduced in your PR. After that, we decided to move all packages to a monorepo which turns out to be the repository of the platform. With that, our composer file pointed to the directory where the package lives, removing the need for that intermediate process of upgrading the package itself.
As all tests inside the packages run against effects implemented inside the simulator, sometimes we have divergent implementations which lead to bugs in production. As new features or changes are added to the effects inside the platform, we need to make sure their behavior won’t change across environments, and this is a hard-problem per se.
Finally, the balance from implementing all of this mechanism is still positive and keeps increasing. We got to a point where the SDK is stable enough and the effects we implemented rarely need to change. There is a learning curve when a new engineer starts working on the platform, but it pays off because several details that are already hidden inside the effects don’t need to be implemented again by them.