Exploring various approaches for testing external calls in Elixir

Published in

Welcome Tech

9 min readOct 2, 2023

When I started out as a backend developer I didn’t know what portions of code to test and how to test it properly. Despite recognising the importance of testing code, setting up the necessary boilerplate seemed complex.

The difficulties I encountered often came from the code complexity rather than the testing process itself. Now I have come to realize that writing tests can actually be an enjoyable experience when the code is neatly organized, bringing me an unparalleled sense of peace of mind, so much so that I can no longer do without it.

Of course, it’s essential to focus testing efforts strategically on critical features rather than attempting to cover every nook and cranny of the codebase. While it’s (relatively) straightforward to write tests for CRUD (create, read, update, delete) models, writing extensive tests with code sections that you don’t fully understand can be intimidating, for example when dealing with external services such as APIs. But ensuring that your application will respond accurately to any given scenario is where testing becomes relevant.

The Elixir ecosystem offers a multitude of possibilities to streamline the testing of external services. At Welcome to the Jungle, we had the opportunity to compare several of these techniques to efficiently test our numerous external services (emailing, queues, payment systems, etc.), most of which lack a dedicated test environment.

In this article, we will summarize what we discovered and provide an overview (albeit not an exhaustive one) of various Elixir methods for testing payment API calls, ranging from the simplest to the most intricate. The focus will be solely on unit testing, through input/output examples, and not extend to discussions either on alternative unit-testing approaches like mutation or property-based testing, or end-to-end or integration testing.

Although the testing methods we will explore use different technical approaches, they all share a common inherent logic: A focus on the internal code and business logic over the final implementation, ensuring independence from the underlying service (which may change over time). While our article only features examples in Elixir, it is very likely that these concepts can be adapted for other programming languages.

Presentation of the use case

In this article, we will simulate a simple call to a fictional payment API called Stipe. Our comparison will focus on the following use case: A user initiates a payment using a credit card stored in the model (note, this should never be done in a real application). The user’s status will then be updated, regardless of the payment outcome — whether it succeeds or fails.

In our simulation, we will treat this as an external library, for which we will have no control over the code in charge of executing the external HTTP call. Our tests will therefore have to focus on examining the payment logic rather than the library itself.

TLDR: Summary table

It is worth noting that some libraries, such as Exq or Oban, offer built-in mock modules for testing function calls. This avoids having to reimplement this logic on the application side.

You can clone our public Git repository to run the code on your end. The master branch contains the initial project without any testing, and each experienced solution is available on a separate branch.

The initial architecture of the project is built as follows:

Lib
| my_app
| misc => ignore this folder, it contains the fictitious implementations of the Stipe library and the Stipe backend
| my_app.ex => the file where our business logic is located (the user’s payment)
| user.ex => the user structure
test => the test folder, empty at the moment

Naive tests

Mock

https://github.com/maximemenager/elixir-testing/tree/with-mock

During this initial testing phase, we will look at an approach often used in many programming languages: monkey patching. This technique involves replacing one function with another in the program memory on the fly.

Your code will no longer call the code of the external library. Instead, it will call a function defined by the developer in “test” mode.

For example, this code snippet shows that we can “overload” the external make_payment function:

This concept is interesting for achieving quick results. However, the main concern here is that it changes the behaviour exclusively in test mode and not in production. To put it simply, it is as if you had two distinct logics between your testing and production phases.

This can lead to significant problems, as any modifications to the business logic in production would go unnoticed in testing. The simplest example would be a change in the response from the library during an update. If the library’s initial response is a tuple ({:ok, response} or {:error, response}) and that a version bump now returns just “response” or “nil”, this modification would remain invisible during testing, resulting in code breakage when deployed to production.

While this test mode can be useful for temporary development, it does not guarantee long-term robustness of the code.

More advanced testing

Fake HTTP server

https://github.com/maximemenager/elixir-testing/tree/with-http-server

This approach involves redirecting your calls to the external API — not to the actual API, but to a custom-built one. The principle is simple: Replicate the responses of the real API by reverse engineering.

The main weakness of the previous approach has been removed. Here, the production and test code are identical, they follow the same path. The distinction lies in the final endpoint that will vary based on the context.

However, we still encounter the same issues as before: A change in the API (which is not supposed to happen) will unfortunately lead to a breakage in the production environment. Once the bug is discovered, resolving it requires duplicating the fix previously mentioned. We will therefore be behind on production. While this type of test is convenient for end-to-end scenarios, it still overlooks the most important aspect: Testing your business logic, not HTTP calls (because, yes, this case is almost like testing the HTTP protocol).

Mock with behaviour

https://github.com/maximemenager/elixir-testing/tree/with-behaviour

Similar to the previous concept, this approach involves eliminating the entire HTTP component to focus on a behaviour. A behaviour consists of generalizing a logic through a contract and deporting the final implementation of the logic to a module that adheres to this contract.

There’s a major subtlety here in comparison to testing with a fake HTTP server or a mock. In this case, you will call the same logic (defined by the behaviour), regardless of whether you are in test or production mode. However, the actual implementation at the end will differ, providing that it is consistent with the contract.

This way, we can very easily create functions that produce the same response as those present in the library. Compared to using mock, we are likely to detect an error before production. If an external function changes signature, it will no longer respect the contract and a warning will be raised during compilation, allowing for easier anticipation and resolution of the problem.

Nevertheless, there is a limitation to this approach: It does not allow for checking the parameters/headers sent or modifying the response on the fly (to differentiate between an HTTP 401 code and an HTTP 200 code, for example).

Mock with a GenServer

https://github.com/maximemenager/elixir-testing/tree/with-genserver

This solution is quite similar and repeats the previous way of doing things. Instead of locking the returned response, we will store the received parameters in a global state (using an Agent/GenServer, for example) to validate the parameters later.

One of the advantages of this solution is that we will be able to check the parameters sent since they are stored in a global state. Furthermore, it is possible to modify the response returned based on the received content.

However, the solution does have limitations in terms of flexibility, since it is impossible to validate that it was our specific function call that saved the parameter in the global state. It is also impossible to guarantee that the function has only been called once and that no other equivalent parameters have been saved.

Mox/Hammox

https://github.com/maximemenager/elixir-testing/tree/with-hammox

Mox is a library inherited from José Valim’s excellent article Mocks and explicit contracts, which presents a clean approach to testing an external service. Based on Mox, Hammox takes the concept further by incorporating the defined typespecs. To keep things simple, we will only refer to Mox for the rest of this article.

Despite being a very small library (fewer than 1,000 lines of code), Mox combines the benefits of the two previous methods (with-genserver and with-behaviour). It provides a shared-state mechanism to ensure that a function is called and tracks how many times it has been called.

The core principle is to generate functions at runtime based solely on the callbacks defined in a behaviour. By default, the functions will exist but will lack underlying implementation. Therefore, in each test that calls a function, it becomes necessary to specify the desired return value. To avoid the boring task of implementing a function during every test, even when the result is irrelevant to the current test case, it’s possible to simplify things by using a stub, a default implementation in case the user implementation is not defined.

Mox also provides the functionality to track the number of calls to a function. If the most common occurrence is a single call, we have discovered intricate ways to validate, with Mox being very helpful. For example, for a production bug that made an unwanted external call, we were able to write a unit test that ensured no external calls were made. Similarly, we have already encountered the reverse situation, validating that two consecutive external calls had been made.

Exploring more

In this article, we explored a subset of the many testing possibilities offered by the Elixir ecosystem. Using different examples, we have tried to transcribe the evolution of our testing approach over the years at Welcome to the Jungle. The progression of our testing methodology has been an ongoing process over several months, and it is worth noting that there may be alternative approaches to testing your applications.

While Mox was presented in this specific use case, keep in mind that this library is useful in any situation where edge effects are involved within a function, such as writing to a file or even performing read/write operations in a database (even if the latter is subject to debate).

In addition to the mock tests, it is worth exploring other types of tests that have not been presented here (such as mutation or property-based testing). These concepts can be more complex by their very nature, but they remain an interesting place to start not only in the Elixir ecosystem, but also in terms of professional growth and development.

Some resources to help you investigate further:

And libraries:

Written by Maxime Ménager, Senior backend developer @ WTTJ

Edited by Anne-Laure Civeyrac,

Illustration by David Adrien

Join our team!