Diffx: Your Diff Friend Forever

Matan Keidar
Wix Engineering
Published in
4 min readDec 14, 2022
(Image taken from https://unsplash.com/)

Introduction

As Scala developers, we frequently store the application data in case classes. One of our daily tasks is to write tests comparing instances of case classes in order to make sure our application logic holds.

One of the most common patterns in tests is asserting for object equality (e.g., should be), it does not provide any helpful information other than printing both compared objects to the console.

What is the problem

At Wix, we strongly believe that every aspect of our system has to be tested in order to avoid problems in production. Since we are living in a complex world, the business domain of a typical application is usually complex, as well, because it has to reflect reality. The problem arises when a test assertion fails without providing any useful information.

For example: let’s design a simple HTTP request model. In our example, an HTTP request has a method, a set of HTTP headers and an optional payload.

So, let’s write a very simple test which compares 2 instances of an HTTP request:

However, the output is not very human-readable since the developer has to analyze and find the exact differences in the following output:

HttpRequest(GET,Set(Header(k1,v1)),Some(hello world bla bla bla bla)) is not the same as 'HttpRequest(GET,Set(Header(k1,v1)),Some(hel1o world bla bla b1a bla))'

Can we do better than this?

Diffx to the rescue

Diffx is an open source library (sponsored by Softwaremill) focused for Scala developers providing friendly human-readable comparison result outputs between various types. Diffx library is able to be integrated into popular test frameworks as well as other libraries such as Refined and Cats.

There are other similar libraries for producing effective human-readable diffs for testing purposes. However, some of them are not actively developed anymore or support auto-derivation (i.e., automatically creating class instances).

Let’s use Diffx in our previous HTTP example and analyse the output:

As can be clearly seen, Diffx output shows that there are 2 differences in the payload content.

Advanced usage

1. Using Nested Matchers

We use the specs2 library heavily as our backend testing framework at Wix. In order to use Diffx in tests, the developer has to call the matchTo matcher, which checks equality of objects. However, in some cases the test assertion is against logic different from simple equality. In this case, the matchTo matcher can be combined with some other higher level matchers. For example, the following code checks if the given collection contains an element using matchTo:

2. Ignoring Fields

In some cases, when comparing 2 objects, we need to ignore some fields because they are not relevant to the comparison. In this case, the developer is able to configure Diffx to ignore specific fields.

In the following code example, Diffx ignores the timestamp field of the LogMessage class, taking into account the message only:

3. Comparing with Tolerance

Diffx is able to compare nested values using a different comparison logic. The following example shows a Person class. Only the weight field is compared with a tolerance of 5 (this example is taken from official documentation):

4. Comparing Collections

Not all collections were born equal: Seq based collections store items by a location index while Set based collections store items based on other criteria (e.g., hash functions). In addition, there are Map based collections that store items (i.e., Values) that are stored by indices (i.e., keys). Diffx has great collection support which enables the developer to choose how the specific collection is going to be processed.

The following example (taken from official documentation) shows how to configure comparison logic applied to each of the collection fields in the Organization class:

5. Change output format

Diffx output format can be tweaked. The color theme can be customized to support dark/light themes and if you are really willing to, you can set your own custom color theme.

Additionally, if your code involves comparing large objects containing multiple similar fields, the print diff output will be long. In case you want to print the differences only, identical results can be omitted from the output.

Conclusion

Diffx is a neat tool for producing pretty diff outputs. It helps the developer to easily understand what the exact difference is when comparing two objects. Although we did not need to apply a complex custom configuration, Diffx library can be highly customized according to the application business domain requirements.

--

--