Computing the Diff of two Arrays in a functional way in Swift

Sergio Schechtman Sette
Grand Parade
Published in
4 min readOct 28, 2017

This week I got a challenge from my friend Mateusz Maćkowiak to solve an issue in a diff function he created. Given two arrays, his goal was to retrieve:

  • The common elements: Elements that are in both arrays.
  • The inserted elements: Elements that are not in the first array, but are in the second array.
  • The removed elements: Elements that are not in the second array, but are in the first array.

There is a catch though. The elements were not Comparable nor Equatable, and the elements from the first array may not be of the same type as the second one. The function receives an compare closure though, that takes one element from the first array and one from the second array and tell it if they are equal.

He showed me his solution:

So his issue was that he needed a compare2 closure to compare two T2 elements. It took me a while to figure out the algorithm, with many for loops and even a continue with a label. I asked him if doing it in a functional way wouldn’t be easier. He asked me to do it, so I took the challenge.

In the beginning it looked easy. But after a couple of minutes without being able to write a single line of code, he smiled and left me to try it by myself.

So after some time and a lot of thinking, i got a solution:

  • I take the first array, filter all elements that are not in the second array using the compare closure and make it my removed list.
  • Then I take the second array, filter all elements that are not in the first array using the compare closure and make it my inserted list.
  • The common list was pretty tricky to create. First I combine the first array and second array into an array of tuples (T1, T2) with all possible combinations from array1 and array2. Afterwards I filter the tuple by applying the compare closure.

I ran some tests and checked that it worked properly. This solution is a lot more readable than all these for loops. I was a little bit happy, but then I realized that it is a naïve solution. We’re doing way too many comparisons, so I checked that it took about five times longer than the iterative one, meaning the iterative approach ran in 20% of the functional approach time.

I was not satisfied so I tried to optimize the common computation a little bit. I noticed that even if i find a tuple with the first element pretty fast, I still compared all its other permutations. So after a while, I got this:

Not getting all the possible tuples anymore. Now we take each element from the first array, create a tuple with it and the first element in the second array that satisfies our closure. If the second element in the tuple is nil, it means that there is no element in the second array that is equal to it. This approach reduces the computation time to be 60% of the original one.

Time to take a deeper look to the removed and inserted lists.

First it occurred to me that in our common list, when the second element is nil, it means that the element was removed. So by doing the opposite of the common list, we filter all tuples in which the second element is nil, and take the first element.

I tried to apply the same logic to the second list, but it doesn’t work. We have no list in which T1 elements are optional. So I created another combinations list, with the second array instead of the first one, and by checking all T1 elements that were nil, i got the inserted list.

Computation time decreased to 40% of the original approach.

Of course this combinations duplication didn’t make me happy. I was sure it could be improved. So i worked on it again.

The best way i could think to improve it was to actually go back a little bit to the first approach and start filtering the second array again. But now I can improve it a little by comparing with the elements that are common, which have less (or same) elements than the first list.

It was a small change, but just by removing the double combinations list, we improved the computation time to 25% of the original approach, which is really close to the iterative performance.

In my opinion, 5% of performance loss is more than justified by making the code a lot smaller and easier to read and removing the need of a compare2 function.

Another curious fact was that after my final solution, I took a look at the original iterative one and noticed that they actually look alike a lot. They both go over the first list and try to find the first element that is equal according to the compare closure. Then they fill the removed list with the elements that do not compare successfully. And the final step takes the second list and figures the removed elements from a comparison with the common list in the functional approach, and with the helper handledJ in the iterative approach.

Maybe this handledJ gave the little performance superiority, but it requires the compare2 closure to work.

The screenshot below shows the average measured time for all approaches.

The input data was:

let a = [Int](0...1000)
let b = [Int](750...1500)

--

--

Sergio Schechtman Sette
Grand Parade

iOS Developer at Revolut. Swift enthusiast. Coffee lover ☕️