An elegant pattern to craft cache-efficient functions in Swift

Swift is a language used in good part on mobile devices, some of which operate with a (very) limited amount of computational power — think of older iPhones or Apple Watches, for instance.

In such contexts, every CPU cycle counts and well-crafted apps will be expected to make the most of them. A known way to reduce an app footprint is through caching.

Caching is often considered a tricky business to implement. While this is true when external resources — such as web pages — are being put in cache, things are much easier when we consider only caching the result of pure functions.

A function is considered pure if it always returns the same value for a given input, while producing no side-effect along the way.

Caching the results of pure non-recursive functions

For such functions, functional programming is the designated tool to transparently bring them cache-efficiency.

Consider the following code:

The cached function takes a function as argument, and returns a new function, that will compute the exact same result as the original one. But once a value has been computed a first time, it will then be stored in a cache.

Consequently, any subsequent calls made with the same input will no longer require any computation to be carried out in order to bring out the correct result.

Consider that you are writing code for a 2D game, you are now able to easily work with cache-efficient versions of standard trigonometry functions:

If your games frequently makes trigonometry computations for the same values, there is now doubt that such an approach will bear significant improvement in performances 💪

The issue with recursive functions

The technique discussed above is so easy to use that one might think that there is a catch to it! And indeed there is one, or rather a limitation.

When you work with non-recursive functions, all is well and the cached function works like a charm.

But if you try to pass it a recursive function, for instance:

You will notice no boost of performance, and indeed for values of n greater than 20, the function will still take a lot of time to return – remember that, without caching intermediary results, this function has an exponential time complexity.

Why does this happen? Well, fib is a recursive function, so it calls itself within its own body. Indeed, when you put fib through the cached function and call the resulting function, there will be a cache look-up. But after that, the recursive calls to fib will still point towards the original fib, therefore no cache look-up will happen anymore.

In order to allow the same cache mechanism to work with recursive functions, additional work will be required.

Writing cache-efficient recursive functions

First of all, we won’t be able to create cached versions of existing functions. But we can provide a framework to easily write cache-efficient recursive functions.

The key problem we need to solve is that of the recursive call. We need a way to transparently wrap some code around the recursive call, in order to perform the cache look-up before actually performing it.

The solution lies in the following type signature : ((In) -> Out, In) -> OutIn and Out being generic types. This signature describes a function that takes two argument :

  • the recursive function
  • the input of the recursive function

Through this type, we are able to both make recursive calls, using the first argument, and access the input of the function, via the second argument.

From there we only need to write an internal function that actually performs the cache look-up before, if necessary, carrying through the recursive call.

While this implementation is indeed a little bit complex, the great thing is that using is extremely easy, as additional syntax is kept to a bear minimum:

From there you can now use the cached function, and see that it runs much faster!

Where to go from there?

The process I’ve described above, that consist of caching the results of intermediary results is called Memoization. Its principle is rather simple: it trades memory for CPU cycles.

While it can bring some noticeable speedup in computations, as with any kind of compromise, it’s no silver bullet. Thus it’s very important to consider every specific situations before applying it:

  • in some cases, you might need to put a cap on the size of the cache, and implement some cache replacement policy in order to manage the cache.
  • when working with floating numbers, caching might reveal inefficient because the inputs will often differ by very small amount. In such cases rounding the input before the computation might help.
  • etc.

Thanks for reading this article, if you’ve enjoyed its content, make sure to check my GitHub repo dedicated to tips for the Swift language!


Originally published at gist.github.com.