A generic, bottom-up memoizer

Because just memoizing Fib can get tedious

printf(‘Hello World’);

I saw a talk earlier which touched on dynamic programming, ending with the uncontroversial flourish of memoizing a Fibonacci function.

Yup, it’s a bit Hello-Worldy, but it’s the function we love to love. If the idea’s unfamiliar, here’s the basic concept:

That’s not to say this is a ‘good’ implementation: it’s an explicative one. It’s purpose is to highlight some key things to think about with Dynamic Programming, the aim of which is to break a large problem down into its a smaller, soluble subproblems, start there and use the information we already have to solve them and work towards the result of the larger problem we’re solving. An, in my opinion, neater solution that better embodies this paradigm but is less pedagogically useful can be found on my pal’s blog.

Key things to note about the above are:

  • It’s ‘bottom up’: it starts at two and works up to n, solving the first and smallest problem that needs to be solved first;
  • It ‘memoizes’: it saves the smaller solutions that we do have in order to solve the next slightly larger problem;
  • It (rather explicitly) deletes solutions no longer required: we don’t need to use solutions of n-3 again, so get rid of them for constant space complexity;
  • It’s tail-recursive: not unusual or difficult for fib, but something to consider more generally with recursive problems if you’re running them in an environment that can optimise for that.

Surely Peter, you ask, you suitably-well-attired, charismatic and dashing young man, you could get any guy you want? Why would you bother with this? The answer is that I would hate to clobber the stack with 2^n function calls if I could do with n. I’m just that kind of guy: tender, caring, aware of other people’s needs and the number of stack frames. You know, the whole package.

Imagine we’re solving for an n of five. If we took a top-down, unmemoized approach* (i.e. started at 5, then recursively called fib on 4 and 3 etc.), then your function would make both of those calls to fib which would, themselves, make another two calls (to 3 and 2 in the first case and 2 and 1 in the latter). At this stage we have the following calls on the stack:

  • 5 (our call)
  • 4 (5-1)
  • 3 (5–2)
  • 3 (4–1)
  • 2 (4–2)
  • 2 (3–1)
  • 1 (3–2)

The bottom three can all return happily now, but we’ve still got another two calls coming from that second 3. We’re looking at a total of nine calls to fib on our stack, and you can clearly see that it’s calling for several numbers more than once. The function I showed above would be called n — 1 times (for values of n over 1): our initial call puts 2 in the cache, then another puts 3 in the cache, then another puts 4 in the cache and one final call that calculates and returns five. Again, this is better demonstrated by someone else with an image.

The basic principle to remember is that we only know 0 and 1. We could start by calculating 5, but that means we need to calculate 3 and 4. Instead, we could start by finding out 2, then use that to find out 3. We’re making one recursive call not two* to get to our answer.

*Of course, we could have a memoized, top-down approach, where we still have two recursive calls. The number of calls this function would make depends on the arrangement of those calls and the order in which expressions are evaluated.
That is, if the recursive call to n — 1 is made first, you’re sweet: it will fill the cache up with every number from n to 0 and so when it returns to the first call with a bulging cache, n-2 will be trivial. However, if it starts at n-2, I make that 8 function calls. I leave it as an exercise to the reader both to work out what would happen in their environment and to correct my numbers if I’ve missed something.

More about this generic memoizer pls

So, this got me thinking about memoizers. It’s a pretty standard problem for learning about higher-order functions. Obviously, Underscore.js has a good implementation (and, obviously, it’s illustrated with fib 😜).

But could we keep the main advantages of the above? That is, can we make a memoizer with constant space complexity and tail recursion that can run in something approaching liner time. The answer is, of course, that it depends on what exactly is happening in your function, but how about this:

It takes a function, func, and a value, n, and returns a function that memoizes a maximum of n results from func and that will call func a maximum of x times when you call memFunc(x) (I make no guarantees about what func does with that call).

It has certain stipulations: it works only on functions that take a single, positive numerical argument and can only keep its guarantees if your function doesn’t make recursive calls on values greater than the argument with which it was called.

Basically, it works for fib and fib-like problems — e.g. n = func(n-1) + func(n-2) + func(n-3)— but it does work!

In order to use it, one must, of course, assign the returned function to a variable within the current scope that has the name used in the recursive call. Otherwise, you’re doing it all for nothing!


The Fibonacci sequence

The most talked-about string of numbers describing the breeding patterns of rabbits that the world has ever known.