Memoization vs Tabulation in DP

Aryan Jain
9 min readJun 11, 2022

What is Dynamic Programming (DP)?

Dynamic Programming is a technique in computer programming that helps to efficiently solve a class of problems that have overlapping subproblems and optimal substructure property.

If any problem can be divided into subproblems, which in turn are divided into smaller subproblems, and if there are overlapping among these subproblems, then the solutions to these subproblems can be saved for future reference. In this way, efficiency of the CPU can be enhanced. This method of solving a solution is referred to as dynamic programming.

Such problems involve repeatedly calculating the value of the same subproblems to find the optimum solution.

Dynamic Programming is mainly an optimization over plain recursion. Wherever we see a recursive solution that has repeated calls for same inputs, we can optimize it using Dynamic Programming. The idea is to simply store the results of subproblems, so that we do not have to re-compute them when needed later. This simple optimization reduces time complexities from exponential to polynomial. For example, if we write simple recursive solution for Fibonacci Numbers, we get exponential time complexity and if we optimize it by storing solutions of subproblems, time complexity reduces to linear. It can be implemented by memoization or tabulation.

What is Memoization?

Memoization is an optimization process. In simple terms, we store the intermediate results of the solutions of sub-problems, allowing us to speed up the computation of the overall solution. The improvement can be reduced to an exponential time solution to a polynomial time solution, with an overhead of using additional memory for storing intermediate results.

A typical square function would look something like this where the input is multiplied to itself and returned.

A memoized square function would keep a cache that would store the previously computed values to be reused, thus trading off space complexity for an improvement in time complexity.

What is Tabulation?

Tabulation is an approach where you solve a dynamic programming problem by first filling up a table, and then compute the solution to the original problem based on the results in this table.

While DP problems, such as the Fibonacci computation, are recursive in nature, a tabulation implementation is always iterative.

Case Example:

If you’re computing for instance fib(3) (the third Fibonacci number), a naive implementation would compute fib(1) twice:

Here we observe the overlapping substructure and optimal subproblems properties, thus Fibonacci calculations can be solved using Dynamic Programming.

With a more clever DP implementation, the tree could be collapsed into a graph (a DAG):

It doesn’t look very impressive in this example, but it’s in fact enough to bring down the complexity from O(2n) to O(n). Here’s a better illustration that compares the full call tree of fib(7) (left) to the corresponding collapsed DAG:

This improvement in complexity is achieved regardles of which DP technique (memoization or tabulation) is used.

A memoized fib function would thus look like:

As you can see fib_mem(k) will only be computed at most once for each k. (Second time around it will return the memoized value.)

This is enough to cause the tree to collapse into a graph as shown in the figures above. For fib_mem(4) the calls would be made in the following order:

This approach is top-down since the original problem, fib_mem(4), is at the top in the above computation.

Time Complexity

The cache is checked to see if there’s already an answer stored in the nth spot and is returned if there. Otherwise, the sum of fibonacci(N — 1) and fibonacci(N — 2) is called recursively, and the sum is set to a variable. This sum is placed in the cache array at the ‘N’th spot, and then the value of the sum is returned. With this solution in memoization, whenever fibonacci(N) is called, and ’N’ has already been solved once, cache[n] will already hold the answer and return it. The time complexity from recursively calling fibonacci(N — 1) + fibonacci(N — 2) is O(2 ^ N) and becomes a lot better with memoization — O(N).

Space Complexity

Using the memoization technique, each ‘fibonacci’ value will be calculated only once. So, the space complexity will be O(N), where ’N’ is the input number to ‘fibonacci’ (the array for memoization will hold ’N’ numbers).

Tabulation is similar in the sense that it builds up a cache, but the approach is different. A tabulation algorithm focuses on filling the entries of the cache, until the target value has been reached.

While DP problems, such as the fibonacci computation, are recursive in nature, a tabulation implementation is always iterative.

Tabulation Algorithm

  • An integer function is created to take input ’N’ and create the array for the cache.
  • The cache array is initialized for elements ‘0’ and ‘1’.
  • Other elements of the Fibonacci series are stored in the cache using a for-loop.
  • Finally, after cache[n] is returned, the main function to print output is written

The tabulation version of fib looks like this:

The computation of fib_tab(4) can be described as follows:

As opposed to the memoization technique, this computation is bottom-up since the original problem, fib_tab(4), is at the bottom of the computation.

Time Complexity

The loop is run as long as ‘i’ is less than or equal to ’N’ to reach the ‘N’th value correctly. The loop is iterated, and ‘i’ is incremented each time instead of recursively breaking down. This ensures that no value is recomputed and the current value is returned in O(N) time.

Space Complexity

Since tabulation or the bottoms-up approach used constant space, the space complexity for this algorithm would be O(1).

Although Tabulation might be comparatively difficult to implement, we have a Complexity Bonus, i.e. The complexity of recursive algorithms can be hard to analyze. With a tabulation-based implementation, however, you get the complexity analysis for free! Tabulation-based solutions always boil down to filling in values in a vector (or matrix) using for loops, and each value is typically computed in constant time.

Is tabulation always better than memoization?

If all subproblems need to be solved in the given problem, tabulation mostly outperforms memoization since it has no overhead for recursion. The tabulation technique can use a preallocated array instead of a hash map.

What is the disadvantage of memoization?

Some of the disadvantages of memoization are space overhead, slow initial execution, and extra burden on programmers as they might need to modify the code.

How do table entries work differently in tabulation and memoization?

Entries are filled one by one in a tabulated version, while entries are filled on-demand in the memoized version.

Should I use tabulation or memoization?

If the original problem requires all subproblems to be solved, tabulation usually outperforms memoization by a constant factor. This is because tabulation has no overhead for recursion and can use a preallocated array rather than, say, a hash map.

If only some of the subproblems need to be solved for the original problem to be solved, then memoization is preferable since the subproblems are solved lazily, i.e. precisely the computations that are needed are carried out. Ex. The square function showed above.

Also, Pseudo-tabulation and a bottom-up approach have better memory complexity when we do not need to “remember” all previous solutions. (In Fibonacci for example, only the last 2 values must be remembered)

In the tabulation approach you have more control over what data to save for later. Inside a memoized function it is hard to make a decision on what previous results can be discarded.

Thus, the bottom line is this: If you want such finer-grained control over memory usage, memoization may not be a viable option.

Pros/Cons

Semantic Complexity

It is often easier to implement DP solutions with memoization. If we define the subproblems recursively by creating a recurrence relation, then it is straightforward to translate the relation into code.

Of course, problems can be formulated with tabulation as well. This requires more thought however, and the conscious ordering of subproblems, so requires subtler design.

Storage

In terms of storage, tabulation can potentially have a smaller footprint than memoization. If we order our subproblems carefully, we can have certain subproblems that are reused early on, then never used again. This enables us to cache the answer to each subproblem for just a limited amount of time. For example, when generating the Fibonacci sequence from the bottom up, we first add 1 and 1 together to get 2. Then, we add 1 and 2 together to get 3. And so on. Any numbers that we’ve calculated before that can be discarded, since we don’t need them anymore to continue generating the sequence. Thus, we create a “sliding window” that enables us to cache just two numbers at any given time. This is an extreme example, but illustrates the importance of ordering your subproblems wisely to minimize space usage in tabulation.

Memoization, on the other hand, does not give us the ability to minimize our storage. Since the calls to subproblems are recursive, we cannot make any smart or definite decisions on how to order the subproblems. Instead, we have to cache all answers to subproblems, in case another recursive call will need them in the future.

Runtime

Memoization on very complex problems can be problematic since there is so much overhead that comes with recursion — each recursive call requires that we keep the entire recursion tree in memory. If the recursion is deep enough, it could overflow the function call stack.

A way to speed up memoization is to parallelize the recursive calls while maintaining global storage for memoized answers to subproblems. Since the ordering of subproblems in memoization does not matter (unlike tabulation), this is viable.

Tabulation is often faster than memoization because it is iterative and solving subproblems requires no overhead. However, it has to go through the entire search space, which means that there is no way to easily optimize the runtime.

Conclusion

Recursion with memoization is better whenever the state space is sparse — in other words if you don’t actually need to solve all smaller subproblems but only some of them. In such cases, the recursive implementation can be much faster. Recursion with memoization is also better whenever the state space is irregular, i.e., whenever it is hard to specify an order of evaluation iteratively.

Iterative solutions are better whenever the state space is dense and regular. If you need to compute the solutions to all the subproblems anyway, you may as well do it without all the function call overhead. An additional advantage of the iterative approach is that you are often able to save memory by forgetting the solutions to subproblems you won’t need in the future. E.g., if you only need row k of the table to compute row k+1, there is no need to remember row k-1 anymore.

References:

https://www.geeksforgeeks.org/tabulation-vs-memoization/

https://awjin.me/algos-js/dp/tab-memo.html

https://www.javatpoint.com/tabulation-vs-memoization

Authors:

  1. Ashish Hegde(29)
  2. Aryan Jain(36)
  3. Naman Jain(38)
  4. Aditya Joshi(43)

--

--