How to solve any dynamic programming interview question (part 1)

Alexandru Rotaru
7 min readJul 24, 2021

--

I recently interviewed and came across one of the most challenging problems I encountered in an interview, a 3D dynamic programming problem and I want to present a strategy that I learned throughout the years that helped me solve it within 45 minutes.

In the following sections, I will discuss one classic DP problem. In part 2 of this article, I will discuss a more complex problem, similar in difficulty to the one I encountered at my interview and show how, without too much experience, you can come up with the solution yourself. You need, however, to be familiar with backtracking algorithms and python.

When researching this topic, I went through some materials online, and I can summarise most of the courses and articles about dynamic programming into the following key points:

  • top-down vs. bottom-up
  • memoization (example Fibonacci)
  • problem and subproblem representation
  • intuitive explanation of the recursion

Yes, you have to start with these if you are new to algorithms, but this will not teach you how to deal with problems in general.

There is also the common saying, “The more problems you solve, the better you become at solving,” and while this is true, in general, people seem to use the problems previously solved as a database of solutions. You may develop a strong intuition on solving problems in other cases, but you are not aware of how the right ideas come.

The biggest issues, in most cases, when teaching DP:

  • the solutions are very well optimized; therefore, obtuse
  • the curricula are focussed solely on common, well-known problems (knapsack, Levenstein distance, Fibonacci, etc.)

which do not necessarily help you develop a general strategy for solving DP problems.

Knapsack discrete

For those unfamiliar with this problem, given some objects with weight (w) and value (v) and a knapsack with a given capacity (c), compute the maximum value you can obtain by putting the objects in the knapsack. The solution, in short, looks something like this:

Eq.1: Common core equation for knapsack 0–1 DP solution

where:

  • i is the index of the object;
  • w[i] and v[i] are the weight and the value of the object i;
  • dp[i][c] is the maximum value you can obtain using objects from index i onward in a knapsack of capacity c.

Solutions and explanations for this problem are plentiful online. However, I would like to approach the problem from the standpoint of the first computer scientist that ever tried to solve it.

How do you come up with the problem representation ?

How do you come up with the recursion ?

How do you come up with such a solution ?

And I mean, literally, not explaining how the solution works, how, do you, step by step, reach this solution starting from scratch.

The “meta-algorithm” or strategy I am about to show you for solving any DP problem is not a formal algorithm but rather a guideline, and it consists of the following steps:

I. Solve the problem recursively

  • Backtracking fitting the problem description
  • Adapt the algorithm for the problem at hand
  • Clean and optimize the code
  • Memoization

II. Convert the recursive solution to an iterative solution

  • Determine the number of loops and table dimensions
  • Optimize for memory or time (if possible)

There is a significant advantage in approaching problems this way, as with a bit of practice, you will be able to see soon where exactly you have difficulties in the process. Also, if you reach a dead-end, you go back one or more steps and try a slightly different approach.

Step 1: Recursive solution

The answers to the questions above will come naturally from approaching the problems in the worst and inefficient way possible: backtracking. In fact, most of the DP algorithms you may have seen are just optimized versions of the backtracking algorithms.

Step 1.1: Starting algorithm

In essence, we are dealing with putting objects into the knapsack. Since the objects are represented with indexes, we can start with a solution that generates all the subsets of the objects in question using a vector of flags (1 means presence, 0 means absence).

Generate all subsets of a set using a mask of bits
python subsets.py
[0, 0, 0] # no objects
[0, 0, 1]
...
[1, 1, 0]
[1, 1, 1] # all objects

Step 1.2: Modify the previous algorithm to solve the problem at hand

To adapt to the problem at hand, besides the necessary parameters, we add logic in the base case of the recursive function. We compute the cumulated weight and value of the objects and update the best value seen so far if the weight is under the knapsack capacity.

python knapsack_backtracking.py
Maximum value: 23

Step 1.3: Clean and optimize the code above

Since we check whether the current subset of elements weighs more than the capacity at the end, we can move this check after every object insertion and cutoff some recursive calls. To do this, we introduce two parameters, curr_value, and curr_weight to keep track of the current total value and weight in the knapsack.

In addition, this change allows us to get rid of the parameter mask and the loop inside the base case.

Knapsack backtracking with cutoff

At this point, we notice that we can merge curr_weight and capacity into a single parameter.

We can reduce parameters further by changing the signature of the function and instead return the optimal value. With this change, we eliminate best and curr_value parameters.

Since weights and values do not change throughout the function calls, we can make them global or store them in a class. This is not necessary but makes the code a bit more readable.

Minimum parameters backtracking knapsack

Ignoring the sanity check for the case in which we consider the current object and abbreviate the names of variables and the function, we get the following equation:

Eq.2: Equation inferred from recursive solution

Although the notations are different, the equations (1) and (2) are equivalent. Even though we started with backtracking, we reached a point where we can see where Eq.1 comes from.

Observation: It may not be obvious, but line 9 comes from line 12 of the prior snippet.

In this setup, the actual parameters of this function represent the problems and the returned value is the solution for that given problem.

Analyzing the problem representation, it is worth noting that idx has two meanings. From weights[idx], values[idx] idx conveys index of an object. Because idx controls which objects can be used in the knapsack throughout subsequent recursive calls, it also means a substring of objects.

Step 1.4: Memoisation

weights = [1, 1, 4, 5, 9, 8]
values = [11, 3, 9, 2, 6, 12]
capacity = 9

If we run our algorithm on the previous example and print the function call parameters, we will notice that some pairs of parameters are repeated.

knapsack(idx=0, capacity=9)
knapsack(idx=1, capacity=9)
knapsack(idx=2, capacity=9)
...
# knapsack contains objects with indices 0, 3
knapsack(idx=4, capacity=3)
...
# knapsack contains objects with indices 0, 1, 2
knapsack(idx=4, capacity=3)
...

This is because due to our input's nature, we need the solution to a certain subproblem in different contexts. In the sample above, we see that we reach the subproblem (idx=4, capacity=3) in two different ways. In both cases, we reach the total weight of the objects to be 6, with the remaining capacity being 9–6=3.

Therefore we need to make sure that we do not compute the same problem more than once. For this, we add a map to keep track of the solved problems.

At the beginning of the function, we return the solution for our input if it was previously computed and stored in the cache. If not, we solve the given problem recursively and then store the solution in the cache just before the return statement.

Cached knapsack backtracking

Step 2. Iterative solution

While the cached recursive approach works, there is still a lot of room for improvement. Recursive solutions, in general, are worth converting to iterative algorithms because function calls introduce overhead.

Step 2.1: Convert recursive to iterative.

This step can be done automatically. To have an iterative version of this algorithm, we need to:

  • determine the number of dimensions in our cache
  • the directions of the loops

After refactoring, there are just two parameters: idx and capacity, and therefore the cache will have two dimensions, and there will be two loops iterating through the cache.

We can determine the loop directions from the problem-subproblems relationship (figure below) or equation (1) (step 1.3).

Knapsack problem state space

Observations:

  • ignoring parameter c, ks(i) depends on ks(i+1), we need to solve for i+1 before i; therefore, i will iterate from N-1 to 0
  • ignoring parameter i, ks(c) depends on ks(c) and ks(c-w[i]), we need to solve for c-w, w≥0 before c; therefore, c will iterate from 0 to capacity
Knapsack iterative solution
python knapsack_iterative.py
Maximum value: 23

The direction of the first loop can be changed to [0, len(weights)] and still be correct. However, by doing this, the meaning of parameter i partially change to represent a substring of objects from 0 to i instead of i to len(objects) as it is currently.

Step 2.2: Optimize memory (if possible)

This step is the hardest and worth trying as it can significantly reduce memory usage and speed up the runtime.

From the previous observations, ks(i) depends only on subproblems with ks(i+1), which means we don’t need to store all the intermediate solutions of our subproblems. We need only to keep solutions for the previous i (in our case, previous i means i+1).

Knapsack memory efficient

We can reduce memory further by keeping only one dp vector, but we need to avoid using dp[c] computed for the current i. Knowing that dp[c] depends on dp[c - something], we reverse the direction of the second loop because as we iterate with c from capacity to 0, c-something has been computed at the previous i.

Knapsack iterative memory efficient
python knapsack_memory_efficient.py 
Maximum value: 23

In part 2 of this article, I will present how I solved a problem that I was unfamiliar with using this guideline.

--

--