Dynamic Programming in the Wild — Content-Aware Image Resizing

Jose Antonio Moreno
7 min readOct 12, 2019

--

Possession of an optimal substructure and overlapping sub-problems; two necessary attributes for dynamic programming to be applicable, and too much of a mouthful for the first sentence of any article.

With that said, dynamic programming can be a frustrating and academic concept to grasp, theory you learn in school and not something most developers apply on a daily basis. When removed from the context of a useful real-world application, it’s easy to brush aside this technique as a tedious exercise reserved for exams and interviews.

The truth is that dynamic programming can be very practical.

Seam Carving at A Glance

Seam carving for content-aware image resizing, first introduced in a 2007 paper by Shai Avidan & Ariel Shamir, is a process made possible (and by possible I mean efficient) through dynamic programming.

Simply put, the goal of this technique is to intelligently resize an image by removing pixels of low interest.

An invasive mountain goat on Mt. Ellinor, in the Olympic range

These pixels make up seams, which are visualized in the above image of the goat as green lines, or paths of low energy. Seams go from top to bottom, or from left to right, and contain a single pixel in each row or column. Seams can wrap around objects of prominence in a way that rectangular blocks of columns cannot.

This method is a significant functional improvement over traditional cropping and scaling, since the first can only be used for contractions, and the second leaves visible distortion and stretching for any dramatic size increases.

Defining an Energy Filter

Before any path finding can be done, an energy mapping is created by applying an energy filter over an image. Using a picture of the Louvre, we can quickly see what areas are empty, low-energy dark space:

Sobel operator applied over the Louvre for detection of edges

This filter can be a simple Sobel operator, which is commonly used to identify edges in an image. Conceptually, you can think of this as taking the derivative of an image, with the purpose of calculating the change in neighboring groups of pixels. White edges are defined as high energy pixels, while black space is empty and can be carved out with seams.

300 low-energy seams superimposed over the original image

As expected, the lowest energy seams are all found at the top of this image, in low-energy space. The lowest seam was the last one found, only after most of the space representing the sky was exhausted. If we animate this process of finding seams, we see the following:

Sequential removal of seams

The end result is a resized depiction of the Louvre, in an arguably more pleasing aspect ratio, without stretching or distortion of the buildings and pyramid.

A basic example of seam removal. This image is much “shorter” than the original.

Is Dynamic Programming right for you?

The actual lowest-energy path finding step is where dynamic programming comes in. We want to find the seam with the lowest total energy.

As mentioned, a candidate problem for dynamic programming has an optimal substructure, meaning that its sub-problems can each be solved in an optimal way.

Additionally, those sub-problems have to be overlapping sub-problems. Overlapping in this sense means that any recursive algorithm solving the problem should solve the same sub-problems over and over, rather than generating new sub-problems.

Top-Down

For seam carving, each sub-problem will be defined as finding the best path up to any single pixel in the image. In other words, each pixel will be designated the cumulative value of the lowest-energy seam which ends at that pixel.

We can do this using a top-down approach, coming directly from the recursive definition of our problem. However, the implication is that we will have to calculate every single path up to every single pixel in every single row. We need an efficient way to do this.

Consider calculating a Fibonacci sequence, which can be recursively defined. Notice how many times more fundamental calculations like F(1) appear:

A simple Fibonacci tree

A top-down approach takes advantage of memoization and computes each node by leveraging the sub-problem solutions in a lookup table. Generally speaking:

1. Solve the problem recursively by breaking it down into smaller problems.

2. If you solve a new sub-problem, cache it

3. When solving a new sub-problem that overlaps with a previous, smaller sub-sub-problem that you already solved, don’t solve the sub-sub-problem again; get it from the table

Even though you’re leveraging memoization by caching computations, you can still run into problems when seam carving. Large images can lead to stack overflows when using a recursive approach, not to mention the overhead of recursive lookups.

M̶i̶d̶d̶l̶e̶-̶O̶u̶t̶

A̶ ̶m̶i̶d̶d̶l̶e̶-̶o̶u̶t̶ ̶a̶p̶p̶r̶o̶a̶c̶h̶ ̶w̶i̶l̶l̶ ̶o̶n̶l̶y̶ ̶b̶e̶ ̶u̶s̶e̶f̶u̶l̶ ̶i̶f̶ ̶y̶o̶u̶’̶r̶e̶ ̶l̶o̶o̶k̶i̶n̶g̶ ̶t̶o̶ ̶a̶c̶h̶i̶e̶v̶e̶ ̶a̶ ̶b̶e̶t̶t̶e̶r̶ W̶e̶i̶s̶s̶m̶a̶n̶ ̶s̶c̶o̶r̶e̶.̶

Bottom-Up

A bottom-up approach is slightly different, because you pre-compute and cache all the sub-problems that you will need to solve larger, overlapping problems. This removes the overhead of recursive calls, at the cost of having to pre-compute all sub-problems needed.

For seam carving, this means starting at the top of the image and finding the best path to each pixel with the intent that a later, longer path will overlap with a smaller seam.

Even though there are many sub-problems, they are relatively small. Solving this is much faster than recursively (or even iteratively) breaking down larger problems, even with caching!

Seam Removal

Back to seam carving; once the n lowest-energy seams are defined, their coordinates are stored in a list. From there, all that’s left to do is remove those pixels from the original image, and shift the neighboring pixels in to fill their place.

Seam Insertion

Once low-energy seams are identified, the removal process can be extended to add synthetic information into an image. Consider this example, where n energy is first removed, and then added back in with a size of 2n.

A Berkeley resident overlooking the city at Indian Rock

Similar to seam removal, we first calculate n low-energy seams. Removing them is necessary to make sure you don’t map over the same seam more than once.

Each seam is then expanded by averaging the neighboring two seams and creating an artificial seam, which will be added back in with the original.

A Swiss cat following me on a narrow trail

The rationale for the insertion process is the same. By adding low-energy seams, the objects of interest are left intact and (mostly) without visual distortions.

Limitations

Arguably, the main limitation of seam carving for content-aware image resizing is the effectiveness of your energy filter. In this example, the object is well defined in terms of its edges, except for the top of the helmet:

The sobel operator didn’t distinguish the climbing helmet from the rocks of Mt. Shasta

Since this image is being expanded horizontally, vertical seams are needed; low-energy paths happen to find their way through the person, instead of around them. The result is a small amount of distortion, similar to what you would see if you were traditionally scaling the image.

A reasonable amount of distortion due to seams that found their way through the person

Further possibilities exist, such as object removal through addition of a low-energy mask over the object to be carved out.

Even without that, you can at least reshape those pesky wide-angle vacation pictures to finally fit on Instagram and no one will know (unless you mess up the energy filter).

A fisherman in Paranaguá, Brazil

Seam carving for content-aware image resizing — the results are visually interesting, and the implementation an exercise in effective dynamic programming.

--

--

Jose Antonio Moreno

Software Engineer at Capital One SF — Ideas here are my own — Berkeley, CA