Algorithms: How Prefix Sums Can Help Improving Operations Over Arrays

Tiago Guedes
Beauty Date
Published in
4 min readJan 29, 2018
Just a cool matrix image.

If you've been in the software world for a while you've probably faced some problem where you needed to do some processing over subsets of an array, right? No?! Well, let me bring you a simple one then.

Let's say I'm riding the Bitcoin wave and I decided to track my gross gains over the past year:

gains = [102, 55, 320, 250, 215, 142, 54, 32, 121, 224, 458, 276]

(remember, these are just sample data :-)

Then I started to think on a couple of queries I could apply to the gains array so that I could check how much I earned for a given period. For instance, let's say I would like to know the total amount I earned in:

  • The whole year;
  • The first couple of months;
  • The last quarter of the year.

Those queries could easily be translated to a Ruby code like this:

queries = [
0..11, # whole year
0..1, # first 2 months
9..11 # last quarter
]

It seems quite simple to take those indexes and sum the values in their range. And yes, it is. But be careful my young Padawan or you may fall in a performance trap.

Ok, so how should I implement the algorithm to sum them?

First Approach (worse): O(N * M)

Naturally, the first solution that comes up to our mind is to take each slice of gains , based on the given query (range), and then iterate over its elements to sum all values.

gains   = [102, 55, 320, 250, 215, 142, 54, 32, 121, 224, 458, 276]
queries = [0..11, 0..1, 9..11]
queries.map do |query|
gains[query].reduce(:+)
end
# Result: [2249, 157, 958]

Done! Problem solved with a neat and small code. Next.

Hold on, not yet: this solution is in fact effective but it's not that much efficient.

For a small input of data (i.e. our current gains and queries arrays) this approach will definitely work fast, but let's assume a distant future where I had tracked those gains over the past 50 years (50 * 12 = 600) and for that period a I ended up producing a collection of 10.000 different combinations of queries. Well, now if I want to process all the queries at once, then the same code that once in the past was apparently responding very well now starts to get some substantial performance losses. The problem in a nutshell: its time complexity.

Just in case you're not used to this code "complexity" topic:

When we talk about time complexity we're actually talking about how well our code scales in face of large inputs. Usually we represent this potential of scalability with Big-O notation and for notation purposes it always considers the algorithm’s worst case scenario when defining its complexity. You can also understand the complexity as the maximum amount of constant-time primary operations (O(1)) that a program may execute. Such operations are: addition, subtraction, comparison, assignment etc.

So, having the size of gains as N and queries as M, the first approach will take each query (O(M)) and traverse the respective slice on gains (O(N)) to sum its values. The algorithm's performance is then represented by the notation: O(N * M) i.e. when thinking in the worst case scenario, it would have had to sum all the 600 values (biggest possible slice) 10k times, which would lead to 6 million primary operations being executed on the CPU while running the algorithm.

And what if we could reduce the worst case to only 10.6k operations?

Second Approach (better): O(N + M)

First, let me quickly introduce the prefix sums technique: it basically describes a way to pre-compute the cumulative sum for each value in a sequence, so they can be used later for a faster calculation of the total between the given indexes.

The reason for using this technique: the sum operation on the slices that once was responding linearly (traverse elements) now happens in constant time, since it’s only a matter of subtracting two pre-computed sums.

So here's how we can improve the previous algorithm:

gains   = [102, 55, 320, 250, 215, 142, 54, 32, 121, 224, 458, 276]
queries = [0..11, 0..1, 9..11]
sums = [0]
# 1. Prefix sum calculation:
(1..gains.length).each do |i|
sums[i] = sums[i - 1] + gains[i - 1]
end
# 2. Query resolution:
queries.map do |query|
sums[query.end + 1] - sums[query.begin]
end
# Result: [2249, 157, 958]

Initially what it does is an O(N) operation to calcule all the cumulative sums based on the gains array. For instance, with gains = [1, 2, 3] it would compute sums = [0, 1, 3, 6].

The second part just takes each query, O(M), and subtracts the pre-computed sums found in sums array. The correctness is ensured since the difference between their prefix sums is equivalent to the sum of all values present in their range.

As we've got two different linear operations happening here, the resulting time complexity is their sum: O(N) + O(M) = O(N + M). Thus, for a large input of data like 600 gains and 10k queries, in the worst case it would execute only 10.6k primary operations on the CPU. That is, for this input, we have about 99% less operations happening in comparison to the previous algorithm.

Conclusion

Some solutions may appear to be very obvious at a first glance, but it's worth trying to reflect a bit more over the problem in order to figure out if there's any hidden pay off (aka trap) when going with the most obvious one.

Another thing to keep in mind is that, in face of performance issues, people usually blame something in their current tech stack (language, framework, libs etc), without realizing that the actual problem may be related to how the solution was designed. So having a better understanding not only of algorithm techniques, but architecture and design patterns, for instance, will certainly bring more performant solutions to the software you build.

--

--