Intro to Algorithms: Chapter Four, the Maximum Sub-Array Problem

Come with me on a walk through Introduction to Algorithms by Cormen, et al., referred to on the internet as the “bible” of algorithms. Start at the beginning here.

Not gonna lie, merge sort beat the crap out of me, intellectually. I didn’t quite understand how recursion was working until I was literally writing the blog post and looking for the absolute most simple recursive method before it finally clicked. But that done, it seems like further work with recursion is going to move much smoother. Case in point: The Maximum Sub-Array Problem (which I will now abbreviate MSAP).

The Theory

MSAP deals with a logical problem of finding which contiguous portion of an array will yield the greatest sum. Obviously if all numbers are positive, the answer is the entire array. Otherwise, here’s are two quick examples:

array = [-1, 2, 3, -1]

The greatest value we can get by combining any number of continuous items of this array is 5: array[1..2]. One more:

array = [1, -1, 1, -1, 5, -1]

Here our sum is 5. We could achieve this by taking everything from array[0..4] or array[2..4] but we could also take array[4] and have the same sum.

The Process

The basics of the approach to this algorithm is that the sub-array we’re looking for can be in only one of three places:

  • On the left “side” of the array (between 0 and the mid-point)
  • On the right “side” of the array (between the midpoint + 1 and the end)
  • Somewhere crossing the midpoint.

So we can divide the array up into three bits and look at each in turn to see where the largest pieces are, then compare them.

The Code

Whereas insertion sort took me a while and merge sort took me on the order of days, I got this guy straight off from the pseudocode.

def max_subarray(array, low, high)
if high == low
[low, high, array[low]]
else
mid = (low + high) / 2
left = max_subarray(array, low, mid)
right = max_subarray(array, mid + 1, high)
cross = max_crossing(array, low, mid, high)
end
if left[2] >= right[2] && left[2] >= cross[2]
ideal = left
elsif right[2] >= left[2] && right[2] >= cross[2]
ideal = right
else
ideal = cross
end
ideal
end
def max_crossing(array, low, mid, high)
left_sum = -20
i = mid
sum = 0
while i >= low
sum = sum + array[i]
if sum > left_sum
left_sum = sum
max_left = i
end
i -= 1
end
sum = 0
j = mid + 1
right_sum = -20
while j <= high
sum = sum + array[j]
if sum > right_sum
right_sum = sum
max_right = j
end
j += 1
end
[max_left, max_right, left_sum + right_sum]
end

So, our max_subarray method breaks up the arrays into their smallest chunks going from the right and left (so that eventually they are single items) then compares them back up through the “stack” and keeping only those sums that are greater than the ones which came before. Finally, the max_crossing gets to run, which goes out from the middle to find the greatest sum. Then, back in the max_subarray, we simply compare the left and right sums to the cross sum and come up with our winner.

I arbitrarily chose -20 as my max-negative value in order to avoid looking up how to have infinities in Ruby (again) but on the whole, I’m sure this code could use an enormous amount of…

Refactoring

def max_crossing(array, low, mid, high)
@left_sum = -Float::INFINITY
@right_sum = -Float::INFINITY
sum = 0
array[low..mid].each_with_index.reverse_each do |num, index|
sum = num + sum
if sum > @left_sum
@left_sum = sum
@max_left = index
end
end
sum = 0
array[(mid+1)..high].each_with_index do |num, index|
sum = num + sum
if sum > @right_sum
@right_sum = sum
@max_right = index + mid + 1
end
end
[@max_left, @max_right, @left_sum + @right_sum]
end

def
max_subarray(array, low, high)
return [low, high, array[low]] if high == low
mid = (low + high) / 2
left = max_subarray(array, low, mid)
right = max_subarray(array, mid + 1, high)
cross = max_crossing(array, low, mid, high)
if cross[2] >= left[2] && cross[2] >= right[2]
cross
elsif left[2] >= right[2]
left
else
right
end
end

I was able to lose eight lines of code and those while loops but after some thinking, I couldn’t get it down further than that. No doubt the whizzes would laugh in my face at this “refactoring” but it’s getting late on a Saturday so I feel inclined to just let it ride.

One result of this refactoring does change the behavior of the program: it now counts larger sub-arrays as the result instead of the smaller sub-arrays that produce the same sum. For our purposes here, there’s no difference, but this would be a really big difference in certain situations. It’s good to know we can come at this from two different directions, in any case (this difference comes from using the array iterations instead of the while loops).

So what happened to Chapter 3?

I read Chapter 3 but honestly I’m not ready for it, math-wise. I read about a number of different ways one can describe the speed at which an algorithm will run. You can look at the worst-case scenario, or the best-case scenario or you can look at any number of other ways.

One key phrase is “asymptotic.” This means that we are interested in what happens as n approaches infinity — if “x” is really, really big, what is happening to y? Otherwise, forget it. I can always come back to the math if I find that it’s really necessary.

That’s it for comp sci today!