Optimizing the “find the greatest difference” challenge.
The challenge was that you were given as input an array of numbers, and needed to find the greatest positive difference between two items with indexes i and j in the array. The conditions were that j < i, and nums[i] > nums[j]. For instance, if nums = [4, 5, 3, 5, 7, 8, 1], with i = 4 and j = 2, you have 7 - 3 = 4. However, the maximum comes at i = 5 and j = 2, nums[i] = 8 and nums[j] = 3, yielding a difference of 5.
An easy problem to solve brute-force, with something like:
This is going to yield an O(n²) time solution, as it’s a doubly nested loop. Works fine for smaller input arrays, but the challenge specifically fed a few arrays of size > 100,000 in. So, efficiency was key.
I realized that for every value of i (the outer-most loop,) there was a way to immediately determine whether it was possible that a new maximum could be located. Essentially, for every value of i, you also keep track of the lowest value found up until that point; call it smallestNum. Then, for that value of i, before going into the second loop, you determine whether nums[i]-smallestNum is smaller than the current greatestDiff. If so, it means that there’s no possible value for nums[j] for that i which could yield a value greater than greatestDiff, and so you continue to the next iteration.
This yields what I believe to be O(n log n) time, as for each i, whether or not you dive into the nested loop is dependent upon certain conditions which are evaluated at run-time.
(I also snuck in some timing functionality, because benchmarking is groovy.)
So, run times.
To simplify the process, I rewrote the functions to return only their run times. I also created the following helper functions:
Interestingly enough, I ran into a strange bug, which perplexed me to the point where I posted a question on StackOverflow. Once I rewrote the functions to only return their run times, V1 would continually run near-instantly and return a run time of 0 seconds, whereas V2 would run as expected. Their code, aside from the return values, was identical, and I couldn’t figure out the root cause.
It turned out that it was because of Chrome trying to be helpful. Because it saw that I was never returning the value of greatestDiff (as I was only returning the run time,) it assumed it wasn’t needed, and was skipping both loops entirely for V1. However, V2 was a bit more complicated, and so Chrome erred on the side of caution by running the full function. Once I changed the return value to
return [((new Date().getTime() — start) / 1000.0), greatestDiff]
it began returning the correct run times again.
So, here are the run times for both functions for various array sizes and maximum individual sizes:
As expected, the more optimized V2 function is significantly faster, about ten times so. And, as the max possible size of elements in the array increases, it gets more efficient. This is because it’s keeping tabs on the current lowest value and thus the greatest difference currently possible, and with a wider range of individual values, it’s able to skip significantly more of them. On the other hand, the brute-force method’s run time is mostly independent of the max individual size of its input elements, which is also as expected.
So, a fun process all in all. Any day that I end up forced to post a question to StackOverflow means, at the least, something interesting occurred, and this was no exception.