Intro to Algorithms: Binary Search
As I’m learning more about programming and software development I’m quickly realizing the importance of understanding algorithms. Sure, many higher-level languages have all sorts of search and sort functions built-in, but the importance of truly understanding how these functions work cannot be understated. This post will give a walkthrough of the binary search algorithm, a foundational tool that should be in every programmer’s kit.
Starting from Scratch: Linear Search
One of the simplest methods available, the linear search, starts from the beginning of the collection and works towards the end, checking each element until the target is found.
Think of trying to find a name in a phonebook. A linear search would start on the first name on the first page, working name by name towards the back of the phonebook until the target name is found.
Translate that to computer language and you get something like this:
This is fine if you’re looking for Andrew Anderson in the phone book, but is a bit more of a hassle when searching for Zoe Zimmerman. It’s important to note that this method also scales poorly. If you live in a small town with 100 residents it’s feasible to do a linear search every time, even when searching for Zoe Zimmerman’s phone number. But imagine searching through New York City’s phonebook, population 19 million. You could use a linear search and you would eventually find your target, would you really want to?
Binary search to the rescue
If we’re going to search through larger collections we’re going to need another search method. Enter the binary search. This method tears the collection in half and checks the midpoint against the target, halving the searched collection size with each cycle until the target is found.
It is important to note that Binary search only works in cases where the collection is already sorted by ascending value.
It works as follows:
- Determine the midpoint in the sorted collection and grab the corresponding element.
- Is the midpoint element greater than or less than our target? If it is greater we can disregard the back half of the collection. If it is lesser we disregard the front half. If the midpoint element matches our target than we’ve succeeded and our search is over.
- Take the remaining half and start again from step one, continue until the target is found.
Back to The Phonebook:
Going back to the phone book of New York City, let’s say we’re searching for someone with the last name of Emmerson and give binary search a shot.
Our first step is to find the halfway point in our collection. The alphabet has 26 letters, so our halfway point would be the 13th letter “m”. We flip to the first page of “m” names and tear the book in half.
Now we compare that midpoint to our target. Does the midpoint come before or after the target we’re searching for? The letter “e” comes before “m” so right away we know we can disregard the back half and throw those names straight in the trash. Already we’re down to 9.5 million names instead of 19 million.
Next, we pick the halfway point between the first letter and the 13th letter which we’ll say is 7 because there are no half letters. We split our remaining collection in half on the first page of “g”, and since our midpoint is further along than our target we can once again throw out the second half.
We’re down to a little under 5 million names, from 19 million. Keep repeating this process and pretty soon you’ll be able to find your target rather quickly even if you’re searching through a massive collection. Meanwhile, if we were using linear search we likely wouldn’t have even searched through just the “a” section. By now you can likely see the time saved by using binary search.
1. Determine the midpoint in the sorted collection and grab the corresponding element.
First, we find the length of our array by calling
array.length . It’s important to note, the return value of
array.length is one-indexed, but arrays are zero-indexed. This means we need to decrement the value returned by
array.length before dividing it by 2 to find the halfway point. We can use
parseInt to round that value to a whole number, thus ensuring it can be used to index our array. Finally, we grab the element at that index.
2. Is the midpoint element greater than or less than our target? If it is greater we can disregard the back half of the collection. If it is lesser we disregard the front half. If the midpoint element matches our target than we’ve succeeded and our search is over.
Thankfully, most programming languages are able of comparing values regardless of whether they are chars, strings, ints, floats, etc. So we just need to set up an
if...else if...else if statement as a logic gate for our function.
3. Take the remaining half and start again from step one, continue until the target is found.
So how do we fill the inside of our
if...else if...else if statement? This would be a great place to utilize recursion since we know we’re repeating the same steps until the target will be found. But if the function will call itself recursively we need to refactor the function so it knows which section of the array to search.
Let’s include two indices as additional arguments. One will specify where in the array to start searching, and the other will specify where to stop searching. We can use these indices to find the midpoint of our search area as well, removing the need to call
array.length which will allow us to recursively call the function while keeping track of what section of the array to search.
Okay, so we’ve refactored our existing code to allow for recursion. Now it’s time to call the function recursively inside each logic gate.
If the middle element is greater than the target we can disregard the middle element and everything after it in our search. We recursively invoke
binarySearch() with the same array, target, and start index. Our search will go up to the element preceding our current midpoint, so we pass in
mid — 1 as the value for
If the middle element is less than the target we can disregard the middle element and everything before it. Once again this means a recursive invocation of
binarySearch() but this time the stop index will stay the same. Our search now starts at the element after the midpoint, so we pass
mid + 1 as the value for
Put that all together and it looks something like this:
We’re almost done, but there’s one loose end we need to tie up. What if our target does not exist in the collection we’re searching? Eventually, we’re going to end up with an array of a single element which will cause us to recursively call our search function with a start index greater than the stop index. Not only does this make zero sense, but it also causes us to get stuck in an infinite loop, quickly followed by exceeding the stack size limit and our function crashing.
This can be fixed quickly by adding a conditional to ensure that our start index is smaller than the stop index. If this is not the case,
-1 is returned in order to signal that the target does not exist in the given collection.
One Last Refactor
The binary search function can also be written iteratively instead of recursively. The logic functions the same, just with a different syntax. Instead of using an
if...else statement to prevent an infinite loop we can use a
while statement with the same logical condition. From there we adjust the indices directly until a match is found.
Sources and further reading: