Binging Binary Search š
Data Structures and Algorithms(DSA) is one of the essential tools which every engineer should have in his/her arsenal. Generally, they are taught in the first year of engineering but, in my first year, I was busy in making ābinary treesā š. Because of that, I couldnāt focus on learning and started blindly toying with them. But for 2018, Iāve made a resolution that I will tame this demon of DSA and will not let it haunt me every time I try to step out of my comfort zone. So Iāve started studying DSA and it looks like itās a long journey but as we know:
āThe journey of a thousand miles begins with a single stepā.
For me that single step is Binary Search, so letās get started with it.
Searching
Searching/finding is the most elementary task that we do on daily basis e.g. searching a name in the phone book, searching a book on the bookshelf. The way we search an item majorly depends on how items are kept/arranged. For example, on a bookshelf, if books are kept haphazardly then one may have to check every book till he/she finds the desired book. But in case of the phone book, one can directly jump to the initials of the name.
In computer science, we are a handful of search algorithms. Some popular elementary search algorithms are Linear/Sequential Search, Binary Search, Interpolation Search, Ternary Search etc. Every search algorithm has itās own pros and cons. In this post weāll discuss a bit about Linear Search and then weāll dive into Binary Search.
Linear/Sequential Search
In Linear Search as the name suggests we linearly/sequentially compare each item of the collection to the key item. We stop searching when we find the item else we continue searching until all the items are compared. Letās take an example to understand it. Suppose we have been given a list of employee numbers in ascending order present on a particular day. We need to check whether employee 5 was present or not. We can leverage Linear Search to figure it out. We compare every id/number in the list with 5. If itās found then we say that employee 5 was present else weāll say employee 5 was absent. Hereās the implementation in javascript:
By looking at the number of comparisons we make to find the element using Linear Search, we can say that itās time complexity is O(n) (You can read more about time complexity in my previous post An Ode to Asymptotic Notation) which means time taken by Linear Search is directly proportional to the size of the input.
We can conclude that Linear Search is good especially when input size is small but itās sluggish when the input is relatively large. Also, it doesnāt take advantage of the fact that input i.e. employees list is sorted in ascending order. Letās see if we can do better with Binary Search.
Binary Search
Binary Search is based on Divide and Conquer Technique. According to CLRS, Divide and Conquer technique is comprised of followings:
Divide: Break or divide the problem into the number of subproblems which are smaller instances of the same problem.
Conquer: Continue breaking the subproblems into smaller subproblems. If the subproblem size is small enough then solve it in a straightforward manner.
Combine: the solutions of the subproblems into the solution for the original subproblems.
Binary Search repeatedly divides the problem into two subproblems that may contain the item. It discards one of the subproblems by utilising the fact that items are sorted. It continues halving the subproblems until it finds the item or it narrows down the list to a single item. Since binary search discards the subproblems because of that itās pseudo Divide & Conquer algorithm. Hereās the pseudocode for binary search:
1. LET A is the list of items of size N and k is the key item.
2. LET min = 1 and max = N.
3. Repeat step 4, 5, 6 TILL min <= max.
4. LET mid = (min + max) / 2.
5. IF A[mid] === k THEN RETURN 'Item Found.'
6. IF A[mid] < key THEN SET min = mid + 1 ELSE SET max = mid - 1.
7. RETURN 'Item not found'.
Hereās the JavaScript implementation for the same:
The beauty of binary search is that in every iteration if it doesnāt find the key item, then it simply reduces the list to at least itās half for the next iteration which makes a huge difference when we talk about its time complexity. Letās consider worst-case scenario for an array of size 32. In the first iteration, we do not find the key element so we cut down the array length to 16. In next iteration, again we do not find the key element so again we cut it down to 8, then to 4, then to 2, and then to 1. Hereās the visual representation of Binary Search and Linear Search:
Complexity
The best case of binary search is when the first comparison/guess is correct(the key
item is equal to the mid
of array). It means, regardless of the size of the list/array, weāll always get the result in constant time. So the best case complexity is O(1).
In worst case scenario, the binary search finds the item at the end of the list/array when itās narrowed down to a single item. As we saw earlier with the list of size 32, to narrow it down to a single item we make at least 5 comparisons.
32/2 --> 16 // 1
16/2 --> 8 // 2
8/2 --> 4 // 3
4/2 --> 2 // 4
2/2 --> 1 // 5Total = 5 guesses/comparisons.
If we double the size to 64 then only one more comparison is extra made.
64/2 --> 32 // 1
32/2 --> 16 // 2
16/2 --> 8 // 3
8/2 --> 4 // 4
4/2 --> 2 // 5
2/2 --> 1 // 6Total = 6 guesses/comparisons.
If we look at the comparison and size of the input we see a pattern here eg. 32 = 2āµ and 64 = 2ā¶. Letās assume n
is the size of input and k
is the number of comparisons then we can say that
Letās apply logarithm to both sides with base 2(you can read more about logarithms here).
By the help of the equation above we can make a blanket statement that the worst case complexity of binary search is O(log n).
Limitations
- Binary search can be applied iff items are sorted otherwise one must sort items before applying it. And we know that sorting is relatively an expensive operation.
- Binary search can only be applied to data structures which allow direct access to elements. If we do not have direct access then we can not apply binary search on it eg. we can not apply binary search to Linked List.
Conclusion
Binary search is a powerful tool to find an element in a large collection especially when the collection is sorted and allows direct access to its element. We learned that for best-case binary search runs with constant time i.e. O(1) while for average and worst-case its runtime is O(log n).