Algorithm Question Categories — Part 9. Binary Search

Aqua Education
Web Architects
Published in
5 min readJan 9, 2024

In How I Prepare for Algorithm Interviews, I talked about the importance of categorizing the questions. This is the ninth part of my Algorithm Question Categories series. In this article, I want to talk about binary search questions.

Before we start this category, I should make it clear that it’s very important that you should distribute your time evenly for all categories rather than only focusing on a few of them. The goal of practicing is not to increase the chance of encountering a question you practiced before, but rather to get familiar with the categories and the tactics so that you can quickly identify the category of a question and apply the tactics accordingly.

Binary search questions might be the category of questions most people tend to overlook or underestimate. That’s either because you might not run into a lot of binary search questions or because you think binary search questions are easy. The code structure is simple, you define two variables i and j at each end of an array respectively. You defined a while loop that terminates when i and j pass each other, and in the while loop, you compare the middle number with the target, if they equal, you find the target. If the middle number is greater, the target is in the left half and you update j. Otherwise, the target is in the right half and you update i. That’s indeed a very simple piece of code. But if you have seen enough binary search questions, you wouldn’t think binary search questions are easy. In the following, I’m going to go over three binary search questions, each representing a level of difficulty, from easy to difficult.

Level 1. Find whether a number is in a sorted list.

This is the easiest level. It can be solved using the original form of binary search, i.e. if the middle number is equal to the target, you find the number and return. If the middle number is greater than the target, you look into the left part. Otherwise, you look into the right half. An example is Search a 2D matrix.

Level 2. Find the insertion location.

This type of binary search asks you directly or indirectly about the insertion location of a number within an array. The tricky part of Level 2 is the number might already exist in the array, so you need to pick the insert location before or after all its existing occurrences according to the requirement. Let’s go over the question Find first and last position of element in sorted array. This is a perfect example as it asks you to find the first and last position of a number in a sorted array using O(lgn). Basically, you need to apply binary search twice. The first time, you find its insertion location before all its existing occurrences in the array, and the second time find the insertion location after all its occurrences. Whether it’s finding the insertion location before or after existing occurrences only differs at one place in the code, i.e. whether you should move i or j when the middle number equals to target. If you want to find the insertion location before, you update j. Otherwise, you update i. When the while loop terminates, i is the insertion location. You should memorize this as it can come in handy during your coding interview. The code difference is shown below.

# Insert before all existing occurrences
while i<=j:
m = (i+j)//2
if nums[m] >= target: # <--------- greater than or equal to
j=m-1
else:
i=m+1
# Insert after all existing occurrences
while i<=j:
m = (i+j)//2
if nums[m] > target: # <--------- greater than
j=m-1
else:
i=m+1

In the code for this question, you can create a helper method so you don’t need to write the binary search code twice. The helper method is shown below.

def findInsertionLocation(self, nums, target, isBefore):
l = len(nums)
i, j = 0, l-1

while i<=j:
m = (i+j)//2
if nums[m] > target or (nums[m]==target and isBefore):
j=m-1
else:
i=m+1

return i

Homework

Try to solve the question Random pick with weight and let me know if it’s insertion before or after.

Level 3

This level represents the hardest binary search questions. It’s difficult because it’s really hard to identify them as binary search questions at first glance. One great example is Capacity to ship packages within d days. The description is already convoluted enough. Without seeing the editorial, I’ll never be able to figure out that it can be solved by binary search. Now let’s dive into the solution. The first thing you need to figure out is if you sum all the weights, and use it as capacity, you can ship all of them in 1 day, so this sum can be used as the high in the binary search. How about the low? The second thing we need to figure out is that the capacity of the ship cannot be less than the max of the weights. So we use the max weight as the low. We also need to write a helper method that given the capacity and the weights, returns how many days it takes to ship. In the main method, we just need to do a binary search. Another tricky part of this question is the capacity is sorted in increasing order, but the days needed to ship are in decreasing order. Take weights = [3,2,2,4,1,4], days = 3 as an example, below is an illustration of the capacities and their corresponding days to ship.

As you can see, there might be multiple capacities that have the same days to ship, and you need to find the first location and the corresponding capacity is the answer we need. In this example, we should return 6. When comparing the middle days to ship with the target days, we can divide it into two situations. If the middle days to ship > target days, that means the capacity is not big enough, so we need to increase the capacity by updating low. If the middle days to ship ≤ target days, there might be lower capacity in the first half that has days to ship ≥ target days, so we should keep looking into the first half. It’s possible that the current middle capacity is the lowest one we should return and the first half doesn’t contain any lower capacity. This situation is also covered by our code as it will guarantee that low will eventually end at the correct capacity. Below is the code for the solution.

class Solution:
def shipWithinDays(self, weights: List[int], days: int) -> int:
high = sum(weights)
low = max(weights)
while low <= high:
m = (low+high) // 2
daysToShip = self.getDaysToShip(m, weights)
if daysToShip > days:
low = m+1
else:
high = m-1

return low

def getDaysToShip(self, capacity, weights):
days = 0
weight = 0
for w in weights:
if weight + w <= capacity:
weight+=w
else:
weight = w
days+=1
if weight > 0:
days+=1

return days

--

--