How to approach an algorithms/coding interview question

Published in

kesisenyollardernegi

4 min readMar 11, 2022

You are given an integer array nums and an integer k. Implement a function that returns the kth largest element in the nums array.

To better understand the task at hand, we should start by learning the constraints. This problem has at least three constraints:

a range for k
a range for n (length of nums)
a range for nums[i] (elements of nums)

These constraints are important because a tight constraint on one of the variables above may enable us to implement a faster or much simpler algorithm.

Let’s assume the interviewer reveals that n will be less than 10,000, nums[i] can be any integer, and k can be anything less than or equal to n.

Two most naive solutions that come to mind are:

Find the largest element. Remove it from the array. Repeat that k times and return the last found element.
Sort the array and return kth element from the end.

Two main components (“Find the largest element” and “Remove it from the array”) of solution (1) are both linear operations. We can reduce the complexity of “Remove it from the array” to O(1) by swapping the largest element with the last element and then popping the last element. But even then, the overall complexity will still be O(kn). Since we don’t have a tighter constraint on k than the one we have on n, our worst case complexity will be O(n²).

What about the second solution? Sorting the array is O(n log n) and accessing the n — kth element is O(1). So, we have O(n log n) complexity for the entire solution. On top of that, most languages have a built-in mechanism to sort integer arrays. So, the resulting code is a very clear one-liner. It looks something like this in Python:

With this solution, we are creating a sorted copy of the nums array. So, we need O(n) extra space. Now is a good time to ask if we are allowed to modify the original nums array. If we are, then we can do the sorting in place and get rid of that extra O(n) space complexity.

We should also ask the interviewer what result is expected when the nums array is empty. Because our code will clearly fail in that case. We may get a guarantee about nums never being empty or we may be asked to handle that case separately.

At this point, the interviewer may ask if we can do any better than this, if k is guaranteed to be much smaller than n. If you think about it, we don’t need the entire array sorted. We only need the largest k elements sorted (smallest of them is the result). We can use a min-heap to keep the largest k elements (seen so far) sorted as we go through the nums array. Pushing an element into and popping the minimum element from a min-heap are both O(log k) operations (assuming that the min-heap has k elements in it).

You can learn more about heaps from here and you probably should because the interviewer may ask some follow up questions around how heaps achieve these time complexities.

With this solution, we go through the array and for each element, we do at most 2 heap operations. Which makes our time complexity O(n log k). We will also need O(k) extra space for the heap.

At this point, we a have a decent solution. If you still have time, the interviewer may further ask you if you can do any better than that? Well, if you think about it (again), we don’t even need the largest k elements sorted. We just need to know the largest k elements. They don’t have to be sorted. Because their minimum will be the result. If we can find the largest k elements in less than O(n log k) time, then we will have a faster solution.

An algorithm known as quickselect can do that. The idea is not that hard. We pick a random element (called pivot) from the array. Then, we partition array into three:

elements that are greater than the pivot
elements that are equal to the pivot
elements that are less than the pivot

If the kth index falls into partition (1), then the result should be larger than the pivot. So, we should narrow our search space into partition (1).
If the kth index falls into partition (3), then the result should be smaller than the pivot. So, we should narrow our search space into partition (3).
If the kth index falls into partition (2), then the result is equal to the pivot.

We keep narrowing down our search space by the algorithm above. If we have 1 element left in our search space, then it should be our result.

With this quickselect approach, we reduce the search space at every step. To keep things simple, let’s assume we half the search space at each step. Each step’s complexity is linear. So, we have n + n/2 + n/4 + n/8 + … = n (1 + 1/2 + 1/4 + 1/8 + …) = 2n operations in total. This gives us O(n) average time complexity. Every operation is in-place. So, we have O(1) extra space complexity.

But this solution has a drawback. If we pick our pivot badly, we may reduce our search space just by one element at each step. That means n + n-1 + n-2 + n-3 + … = n(n-1)/2 operations. So, the worst case time complexity of quickselect becomes O(n²).

To accommodate this problem, there are better algorithms to use for pivot selection. A popular one is median of medians which is explained here and here. A high level understanding is highly recommended in case your interviewer wants you to back your linear time complexity claim.

Notes

Three-way partitioning used in quickselect is better explained here

How to approach an algorithms/coding interview question

Notes

Written by Can Guler