# Binary Search — Find K-th Smallest Pair Distance

LeetCode: https://leetcode.com/problems/find-k-th-smallest-pair-distance/

Given an integer array, return the k-th smallest **distance** among all the pairs. The distance of a pair (A, B) is defined as the absolute difference between A and B.

Letnumsbe the input array,kbe the targeted pair, andnbe the size of the input arraynums = [1, 3, 1], k = 2 -> d = 2

Explanation: (1, 1), (1, 3), (1, 3) or (3, 1)nums = [0, 0, 0, 0, 1, 1, 1, 2, 2, 2], k = 5 -> d = 0

Explanation: Of all the 45 pairs, know there are 12 pairs' distance are 0 and k < 12 so d = 0.

# Solution 1:

- Sort the input array
- Get numbers’ frequencies (will run faster if there are a lot of repeating numbers)
- Store n-1 sorted lists in a
**binary heap**(Refer to the explanation below) - Count how many pairs’ distance are 0 (repeating numbers).
**Merge**n-1 sorted lists- Use a while loop to
**pop**the smallest distance**push**the next smallest one from the same array onto binary heap if there is any element left - Return the
**k-th**smallest pair’s distance

Explanation: If we look at the 2nd example, we know that once the array is sorted the closest distance for a number at i-th index will always be (nums[i + 1] - nums[i]). We also know that there are(n - 1) * n / 2possible pairs. Each number but the last one in the array can pair with all the numbers after them. Knowing this fact, we can think of this problem as mergingn-1sorted lists. The list represents the distance of all the possible pairs that start from a particular number.1st list: 0, 0, 0, 1, 1, 1, 2, 2, 2 (@nums[0])

2nd list: 0, 0, 1, 1, 1, 2, 2, 2 (@nums[1])

.

.

.

6th list: 0, 1, 1, 1 (@nums[5])

.

.

.

9th list: 0 (@nums[8])

# Complexity Analysis:

Time Complexity:O((n + k)log(n))Sorting the array will takeO(n*log(n))and each iteration in the while loop to pop the smallest distance will takelog(n)as we always have at most(n-1)items in the binary heap. In the worst case, when there is no repeating number andkis too large,kwill be very close ton^2as the number of possible pairs =(n - 1) * n / 2 = (n^2 - n) / 2. Therefore, the runtime would actually becomeO(n^2log(n))for the worst case. Thenfromis dominated by the(n + k)k.Space Complexity:O(n)The memory required for this solution is directly proportional to the input size as we need an array with size of(n-1)to store binary heap.

Now let’s see if we can optimize the runtime and maybe space as well…

# Solution 2:

The goal is to improve the k from(n + k)in solution 1’s runtime. Sorting will always takeO(nlog(n))so we know that there is nothing we can do to improve n. Let's focus on k here. Obviously, we want to prevent it from becomingn^2when it gets too large, to solve the question without the need to go over k pairs in other words. If we look at the numbers' correlation after sorting, we realize that we can immediately get the maximum distance(nums[-1] - nums[0])in constant time. We cannot really get the minimum distance in constant time. However, we know for a fact that the smallest it could be is 0. Thus, we know the distance we're looking for will certainly fall in the range of 0 to(nums[-1] - nums[0]), and so far gathering these informations only takesO(nog(n)).nums = [0, 0, 0, 0, 1, 1, 1, 2, 2, 2], k = 25Let's use this example again. Now instead of using binary heap and popping k smallest elements, which the runtime will depend on k, we can try another approach with the informations we just get. The maximum distance is 2 so we know the all the possible distances are[0, 1, 2]. WHAT IFwe can efficiently calculate how many pairs' distance are <= to a specific distance we choose from the range above. This is where the window sliding technique comes in handy, having two pointers, one fast and the other slow, to go through the entire array in linear time. For instance, we start from distance 1. See the illustration below. 1. Start from the 0th index 0 0 0 0 1 1 1 2 2 2

i j2. Increment j untilnums[j] - nums[i] > kand decrementjby 1 0 0 0 0 1 1 1 2 2 2

i j3. # of pairs <= 1 @i-thindex =(j - i + 1)4. Incrementiby 15. Repeat 2-5 againThis window sliding technique will run2*nto get the result but onlynwill change proportionally based on the input size so this technique will only takeO(n). Now we have all the possible distances and a efficient way to calculate# of pairs <= d. It is only searching left to do. Given a sorted array (distance array), we can perform a binary search that will only take O(log(n)) rather than O(n) for linear search. In each search, calculating # of pairs <= the selected distance takesO(n). Therefore, letdbe the size of the distance array. The totalruntimecomes toO(nlog(n))for sorting plusO(nlog(d))for searching. As for thememory, it will be optimized toO(1)since no extra memory is needed.

That sounds like a perfect solution for this problem, BUT… can we do better than that???

# Optimizing Window Sliding

Previously, we mention that window sliding takes O(n) and we know there is no way we could bring it down to constant time. So the question is can we let it run slightly faster? Counter intuitively, what if we count # of pairs > the distance we choose instead of <=?

Say the distance is 1.1.istarts from the 0th index 0 0 0 0 1 1 1 2 2 2i,j2. Only incrementjwhennums[i] - nums[j] > 1and increment the total count by(n - i)every timejis incremented 0 0 0 0 1 1 1 2 2 2j i3.istops @ 4th index 0 0 0 0 1 1 1 2 2 2

j iBoth window sliding methods will runniterations but the difference is the 2nd pointer. In method 1, the fast pointerjstops @ 6th index wheni = 0and eventually go up to the last index which also takesniterations in total. On the other hand, for method 2, the slow pointerjstops @ 4th index and never gets incremented. Both are very optimal approach but generally speaking, method 2 will run slightly faster than method 1.

- Initialize the left and right bounds as 0 and len(nums)-1
- Use a while loop to binary search the distance
- Find # of pairs with distance <= the middle one

**Please leave a comment below if you have any suggestion/thought!**

**Thanks for reading :)**