Kth Largest Element in a Stream
Statement
Given a stream of integers sorted or unsorted, create a class to find the kth
largest integer of the stream. It’s the kth
largest integer in sorted order, not the kth
distinct integer.
Constraints
- 1 ≤
k
≤ 10³ - 0 ≤
numbers.length()
≤ 10³ - -10³ ≤
numbers[i]
≤ 10³ - -10³ ≤
value
≤ 10³
Solution
"""
production algorithm
"""
from data_structures.heap.heap import Heap
class KthLargest:
def __init__(self, k, numbers):
"""
time complexity O(nlogk)
space complexity O(k)
"""
self.k = k
self.heap = Heap()
for i in range(len(numbers)):
self.add(numbers[i])
def add(self, value):
"""
time complexity O(logk)
space complexity O(k)
"""
self.heap.push(value)
if self.heap.size() > self.k:
self.heap.pop()
return self.heap.top()
"""
unit tests
"""
from unittest import TestCase
from algorithms.top_k_elements.kth_largest_element_of_stream.solution import KthLargest
class SolutionTestCase(TestCase):
def test_small_stream(self):
# Given
k = 7
nums = [7, 13, 29, 11, 23, 2, 19, 5, 21, 26, 17, 0, 14, 8, 25]
kth_largest = KthLargest(k, nums)
# When
actual = kth_largest.heap.top()
# Then
expected = 17
self.assertEqual(actual, expected)
def test_large_stream(self):
# Given
k = 7
nums = [34, 67, 250, 99, 301, 483, 76, 192, 456, 127, 348, 284, 13, 222, 401, 178, 69, 415, 230, 351, 82, 245, 300, 157, 60, 197, 412, 268, 91, 145, 377, 110, 44, 256, 309, 118, 35, 269, 325, 214, 275, 158, 71, 485, 48, 124, 205, 226, 286, 19,
342, 289, 223, 67, 434, 57, 143, 188, 241, 177, 122, 316, 159, 379, 248, 173, 31, 149, 96, 317, 95, 333, 290, 134, 245, 215, 400, 49, 392, 367, 14, 107, 284, 371, 173, 93, 483, 52, 199, 28, 259, 233, 64, 315, 445, 224, 161, 312, 389, 417]
kth_largest = KthLargest(k, nums)
# When
actual = kth_largest.heap.top()
# Then
expected = 417
self.assertEqual(actual, expected)
Strategy
Top K Elements.
Explanation
A min heap is used to hold the k
largest integers. At the top of the min heap, is the kth
largest integer of the stream. When a new integer comes in through the stream it gets added to the heap, and then the k+1th
largest integer gets popped from the heap, keeping the kth
largest integer at the top of the heap.
Time Complexity
Initializing the class takes an integer k
and an initial integer list numbers
of size n
. The min heap is filled up to size k
with the integers of numbers
. At worst, the min heap reaches its maximum size k
and then integers of numbers
continue to be pushed to and popped from the heap. Overall, the worst case time complexity to initialize the class is O(nlogk)
.
After initialization, the more integers of the stream get pushed to and popped from the heap. Since the maximum size of the heap is k
, then the time complexity to operate on just one integer with the min heap is O(k)
.
Space Complexity
The auxiliary space complexity depends on the size of the heap which is k
. Therefore, the auxiliary space complexity of the algorithm is O(k)
.