Kth Largest Element in a Stream

Ethan Davis
Data Structures and Algorithms DSA
3 min readApr 3, 2024
Data Structures and Algorithms

Statement

Given a stream of integers sorted or unsorted, create a class to find the kth largest integer of the stream. It’s the kth largest integer in sorted order, not the kth distinct integer.

Constraints

  • 1 ≤ k ≤ 10³
  • 0 ≤ numbers.length() ≤ 10³
  • -10³ ≤ numbers[i] ≤ 10³
  • -10³ ≤ value ≤ 10³

Solution

"""
production algorithm
"""

from data_structures.heap.heap import Heap


class KthLargest:
def __init__(self, k, numbers):
"""
time complexity O(nlogk)
space complexity O(k)
"""
self.k = k
self.heap = Heap()
for i in range(len(numbers)):
self.add(numbers[i])

def add(self, value):
"""
time complexity O(logk)
space complexity O(k)
"""
self.heap.push(value)
if self.heap.size() > self.k:
self.heap.pop()
return self.heap.top()
"""
unit tests
"""

from unittest import TestCase
from algorithms.top_k_elements.kth_largest_element_of_stream.solution import KthLargest


class SolutionTestCase(TestCase):
def test_small_stream(self):
# Given
k = 7
nums = [7, 13, 29, 11, 23, 2, 19, 5, 21, 26, 17, 0, 14, 8, 25]
kth_largest = KthLargest(k, nums)

# When
actual = kth_largest.heap.top()

# Then
expected = 17
self.assertEqual(actual, expected)

def test_large_stream(self):
# Given
k = 7
nums = [34, 67, 250, 99, 301, 483, 76, 192, 456, 127, 348, 284, 13, 222, 401, 178, 69, 415, 230, 351, 82, 245, 300, 157, 60, 197, 412, 268, 91, 145, 377, 110, 44, 256, 309, 118, 35, 269, 325, 214, 275, 158, 71, 485, 48, 124, 205, 226, 286, 19,
342, 289, 223, 67, 434, 57, 143, 188, 241, 177, 122, 316, 159, 379, 248, 173, 31, 149, 96, 317, 95, 333, 290, 134, 245, 215, 400, 49, 392, 367, 14, 107, 284, 371, 173, 93, 483, 52, 199, 28, 259, 233, 64, 315, 445, 224, 161, 312, 389, 417]
kth_largest = KthLargest(k, nums)

# When
actual = kth_largest.heap.top()

# Then
expected = 417
self.assertEqual(actual, expected)

Strategy

Top K Elements.

Explanation

A min heap is used to hold the k largest integers. At the top of the min heap, is the kth largest integer of the stream. When a new integer comes in through the stream it gets added to the heap, and then the k+1th largest integer gets popped from the heap, keeping the kth largest integer at the top of the heap.

Time Complexity

Initializing the class takes an integer k and an initial integer list numbers of size n. The min heap is filled up to size k with the integers of numbers. At worst, the min heap reaches its maximum size k and then integers of numbers continue to be pushed to and popped from the heap. Overall, the worst case time complexity to initialize the class is O(nlogk).

After initialization, the more integers of the stream get pushed to and popped from the heap. Since the maximum size of the heap is k, then the time complexity to operate on just one integer with the min heap is O(k).

Space Complexity

The auxiliary space complexity depends on the size of the heap which is k. Therefore, the auxiliary space complexity of the algorithm is O(k).

Links

--

--