Minimum Window Substring

Ethan Davis

Published in

Data Structures and Algorithms DSA

3 min readMay 25, 2024

Statement

Given two strings p and q find the minimum substring of p such that it has the following properties:

It’s the shortest substring of p that includes all the characters of q
It contains at least the same frequency of each character of q
The order of the characters doesn’t matter

Constraints

p and q consist of uppercase and lowercase English characters
1 ≤ p.length(), q.length() ≤ 10³

Solution

"""
production algorithm
"""

from collections import defaultdict, Counter


class Solution:
    def minimum_window_substring(self, string1, string2):
        """
        time complexity O(n)
        space complexity O(1)
        """
        if len(string1) < len(string2):
            return str()

        m, n = len(string1), len(string2)
        low, high = 0, 0
        frequencies, visited, total = Counter(string2), defaultdict(int), 0
        length, result = float("inf"), str()

        # step foward high index
        for high in range(m):
            if string1[high] in frequencies:
                if visited[string1[high]] < frequencies[string1[high]]:
                    total += 1
                visited[string1[high]] += 1

            # found match
            if total == n:

                # step forward low index
                while total == n:
                    if string1[low] in frequencies:
                        if visited[string1[low]] == frequencies[string1[low]]:
                            total -= 1
                        visited[string1[low]] -= 1
                    low += 1

                # found local minimum window substring match
                if high - low + 2 < length:
                    length = high - low + 2
                    result = string1[low-1:high+1]

        return result

"""
unit tests
"""

from unittest import TestCase
from algorithms.sliding_window.minimum_window_substring.solution import Solution


class SolutionTestCase(TestCase):
    def test_shorter_first_string(self):
        # Given
        string1 = "azxeytjmnkwhzbdloifg"
        string2 = "azxeytjmnkwhzbdloifgazxeytjmnkwhzbdloifg"
        solution = Solution()

        # When
        actual = solution.minimum_window_substring(string1, string2)

        # Then
        expected = str()
        self.assertEqual(actual, expected)

    def test_zero_matches(self):
        # Given
        string1 = "azxeytjmnkwhzbdloifg"
        string2 = "aaaaaaaaaaaaaaaaaaaa"
        solution = Solution()

        # When
        actual = solution.minimum_window_substring(string1, string2)

        # Then
        expected = str()
        self.assertEqual(actual, expected)

    def test_one_match(self):
        # Given
        string1 = "azxeytpzqrwhzbdloifg"
        string2 = "pqr"
        solution = Solution()

        # When
        actual = solution.minimum_window_substring(string1, string2)

        # Then
        expected = "pzqr"
        self.assertEqual(actual, expected)

    def test_two_matches_return_frist(self):
        # Given
        string1 = "azpzqrjmnkwhzpzqrifg"
        string2 = "pqr"
        solution = Solution()

        # When
        actual = solution.minimum_window_substring(string1, string2)

        # Then
        expected = "pzqr"
        self.assertEqual(actual, expected)

    def test_two_matches_return_second(self):
        # Given
        string1 = "apzqrtjmnkwhzbpqrifg"
        string2 = "pqr"
        solution = Solution()

        # When
        actual = solution.minimum_window_substring(string1, string2)

        # Then
        expected = "pqr"
        self.assertEqual(actual, expected)

    def test_worst_case(self):
        # Given
        string1 = "aaaaaaaaaaaaaaaaaaaz"
        string2 = "z"
        solution = Solution()

        # When
        actual = solution.minimum_window_substring(string1, string2)

        # Then
        expected = "z"
        self.assertEqual(actual, expected)

Strategy

Sliding Window.

Explanation

Before iteration of the characters of the first string, create character frequency hash maps for the first and second strings. Low and high indexes are used to record a window of characters of the first string throughout iteration.

For each iteration of the characters of the first string, bump the high index, and record the frequency of the character at the high index. If all character frequencies of the second string have been iterated in the window of the first string, then bump the low index, until the character frequencies in the window of the first string and the character frequencies of the second string no longer match. Record the window of characters of the first string if it’s the shortest substring visited so far.

After iteration of the characters of the first string, return the shortest substring recorded.

Time Complexity

At worst the low and high indexes iterate the length of the first string. Therefore the time complexity of the algorithm is O(n).

Space Complexity

The auxiliary space complexity of the algorithm is O(1) since at most 52 characters are kept in the character frequency hash maps.

Minimum Window Substring

Statement

Constraints

Solution

Strategy

Explanation

Time Complexity

Space Complexity

Links

data_structures_algorithms/algorithms/sliding_window/minimum_window_substring at main ·…

Contribute to davisethan/data_structures_algorithms development by creating an account on GitHub.

Written by Ethan Davis