Minimum Window Subsequence

Ethan Davis
Data Structures and Algorithms DSA
3 min readMay 24, 2024
Data Structures and Algorithms

Statement

Given two strings s1 and s2 find the shortest substring of s1 such that s2 is a subsequence of that substring.

A substring is a contiguous sequence of characters in a string. A subsequence is a sequence of characters derived from another sequence of characters by deleting zero or more characters and without changing the order of the remaining characters.

For example, if s1 = abbcb and s2 = ac, then the shortest substring of s1 such that s2 is a subsequence of that substring is abbc.

If there’s no substring of s1 that covers all characters of s2, then return an empty string.

If there’s multiple minimum length substrings that meet the subsequence requirement, then return the substring with the lowest starting index.

Constraints

  • 1 ≤ s1.length() ≤ 2 × 10³
  • 1 ≤ s2.length() ≤ 100
  • s1 and s2 consist of uppercase and lowercase English letters

Solution

"""
production algorithm
"""


class Solution:
def minimum_window_subsequence(self, string1, string2):
"""
time complexity O(mn)
space complexity O(1)
"""
m, n = len(string1), len(string2)
i, j = 0, 0
length, result = float("inf"), str()

# step forward
while i < m:
if string2[j] == string1[i]:
j = j + 1
i = i + 1

# found match
if j == n:
high = i
i, j = i - 1, j - 1

# step backward to start of match
while 0 <= j:
if string2[j] == string1[i]:
j = j - 1
i = i - 1

i, j = i + 1, j + 1
low = i

# found new minimum window subsequence match
if high - low < length:
length, result = high - low, string1[low:high]

i = i + 1

return result
"""
unit tests
"""

from unittest import TestCase
from algorithms.sliding_window.minimum_window_subsequence.solution import Solution


class SolutionTestCase(TestCase):
def test_zero_matches(self):
# Given
string1 = "hzcxyvksqwmfzjpturlo"
string2 = "pqr"
solution = Solution()

# When
actual = solution.minimum_window_subsequence(string1, string2)

# Then
expected = str()
self.assertEqual(actual, expected)

def test_one_match(self):
# Given
string1 = "hzcpyqrsqwmfzjpturlo"
string2 = "pqr"
solution = Solution()

# When
actual = solution.minimum_window_subsequence(string1, string2)

# Then
expected = "pyqr"
self.assertEqual(actual, expected)

def test_two_matches_return_first(self):
# Given
string1 = "hzcpyqrsqwmfzjptqrlo"
string2 = "pqr"
solution = Solution()

# When
actual = solution.minimum_window_subsequence(string1, string2)

# Then
expected = "pyqr"
self.assertEqual(actual, expected)

def test_two_matches_return_second(self):
# Given
string1 = "hzcpyqrsqwmfzjtpqrlo"
string2 = "pqr"
solution = Solution()

# When
actual = solution.minimum_window_subsequence(string1, string2)

# Then
expected = "pqr"
self.assertEqual(actual, expected)

def test_two_matches_overlap(self):
# Given
string1 = "hzcppqrsqwmfzjpturlo"
string2 = "pqr"
solution = Solution()

# When
actual = solution.minimum_window_subsequence(string1, string2)

# Then
expected = "pqr"
self.assertEqual(actual, expected)

def test_worst_case(self):
# Given
string1 = "aaaaaaaaaaaaaaaaaaaa"
string2 = "aaa"
solution = Solution()

# When
actual = solution.minimum_window_subsequence(string1, string2)

# Then
expected = "aaa"
self.assertEqual(actual, expected)

Strategy

Sliding Window.

Explanation

Step forward with the character index of s1. If the current character of s2 equals the current character of s1, then also step forward with the character index of s2.

If the end of s2 is reached, then step backward with the character index of s1. If the current character of s2 equals the current character of s1, then also step backward with the character index of s2. When the start of s2 is reached, then record the shortest substring of s1 such that s2 is a subsequence of that substring.

Finally, repeat until the end of s1 is reached.

Time Complexity

In the worst case both s1 and s2 consist of the same repeated character. Then for each character of s1, an iteration for each character of s2 occurs. Therefore the time complexity of the algorithm is O(mn).

Space Complexity

The auxiliary space complexity of the algorithm is O(1).

Links

--

--