Leetcode 159 — Longest Substring with At Most Two Distinct Characters

Yanxi Chen
LeetNotes
Published in
2 min readFeb 21, 2019

The problem can be found here. This is essentially a duplicate for Leetcode 904 — Fruit Into Baskets.

With this problem, I try to take my solution (which takes O(n) time and O(n) space) and optimize it.

Dynamic Programming Inspired

An important lesson I’ve learned from trying to tackle DP problems on my own is that very often, we do not have to take a greedy approach as we do for the knapsack problem, where the last updated result is the final result. Instead, we can record a result at every index, and at the last step, take the maximum of all the results, and still keep the time complexity at O(n).

For our problem, my initial thought is to keep a record of the length of the longest substring ending at each index (including that index) as we iterate through the input string, and finally take the maximum of all records. Essentially, I’m treating it like the Maximum Subarray question.

At each new index, we use a hashmap to keep track of the characters we’ve seen, as well as the most recent index of the characters we’ve seen. This way we know where to break off the substring when the new character we encounter exceeds the 2 distinct character limit.

The code can be found below:

chs = {}
dp = [0 for _ in s]
for i, ch in enumerate(s):
if ch in chs:
chs[ch] = i
dp[i] = dp[i-1] + 1
else:
if len(chs) < 2:
dp[i] = dp[i-1] + 1
else:
break_i, break_ch = float('inf'), ''
for key in chs.keys():
if chs[key] < break_i:
break_i, break_ch = chs[key], key
dp[i] = i - break_i
del chs[break_ch]
chs[ch] = i
return max(dp) if dp else 0

DP Optimization (Sliding Window)

You may notice that the way we accumulate the dp array in the solution above stays consistent; we simply increment it by 1 until we see a character that exceeds the 2 distinct characters, at which point we update the length to count from the leftmost index that satisfies the 2 distinct character constraint. Therefore it is not necessary to keep the dp array at all; we can simply keep a maximum result, and compare results at each breakpoint against it and update it if necessary.

The optimized code is as follows:

chs = {}
max_length, curr_length = 0, 0
for i, ch in enumerate(s):
if ch in chs:
chs[ch] = i
curr_length += 1
else:
if len(chs) < 2:
curr_length += 1
else:
break_i, break_ch = float('inf'), ''
for key in chs.keys():
if chs[key] < break_i:
break_i, break_ch = chs[key], key
max_length = max(max_length, curr_length)
curr_length = i - break_i
del chs[break_ch]
chs[ch] = i
return max(max_length, curr_length)

As it turns out, this is what they call a “Sliding Window” approach. Because we no longer keep a record at each index, the space complexity has been reduced from O(n) to O(1).

--

--