Word Break

Ethan Davis
Data Structures and Algorithms DSA
2 min readApr 24, 2024
Data Structures and Algorithms

Statement

Given a string s, and a dictionary of strings d, check if s can be segmented into space-separated sequence of one or more dictionary words. If yes, then return true, otherwise return false. The same word in the dictionary may be used multiple times.

Constraints

  • 1 ≤ s.length() ≤ 250
  • 1 ≤ d.length() ≤ 1000
  • 1 ≤ d[i].length() ≤ 20
  • s and d[i] consist of only lowercase English characters

Solution

"""
production algorithm
"""


class Solution:
def word_break(self, string, dictionary):
"""
time complexity O(n^3)
space complexity O(n)
"""
n = len(string)
memo = [True if k == 0 else None for k in range(n + 1)]

for k in range(1, n + 1):
memo[k] = False
for i in range(k):
if string[i:k] in dictionary and memo[i]:
memo[k] = True

return memo[n]
"""
unit tests
"""

from unittest import TestCase
from algorithms.dynamic_programming.word_break.solution import Solution


class SolutionTestCase(TestCase):
def test_existent_solution(self):
# Given
string = "catsanddogsandbirds"
dictionary = {"cats", "and", "sand", "dogs", "birds"}
solution = Solution()

# When
actual = solution.word_break(string, dictionary)

# Then
expected = True
self.assertEqual(actual, expected)

def test_nonexistent_solution(self):
# Given
string = "catsandogsandbirds"
dictionary = {"cats", "and", "sand", "dogs", "birds"}
solution = Solution()

# When
actual = solution.word_break(string, dictionary)

# Then
expected = False
self.assertEqual(actual, expected)

Strategy

Dynamic Programming.

Explanation

A 1-D matrix, whose dimension is the length of the string, is maintained to hold the solution for lengths of the string seen so far. In the trivial case, where the length of the string is 0, the solution is true. Then, iteration of the length of the string occurs. The solution for smaller lengths of the string is used to find the solution for larger lengths of the string. Iteration ends when the solution has been found for the length of the input string.

Notice, that the problem has optimal substructure, i.e. the solution of larger lengths of the string can be found from smaller lengths of the string. Also notice, that the problem has overlapping subproblems, i.e. the solution of a smaller length of the string can be reused to find the solution of multiple larger lengths of the string. Since the problem has optimal substructure and overlapping subproblems, then dynamic programming can be used to solve the problem.

Time Complexity

The algorithm iterates the length of the string. For each iteration, the algorithm also iterates the length of the current string’s substrings. Then, for each inner-iteration, the current substring is sliced to see if it’s in the dictionary. Therefore, the time complexity of the algorithm is O(n^3).

Space Complexity

The auxiliary space complexity depends on the size of the matrix which is n. Therefore, the auxiliary space complexity of the algorithm is O(n).

Links

--

--