Leetcode 854 K-Similar Strings

4 min readJun 29, 2019

https://leetcode.com/problems/k-similar-strings/solution/

This is a leetcode hard question, and I spent one whole day on this one. I tried two different ways to solve it:

Cycle Decomposition by BFS and traceback
BFS in Search Space

The former one is the reason why it took me so many time, and in the end, I still failed on some edge cases. The latter is pretty straightforward once we can find the optimal choice. Even though I still want to take some time to explain what is cycle decomposition. I hope this is useful for you.

Cycle Decomposition (Failed)

There is a mathematical idea behind this question. Each pair of an anagram is a pair of permutation and can be visualized by using a graph. This graph is called a permutation graph. We can solve this question based on the mathematical idea, but it turned out the solution is very complicated and includes many edge cases. But it’s still worth to mention how we can do that.

For example, we have A='abacadc', B='bcdaaca'. Let each character be a vertex in a graph and connect the edge from A to B, and then we have edges: A[0]->B[0],A[1]->B[1] and so on. This can be drawn as the following:

If at position i, A[i] = B[i], then there should be a self-connected edge. Besides this, notice that for each cycle inside the graph, for example, a->b->c->a, it requires us to swap at least edge-1 times to remove all non-self-connected edge. This is an important observation we need.

To solve this question, we want to transform the whole graph so that there is no non-self-connected edge, which means we need to find an optimal way to break down the graph into different cycles and then the times of swapping we need are the sum of edge-1 in each cycle we get.

The process of break down the graph into cycles is called cycle decomposition

where Ck is the lengths of the cycles in some cycle decomposition.

To make the #swap as small as possible, we need to find a way to do cycle decomposition so that we can maximize the number of cycles in a cycle decomposition. And this leads to a possible solution to the answer.

The core idea is to use BFS and traceback to find a minimal-sized cycle inside the graph, and once we find one we remove the edges of that cycle in the graph and try to find the second one.

However, after investing over 5 hours into this method, I still have some edge cases that cannot be solved. One of them is how to deal with multiple same-direction edges and when to stop adding edge into the queue of BFS. Anyway, I find it is not trivial to solve them and its kind of impossible to use this method in a real interviewing setting.

Explore in Search Space (Succeed)

Still, we have A='abacadc', B='bcdaaca'.

We want to transform B into A, let i be the first index that A[i]!=B[i], and we need to find j so that B[j]==A[i] where i<j and B[j]!=A[j]. Then we swap B[i] and B[j], plus one to the step we take, and we are one or two steps closer to the target A. The whole process could be visualized in a search tree space.

We know that for an n-size string B, we at most need to swap n–1 time to convert it to A. And there could be finite possible pair of (i, j) at each root and taking any one of them will contribute to one child of the original string B.

By exploring this search tree space, we try to find the optimal path to minimize the step(the depth of the tree). We can use DFS or BFS to solve this, and BFS would give a better time complexity.

So, yeah, we can use BFS to solve it. No cycle decomposition.

def kSimilarity(self, A: str, B: str) -> int:
    if B == A:
        return 0
    queue = deque()
    visited = set()
    queue.append({'value': B, 'idx': -1})
    visited.add(B)
    step = 0
    while queue:
        level_size = len(queue)
        for _ in range(level_size):                 
            node = queue.popleft()
            
            if node['value'] == A:
                return step
            
            for i in range(node['idx']+1, len(A)):
                if A[i] != node['value'][i]:
                    break
           
            new_lst = list(node['value'])
            for j in range(i+1, len(B)):  
                if new_lst[j] == A[i] and A[j] != new_lst[j]:
                    new_lst[j], new_lst[i] = new_lst[i], new_lst[j]
                    new_str = ''.join(new_lst)
                    if new_str not in visited:
                        visited.add(new_str)
                        queue.append({'value': new_str, 'idx': i})new_lst[j], new_lst[i] = new_lst[i], new_lst[j]
        step += 1return step

Leetcode 854 K-Similar Strings

Cycle Decomposition (Failed)

Explore in Search Space (Succeed)

Written by Jolyon