Longest String Chain

Patrick Leaton
Problem Description

You are given an array of words where each word consists of lowercase English letters.

wordA is a predecessor of wordB if and only if we can insert exactly one letter anywhere in wordA without changing the order of the other characters to make it equal to wordB.

  • For example, "abc" is a predecessor of "abac", while "cba" is not a predecessor of "bcad".

word chain is a sequence of words [word1, word2, ..., wordk] with k >= 1, where word1 is a predecessor of word2word2 is a predecessor of word3, and so on. A single word is trivially a word chain with k == 1.

Return the length of the longest possible word chain with words chosen from the given list of words.

 

Example 1:

Input: words = ["a","b","ba","bca","bda","bdca"]
Output: 4
Explanation: One of the longest word chains is ["a","ba","bda","bdca"].

Example 2:

Input: words = ["xbc","pcxbcf","xb","cxbc","pcxbc"]
Output: 5
Explanation: All the words can be put in a word chain ["xb", "xbc", "cxbc", "pcxbc", "pcxbcf"].

Example 3:

Input: words = ["abcd","dbqca"]
Output: 1
Explanation: The trivial word chain ["abcd"] is one of the longest word chains.
["abcd","dbqca"] is not a valid word chain because the ordering of the letters is changed.

 

Constraints:

  • 1 <= words.length <= 1000
  • 1 <= words[i].length <= 16
  • words[i] only consists of lowercase English letters.

 

The description was taken from https://leetcode.com/problems/longest-string-chain/.

Problem Solution

#O(L^2*N) Time, O(N) Space Where L is Length of Each Word and N is Count of Words
class Solution:
    def longestStrChain(self, words: List[str]) -> int:
        longest = 1
        words_set = set(words)
        words.sort(key = lambda x: len(x), reverse = True)
        seen = {}
       
        def bfs(word:str) -> int:
            chain = 1
            queue = collections.deque([(word)])
            while queue:
                width = len(queue)
                for _ in range(width):
                    word = queue.popleft()
                    for i in range(len(word)):
                        neighbor = word[:i] + word[i+1:]
                        if neighbor in words_set and neighbor not in seen:
                            seen[neighbor] = True
                            queue.append(neighbor)
                if queue:
                    chain += 1
            return chain
       
        for word in words:
            if len(word) < longest:
                break
            if word not in seen:
                longest = max(longest, bfs(word))
               
        return longest

Problem Explanation


This problem is very similar to Word Ladder.

Although, here we aren't replacing a character with another within a transformation sequence. 

Instead, we are removing a character to see if we can transform a word into another that exists within the input array.

To solve this, similar to Word Ladder, we can use a Breadth-First Search.  

We will sort the words list from the greatest length to the smallest and perform a BFS on each word.  Within the BFS, we will iteratively exclude every single character of the word and see if a predecessor exists so that we can create a chain.  Once the BFS has ended, we will compare the longest chain we were able to form from the root word to the running maximum.


Let's start by initializing our running longest chain counter, our words set for constant lookup times, sorting our words list by lengths in descending order, and our seen words dictionary.

        longest = 1
        words_set = set(words)
        words.sort(key = lambda x: len(x), reverse = True)
        seen = {}


Next, we will create our BFS function.

This function will take a root word and will return the longest chain we were able to form from that root word.

        def bfs(word:str) -> int:

 

Let's start by initializing our chain to one word, the root word, and pushing the root word onto our queue.

            chain = 1
            queue = collections.deque([(word)])

 

While the queue still has words to chain, we will continue our search.

            while queue:

 

We'll start each iteration of our BFS by saving the width of the queue.

This will let us know how many words are in the current layer, which will be useful since each additional layer will be an additional link in the longest word chain we can form from the root word.

                width = len(queue)

 

Then, for each word in the given layer, we will pop it from the queue and exclude each character to see if we have a neighbor word that exists in the given words.

                    word = queue.popleft()
                    for i in range(len(word)):
                        neighbor = word[:i] + word[i+1:]

 

If we do have a neighbor that is in the words set and this neighbor hasn't been seen in a previous chain before, we will mark it as seen and append it to the queue.

                        if neighbor in words_set and neighbor not in seen:
                            seen[neighbor] = True
                            queue.append(neighbor)

 

Once we have finished traversing a layer, if we still have words to chain in our queue, we will increment the chain count by one.

                if queue:
                    chain += 1

 

When we have emptied the queue, we will return the length of the chain we were able to form from the root word.

            return chain


Now that we have our BFS function built, we will begin utilizing it to find the longest chain.

For each word in words, if the length of the current word is less than the longest chain we have seen, we will break and stop our search.

The reason is, we are moving from the longest words to the shortest ones so if we have already found a longer chain than the length of the current word, we aren't going to be able to build any greater chain from here.  If our longest word contains six characters, our best-case scenario is a chain of six words.  If we found that and then arrived at a five-character chain, then it'd be impossible to create a longer chain.

This optimization will allow us to only consider the greatest chains and prune our search.

        for word in words:
            if len(word) < longest:
                break

 

If we are still looking for the longest chain, we will first check if the current word hasn't already been seen.

This allows us to consider chains starting from the root word because if the word has already been seen, then it was already included in a previous, longer chain so there would be no point to consider it a second time since we would have already taken its path all the way down to the last word in its word chain before.

This optimization also allows us to only consider the greatest chains to prune our search.

If the word has not been seen, we will run a BFS on it as the root word and compare the longest chain we were able to form from it to the current longest chain we have seen.

            if word not in seen:
                longest = max(longest, bfs(word))

 

After we have found our longest word chain, we will return its length.

        return longest