Delete Operation for Two Strings

Patrick Leaton
Problem Description

Given two strings word1 and word2, return the minimum number of steps required to make word1 and word2 the same.

In one step, you can delete exactly one character in either string.

 

Example 1:

Input: word1 = "sea", word2 = "eat"
Output: 2
Explanation: You need one step to make "sea" to "ea" and another step to make "eat" to "ea".

Example 2:

Input: word1 = "leetcode", word2 = "etco"
Output: 4

 

Constraints:

  • 1 <= word1.length, word2.length <= 500
  • word1 and word2 consist of only lowercase English letters.

 

The description was taken from https://leetcode.com/problems/delete-operation-for-two-strings/.

Problem Solution

#O(M*N) Time, O(M*N) Space
class Solution:
    def minDistance(self, word1: str, word2: str) -> int:
        dp = [[0 for col in range(len(word2)+1)] for row in range(len(word1)+1)]
       
        for i in range(1, len(word1) + 1):
            for j in range(1, len(word2) + 1):
                if word1[i-1] == word2[j-1]:
                    dp[i][j] = dp[i-1][j-1] + 1
                else:
                    dp[i][j] = max(dp[i-1][j], dp[i][j-1])
       
        total_chars = len(word1) + len(word2)
        same_chars = dp[-1][-1] * 2
        return total_chars - same_chars

Problem Explanation


This question is like a combination of Edit Distance and Maximum Length of Repeated Subarray.

If we are given two inputs and we are asked to find if those two inputs can be correlated, what the minimum amount of iterations it takes for one to be the other, or what amount of one exists in the other, things like that, typically that is an indication that Dynamic Programming can be applied.

If we can find a substring that exists in both strings, then we can get the minimum number of deletions by subtracting the total number of characters minus that common substring in both strings.


Let's create our dp table.

        dp = [[0 for col in range(len(word2)+1)] for row in range(len(word1)+1)]

 

The code above translates to "we want zeros in each column for as many characters are in the first word, and we will want as many rows as there are characters in the second word".

We will also want an additional index at the beginning of the first row and column to satisfy the base case if we are given two empty strings.  In that case, the minimum deletion count would be zero.

Next, let's begin building our DP table.

        for i in range(1, len(word1) + 1):
            for j in range(1, len(word2) + 1):

 

The choices for each subproblem will be within the top, top-left, and left indices during each iteration as follows:

Extend  Delete   
Delete  Current 

 

If we find two of the same characters in both strings, then we can extend the current longest common substring between these two words by incrementing the top-left subproblem answer by one.

                if word1[i-1] == word2[j-1]:

 

Otherwise, we will consider our two delete options, based on the maximum longest subsequence that we could yield.

If we have "tuna" and "lunar" as our inputs,  our choices would be to delete both "t" then "l", extend "u", "n", "a", then delete "r".

If our DP table is set up like this:

  " " l u n a r
" "             
t            
u            
n            
a            

 

Deleting from "lunar" would be extending the left neighbor cell.  Deleting from "tuna" would be extending the top neighbor cell.

                else:
                    dp[i][j] = max(dp[i-1][j], dp[i][j-1])

 

Once our traversal is finished, we will have the longest common subsequence count in the last cell of the DP table.

We can then take the total number of chars that we have, subtract that by the longest common subsequence in both strings, and that will give us our answer.

        total_chars = len(word1) + len(word2)
        same_chars = dp[-1][-1] * 2
        return total_chars - same_chars