Remove All Occurrences of a Substring

Patrick Leaton
Problem Description

Given two strings s and part, perform the following operation on s until all occurrences of the substring part are removed:

  • Find the leftmost occurrence of the substring part and remove it from s.

Return s after removing all occurrences of part.

substring is a contiguous sequence of characters in a string.

 

Example 1:

Input: s = "daabcbaabcbc", part = "abc"
Output: "dab"
Explanation: The following operations are done:
- s = "daabcbaabcbc", remove "abc" starting at index 2, so s = "dabaabcbc".
- s = "dabaabcbc", remove "abc" starting at index 4, so s = "dababc".
- s = "dababc", remove "abc" starting at index 3, so s = "dab".
Now s has no occurrences of "abc".

Example 2:

Input: s = "axxxxyyyyb", part = "xy"
Output: "ab"
Explanation: The following operations are done:
- s = "axxxxyyyyb", remove "xy" starting at index 4 so s = "axxxyyyb".
- s = "axxxyyyb", remove "xy" starting at index 3 so s = "axxyyb".
- s = "axxyyb", remove "xy" starting at index 2 so s = "axyb".
- s = "axyb", remove "xy" starting at index 1 so s = "ab".
Now s has no occurrences of "xy".

 

Constraints:

  • 1 <= s.length <= 1000
  • 1 <= part.length <= 1000
  • s​​​​​​ and part consists of lowercase English letters.

 

The description was taken from https://leetcode.com/problems/remove-all-occurrences-of-a-substring/.

Problem Solution

#O(N) Time, O(N) Space
class Solution:
    def removeOccurrences(self, s: str, part: str) -> str:
        stack = []
        removal = part[::-1]
       
        for char in s:
            stack.append(char)
            if len(stack) >= len(part):
                delete = 0
                bucket = []
                while stack and stack[-1] == removal[delete % len(removal)]:
                    bucket.append(stack.pop())
                    delete += 1
                    if len(bucket) == len(removal):
                        bucket = []
                if bucket:
                    stack += bucket[::-1]
               
        return "".join(stack)

Problem Explanation


A brute force method would be to traverse the string for the part that we need to remove, remove it when we find it, then look through the string again to handle the scenario where the removal of that part caused a new part to form at the indices before and after the part we just removed.

aabcbc -> abc

The problem with that is we'd have to traverse the entire string to look for new parts each time we delete one until there are none remaining.  This would require quadratic time.

We may notice that this motion is kind of like playing Candy Crush or Tetris.

If we look at the second example, intuitively we see that we would have to remove xy then we'd look left and see that we have an x, right to see that we have a y, and see that we'd have another xy to remove.

That intuition is last in first out, LIFO.  

We look at the last character we traversed, look at the current character and see that this is the part we need to remove.

To fulfill that intuition, we can utilize a stack. 

To solve this problem, we can store each character into a stack.  If we see that the top of the stack is the last character of the removal part, we'll enter a loop where we compare the current top character of the stack to the current character of the removal part.  If the characters are the same, we'll pop the character off the stack and place it into a temporary array.  If that temporary array becomes the same length as the removal part then we will delete it, removing the part from our output string.

This way, if we had the prefix of the removal part in the stack prior to popping off a consecutive removal part, we will immediately be able to remove it once we add the suffix in the next iteration.  This will allow us to only have to traverse the string once.

c    
b  
 
a

 

When we finish traversing the string, the output will be the characters remaining in the stack.


Let's start by initializing our stack.

        stack = []

 

Next, we will create a helper string for us of the removal part in reverse.  

When we have the removal part in our stack, it is going to be in reverse when we're popping it off.  This will help us for comparisons between the removal part and the top of the stack.

        removal = part[::-1]

 

After, we'll traverse the string.

        for char in s:

 

At the beginning of the iteration, we will append the current character to the stack and then we will see if the stack is greater than or equal to the length of the removal part.

If it isn't then there is no possible way we have characters to remove yet so we'll disregard the stack for now.

            stack.append(char)
            if len(stack) >= len(part):

 

Otherwise, we will create a deletion pointer that we will use to traverse the removal part and also a bucket.

                delete = 0
                bucket = []

 

While we have characters in the stack and the stack's top character is the same as the current character in the removal part at the deletion pointer, we will pop the top character from the stack and place it into the bucket.

                while stack and stack[-1] == removal[delete % len(removal)]:
                    bucket.append(stack.pop())
                    delete += 1

 

The modulo operator is just a trick that forces the deletion pointer to wrap around if it becomes greater than the length of the removal part.  We could have also just added a break statement within the loop for this scenario instead.

If we followed the removal chain of characters down the stack and now the length of our bucket is the size of the string that we need to remove, then the removal string has to be in our bucket so we will empty it.

                    if len(bucket) == len(removal):
                        bucket = []

 

If we have any remaining characters in our bucket after we find a different character between the removal chain and the top of our stack, we will put the characters within the bucket back onto a stack, reversed since their order was reversed when we popped them off of the stack.

                if bucket:
                    stack += bucket[::-1]

 

After we have traversed the string, we will join the stack characters together to form a string and return that string.

        return "".join(stack)