Longest Substring without Repeating Characters

Patrick Leaton
Problem Description

Given a string s, find the length of the longest substring without repeating characters.

Example 1:

Input: s = "abcabcbb"
Output: 3
Explanation: The answer is "abc", with the length of 3.

Example 2:

Input: s = "bbbbb"
Output: 1
Explanation: The answer is "b", with the length of 1.

Example 3:

Input: s = "pwwkew"
Output: 3
Explanation: The answer is "wke", with the length of 3.
Notice that the answer must be a substring, "pwke" is a subsequence and not a substring.

Example 4:

Input: s = ""
Output: 0

 

The description was taken from https://leetcode.com/problems/longest-substring-without-repeating-characters/.

Problem Solution

#O(N) Time, O(K) Space where K is length of longest substring.
class Solution:
    def lengthOfLongestSubstring(self, s: str) -> int:    
        left, right = 0, 0

        longest = 0
        window = {}

       
        while right < len(s):
            if s[right] in window:
                window[s[right]] += 1
            else:
                window[s[right]] = 1
            right += 1
           
            distance, unique = right - left, len(window)
           
            while unique < distance:
                window[s[left]] -= 1
                if window[s[left]] == 0:
                    del window[s[left]]
                left += 1
                distance -= 1
               
            longest = max(longest, distance)
       
        return longest

Problem Explanation


This question is very similar to Longest Substring with At Most K Distinct Characters and Longest Substring with At Most Two Distinct Characters.  Similar to those, the question requires us to return the length of a specific substring that satisfies a certain constraint.

When we see that requirement, we know that a sliding window may be a good approach.

What we can do is have two pointers that signify the left-bound and right-bound of our sliding window.  We will use a hashmap to track the distinct characters along with their count.  If we have a wider window than we have distinct characters, we know that we need to shrink the left side of the sliding window.

That is the general template for sliding window problems, we will expand the right side until our window is no longer valid in which case we will shrink the left.  In this case, the window will no longer be valid if the count of distinct characters within the window is shorter than our window.


Let's start by initializing our left and right pointers to zero, the first index of the string.

        left, right = 0, 0

 

We will set our running, longest substring counter to a zero.  That would be the case if we had an empty substring.

        longest = 0

 

We will also use a hashmap to signify the window.  The hashmap will contain each distinct character within the window and its frequency.

        window = {}

 

Now let's begin traversing the string.

We will continue until the right pointer has gone past the end of the string.

        while right < len(s):

 

Just like other sliding window problems, our right pointer will be in charge of throwing values into the window.  The left will be in charge of evicting values from it.

Let's throw the character at the right pointer into the window then expand.

            if s[right] in window:
                window[s[right]] += 1
            else:
                window[s[right]] = 1
            right += 1

 

We can continue expanding until our window does not satisfy the given constraint, when we have repeated characters in our window.  We would know if we do by calculating the distance between pointers to find how many characters are in the sliding window and then comparing that to the hashmap.  If we have five characters between pointers but only four in the hashmap, that means that one of the characters is repeated due to the Pigeonhole Principle.

Let's calculate those now.

            distance, unique = right - left, len(window)
 

If that is the case, we will need to start shrinking it before we expand again.

While the number of unique characters in the window is less than the size of the window, we will continuously remove the character at the left pointer from the window by decrementing the frequency of the character within our hashmap and incrementing the left pointer.

Remember, if we leave character keys in the hashmap when their frequency gets decremented to zero, that character will stay in our window still and will increase the length.  To fix this, we will check if the character frequency is zero and delete it from the hashmap if it is.

            while unique < distance:
                window[s[left]] -= 1
                if window[s[left]] == 0:
                    del window[s[left]]
                left += 1

 

We will also decrement the distance value instead of recalculating it within every iteration.

                distance -= 1

 

Once we have made sure our window is of the given constraint, we will test it against the current longest substring without repeating characters.

            longest = max(longest, distance)

 

Once we have slid our window across the entire string, we will return the longest substring we found of the given constraint.

        return longest