Subarrays with K Different Integers

Patrick Leaton
Problem Description

Given an integer array nums and an integer k, return the number of good subarrays of nums.

good array is an array where the number of different integers in that array is exactly k.

  • For example, [1,2,3,1,2] has 3 different integers: 12, and 3.

subarray is a contiguous part of an array.

 

Example 1:

Input: nums = [1,2,1,2,3], k = 2
Output: 7
Explanation: Subarrays formed with exactly 2 different integers: [1,2], [2,1], [1,2], [2,3], [1,2,1], [2,1,2], [1,2,1,2]

Example 2:

Input: nums = [1,2,1,3,4], k = 3
Output: 3
Explanation: Subarrays formed with exactly 3 different integers: [1,2,1,3], [2,1,3], [1,3,4].

 

Constraints:

  • 1 <= nums.length <= 2 * 10^4
  • 1 <= nums[i], k <= nums.length

 

 

The description was taken from https://leetcode.com/problems/subarrays-with-k-different-integers/.

Problem Solution

#O(N) Time, O(K) Space
class Solution:
    def subarraysWithKDistinct(self, nums: List[int], k: int) -> int:
       
        def sliding_window(k:int) -> int:
            left, right = 0, 0
            output = 0
            window = {}
            while right < len(nums):
                if nums[right] not in window:
                    window[nums[right]] = 1
                else:
                    window[nums[right]] += 1
                right += 1
                while len(window) > k:
                    window[nums[left]] -= 1
                    if window[nums[left]] == 0:
                        del window[nums[left]]
                    left += 1
                output += right - left + 1
            return output
       
        return sliding_window(k) - sliding_window(k-1)

Problem Explanation


This problem is very close to Longest Substring with At Most K Distinct Characters.

If we are looking for a specific subarray or substring that satisfies a specific constraint, typically a sliding window is a good approach.  The main distinction here is we aren't looking for the best substring, but all substrings that satisfy a specific constraint.

The only real tweak we will have to make to a traditional sliding window algorithm is we will not be comparing the current window to whatever maximum window size we have saved for the output, but instead, we will be incrementing the output by the range of the current sliding window since that will provide us new combinations of sublists within each additional window expansion.

There's also one mathematical trick that will be explained later.


First off, let's make a sliding window function since we will have to run this algorithm twice and will be able to reuse the code.

The function will take the number of distinct numbers that a sublist must contain and will return a sublist count.

        def sliding_window(k:int) -> int:

 

The rest of this function is a near-exact copy of Longest Substring with At Most K Distinct Characters, besides the output as mentioned in the beginning.

Let's start by initializing our left and right pointers to zero, the first index of the string.

            left, right = 0, 0

 

We will then initialize our output to zero.

            output = 0

 

We will also use a hashmap to signify the window.  The hashmap will contain each distinct number within the window and its frequency, this way we can know whether we have more than k distinct integers within the window and need to shrink it.

            window = {}

 

Now let's begin traversing the array.

We will continue until the right pointer has gone past the end of the window.

            while right < len(nums):

 

Just like other sliding window problems, our right pointer will be in charge of throwing values into the window.  The left will be in charge of evicting values from it.

Let's throw the number at the right pointer into the window then expand.

                if nums[right] not in window:
                    window[nums[right]] = 1
                else:
                    window[nums[right]] += 1
                right += 1

 

We can continue expanding until our window does not satisfy the given constraint, when we have more than k distinct numbers in our window.

Now we need to start shrinking it before we expand again.

While the number of distinct numbers in the window is greater than k, we will remove the number at the left pointer from the window by decrementing the frequency of the character in our hashmap and incrementing the left pointer.  But remember, if we leave number keys in the hashmap when their frequency gets decremented to zero, that number will stay in our window still.  To fix this, we will check if the number frequency is zero and delete it from the hashmap if it is.

                while len(window) > k:
                    window[nums[left]] -= 1
                    if window[nums[left]] == 0:
                        del window[nums[left]]
                    left += 1

 

Once we have made sure our window is of the given constraint, we will increment the output by the number of digits in our window. We are now including this many combinations of sublists.

                output += right - left + 1

 

Once we have slid our window across the entire array, we will return the number of sublists we found of the given constraint.

            return output


Okay, so here is the trick.

A sliding window algorithm is only going to shrink when the constraint isn't met.  Ours wouldn't shrink unless there were more than k distinct numbers in the window.

That means that we were counting sublists where the number of distinct digits was less than or equal to k, but the question was wanting us to return the count of sublists where the number of distinct digits was exactly k.

To solve this, we will need to run the algorithm again one more on k-1 to find the number of sublists where the number of distinct digits that are less than k.

This way, we can take the number of sublists that are less than or equal to k, subtract that by the number of sublists that are less than k, and we'll get the number of sublists that equal k.

        return sliding_window(k) - sliding_window(k-1)