Maximum Length of Repeated Subarray

Patrick Leaton
Problem Description

Given two integer arrays nums1 and nums2, return the maximum length of a subarray that appears in both arrays.

 

Example 1:

Input: nums1 = [1,2,3,2,1], nums2 = [3,2,1,4,7]
Output: 3
Explanation: The repeated subarray with maximum length is [3,2,1].

Example 2:

Input: nums1 = [0,0,0,0,0], nums2 = [0,0,0,0,0]
Output: 5

 

Constraints:

  • 1 <= nums1.length, nums2.length <= 1000
  • 0 <= nums1[i], nums2[i] <= 100

 

 

The description was taken from https://leetcode.com/problems/maximum-length-of-repeated-subarray/.

Problem Solution

#O(M*N) Time, O(M*N) Space
class Solution:
    def findLength(self, nums1: List[int], nums2: List[int]) -> int:
        max_len = 0
        dp = [[0 for col in range(len(nums2)+1)] for row in range(len(nums1)+1)]
       
        for i in range(1, len(nums1)+1):
            for j in range(1, len(nums2)+1):
                if nums1[i-1] == nums2[j-1]:
                    dp[i][j] = dp[i-1][j-1] + 1
                    max_len = max(max_len, dp[i][j])
       
        return max_len

Problem Explanation


We are asked to find a subarray that exists in both arrays, but won't always be in the same order.  

This would be a similar approach to using dynamic programming for finding a subset in the other array, we can break up the subproblems to seeing if one value in the first array exists within the other, building up a cache of each consecutive time this subproblem is true.  

From there, this is a somewhat traditional dynamic programming problem.


Let's start by creating a maximum repeated subarray length variable.

Usually, we would just have the answer as the last index of our dp cache, but the subarray may not span to the end of the array so we will have to track this answer another way.

        max_len = 0

 

Next, we will make our dp cache by creating a matrix.

There is a hint for knowing whether a dp cache is an array or matrix that is usually given based on the inputs.  Since we are given two same-type inputs, that usually means we will need a matrix.  If we were just given one, usually we'd just need an array.

We will have as many columns as there are indices in the second array and as many rows as there are indices in the first array.

        dp = [[0 for col in range(len(nums2)+1)] for row in range(len(nums1)+1)]

 

If we had the inputs:

nums1 = [1,2,3,2,1], nums2 = [3,2,1,4,7],

Our cache would look like this, taking into account the base case of two arrays of size zero.

  [ ] 3 2 1 4 7
[ ] 0 0 0 0 0 0
1 0 0 0 0 0 0
2 0 0 0 0 0 0
3 0 0 0 0 0 0
2 0 0 0 0 0 0
1 0 0 0 0 0 0

 

 

 

We will then have two pointers, one for each iteration through both arrays, the first will expand the partition of solved sub-problems, the second will solve the subproblem of the current partition.   

        for i in range(1, len(nums1)+1):
            for j in range(1, len(nums2)+1):

 

The subproblem we are needing to answer is simply, does one of the values in one array exists within the other. 

                if nums1[i-1] == nums2[j-1]:

 

If that is the case, we will mark the current cell of the dp cache as the upper-left cell plus one.

                    dp[i][j] = dp[i-1][j-1] + 1

 

The reasoning is we can add onto a previous iteration's subproblem, "yes", if the current iteration is also "yes".  If we answered that the value in the first array exists in the second array directly before, and we answered it again, we begin noticing a subarray.

  [ ] 4 7
[ ] 0 0 0 0 0 0
1 0 0 0 1 0 0
2 0 0 1 0 0 0
0 1   0 0 0 0
0 0 0 0 0
0 0 0 0 0

 

Since the subarray sub-problems don't span to the last index of the cache, we will just need to test the max_len variable for each cell as we traverse the dp matrix.

                    max_len = max(max_len, dp[i][j])

 

Once we are done traversing the matrix, we will return the maximum length of the repeated subarray.

        return max_len