Longest Consecutive Sequence

Patrick Leaton
Problem Description

Given an unsorted array of integers nums, return the length of the longest consecutive elements sequence.

Follow up: Could you implement the O(n) solution? 

 

Example 1:

Input: nums = [100,4,200,1,3,2]
Output: 4
Explanation: The longest consecutive elements sequence is [1, 2, 3, 4]. Therefore its length is 4.

Example 2:

Input: nums = [0,3,7,2,5,8,4,6,0,1]
Output: 9

 

Constraints:

  • 0 <= nums.length <= 10^4
  • -10^9 <= nums[i] <= 10^9

 

The description was taken from https://leetcode.com/problems/longest-consecutive-sequence/.

Problem Solution

#O(N) Time, O(N) Space
class Solution:
    def longestConsecutive(self, nums: List[int]) -> int:
        longest_streak = 0
        nums_set = set(nums)
       
        for num in nums_set:
            if num - 1 not in nums_set:
                running_num = num
                running_streak = 1
                while running_num + 1 in nums_set:
                    running_num += 1
                    running_streak += 1
                longest_streak = max(longest_streak, running_streak)
           
        return longest_streak

Problem Explanation


A brute force approach could be how we would imagine it.  We would store each number in a HashMap or HashSet, then from each number, we would try to find if its next number within a sequence exists within our storage.  The problem with this is that we would be visiting the same numbers within the same sequence multiple times.  If we visited earlier numbers in the same sequence already during this process, then the later numbers would also have been visited already as well.  This would require a quadratic time complexity and a linear space complexity.

From there we may think to sort the input so that we visit the numbers within a sequence starting from the beginning of the sequence only.  This would be better, with an O(N(Log(N))) time complexity and linear space, but we can do better.   

The thing is, we don't need to sort the list to visit the starting numbers of the sequence first and avoid revisiting the same sequence.  We can solve this problem by putting each number in a set, then only performing a running count on the first number of a potential sequence so that we don't repeat work by scanning the same sequences over and over again.  We can find candidate first numbers based on if their predecessor exists within our set.  If it does, we know that this isn't the first number of the sequence.  If it doesn't, then we know that this is a candidate starting number of a sequence.


We will start by creating the longest_streak counter and converting the input array into a set.

        longest_streak = 0
        nums_set = set(nums)

 

We will then traverse each value from the input array.

        for num in nums_set:

 

During each iteration, if the current number minus one is in the set then we can assume that we have either scanned the current sequence candidate before, or we will in the future.  

If that is the case, we will continue to the next number from the input array.

If that is not the case, then we will apply a running count to begin considering the current number as a starting number for the longest consecutive sequence.

            if num - 1 not in nums_set:

 

Let's initialize a running number as the current number from the input array and a running streak that will keep track of the length of this current consecutive subsequence.

                running_num = num
                running_streak = 1

 

We will then create a loop that will run while the running number is an existing value in our set.

While this condition is true, we will increment the running number to continue looking up the next consecutive values for this sequence and we will also increment the running streak to keep track of this sequence's length.

                while running_num + 1 in nums_set:
                    running_num += 1
                    running_streak += 1

 

Once we have broken from the loop, we have reached the end of a sequence so we will compare the running streak we counted against the longest saved streak to see if that sequence is the current longest.

                longest_streak = max(longest_streak, running_streak)

 

This will continue until we have traversed the entire array, then we will return the longest counted streak.

        return longest_streak