Find the Duplicate Number

Patrick Leaton
Problem Description

Given an array of integers nums containing n + 1 integers where each integer is in the range [1, n] inclusive.

There is only one duplicate number in nums, return this duplicate number.

Follow-ups:

  1. How can we prove that at least one duplicate number must exist in nums
  2. Can you solve the problem without modifying the array nums?
  3. Can you solve the problem using only constant, O(1) extra space?
  4. Can you solve the problem with runtime complexity less than O(n^2)?

 

Example 1:

Input: nums = [1,3,4,2,2]
Output: 2

Example 2:

Input: nums = [3,1,3,4,2]
Output: 3

Example 3:

Input: nums = [1,1]
Output: 1

Example 4:

Input: nums = [1,1,2]
Output: 1

 

Constraints:

  • 2 <= n <= 3 * 10^4
  • nums.length == n + 1
  • 1 <= nums[i] <= n
  • All the integers in nums appear only once except for precisely one integer which appears two or more times.

 

The description was taken from https://leetcode.com/problems/find-the-duplicate-number/.

Problem Solution

#O(N) Time, O(1) Space
class Solution:
    def findDuplicate(self, nums: List[int]) -> int:
        slow = fast = nums[0]
        while True:
            slow = nums[slow]
            fast = nums[nums[fast]]
            if fast == slow:
                break
 
        slow = nums[0]
        while slow != fast:
            slow = nums[slow]
            fast = nums[fast]
        return slow

Problem Explanation


There are several ways to solve this problem.

We could sort and check neighboring duplicates, store each number in external memory and see if we encounter the same number twice, use a binary search after sorting, etc.

However, let's notice that the problem description states that the values in the array range from one to n exclusively.  That is a hint that Floyd's Tortoise and Hare algorithm can be implemented to detect cycles in how we'd increment a fast and slow pointer.


Let's start by setting our slow and fast pointers at the first element of the array.

        slow = fast = nums[0]

 

We will then make a loop that will run infinitely until the two pointers meet since there will always exist a duplicate number according to the problem description.

        while True:

 

During each iteration, we will increment the slow pointer once and the fast pointer twice.

            slow = nums[slow]
            fast = nums[nums[fast]]
            if fast == slow:
                break

 

The incrementing logic may be confusing, so let's take this test case for example:[1,3,4,2,2].

The slow pointer will be incremented by the element at the index place of the slow pointer's value, nums[slow].  In the first iteration, the slow pointer is one.  The element at index one of the array is three, so the slow pointer will be moved to three.

The fast pointer will be incremented by the element at the index place of the element at the index place of the fast pointer's value, nums[nums[fast]]. In the first iteration, the fast pointer is one.  The element at index one of the array is three.  The element at index three is two, so the fast pointer will be moved to two.

Notice that the fast pointer is now twice as far as the slow pointer from the beginning, this is good.

Once the fast pointer, the hare, laps around and slaps a "kick me" sign on the back of the slow pointer, the tortoise, this loop will break.

Once the two pointers meet, we can prove that there is a cycle, sweet.

However, the question isn't asking to prove there is a cycle but to find the duplicate number.  The number that the two pointers met each other on is likely not the duplicate number.

To find the duplicate number, we will keep the fast pointer's location but send the slow pointer back to the first number in the array.  

        slow = nums[0]

 

We will increment the slow and fast pointer at the same speed until they meet again.  

When this happens, they will have met at the entrance to the cycle which is the duplicate number.  

        while slow != fast:
            slow = nums[slow]
            fast = nums[fast]

 

Once we have broken from the loop, we can return the element at either pointer's index since they are in the same place.  We have found our duplicate number.

        return slow