Why the “Find the Duplicate Number” problem doesn’t accept 0

The “Find The Duplicate Number” problem is a classic challenge in computer science and algorithm design, often encountered in coding interviews and competitive programming. The optimal solution to this problem involves cycle detection using fast and slow pointers. However, this algorithm works only when the input array doesn’t contain $0$ . Let’s look into the problem statement and its naive and optimized approaches before delving into the details.

Problem statement

Given an unsorted array of positive numbers, nums, such that the values lie in the range $[1,n]$ , inclusive, and that there are $n+1$ numbers in the array, find and return the duplicate number present in nums. There is only one repeated number in nums.

Constraints:

$1 \leq n \leq 10^3$
nums.length $=n+1$
$1 \leq$ nums[i] $\leq n$
All the integers in nums are unique, except for one integer that will appear more than once.

Examples:

Let’s look into a couple of examples to get an idea about the sample input and output.

Naive approach

The naive solution to this problem is to iterate through the array and for each element, check the rest of the array if there’s a duplicate. If a duplicate is found, we return that duplicate number.

The time complexity of this algorithm is $O(n^2)$ and it operates in $O(1)$ space.

Optimized approach

The optimal solution employs Floyd’s Tortoise and Hare algorithm, utilizing two pointers, slow and fast, initially set to the first element of the array. The slow pointer advances by one step, while the fast pointer advances by two steps in each iteration. When they meet, it indicates the presence of a cycle. At this point, one pointer is reset to the beginning of the array while both move at the same speed. The meeting point of the two pointers is the start of the cycle, which corresponds to the duplicate number in the array. This approach ensures linear time complexity $O(n)$ and constant space complexity $O(1)$ , offering an optimal solution for finding the duplicate number.

In this answer, we’re interested in talking about the cycle detection algorithm and why it fails if the input array contains $0$ .

Cycle detection

To find the cycle, we iterate nums using the $f(i)=nums[i]$ function, where $i$ is the index of the array. This function constructs the following sequence to iterate:

From the above algorithm, we know that any index that we visit in the array is due to the value present at the previously visited index, except the first index ( $i=0$ ) that we visit at the start of the algorithm. It’s worth mentioning that we can’t visit the first index twice because the array doesn’t contain $0$ .

What happens if the input contains $0$ ?

In case when $nums$ contains $0$ , there’s a chance of visiting the first index twice, once at the start of the algorithm and again due to the $0$ in the array. Since our algorithm treats a repeated visit as a duplicate number, it will declare $nums[0]$ as the duplicate number, which will be incorrect.

Let’s try to understand this concept with the help of the following illustration.

def find_duplicate(nums):
    # Initialize the fast and slow pointers at the first index of the array
    fast = slow = nums[0]
    # Traverse the array until the intersection point is found
    while True:
        slow = nums[slow]
        # Move the fast pointer two steps
        fast = nums[nums[fast]]
        # Break the loop when the slow and fast pointers meet
        if slow == fast:
            break
    # Set the slow pointer at the start of the array
    slow = nums[0]
    # Traverse in the array until the slow pointer meets the fast pointer
    while slow != fast:
        slow = nums[slow]
        # Move the fast pointer one step
        fast = nums[fast]
    return fast
# Driver code
def main():
    nums = [[2, 5, 8, 6, 3, 9, 4, 8, 1, 7],
        [2, 5, 8, 6, 3, 9, 7, 4, 1, 0]]
    for i in range(len(nums)):
        print(i + 1, ".\tnums = ", nums[i], sep="")
        print("\tDuplicate number = ", find_duplicate(nums[i]), sep="")
        print("-" * 100)
if __name__ == '__main__':
    main()

The Find the Duplicate Number problem

Explanation

Let’s try to understand the code above:

Line 3: We initialize fast and slow pointers at the first index of the array.
Lines 6–12: We traverse the array until the intersection point is found.
Lines 17–20: We traverse in the array until the slow pointer meets the fast pointer.

It’s evident from the output of the second test case mentioned on line 27 that the algorithm fails if the input array contains 0. The output says that the duplicate number in the array is 2, but the array doesn’t contain any duplicate numbers.

Free AI Mock Interviews

Coding Interview

Coding PatternsFree Interview

Gain insights and practical experience with coding patterns through targeted MCQs and coding problems, designed to match and challenge your expertise level.

System Design

YouTubeFree Interview

Learn to design a video streaming platform like YouTube by tackling functional and non-functional requirements, core components, and high-level to detailed design challenges.

Free Resources

Input	Output
[1, 8, 8, 4, 2, 5]	8
[1, 5, 3, 4, 2, 5]	5
[1, 2, 3, 4, 3, 6, 3]	3