# Find a Duplicate in an Array

Sep 30, 2018 · 4 min read

# Problem

• Find a duplicate in an array

Category: Arrays

# Solving Process

Before to see the solutions, let’s talk a bit about the problem. We get an array of n + 1 element with integers between 1 and n.

For example, with an array of 5 integers, it implies that each integer will have a value between 1 and 4 (included). This means there will automatically be at least one duplicate.

The only exception is with an array of size 1. This is the only case where we should return a -1.

The brute force solution is to implement two nested loops this way:

O(n²) time complexity and O(1) space complexity.

Another solution is to have a data structure to count the number of iterations of each integer. This solution could be implemented either with an array or a hash table.

An implementation in Java:

The value of index i represent the number of iterations of i+1.

This solution is O(n) time but O(n) space as we need an extra structure.

If we apply the simplification technique, we could try to find a solution with a sorted array.

In this case, we would just have to compare each element with its right neighbor.

In Java:

O(1) in space but O(n log(n)) in time as we need to sort the collection up front.

A direction we may think about is to sum the elements of the array and to compare it with 1 + 2 + … + n.

Let’s see an example:

In this example, we are able to find the solution in O(n) time and O(1) space. Yet, it is working solely in the case where we have one duplicate.

A counter-example:

This direction is going nowhere. But sometimes we need to try stuff to find the most optimal solution.

There is something interesting to mention. So far, our solutions are not really making use of the fact that each integer is between 1 and n. Due to this interesting constraint, each value has its own corresponding index in the array.

The solution is to consider this particular array as a sort of linked list. Any index is pointing to the value of that index.

We iterate over each element and mark its corresponding index by setting its sign to minus. If we already marked it as negative, it means its index is a duplicate.

Let’s see a concrete step by step example:

In Java:

This solution is O(n) time and O(1) space. Yet, it requires to mutate the input list.

There is another solution which also considers the given array as a sort of linked list (again, this is possible because of the constraint that each integer is between 1 and n).

Let’s analyze the example [1, 2, 3, 4, 2]:

With this representation, we can simply say that a duplicate exists when a cycle does exist. Moreover, the duplicate is the entry point of the cycle (in this case, the second element).

If we take inspiration from Floyd’s cycle-finding algorithm, we can derive the following algorithm:

• Initiate two pointers slow and fast
• For each step, slow is moving one step at a time with slow = a[slow], whereas fast is moving two steps at a time with fast = a[a[fast]]
• When slow == fast, we are in a cycle

Is the algorithm completed? Not yet. The entry point of this cycle will be the duplicate. We have to reset slow and move both pointers step by step until they are equals again.

A possible implementation in Java:

This solution is O(n) time and O(1) space and does not require to mutate the input list.

Written by

Written by