# Find a Duplicate in an Array

# Problem

- Find a duplicate in an array

Given an array of n + 1 integers between 1 and n, find one of the duplicates.If there are multiple possible answers, return one of the duplicates.If there is no duplicate, return -1.Example:Input: [1, 2, 2, 3]

Output: 2Input: [3, 4, 1, 4, 1]

Output: 4 or 1

Category: Arrays

# Solving Process

Before to see the solutions, let’s talk a bit about the problem. We get an array of `n + 1`

element with integers between 1 and `n`

.

For example, with an array of 5 integers, it implies that each integer will have a value between 1 and 4 (included). This means there will automatically be **at least** one duplicate.

The only exception is with an array of size 1. This is the only case where we should return a -1.

## Brute Force

The brute force solution is to implement two nested loops this way:

`for i = 0; i < size(a); i++ {`

for j = i+1; j < size(a); j++ {

if(a[i] == a[j]) return a[i]

}

}

O(n²) time complexity and O(1) space complexity.

## Count Iterations

Another solution is to have a data structure to count the number of iterations of each integer. This solution could be implemented either with an array or a hash table.

An implementation in Java:

The value of index `i`

represent the number of iterations of `i+1`

.

This solution is O(n) time but O(n) space as we need an extra structure.

## Sorted Array

If we apply the simplification technique, we could try to find a solution with a sorted array.

In this case, we would just have to compare each element with its right neighbor.

In Java:

O(1) in space but O(n log(n)) in time as we need to sort the collection up front.

## Sum of the Elements

A direction we may think about is to sum the elements of the array and to compare it with 1 + 2 + … + n.

Let’s see an example:

Input: [1, 4, 3, 3, 2, 5]

Sum = 18As in this example, we have n = 5:

Sum of 1 to 5 = 1 + 2 + 3 + 4 + 5 = 15=> 18 - 15 = 3 so 3 is the duplicate

In this example, we are able to find the solution in O(n) time and O(1) space. Yet, it is working solely in the case where we have **one duplicate**.

A counter-example:

Input: [1, 2, 3, 2, 3, 4]

Sum = 15As in this example we have n = 5,

Sum of 1 to 5 = 1 + 2 + 3 + 4 + 5 = 15/!\ Not working

This direction is going nowhere. But sometimes we need to try stuff to find the most optimal solution.

## Marker

There is something interesting to mention. So far, our solutions are not really making use of the fact that each integer is between 1 and `n`

. Due to this interesting constraint, each value has its own corresponding index in the array.

The solution is to consider this particular array as a sort of linked list. Any index is pointing to the value of that index.

We iterate over each element and *mark* its corresponding index by setting its sign to minus. If we already marked it as negative, it means its index is a duplicate.

Let’s see a concrete step by step example:

Input: [2, 3, 3, 1]* Index 0:

Absolute value = 2

Put a minus sign to a[2] => [2, 3, -3, 1]* Index 1:

Absolute value = 3

Put a minus sign to a[3] => [2, 3, -3, -1]* Index 2:

Absolute value = 3

As a[3] is already negative, it means 3 is a duplicate

In Java:

This solution is O(n) time and O(1) space. Yet, it requires to mutate the input list.

## Runner Technique

There is another solution which also considers the given array as a sort of linked list (again, this is possible because of the constraint that each integer is between 1 and `n`

).

Let’s analyze the example `[1, 2, 3, 4, 2]`

:

With this representation, we can simply say that a duplicate exists when a cycle does exist. Moreover, the duplicate is the entry point of the cycle (in this case, the second element).

If we take inspiration from Floyd’s cycle-finding algorithm, we can derive the following algorithm:

- Initiate two pointers
`slow`

and`fast`

- For each step, slow is moving one step at a time with
`slow = a[slow]`

, whereas fast is moving two steps at a time with`fast = a[a[fast]]`

- When
`slow == fast`

, we are in a cycle

Is the algorithm completed? Not yet. The entry point of this cycle will be the duplicate. We have to reset `slow`

and move both pointers step by step until they are equals again.

A possible implementation in Java:

This solution is O(n) time and O(1) space and does not require to mutate the input list.