Finding all prime numbers up to N less than quadratic time

Adem Atalay
The Startup
Published in
5 min readMay 17, 2020

A prime number is known as a number which has exactly two division factor which is 1 and itself. Since the definition requires exactly two numbers, 1 is not accepted as prime number even thought it is divisible by 1 and itself. Thus, the prime numbers start from 2 and goes to infinity with an irregular distribution.

A function which results the number of prime numbers up to a given number n is known as prime-counting function and denoted by π(n). The distribution of this function shows the irregularity of prime numbers more clearly in the following chart.

The values of π(n) for the first 60 positive integers (Wikipedia)

The naive way to get the list of all prime numbers in a given interval [2,n] is to apply primality test to each of the numbers and pick the prime ones. This means running primality test n-1 times. So, the complexity of it would be proportional to the complexity of the primality test. There are couple of methods to test the primality such as trying to find a divisor between [2, √n], heuristic tests or Fermat’s test. Unfortunately, there is no way to test primality of a number in a constant time which means the complexity of the solution will be bigger than linear time in any case for picking prime numbers.

Note that a number is either a prime or composite. There is no other choice for a number. If all composite numbers can be removed in a linear time from the list of numbers, only prime numbers would be left. Now the question turns out that if it is possible to check a number is composite in a constant time.

Actually, there is no difference between checking a number is prime or composite for only one given number. However, it is possible get composite number series from a starting prime number where each next values of the serie can be calculated in a constant time. The serie is shown below.

Serie of composite numbers starting from a prime number p

For example, for a given interval [2, n] we know that the first number 2 is prime. According to the serie above the numbers 4, 6, 8, 10, 12, 14 … 2k where 2k ≤ n are composite and removing them from the given interval takes just k = n/2 constant steps. Similarly, after removing all the composite numbers starting from 2, the next value will be the next prime number which is 3. Same way can be applied for 3 and all composite numbers starting from 3 can be removed in n/3 steps. The following illustration shows removing composite numbers up to 100.

Removing composite numbers from interval

The list will only contain prime numbers when all of the composite numbers are removed. Even if it gives a feeling that the algorithm runs in a linear time, it runs in a little bit bigger complexity. The algorithm is shown below.

Algorithm for listing all prime numbers less than n

In the algorithm above, it is clear that the while loop which marks the composite numbers run 23071 times and the algorithm runs through all 10000–1 numbers. So, the total number of operations is 33070 which is 3.3 times bigger than 10000. The coefficient changes proportional to n which concludes that the complexity of the algorithm is more than a linear complexity.

Algorithm runs the outer while loop n times and runs the inner while loop for each prime numbers p for (n/p)-1 times. So, the total number of operations can be declared as follows.

number of operations for n

This sum can be simplified as follows

number of operations for n

Obviously, the summation of 1/p makes the complexity non linear. Finding an upper bound of that summation would give some idea for the complexity of the solution.

upper bounding serie of 1/p

The upper bound is known as harmonic series and the upper bound of a harmonic serie can be found by normalizing the serie inputs to a power of 1/2 as shown below.

the upper bound of harmonic serie for some k

There are two 1/2s and four 1/4s and so on. So, the number of elements in the serie can be taken as the sum of power of 2s.

relation between n and k values

Adding 2 both sides and taking the logarithm results k=log(n+2) which is the upper bound of both harmonic serie and summation of 1/p values. This sets the upper bound of the complexity of the algorithm as nlog(n).

In the algorithm, all the numbers in the interval are taken and the composite ones are removed. It is something like taking a big piece of soil on a sieve and getting rid of the big pieces until some special sized ones left. This is the metaphor introduced by Eratosthenes and the algorithm that we analysed in this article is known as Sieve of Eratosthenes.

I hope you like the article, and thanks for reading.

--

--