The Sieve of Eratosthenes: Ancient Algorithm for Prime Numbers

9 min readJun 3, 2023

Sieve of Erathosthenes after all prime and the multiples are marked (source: Visnos simulation)

In the realm of mathematics, prime numbers have captivated the curiosity of scholars and mathematicians for centuries. These unique integers, divisible only by 1 and themselves, possess an inherent elegance that continues to enthrall researchers to this day. Among the numerous methods devised to identify prime numbers, the Sieve of Eratosthenes stands as a remarkable algorithm that showcases both simplicity and effectiveness. Named after the ancient Greek mathematician Eratosthenes, this technique enables the identification of all prime numbers up to a given limit. Let us embark on a journey to unravel the secrets of this ancient algorithm and explore its timeless significance.

Eratosthenes, a scholar from ancient Alexandria, developed the Sieve of Eratosthenes around 240 BCE. The technique is based on the fundamental principle that all non-prime numbers can be expressed as a product of prime factors. Eratosthenes realized that by systematically eliminating multiples of prime numbers, he could sieve out all non-prime numbers, leaving behind only the primes.

Algorithmic Steps

1. Initialization: Begin by creating a list of numbers from 2 to the desired limit. Initially, assume all numbers are prime.
2. Marking Multiples: Starting with the first prime number, 2, mark all its multiples as non-prime. Move on to the next unmarked number and repeat the process until all multiples of primes up to the square root of the limit are marked.
3. Identifying Primes: The remaining unmarked numbers in the list are prime.

Sieve of Eratosthenes animation (source: Wikipedia)

Let’s understand the steps with an example:

To find all prime numbers up to 30, you can use the Sieve of Eratosthenes algorithm. Here’s how:

1. Create a list of numbers from 2 to 30.

2. Start with the first prime, which is 2, and mark its multiples: 4, 6, 8… and so on until you get to 30.

3. Move on to the next unmarked number (in this case it’s 3), and mark its multiples: 6,9,… up till 30.

4. Repeat step 3 for all remaining unmarked numbers in your list — that gives you your primes! Using this property of prime numbers greatly reduces computational time over other methods that rely on brute force calculations or trial division.

Intuition (Why It Works)

The intuition behind the algorithm can be understood in the following steps:

1. Initialization: Initially, all numbers from 2 to the given limit are assumed to be prime. This is represented by setting their corresponding values in a boolean array to True.

2. Marking Multiples: Starting with the first prime number, which is 2, we mark all its multiples as non-prime. By doing so, we effectively eliminate numbers that are divisible by 2 (other than 2 itself) and, therefore, cannot be prime. Similarly, we proceed to the next unmarked number, which is a prime, and mark all its multiples as non-prime. This process continues until we reach the square root of the given limit.

The rationale behind this step is that any composite number (a number that is not prime) can be expressed as a product of prime factors. If a number is divisible by a prime number, it means it has that prime number as a factor, making it composite.

By marking the multiples of each prime number, we eliminate the non-prime numbers efficiently. For example, when we mark the multiples of 2, we remove all even numbers (except 2) from consideration. When we mark the multiples of 3, we eliminate numbers divisible by 3 (except 3 itself).

This process continues with subsequent prime numbers, progressively eliminating multiples of primes and reducing the search space for prime numbers.

3. Identifying Primes: After marking all multiples, the remaining unmarked numbers in the array are prime numbers. This is because any number that has not been marked as non-prime must have survived the marking process. Hence, it does not have any smaller prime factors and, by definition, is a prime number.

By systematically marking multiples of prime numbers and eliminating non-prime numbers, the Sieve of Eratosthenes algorithm efficiently identifies all prime numbers up to the given limit.

The algorithm’s intuition lies in leveraging the fact that prime numbers are the building blocks of all numbers. By iteratively sieving out composite numbers, we reveal the prime numbers, which form the foundation of number theory and have significant applications in various fields of mathematics and computer science.

Implementation in Python

Here’s a simple implementation of the Sieve of Eratosthenes algorithm in Python:

def sieve_of_eratosthenes(n):
 # Create a boolean array "prime[0..n]" and initialize
 # all entries it as true.
 prime = [True for _ in range(n + 1)]
 p = 2
 while p * p <= n:
 # If prime[p] is not changed, then it is a prime
   if prime[p] is True:
   # Update all multiples of p
     for i in range(p * p, n + 1, p):
       prime[i] = False
   p += 1
 
 # Print all prime numbers
 for p in range(2, n + 1):
 if prime[p]:
 print(p, end=" ")
# Example usage:
sieve_of_eratosthenes(30)

In this implementation, we first create a boolean array called `prime`, where `prime[i]` represents whether `i` is a prime number. Initially, we assume all numbers are prime by setting their corresponding array values to `True`.

Then, we start with the smallest prime number, 2, and iterate through the array. For each prime number `p`, we mark all its multiples as non-prime by setting their corresponding array values to `False`. We continue this process until we reach the square root of `n`, as multiples larger than the square root would have already been marked by smaller primes.

Finally, we print all the numbers that are marked as prime.

The example usage in the code will generate and print all prime numbers up to 30 using the Sieve of Eratosthenes algorithm.

Why start from p*p

for i in range(p * p, n + 1, p):
   prime[i] = False

In this section of the code, we iterate over the multiples of the current prime number p. We start from p * p because any smaller multiple of p would have already been marked as non-prime by previous iterations. We increment i by p in each iteration to cover all multiples of p up to n. If prime[i] is currently marked as True, indicating it is a prime number, we update it to False to mark it as non-prime.

The statement “We start from p * p because any smaller multiple of p would have already been marked as non-prime by previous iterations” is true in the context of the Sieve of Eratosthenes algorithm.

In the algorithm, we start with the smallest prime number, which is 2, and mark all its multiples as non-prime. When we move to the next unmarked number, which is a prime number, we want to eliminate its multiples from consideration as well.

Consider a prime number p and one of its multiples, m, where m is greater than p. In earlier iterations of the algorithm, when we encountered the prime number p', where p' is less than p, we would have already marked the multiple m as non-prime. This is because m can be expressed as m = p' * k, where k is an integer.

If we had encountered p' before p, we would have already marked m as non-prime during the iteration for p'. Therefore, when we reach the iteration for p, we can safely skip marking the multiples of p that are less than p * p. Marking these multiples again would be redundant, as they have already been marked by previous iterations.

Starting from p * p ensures that we only consider multiples of p that are greater than or equal to p * p. Any smaller multiple of p would have already been marked as non-prime by previous iterations, so we can safely skip them. This optimization significantly reduces the number of iterations needed and improves the efficiency of the Sieve of Eratosthenes algorithm.

Time Complexity Analysis

Let’s break down the time complexity analysis of the algorithm:

1. Initialization: Creating the boolean array of size n takes O(n) time.

2. Marking Multiples: For each prime number p from 2 to sqrt(n), we mark its multiples as non-prime. The number of iterations required for this step is approximately sqrt(n). For each iteration, we mark (n/p) numbers as non-prime. Therefore, the overall time complexity of this step is O(n/2 + n/3 + n/5 + …), which can be simplified to O(n log log n) using the harmonic series approximation.

3. Identifying Primes: Printing or storing the prime numbers takes O(n) time in the worst case, as there can be up to n/ln(n) prime numbers within the range from 2 to n.

Overall, the dominant time complexity comes from the marking multiples step, which is O(n log log n).

It is important to note that the Sieve of Eratosthenes is highly efficient compared to other methods for finding prime numbers. Its time complexity makes it suitable for generating prime numbers within large ranges efficiently.

Harmonic Series

The harmonic series approximation is a useful tool for estimating the sum of the reciprocals of natural numbers. It is often used to simplify the time complexity analysis of algorithms like the Sieve of Eratosthenes.

To derive the harmonic series approximation, we start with the definition of the harmonic series:

H(n) = 1 + 1/2 + 1/3 + 1/4 + … + 1/n

We can observe that the sum of the harmonic series can be approximated by integrating the function f(x) = 1/x over the interval [1, n+1]:

∫(1 to n+1) 1/x dx ≈ ∫(1 to n+1) dx/x

Evaluating the integral gives:

ln(n+1) — ln(1) ≈ ln(n+1)

So, we have:

H(n) ≈ ln(n+1)

This approximation suggests that the sum of the harmonic series H(n) can be approximated by the natural logarithm of (n+1).

Now, let’s consider the number of iterations required in the marking multiples step of the Sieve of Eratosthenes algorithm. We have:

Iterations ≈ (n/2) + (n/3) + (n/5) + …

Using the harmonic series approximation, we can rewrite this as:

Iterations ≈ n * (1/2 + 1/3 + 1/5 + …)

And further simplify it to:

Iterations ≈ n * H(p)

Where p represents the largest prime number less than or equal to the square root of n.

Since the number of iterations dominates the time complexity of the algorithm, we can say that the time complexity of the Sieve of Eratosthenes is approximately O(n * H(p)), which can be further approximated to O(n log log n) using the harmonic series approximation of H(p) as ln(ln(n)).

Therefore, the time complexity of the Sieve of Eratosthenes algorithm is often stated as O(n log log n) for large values of n.

The harmonic series approximation, ln(n+1), used in the time complexity analysis of the Sieve of Eratosthenes, provides a reasonable estimate of the actual number of iterations required, despite not being an exact representation of the series. The approximation grows at a slower rate than the actual harmonic series, allowing it to serve as an upper bound. Although not precise, it becomes increasingly accurate as the input size grows. This approximation simplifies the analysis and provides insight into the algorithm’s efficiency. By capturing the growth rate well, it indicates that the time complexity of the Sieve of Eratosthenes is O(n log log n). Thus, while not exact, the harmonic series approximation is a valuable technique for estimating time complexity without delving into the precise summation of the series.

Significance and Applications

The Sieve of Eratosthenes may have originated over two millennia ago, but its significance persists in modern mathematics and computer science. It serves as an essential algorithm for prime number generation and plays a crucial role in various domains, including cryptography, number theory, and data security.

1. Prime Number Generation: The sieve provides a fast and efficient method to generate prime numbers within a given range, aiding in various computational tasks that involve prime numbers.

Prime number illustration (source: Wikipedia)

2. Cryptography: Prime numbers lie at the heart of modern cryptographic systems. The Sieve of Eratosthenes assists in generating large prime numbers necessary for encryption and security protocols.

3. Prime Factorization: The algorithm serves as a foundational tool for prime factorization, a key component in solving complex mathematical problems and cryptography algorithms.

4. Mathematical Research: The Sieve of Eratosthenes continues to inspire researchers in the field of number theory. It offers insights into the distribution and properties of prime numbers.

Conclusion

The Sieve of Eratosthenes, a mathematical algorithm conceived centuries ago, stands as a testament to the enduring power of human intellect. Eratosthenes’ ingenious method of eliminating multiples to unveil prime numbers continues to shape the way we understand and work with these extraordinary integers. From its humble origins in ancient Alexandria to its continued applications in modern cryptography and computational research, the sieve showcases the remarkable marriage of simplicity and efficiency. As we explore further into the realm of numbers, the Sieve of Eratosthenes remains an invaluable tool, shedding light on the timeless beauty of prime numbers.