Count number of coprimes to all numbers in array (in Python)

Andriy Lazorenko
3 min readMay 8, 2019

--

The code from the article is available on github:

https://github.com/AndriyLazorenko/medium_blog/blob/master/Coprimes.ipynb

Part 1. Problem setup

During one of tech interviews I was asked to solve the following challenge on HackerRank (nice platform, auto-completion IDE and auto-tests included):

Given an array of numbers, e.g. [5, 8, 14] , return an array of numbers representing counts of coprimes to given numbers.

E.g. for 5, its coprimes are:

1, 2, 3, 4

A simple test for coprime is if greatest common divisor (gcd) is equal to one.

import math
a = math.gcd(5,1)
# a = 1

So, for an input array of [5,8,14] we expect an output array of [4,4,6]. Let’s go step by step.

Coprimes of 5: 1,2,3,4, so counting them all we arrive at 4

Coprimes of 8: 1,3,5,7, so counting them all we arrive at 4

Coprimes of 14: 1,3,5,9,11,13, so counting them all we arrive at 6

Returning counts of coprimes for every number in input array we receive output array of [4,4,6]

Part 2. Solution, first attempt

My first attempt at a solution successfully passed 5/7 tests, failing last tests on pre-defined timeout, which points out to complexity issues (will be discussed below).

First, I decided to use implementation of gcd taken from math lib. I decided to iterate over all elements in range of given number and look for coprimes with the following test math.gcd(a,b) == 1 , collecting the numbers passing comprimes test in array and counting them with sum()

def coprimes(a):
return sum([math.gcd(a,el) == 1 for el in range(a)])

Having implemented coprimes count for a single number, it’s easy to scale it to an array using listcomp as follows:

def cnt_coprimes_in_array(arr):
return [coprimes(el) for el in arr]

However, the solution didn’t pass several speed tests. Let’s try to understand why.

Part 3. Solution (speeding up the algorithm)

So, we have two list comprehensions, which can be interpreted as two loops. The outer loop iterates over different input numbers in the array, while the inner loop iterates over all the integers up to a given one. For large integers the inner loop can be quite slow (even omitting the fact that gcd(a,b) is sort of a loop as well). So, let’s try to speed up the inner loop.

After some digging on the internet one can find a generalized formula of number of coprimes of a number.

If we factorize a number N into prime numbers as follows:

N = a^x*b^y*c^z*..., where a,b,c are primes

we can express the number of coprimes of N as follows:

coprimes = N * (1 - 1/a) * (1 - 1/b) * (1 - 1/c) * ...

So, let’s implement new method coprimes_2(a) and time both methods on a test case.

Constructing a test case

ten_thousand = int(1E4)
a =[np.random.randint(ten_thousand) for i in range(ten_thousand)]

Now, a new method:

import numpy as np
from sympy import primefactors
def coprimes_2(a):
p_f = primefactors(a)
ar = [(1-1/el) for el in p_f]
num_coprimes = int(np.prod(ar) * a)
return num_coprimes

We can test that the method yields the same result on a number 1504 as our working method:

assert coprimes_2(1504) == coprimes(1504)

We are satisfied. Let’s construct a method processing entire array with our new ‘inner loop’ method:

def cnt_coprimes_in_array_2(arr):
return [coprimes_2(el) for el in arr]

Let’s time both methods using jupyter notebook magic:

%%timeit
cnt_coprimes_in_array(a)
# 13.2 s ± 448 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

and the second one:

%%timeit
cnt_coprimes_in_array_2(a)
# 143 ms ± 22.2 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

We can clearly see a winner here. If only I was that smart while solving tech interview challenge :)

P.S.: this story is a part of my series on algorithmic challenges. Check out other cool algorithms decomposed with tests and jupyter notebooks!

--

--