Sherlock and Anagrams Solution

carlosbf
2 min readAug 28, 2018

This is one of the medium difficulty problems in the Dictionaries and Hashmaps section of hackerrank’s interview preparation kit problem set. Link here.

The problem states that given a string s you need to count the number of anagrammatic substring pairs contained in s. For example let s = “cdcd”, then we have 3 anagrammatic substrings c, d and cd occurring 2, 2 and 3 times respectively. The number of anagrammatic substring pairs is 5. Consisting of [c, c], [d,d], [cd, cd], [dc, cd], [cd, dc].

Solution

The strategy I used to solve this problem is to break it down into 2 parts.

First step

First counting all occurrences anagrammatic substrings, there are
(n *(n-1)/2) -1 substrings in any string of length n, we can use 3 for loops to get the substrings of all lengths. Now to count them we can assign a unique key to each substring such that the same value is returned for any anagrammatic string. We can achieve that by assigning a prime integer to each ascii character and multiplying the values. The result modulo some large prime integer will be our unique key. For each occurrence of this key we add one in an unordered map. After this we have an unordered map with the number of occurrences of all substrings.

Second step

Second we can iterate in this unordered map and count the number of pairs that can be formed with the anagrammatic strings that occur at least twice. We can count the pairs of k anagrammatic occurrences of a string with the formula:
k*(k-1)/2 . We do this for all entries with at least 2 occurrences and add the result.

Code

LINK TO FULL SOLUTION ON GITHUB

int sherlockAndAnagrams(string s) {

int n;
long key;
int result;
const char* s_chars;
unordered_map<long,int> map;

result = 0;
n = s.size();
init_primes();
s_chars = s.c_str();

for (int len = 1; len <= n; len++)
{
for (int i = 0; i <= n - len; i++)
{
int j = i + len - 1;
key = 1;
for (int k = i; k <= j; k++) {
key = (primes_map[ s_chars[k] ]*key) %1000000007;
}
map[key]+=1;
}
}


for(unordered_map<long,int>::iterator it = map.begin(); it !=map.end(); ++it){
int val = it->second;
if(val >= 2){
result += (val*(val-1))/2;
}
}

return result;

}
void init_primes(){
primes_map['a']= 2;
primes_map['b']= 3;
primes_map['c']= 5;
primes_map['d']= 7;
primes_map['e']= 11;
primes_map['f']= 13;
primes_map['g']= 17;
primes_map['h']= 19;
primes_map['i']= 23;
primes_map['j']= 29;
primes_map['k']= 31;
primes_map['l']= 37;
primes_map['m']= 41;
primes_map['n']= 43;
primes_map['o']= 47;
primes_map['p']= 53;
primes_map['q']= 59;
primes_map['r']= 61;
primes_map['s']= 67;
primes_map['t']= 71;
primes_map['u']= 73;
primes_map['v']= 79;
primes_map['w']= 83;
primes_map['x']= 89;
primes_map['y']= 97;
primes_map['z']= 101;
}

--

--

carlosbf

Software developer that publishes his interview prep on this blog