Python Anagram Solver

Raghav Gurung
Python Anagram Solver
5 min readAug 8, 2019

A simple and efficient way to get all anagrams from a scrambled letter using Python.

In this article our main purpose is to get all the anagrams from a scrambled letter. The anagrams can be of all sizes varying from length of 1 to the maximum length of the scrambled letter input (i.e. a perfect anagram).

Different ways to approach this problem

There are numerous ways that this could be done. If you have searched on the net, you will find a number of ways that this could be done.

In most of the cases people use a naive approach of taking all the permutations (combinations) of the scrambled letters and then running every permutation through a text file containing eligible words (dictionary) and then appending it to a data structure. This concerns performance issue as the more the permutations the scrambled letter has the more loops it will make over the dictionary text file to find the words. Hence it takes a lot of time when you have bigger words which have a lot of possible combinations.

In some cases, you will only find code to check if a given word is an anagram of another given word. This code is mostly used in technical interviews and is simple but the only thing is you have to find a fast algorithm.

I another case you will find people converting the letters into hash values or prime numbers. This is also an efficient way but to a beginner this seems really messy.

A simple and faster approach

Here I will show you a way to get all the anagrams from a scrambled letter in the most simplest and fastest way you can imagine. Our approach will only loop once through the dictionary and get all the anagrams that are there in the dictionary you provided.

This will make it faster as no matter how many anagrams the scrambled letters have it will take almost the same amount of time to output all the words. Does not matter how many words it needs to take out. We will not use permutations at all and so no need to use the itertools modules as is suggested in most of the code you find on the internet.

The time complexity only depends on the size of the dictionary file you have provided as the code has to loop once through the entire dictionary to validate all the words.

We will be using two built-in modules in python3 — Sets and Counter.

Before we start coding I want you to refresh your memory on these two modules we’ll be using in our code.

Get Dictionary file

First we load in the text file containing all the words. You can get this file from anywhere just search on google. Or if you’re using MacOS like mine you can get the system dictionary file. Just copy the file to your current directory where your python file resides.

Open your terminal and type the following line:

sudo cp /usr/share/dict/words .

Once you have the file rename it to “words.txt” :

mv words words.txt

Now create a python file “anagram_solver.py” :

touch anagram_solver.py

Now open your program with your favourite IDE. I like to use SublimeText.

WRITING THE PROGRAM

Import the dependencies

We will use Counter from collections. Counter has a special ability to output a dictionary file containing keys as the unique elements in the list and values as the number of count for that element.

We will also be using the sys module and time module to test the the performance of the running program.

from collections import Counter
import sys
import time

Load the “words.txt” file:

with open('words.txt', 'r') as f:
dictionary = f.read()

Split the words and lowercase all of them:

dictionary = [x.lower() for x in dictionary.split('\n')]

Now we have a list with all the words in the English dictionary file. Check the number of words listed in your dictionary.

print(len(dictionary))
>> 235887

This number will be the only factor that will statistically effect the performance of the code i.e. size of the dictionary is proportional to the time taken by the program to execute.

Create the main function

We will now create a function “return_anagrams” which will take in any string (scrambled letters mostly) as a parameter and return a list containing all the anagrams.

def return_anagrams(letters: str) -> list:    # Lowercasing the input letters
letters = letters.lower()

Now, for each word in the dictionary file we take the unique letters in the word and see if all the letters in the word is contained the given scrambled input letters.

Add the following code to the “return_anagrams” function:

anagrams = set()
for word in dictionary:
if not set(word) - set(letters):
check_word = set()
for k, v in Counter(word).items():
if v <= letters_count[k]:
check_word.add(k)
if check_word == set(word):
anagrams.add(word)

Let me explain it to you step by step:

  • First we initialise an empty set. Remember set() does not allow duplicated values.
  • Next we iterate over each of the words in the dictionary.
  • Next, we check if all the letters in the word are also contained in the given scrambled letters input. It actually returns an empty set() if all the letters in the dictionary ‘word’ are in the scrambled string ‘letters’. If not then it returns a set containing the letters in the dictionary word that are not contained in the given scramble input ‘letters’.
  • Next we initialise another empy set ‘check_word’.
  • Now we iterate over key , value in the Counter dictionary type of ‘word’ and check if the count of those elements are less than or equal to the count of the same elements in the scrambled letter input. If they are we add that to the ‘check_word’ set.
  • Now, we check if the ‘check_word’ set is exactly equal to the ‘word’ set. If they are equal we add that word to the ‘anagram’ set.

Now we simply convert the set into a list and return the anagram list sorted by length.

return sorted(list(anagrams), key=lambda x: len(x))

To test your program, write the following code:

if __name__ == '__main__':
start = time.time()
test_anagrams = return_anagrams(sys.argv[1])
print(test_anagrams)
stop = time.time()
print(f"Number of anagrams: {len(test_anagrams)}")
print(f"Time Taken: {round(stop - start, 2)} seconds")

Here is the full code:

Save the file and open your terminal. Make sure you are in the same working directory as your python file.

Type the following in the terminal with your scrambled letters input instead of the curly braced command:

python3 anagram_solver.py {your scrambled letters}

Thank You!

--

--