Spellcheck in Python

Michael Dendinger
Analytics Vidhya
Published in
4 min readJul 31, 2020

Recently, I was working on a project as a mental test. I was looking at the New York Times Spelling Bee game and I thought to myself, “I bet it would be really easy to make a Python script that could give me all words that would work within the Spelling Bee game.” Now, I know, developing a code to beat this game defeats the point of the mental exercise, but I thought I would try anyway.

Spelling Bee Layout

For those who are unfamiliar with Spelling Bee, you are given seven letters to make words with. In the center of the list, you have one character that must be used in every word. Each word needs to be a minimum of four characters long.

It is a simple concept, but I thought I could speed up the process. I figured I could get every iteration of the characters and then compare them to some kind of spellchecker to verify they are actually words. From there I would be able to quickly input the words and call it a day.

Creating the iteration portion wasn’t very difficult. I did it in an admittedly inefficient way for my mental exercise and just did nested loops for the words.

Now, like I said, this is a pretty over-coded way to do this, but I tried using a couple iteration tools like using the Iteration library and importing the Combinations function. This however would only give me a minimal number of iterations and wouldn’t function to the level I needed it to get to. For example, I was only getting around 300 combination with the Combinations function. However, when I used the loop for the four character combinations, I found 2401 combinations. The five word combination found 16807.

Now, once I had my lists of combinations as a part of my test, I need to find which combinations are actually words. Though for computational efficiency, I first removed all combinations of without the necessary letter. In my test case the letter ‘T’. This cleans our list down from roughly 19,000 to around 14,000.

Now, to the crux of this discussion. How to find which words are actual words! My first thought was, “Hey, I bet there is a spell check Python library.” I was right and wrong. First, I found PySpellChecker, I liked the idea. However, it doesn’t exactly give me what I was looking for. It tells me what the word could be, but it doesn’t give me a boolean True or False for being spelled correctly. As far as I can find, no Python library does this. So, my logical test was to see if the suggested word from Pyspellchecker was equal to the the word that existed and I got a mess.

Boy is this a mess for my work. If the combos are close to a word, spellchecker does a pretty good job, but if you just feed it a jumble you do not get a coherent answer. Sifting through this was a huge headache and I thought to look to the actual ambassadors of the written word. Merriam-Websters Dictionary. Merriam-Webster luckily has an API! However, the free version of the API only allows for 1,000 requests a day and that just won’t do for my test. Oxford Dictionary has a similar pricing set for their API as well.

I would like to advertise Postman real quick! I love using this to get API responses and if you haven’t used it but get frustrated with API requests, you have to start now! (I am not paid by them by any means, but they are great!

Now, when I used Postman and got my request scripted out for me, I got these results from Merriam-Webster:

This is better, but then rare words like “Hight” (which means “being named” and doesn’t use the necessary letters) still occur and obfuscate the practice that I am going through.

So, in my experience, if you are trying to do simple spelling corrections, feel free to use PySpellChecker, but if you are trying to find coherent results in the thousands of combinations, you may have to pay for an API and that may also not bring in the result you want.

--

--

Michael Dendinger
Analytics Vidhya

M.A. International Relations/M.S. Data Analytics. Certified in Data Analytics, GIS, and Humanitarian Assistance. Returned Peace Corps Volunteer.