Word Hunt: Cracking the Code

Abhay Khanna
10 min readJan 5, 2023

--

Word Hunt is a word-building game on the popular iOS app Game Pidgeon. In this article, I analyze the underlying cognitive principles driving the gameplay, while also explaining how I built a full Word Hunt solver in C++. Enjoy!

Android users, I apologize.

In early 2017, an iOS app called GamePidgeon found its way on to nearly every teenager’s iPhone — seemingly overnight. The app features an impressive 25 multiplayer minigames fully integrated with the iPhone’s natural messaging app. It is likely that you, the reader, may have received an 8-Ball request at some point in your life. If you made it to Word Hunt as well, even better.

Word Hunt is essentially a fresh take on Boggle. A randomly generated 4x4 grid of letters is given to two players along with a 90-second time limit. The objective is to make as many words as you can by sliding your finger across the touchscreen board, and the winner is the one with the most points at the end. Words scale in points based on their length, so the game is a balance between making as many words as possible, while also keeping an eye on those juicy big-word opportunities.

Over this past semester at Vanderbilt University, I have been able to do two projects about Word Hunt in my classes as a Cognitive Studies major. The projects had us apply a key cognitive principle to certain activities from our everyday lives, and they redefined the way I think about the game, or more accurately, the way I think about thinking about the game. I am also a Computer Science major, so I thought it would be fun to contrast how humans play Word Hunt with how machines do. I dusted off my projects, and using a recursive back-tracking algorithm, I built a full Word Hunt Solver in C++. Here it is on GitHub.

First, lets walk through the solver.

To begin, I needed a dictionary file in order to check for valid words. I found the NASPA Word List (2019) online, and I downloaded it as dictionary.txt.

// Dictionary.txt
// NASPA Word List 2019

aa
aah
aahed
aahing
aahs
aal
aalii
aaliis
aals
aardvark
...

Next, I used a Trie data structure to store all of the words in dictionary.txt. A Trie is a tree data structure comprising of nodes of letters, and it allows us to check if any given prefix has a corresponding suffix more efficiently. More on this later.

I then ask the user for the letters for the gameboard they want to solve.

Welcome to the Word Hunt solver.

Enter four letters to make the top row (ex: 'abcd'):
regf
Enter four more letters to make the second row:
poes
Enter four more letters to make the third row:
aerd
Enter four more letters to make the bottom row:
afeo


Here is the game board you entered:

r e g f
p o e s
a e r d
a f e o

Lets Play!

Now to the fun part. Since the program now has the board in its memory, the computer is tasked with finding as many words as possible. In case you have been living in the woods for the past 50 years, computers have gotten pretty powerful, and thanks to the Trie data structure, the computer is able to find, sort, and print every possible word in less than a second. Here is the recursive solver function:

// Finds all possible 3+ letter words on the board with the given tile as 
// the starting tile, adds all those words to the answers vector.

// pre: gameBoard array is passed, a bool array (visited) is also passed
// to keep track of visited letters, tile is specified with row and column

// post: all possible 3+ letter words on the board with the given tile
// as the starting tile are in the answers vector, visited[][]
// is still false for all values

void Solver::recurse(char gameBoard[4][4], std::string word,
int row, int column, bool visited[4][4]){

//The game board is stored in the 2D array called gameBoard.

//checking if visited or if tile is in bounds
if (visited[row][column] || column < 0 ||
column > 3 || row < 0 || row > 3){
return;
}

// checks if the current string of letters can become a word
// condition is true if the word is already a valid word
if (dictionary.isPrefix(word)){

char letter = gameBoard[row][column];

// Ex: "b" now becomes "ba"
word += letter;

visited[row][column] = true;

//Try to build the rest of the word with all of the bordering tiles
recurse(gameBoard, word, row + 1, column + 1, visited);
recurse(gameBoard, word, row + 1, column, visited);
recurse(gameBoard, word, row + 1, column - 1, visited);
recurse(gameBoard, word, row, column + 1, visited);
recurse(gameBoard, word, row, column - 1, visited);
recurse(gameBoard, word, row - 1, column + 1, visited);
recurse(gameBoard, word, row - 1, column, visited);
recurse(gameBoard, word, row - 1, column - 1, visited);

visited[row][column] = false;

//there may be more than one way to create the same word
//we wouldn't want to print the same word twice
//checkDuplicates checks for that in O(1) lookup speed

if (dictionary.isWord(word) && !checkDuplicates[word]){
checkDuplicates[word] = true; // new map entry
++numWords;
answers.push_back(word);
}

} else {
return;
}
}

The solver works by tracing every possible combination of letters starting from a given tile. Lets use this board again as an example.

The program will start with a given tile, and lets say that tile is ‘M’ on the top row. The program will grab the ‘T’, ‘R’, ‘L’, ‘I’, and ‘A’ and append it to the M to make ‘MT’, ‘MR’, etc. It will then perform a recursive call, and append the new letters bordering the letter just grabbed, like ‘MTI’, ‘MTU’, etc. If you are a programmer reading this, you might start noticing a problem here, and that is the redundancy of calling the function again trying to build off of ‘MT’. There are no words that start with ‘MT’! Everyone knows that. This strategy would explore hundreds, if not thousands of possible words starting with ‘MT’, with predictably no results, and massive ramifications on processing time and energy cost.

// Returns true if pre is a prefix in the sub-Trie starting at the given 
// TrieNode, else returns false
// pre: TrieNode and Trie exist, string exists
// post: returns true if the string is a prefix in the subTrie, false
// otherwise
bool TrieNode::isPrefix(const std::string& pre) const{

//base case
if (pre == ""){
return true;
}
//Just in case the letter is not lowercase already
char frontLetter = tolower(pre[0]);

if (children[frontLetter - 'a'] == nullptr){
return false;
} else {
return children[frontLetter - 'a']->isPrefix(pre.substr(1));
}
}

The function above is included in the recursive solver function for this reason. A prefix checker is a key feature of using a Trie. It accepts a string, and returns a Boolean on whether the string is a prefix to a word in the Trie. The recursive function takes each character in a string, and makes sure it was a child of the one before it.

Source: cs.emory.edu

In the Trie representation above, a total of eight words are entered (compared to the 172820 words entered to by our dictionary Trie). If given ‘bell’, the isPrefix function would check that each letter is a child of the one before it. The string ‘bell’ would return true, and so would ‘st’ and ‘sel’. It allows us to quickly discriminate obvious non-prefixes like ‘bt’, which would quickly return as false.

You may easily be able to come up with some more obvious non-prefixes. Is ‘rwq’ a prefix? Is ‘a@jk6#’ a prefix? Obviously not — but how do we know that? It seems like we are able to apply the concept of a prefix to this abstract series of squiggles we call language and be like our own Trie. So that means… we have a Trie in our brain? Yes! Pretty cool right!

This begs the question… who put it there? Turns out the human brain is able to create flexible data structures without even understanding what the structure is. Like the Reverse Trie! If I asked you if ‘ead’ was a valid suffix, you would say yes fairly quickly, and in the process of doing so, you created a data structure where you make sure each letter is a child of the one after it. Confused? That just proves the point even further — you were able to know ‘ead’ was a valid suffix without understanding the underlying data structure where you got that knowledge.

So with our ability to conjure up flexible data structures, how do we play Word Hunt? As someone who has played almost 2500 games, I can say that while a computer may have the advantage in speed, the way humans play is a *lot* more interesting.

It took my brain a while to get the hang of this game. At first, I only saw the board as 16 random letters, and I patiently stared it until a word popped out to me. As I played the game more and more, I underwent something termed in the cognitive science community as “perceptual learning.” The basic definition of this phenomenon is that people get better at extracting relevant perceptual information from the environment as a result of experience or practice. Moreover, we become more attuned to relevant features, particularly structural relations, that define important classifications within the domain of expertise (Kellman & Garrigan, 2009). To reiterate, this means that we get better at perceiving important patterns and structures the more we exposed to those environments.

Lets take football defenses as an example.

Source: NFL Game Rewind; Broncos (orange) vs Seahawks (blue) 2013; SB XLVIII

Legendary Broncos/Colts quarterback Peyton Manning still holds the record for the most passing yards in a season in NFL History. His ability to “read” defenses was unmatched due to his high level of perceptual learning. What looks like 11 blue-uniformed men to us, is actually a well-thought out defensive scheme full of balanced rotations and complex structures, which Peyton was able to dissect and understand due to his practice and experience. Peyton Manning exemplified a key aspect of perceptual learning — being able to “see” more than a novice would in that environment. This allowed him to better exploit weaknesses in a defense, ultimately resulting an a superstar career.

In the same way, I grew from seeing Word Hunt as a grid of 16 letters, like a novice would, to seeing Word Hunt as a series of patterns, opportunity zones, and word fragments. Both Peyton Manning and I used perceptual learning to perform better at perceptual tasks within our domain of expertise… yet somehow he is the only one appearing in Nationwide commercials. Hm.

Hot zones (red) and Cold zones (blue)

With Word Hunt, I dubbed these key structural relations as “hot zones” and “cold zones.” Hot zones are areas of the game board with features likely to yield better words, such as common word fragments, more vowels, and more common letters. Cold zones are the opposite, and they often look like a jumble of unpopular consonants and very few vowels.

These zones then give way to the patterns that truly separate the good players from the rest of the pack. When I look at the board, the ‘S’ jumps out at me, and with it comes the suffixes ‘ANS’, ‘ATS’, ‘GS’. I quickly trace my finger over the hot zones forming word after word. Eventually, my finger finds words like ‘GRANTS’, ‘GIANTS’, and ‘STRAND’, all mostly contained to that zone. Not me — my finger. The way to play this game is to never stop moving your hands. In some cases, I am able to reach 70 words in a 90-second game. There is no time to think — only swipe.

It is no surprise that the hot zones I am able to identify have many of the most common letters in them. The most common letters in the English language are e, t, a, i, o, n, s, h, and r, which means half of them are present in our hot zone. I only looked that up once I started writing this article, yet my brain is attuned to these letters anyway… So. How did I correctly identify the hot zones?

AI-Generated Image

It seems like the human brain can create data structures subconsciously. While engaging in the perceptual learning process for Word Hunt, I created a Trie, a reverse Trie, and rough list of letters that were most common in the English Language — all without knowing I was even creating them. I used these data structures fluidly as my fingers flew across the board.

In contrast, I gave my solver only one data structure — the Trie. It diligently solved all of the possible words by tracing possible word candidates systematically: no bias, no zones. The solver worked, but the strategy was rather mundane; more efficient strategies exist, but those strategies require more data structures. Why can’t computers create those structures during runtime like humans do? It would create both ultra-fast and ultra-efficient solutions — the possibilities are endless.

In my first semester studying Cognitive Studies and Computer Science, I have organically stumbled across the foundational motives behind AI research. The recent release of groundbreaking AI tools such as Chat-GPT and DALL-E 2 have caused a buzz around the topic. These tools were used to generate the image above from just a simple sentence prompt, and in fact, this entire article was written by Chat-GPT (kidding, though we are not far off). I hope to explore AI technology more in the future.

I hope that this article was insightful — beyond just that I play too much Word Hunt. I have certainly enjoyed diving deeper into Cognitive Studies and Computer Science, though I know I am only just scratching the surface. Feel free to review the full solver on Github, and to use it to impress your friends. Happy swiping!

--

--