Dominating a Skill-Based Casino Game With Deep Learning

17 min readJun 22, 2024

Using an OpenCLIP Vision Transformer Model, Color Quantization, Multiprocessing, and other Misc. Algorithms.

In today’s world, applications for machine learning algorithms are practically inexhaustible. No, really. Approximately 99% of Fortune 500 companies utilize AI in ways to expedite operations, reduce costs, and improve decision-making. Globally, it is being used to flip burgers, tell fortunes, write songs, create images, and have back-and-forth conversations—not to mention save lives in the medical and emergency response sectors. Try to name something to which AI cannot be applied.

Since this revolutionary technology is all around us, chances are, you unwittingly play a part in giving feedback to a company’s algorithm every day. The actions you perform, pages you view, and products you buy are data that is fed to large companies and is used to generate more profit.

Who's to say we can’t use machine learning to make money?

For a while, I had been searching for ways to monetize my programming and technological skills. I have real talent and feel that I should channel my energy toward original, profitable projects over cliché, tutorial-backed guides. Yet, no matter how hard I looked, every niche seemed to be occupied by either large companies or small startups. As a lone programmer, still in high school, I stood no chance.

Nevertheless, on the last day of my junior year, it finally came to me. When playing a game called Blokus with a few friends, one mentioned that it reminded him of a game called Block Trail he plays on FanDuel Faceoff. Before then, I did not know that FanDuel offered games — I thought they solely focused on sports betting. Curiously, I Googled the games that FanDuel offers in their casino.

There it was. A word-based game where players can earn real money. A match consists of three rounds and the goal is to find as many words as you can by connecting letters horizontally, vertically, and diagonally. My programmer mind was inundated with dozens of ideas about how I could take advantage of this. The first step: research.

Statistically, if you are in the mid-50% of players, you will not break even. If you start with $0.60 and win, you’ll end up with $1.05. Then, if you lose, you’ll end up with $0.45 — not enough to play again.

Players can go “head-to-head” and directly bet against each other for real, monetary gain. After a match, the winner retains 100% of their own stake and earns 75% of what the loser wagered in the $0.60 and $3.00 games, and 80% in the $20.00 game.

You would have to be well above the average to even make profit a possibility. For a meticulously crafted program, that should be rather effortless — and developing this is exactly what I sought to do.

Wager Pricing — Source: FanDuel Faceoff App

Over the past two weeks, I worked tirelessly on a program that can attain superhuman scores in Boggle™, a game owned by Hasbro and can be played in the FanDuel Faceoff app.

There were three main steps to accomplishing this:

Identifying the game pieces on the board
Finding the best words to play
Entering the words in a humanesque fashion

Step 1

Logically, Optical Character Recognition (OCR) would seem to be an easy-to-implement technique of extracting text from the board. However, after days of testing, I found this method unsuccessful. OCR works best when you have full words that can be found in the dictionary. When given isolated letters, it often returns inaccuracies and misidentifications. Now, I tried numerous different OCR programs, including Paddle, Easy, and even Apple’s native OCR feature baked into macOS. PaddleOCR came close, but none gave me the results I was hoping for. I considered training a custom OCR model on this specific game font, but since I already have experience with object detection, I thought I might as well explore that avenue. And that’s exactly what I did.

Object detection didn’t go that well, either. Now, this wasn’t the fault of the model I was using, YOLOv8x — this complex model can excel in a multiplicity of situations. As a one-man team, this approach was limited by the magnitudes of the datasets I put together. My first dataset consisted of 100 grayscale boards which I then had to painstakingly annotate. This part itself took days of work. After I trained my first model on this dataset, I quickly put it to the test. It performed… terribly.

I soon realized that my dataset had too much augmentation since I didn’t collect the data properly. I took screenshots of games using my operating system’s native screenshot tool. However, when I had to make all the images the same size, this caused some images to experience stretching and distortion. Each of the images had a slightly different aspect ratio before being pre-processed. Moreover, some images were “close-ups” and some were “long-shots.” I regrettably noticed this mistake after it was too late. Promptly, I started collecting a dataset of images in a more appropriate manner using OpenCV but stopped shortly after because of how arduous the annotating process was. Even though the decorations around the tiles can vary, the core text is always the same for a single letter. In light of this, I knew there had to be a more efficient method than object detection.

By that point, creating a large dataset by myself wouldn’t be feasible. Next, I tried a hard-coded approach using primitive computer vision techniques to compare each of the 16 pieces on the board to each of the 26 basic tiles. To do this, I collected all basic tiles, A-Z, and created a series of Python scripts that tested techniques such as mean-squared error (MSE), overlap ratio, and histogram of gradients (HOG). After further testing, these methods often resulted in very linear spreads of similarity scores. Even when filtered out, minor decorations (e.g., bonus tiles) could cause the board to be read inaccurately, and thus derail the entire program.

I was looking for something that can ambiguously be called a combination of these systems. Neither something too overkill nor something too primitive. I needed a way to reliably identify the pieces on the board every single time.

Unexpectingly, I found it! After searching online for some time, I came across a Medium article entitled, “Exploring Image Similarity Approaches in Python” by Vasista Reddy that outlines the best methods for evaluating image similarity. At the bottom of the article, I found exactly what I was searching for.

Exploring Image Similarity Approaches in Python

In a world inundated with images, the ability to measure and quantify the similarity between images has become a…

medium.com

Using a model from OpenCLIP, you can quantify “the similarity between images… based on the cosine similarity or Euclidean distance of these feature vectors.”

Bingo! Or should I say — Boggle!

Without hesitation, I went straight to work with this recovered motivation. In just a few days, I created a fully operational program that lived up to my goal and can consistently reach superhuman scores.

Now, for the moment you’ve been waiting for. The code!

# gen_control_group.py

import os
import cv2
import numpy as np
from typing import List

def extract_black_pixels_from_images(image_pieces: List[np.ndarray]) -> List[np.ndarray]:
    processed_pieces: List[np.ndarray] = []
    lower_black: np.ndarray = np.array([0, 0, 0], dtype = "uint8")
    upper_black: np.ndarray = np.array([48, 48, 48], dtype = "uint8")
    for piece in image_pieces:
        mask: np.ndarray = cv2.inRange(piece, lower_black, upper_black)
        processed_pieces.append(mask)
    return processed_pieces

def save(input_image_path: str, output_dir: str) -> None:
    input_image: np.ndarray = cv2.imread(input_image_path)
    binary_image: np.ndarray = extract_black_pixels_from_images([input_image])[0]
    os.makedirs(output_dir, exist_ok=True)
    output_path: str = os.path.join(output_dir, os.path.basename(input_image_path))
    cv2.imwrite(output_path, binary_image)

if __name__ == "__main__":
    input_directory: str = "raw_basic_letters"
    output_directory: str = "control_group"
    for filename in os.listdir(input_directory):
        if filename.lower().endswith((".png")):
            input_image_path: str = os.path.join(input_directory, filename)
            save(input_image_path, output_directory)

The above code was used to create the transformation depicted below.

Pre-processing the images is necessary to create a baseline for the identifications and is done only once to avoid redundant processing. The program can reference these bitmask images whenever it may need since they are saved to a folder within the working directory.

In the main program, just like this control group, each unidentified piece on the board is stripped of all colors other than black and converted to a bitmask image. This is used to isolate the characters from any of the decorations around the tiles such as the DL, DW, TL, and TW bonuses.

# main.py

def imageEncoder(img: np.ndarray) -> torch.Tensor:
    img = Image.fromarray(img).convert('RGB')
    img = preprocess(img).unsqueeze(0).to(device)
    with torch.no_grad():
        img = model.encode_image(img)
    return img

def generateScore(img1: np.ndarray, img2: np.ndarray) -> float:
    img1, img2 = map(imageEncoder, (img1, img2))
    cos_scores = util.pytorch_cos_sim(img1, img2)
    score = round(float(cos_scores[0][0]) * 100, 2)
    return score

def compareAllImages(img: np.ndarray, directory: str) -> List[Tuple[str, float]]:
    img1_tensor = imageEncoder(img)
    scores = []
    filenames = [f for f in os.listdir(directory) if f.endswith('.png')]
    def process_image(filename):
        image_path = os.path.join(directory, filename)
        img2 = cv2.imread(image_path)
        img2_tensor = imageEncoder(img2)
        cos_scores = util.pytorch_cos_sim(img1_tensor, img2_tensor)
        score = round(float(cos_scores[0][0]) * 100, 2)
        return filename, score
    with ThreadPoolExecutor() as executor:
        scores = list(executor.map(process_image, filenames))
    return scores

These functions are used to calculate the similarity between each of the 16 tiles on the board with each of the 26 basic A-Z tiles. In more detail, feature vectors are derived from encodings of each tile using a neural network model from OpenCLIP, and the cosine similarity between these vectors is calculated and scaled to a percentage for easy comparison. This accentuates the best options and significantly aids in discerning between similar-looking letters.

Piece 0 - Most similar letter: Y with score: 99.08
Piece 1 - Most similar letter: L with score: 98.5
Piece 2 - Most similar letter: H with score: 98.84
Piece 3 - Most similar letter: F with score: 99.25
Piece 4 - Most similar letter: N with score: 99.32
Piece 5 - Most similar letter: A with score: 99.16
Piece 6 - Most similar letter: H with score: 99.18
Piece 7 - Most similar letter: E with score: 98.92
Piece 8 - Most similar letter: E with score: 98.15
Piece 9 - Most similar letter: N with score: 99.29
Piece 10 - Most similar letter: M with score: 98.53
Piece 11 - Most similar letter: S with score: 97.79
Piece 12 - Most similar letter: L with score: 98.74
Piece 13 - Most similar letter: T with score: 98.15
Piece 14 - Most similar letter: O with score: 98.38
Piece 15 - Most similar letter: T with score: 98.19
Board: [['Y', 'L', 'H', 'F'], ['N', 'A', 'H', 'E'], ['E', 'N', 'M', 'S'], ['L', 'T', 'O', 'T']]

Step 2

In order to find the best possible words that can be played, bonus tiles on the board must be identified. This can be achieved with color quantization, which is a process of reducing the number of distinct colors in an image. In this case, the average color of each tile is sufficient in accurately recognizing bonuses.

# main.py

def get_average_color(image: np.ndarray, k: int = 3) -> np.ndarray:
    pixels = image.reshape(-1, 3).astype(np.float32)
    _, labels, centers = cv2.kmeans(pixels, k, None,
                                    (cv2.TERM_CRITERIA_EPS + cv2.TERM_CRITERIA_MAX_ITER, 100, 0.2),
                                    10, cv2.KMEANS_RANDOM_CENTERS)
    average_color = centers[np.argmax(np.bincount(labels.flatten()))]
    return average_color

def classify_and_store_bonus_tiles(color_pieces: List[np.ndarray]) -> Dict[str, List[Tuple[int, int]]]:
    bonus_tiles = {'DL': [], 'DW': [], 'TL': [], 'TW': []}
    for i, piece in enumerate(color_pieces):
        avg_color = get_average_color(piece)
        logging.debug(f"Piece {i} average color: {avg_color}")
        print(f"Piece {i} average color:", avg_color)
        x, y = i % 4, i // 4
        if np.allclose(avg_color, [230, 190, 70], atol=20): # TWEAK
            bonus_tiles['DL'].append((x, y))
        elif np.allclose(avg_color, [100, 180, 55], atol=20): # TWEAK
            bonus_tiles['DW'].append((x, y))
        elif np.allclose(avg_color, [245, 100, 75], atol=20): # TWEAK
            bonus_tiles['TL'].append((x, y))
        elif np.allclose(avg_color, [145, 100, 230], atol=20): # TWEAK
            bonus_tiles['TW'].append((x, y))
    return bonus_tiles

After each average color is calculated, it is checked to see if it falls within a certain range.

Piece 0 average color: [234.28247 215.43208 181.48859]
Piece 1 average color: [234.28447 215.44847 181.2834 ]
Piece 2 average color: [234.15767 215.21616 180.8396 ]
Piece 3 average color: [228.74945  189.43506   59.826733]
Piece 4 average color: [234.45782 215.80875 182.3849 ]
Piece 5 average color: [111.63389  185.93488   56.020657]
Piece 6 average color: [234.59401 215.70477 181.70276]
Piece 7 average color: [234.54495 215.78358 182.02042]
Piece 8 average color: [234.12727 215.25221 181.67554]
Piece 9 average color: [234.18199 200.7824   74.05492]
Piece 10 average color: [234.41765 215.48843 181.59122]
Piece 11 average color: [231.89584 198.74094  73.40631]
Piece 12 average color: [234.57059 215.50755 181.63367]
Piece 13 average color: [234.38069 215.48698 181.96625]
Piece 14 average color: [234.61804 215.77003 181.88376]
Piece 15 average color: [234.71565 215.95825 182.32326]
Bonus tiles: {'DL': [(3, 0), (1, 2), (3, 2)], 'DW': [(1, 1)], 'TL': [], 'TW': []}

By this time, all the information needed in order to find the best words has been extracted from the game board.

letter_points = {
        'A': 1, 'B': 3, 'C': 3, 'D': 2, 'E': 1, 'F': 4, 'G': 2, 'H': 4, 'I': 1, 'J': 8,
        'K': 5, 'L': 1, 'M': 3, 'N': 1, 'O': 1, 'P': 3, 'Q': 10, 'R': 1, 'S': 1, 'T': 1,
        'U': 1, 'V': 4, 'W': 4, 'X': 8, 'Y': 4, 'Z': 10
    }

# main.py

def calculate_word_score(word: str,
                         coords: Tuple[int, int],
                         board: List[List[str]],
                         bonus_tiles: Dict[str, int],
                         letter_points: Dict[str, int]) -> int:
    word_score = 0
    word_multipliers = []
    logging.debug(f'Scoring: {word}...')
    logging.debug(f'Coords: {coords}')
    logging.debug(f'Bonus tiles: {bonus_tiles}')
    for (x, y) in coords:
        letter = board[y][x]
        base_score = letter_points[letter]
        logging.debug(f'Letter: {letter}, Base Score: {base_score}')
        for bonus, positions in bonus_tiles.items():
            if (x, y) in positions:
                if bonus == 'TL':
                    base_score *= 3
                    logging.debug(f'Applied TL bonus at: {(x,y)}, new score: {base_score}')
                elif bonus == 'DL':
                    base_score *= 2
                    logging.debug(f'Applied DL bonus at: {(x,y)}, new score: {base_score}')
                elif bonus == 'TW':
                    word_multipliers.append(3)
                    logging.debug(f'Applied TW bonus at: {(x,y)}')
                elif bonus == 'DW':
                    word_multipliers.append(2)
                    logging.debug(f'Applied DW bonus at: {(x,y)}')
        word_score += base_score
    for multiplier in word_multipliers:
        word_score *= multiplier
        logging.debug(f'Applied word multiplier, new score: {word_score}')
    if len(word) >= 5:
        bonus_points = (len(word) - 4) * 5
        word_score += bonus_points
        logging.debug(f'Applied length bonus, new score: {word_score}')
    logging.debug(f'Final score for {word}: {word_score}')
    return word_score

To get all the possible words, I modified a library called Pyggle by implementing a Trie data structure for searching the NASPA Word List 2020 dictionary — the same one used in FanDuel. Boggle algorithms have existed for decades, so there was no need to reinvent the wheel. The above code iterates for every possible word that can be played so that we find the ones that score the highest.

Step 3

Figuring out how to enter the words was by far the easiest part. Knowing where each letter is on the board, the coordinates are scaled up to the resolution of the region of the screen that is captured.

# main.py

def get_word_screen_coords(word: str, board_coords: list[tuple], top_left: tuple, bottom_right: tuple) -> list[tuple]:
    box_width = (bottom_right[0] - top_left[0]) // 4
    box_height = (bottom_right[1] - top_left[1]) // 4
    word_screen_coords = []
    for coord in board_coords:
        screen_x = coord[0] * box_width + box_width // 2 + top_left[0]
        screen_y = coord[1] * box_height + box_height // 2 + top_left[1]
        print(f"Screen coordinates for {coord}: ({screen_x}, {screen_y})")
        word_screen_coords.append((screen_x, screen_y))
    return word_screen_coords

Lastly, once the screen coordinates are calculated for the mouse cursor to follow, a Catmull-Rom spline is implemented to create a curve to make it seem like a human is dragging on the screen.

# main.py

def glide_mouse_to_positions(word_screen_coords: list[tuple], duration: float = 2.0, steps_multiplier_if_gliding: int = 3, glide: bool = True) -> None:
    if not glide:
        for coord in word_screen_coords:
            pyautogui.moveTo(coord[0], coord[1], duration)
            time.sleep(duration)
    elif len(word_screen_coords) >= 4:
        # Use Catmull-Rom spline
        x = [coord[0] for coord in word_screen_coords]
        y = [coord[1] for coord in word_screen_coords]
        t = np.arange(len(word_screen_coords))
        cs = CubicSpline(t, np.c_[x, y], bc_type='clamped')
        steps = steps_multiplier_if_gliding * len(word_screen_coords)
        for i in np.linspace(0, len(word_screen_coords) - 1, steps):
            pyautogui.moveTo(cs(i)[0], cs(i)[1], duration)
            time.sleep(duration / steps)
    else:
        # Use interpolation
        x = [coord[0] for coord in word_screen_coords]
        y = [coord[1] for coord in word_screen_coords]
        t = np.linspace(0, 1, len(word_screen_coords))
        fx = interp1d(t, x, kind='linear')
        fy = interp1d(t, y, kind='linear')
        steps = steps_multiplier_if_gliding * len(word_screen_coords)
        for i in np.linspace(0, 1, steps):
            pyautogui.moveTo(fx(i), fy(i), duration)
            time.sleep(duration / steps)

Smaller words containing less than 4 letters fall back to interpolation since the Catmull-Rom spline requires at least 4 points.

Final Words

Below is the complete file with all three steps in conjunction:

# main.py

import os
import cv2
import time
import torch
import logging
import open_clip
import pyautogui
import numpy as np
from PIL import Image, ImageGrab
from pynput import mouse, keyboard
from sentence_transformers import util
from pynput.keyboard import Key, KeyCode
from pynput.mouse import Controller, Button
from typing import List, Tuple, Union, Dict
from pyggle.lib.pyggle import Boggle, boggle
from concurrent.futures import ThreadPoolExecutor
from scipy.interpolate import CubicSpline, interp1d


def imageEncoder(img: np.ndarray) -> torch.Tensor:
    img = Image.fromarray(img).convert('RGB')
    img = preprocess(img).unsqueeze(0).to(device)
    with torch.no_grad():
        img = model.encode_image(img)
    return img

def generateScore(img1: np.ndarray, img2: np.ndarray) -> float:
    img1, img2 = map(imageEncoder, (img1, img2))
    cos_scores = util.pytorch_cos_sim(img1, img2)
    score = round(float(cos_scores[0][0]) * 100, 2)
    return score

def compareAllImages(img: np.ndarray, directory: str) -> List[Tuple[str, float]]:
    img1_tensor = imageEncoder(img)
    scores = []
    filenames = [f for f in os.listdir(directory) if f.endswith('.png')]
    def process_image(filename):
        image_path = os.path.join(directory, filename)
        img2 = cv2.imread(image_path)
        img2_tensor = imageEncoder(img2)
        cos_scores = util.pytorch_cos_sim(img1_tensor, img2_tensor)
        score = round(float(cos_scores[0][0]) * 100, 2)
        return filename, score
    with ThreadPoolExecutor() as executor:
        scores = list(executor.map(process_image, filenames))
    return scores

def capture_screen_region_for_colors(top_left: tuple, bottom_right: tuple) -> np.ndarray:
    img = ImageGrab.grab(bbox=(top_left[0], top_left[1], bottom_right[0], bottom_right[1]))
    open_cv_image = np.array(img)
    open_cv_image = open_cv_image[:, :, ::-1].copy()
    color_image = cv2.cvtColor(open_cv_image, cv2.COLOR_BGR2RGB)
    return np.array(color_image)

def capture_screen_region_for_comparison(top_left: tuple, bottom_right: tuple) -> np.ndarray:
    img = ImageGrab.grab(bbox=(top_left[0], top_left[1], bottom_right[0], bottom_right[1]))
    open_cv_image = np.array(img)
    open_cv_image = open_cv_image[:, :, ::-1].copy()
    return open_cv_image

def split_image_into_4x4_grid(image: np.ndarray) -> list[np.ndarray]:
    height, width = image.shape[:2]
    cell_width = width // 4
    cell_height = height // 4
    image_pieces = []
    for i in range(4):
        for j in range(4):
            start_y = i * cell_height
            start_x = j * cell_width
            end_y = (i + 1) * cell_height if i < 3 else height
            end_x = (j + 1) * cell_width if j < 3 else width
            piece = image[start_y:end_y, start_x:end_x]
            image_pieces.append(piece)
    return image_pieces

def get_most_similar_letter(scores: List[Tuple[str, float]]) -> Tuple[str, float]:
    scores.sort(key=lambda x: x[1], reverse=True)
    return scores[0][0].replace('.png', ''), scores[0][1]

def binary_image_pieces(image_pieces: List[np.ndarray]) -> List[np.ndarray]:
    processed_pieces = []
    lower_black = np.array([0, 0, 0], dtype="uint8")
    upper_black = np.array([50, 50, 50], dtype="uint8")
    for piece in image_pieces:
        mask = cv2.inRange(piece, lower_black, upper_black)
        processed_pieces.append(mask)
    return processed_pieces

def save_images(image_pieces: List[np.ndarray], output_dir: str) -> None:
    os.makedirs(output_dir, exist_ok=True)
    for i, piece in enumerate(image_pieces):
        output_path = os.path.join(output_dir, f"piece_{i}.png")
        cv2.imwrite(output_path, piece)

def list_to_board(lst: list) -> list[list[str]]:
    return [lst[i * 4: i * 4 + 4] for i in range(4)]

def get_average_color(image: np.ndarray, k: int = 3) -> np.ndarray:
    pixels = image.reshape(-1, 3).astype(np.float32)
    _, labels, centers = cv2.kmeans(pixels, k, None,
                                    (cv2.TERM_CRITERIA_EPS + cv2.TERM_CRITERIA_MAX_ITER, 100, 0.2),
                                    10, cv2.KMEANS_RANDOM_CENTERS)
    average_color = centers[np.argmax(np.bincount(labels.flatten()))]
    return average_color

def classify_and_store_bonus_tiles(color_pieces: List[np.ndarray]) -> Dict[str, List[Tuple[int, int]]]:
    bonus_tiles = {'DL': [], 'DW': [], 'TL': [], 'TW': []}
    for i, piece in enumerate(color_pieces):
        avg_color = get_average_color(piece)
        logging.debug(f"Piece {i} average color: {avg_color}")
        print(f"Piece {i} average color:", avg_color)
        x, y = i % 4, i // 4
        if np.allclose(avg_color, [230, 190, 70], atol=20): # TWEAK
            bonus_tiles['DL'].append((x, y))
        elif np.allclose(avg_color, [100, 180, 55], atol=20): # TWEAK
            bonus_tiles['DW'].append((x, y))
        elif np.allclose(avg_color, [245, 100, 75], atol=20): # TWEAK
            bonus_tiles['TL'].append((x, y))
        elif np.allclose(avg_color, [145, 100, 230], atol=20): # TWEAK
            bonus_tiles['TW'].append((x, y))
    return bonus_tiles

def calculate_word_score(word: str,
                         coords: Tuple[int, int],
                         board: List[List[str]],
                         bonus_tiles: Dict[str, int],
                         letter_points: Dict[str, int]) -> int:
    word_score = 0
    word_multipliers = []
    logging.debug(f'Scoring: {word}...')
    logging.debug(f'Coords: {coords}')
    logging.debug(f'Bonus tiles: {bonus_tiles}')
    for (x, y) in coords:
        letter = board[y][x]
        base_score = letter_points[letter]
        logging.debug(f'Letter: {letter}, Base Score: {base_score}')
        for bonus, positions in bonus_tiles.items():
            if (x, y) in positions:
                if bonus == 'TL':
                    base_score *= 3
                    logging.debug(f'Applied TL bonus at: {(x,y)}, new score: {base_score}')
                elif bonus == 'DL':
                    base_score *= 2
                    logging.debug(f'Applied DL bonus at: {(x,y)}, new score: {base_score}')
                elif bonus == 'TW':
                    word_multipliers.append(3)
                    logging.debug(f'Applied TW bonus at: {(x,y)}')
                elif bonus == 'DW':
                    word_multipliers.append(2)
                    logging.debug(f'Applied DW bonus at: {(x,y)}')
        word_score += base_score
    for multiplier in word_multipliers:
        word_score *= multiplier
        logging.debug(f'Applied word multiplier, new score: {word_score}')
    if len(word) >= 5:
        bonus_points = (len(word) - 4) * 5
        word_score += bonus_points
        logging.debug(f'Applied length bonus, new score: {word_score}')
    logging.debug(f'Final score for {word}: {word_score}')
    return word_score

def get_word_screen_coords(word: str, board_coords: list[tuple], top_left: tuple, bottom_right: tuple) -> list[tuple]:
    box_width = (bottom_right[0] - top_left[0]) // 4
    box_height = (bottom_right[1] - top_left[1]) // 4
    word_screen_coords = []
    for coord in board_coords:
        screen_x = coord[0] * box_width + box_width // 2 + top_left[0]
        screen_y = coord[1] * box_height + box_height // 2 + top_left[1]
        print(f"Screen coordinates for {coord}: ({screen_x}, {screen_y})")
        word_screen_coords.append((screen_x, screen_y))
    return word_screen_coords

def glide_mouse_to_positions(word_screen_coords: list[tuple], duration: float = 2.0, steps_multiplier_if_gliding: int = 3, glide: bool = True) -> None:
    if not glide:
        for coord in word_screen_coords:
            pyautogui.moveTo(coord[0], coord[1], duration)
            time.sleep(duration)
    elif len(word_screen_coords) >= 4:
        # Use Catmull-Rom spline
        x = [coord[0] for coord in word_screen_coords]
        y = [coord[1] for coord in word_screen_coords]
        t = np.arange(len(word_screen_coords))
        cs = CubicSpline(t, np.c_[x, y], bc_type='clamped')
        steps = steps_multiplier_if_gliding * len(word_screen_coords)
        for i in np.linspace(0, len(word_screen_coords) - 1, steps):
            pyautogui.moveTo(cs(i)[0], cs(i)[1], duration)
            time.sleep(duration / steps)
    else:
        # Use interpolation
        x = [coord[0] for coord in word_screen_coords]
        y = [coord[1] for coord in word_screen_coords]
        t = np.linspace(0, 1, len(word_screen_coords))
        fx = interp1d(t, x, kind='linear')
        fy = interp1d(t, y, kind='linear')
        steps = steps_multiplier_if_gliding * len(word_screen_coords)
        for i in np.linspace(0, 1, steps):
            pyautogui.moveTo(fx(i), fy(i), duration)
            time.sleep(duration / steps)

def on_press(key: Union[Key, KeyCode]) -> None: # callback function
    if key == Key.enter:
        print('Enter key pressed. Current mouse position is:', mouse_controller.position)
        mouse_positions.append(mouse_controller.position)
    if key == Key.shift:
        print('Shift key pressed. Exiting...')
        exit()

def get_words_until_min_letters(word_scores: List[Tuple[str, int]], min_letters: int) -> List[Tuple[str, int]]:
    total_letters = 0
    for i, (word, _) in enumerate(word_scores):
        total_letters += len(word)
        if total_letters >= min_letters:
            return word_scores[:i+1]
    return word_scores

if __name__ == '__main__':
    logging.basicConfig(filename='scoring_debug.log', level=logging.DEBUG)

    letter_points = {
        'A': 1, 'B': 3, 'C': 3, 'D': 2, 'E': 1, 'F': 4, 'G': 2, 'H': 4, 'I': 1, 'J': 8,
        'K': 5, 'L': 1, 'M': 3, 'N': 1, 'O': 1, 'P': 3, 'Q': 10, 'R': 1, 'S': 1, 'T': 1,
        'U': 1, 'V': 4, 'W': 4, 'X': 8, 'Y': 4, 'Z': 10
    }

    print('Loading model...')
    device = "cuda" if torch.cuda.is_available() else "cpu"
    print('Using device:', torch.cuda.get_device_name(0))
    model, _, preprocess = open_clip.create_model_and_transforms('ViT-B-16-plus-240', pretrained='laion400m_e32')
    model.to(device)
    print('Model loaded.')

    mouse_controller = mouse.Controller()
    keyboard_listener = keyboard.Listener(on_press=on_press)
    keyboard_listener.start()

    while True:
        mouse_positions = []
        print('Listening for Enter key...')

        while len(mouse_positions) < 2: pass

        top_left = mouse_positions[0]
        bottom_right = mouse_positions[1]
        print('First mouse position:', mouse_positions[0])
        print('Second mouse position:', mouse_positions[1])
        print('Positions captured.')

        region = capture_screen_region_for_comparison(top_left, bottom_right)

        image_pieces = split_image_into_4x4_grid(region)
        image_pieces = binary_image_pieces(image_pieces)
        save_images(image_pieces, "pieces_output")

        color_region = capture_screen_region_for_colors(top_left, bottom_right)
        color_pieces = split_image_into_4x4_grid(color_region)
        save_images(color_pieces, "color_pieces_output")
        bonus_tiles = classify_and_store_bonus_tiles(color_pieces)

        letters = []
        for index, piece in enumerate(image_pieces):
            scores = compareAllImages(piece, "control_group")
            most_similar_letter, highest_score = get_most_similar_letter(scores)
            letters.append(most_similar_letter)
            logging.debug(f'Piece {index} - Most similar letter: {most_similar_letter} with score: {highest_score}')
            print(f'Piece {index} - Most similar letter:', most_similar_letter, 'with score:', highest_score)

        board = list_to_board(letters)
        print('Board:', board)
        print('Bonus tiles:', bonus_tiles)
        logging.debug(f'Board: {board}')
        logging.debug(f'Bonus tiles: {bonus_tiles}')

        boggle = Boggle(board)
        solved = boggle.solve()
        print(solved)

        word_scores = []
        for word, coords in solved.items():
            score = calculate_word_score(word, coords, board, bonus_tiles, letter_points)
            word_scores.append((word, score))

        word_scores.sort(key=lambda x: x[1], reverse=False)

        for word, score in word_scores:
            print(f"{word}: {score}")
        
        word_scores.sort(key=lambda x: x[1], reverse=True)

        # *** minumum letters to enter, rounds up a word *** #
        filtered_entries = get_words_until_min_letters(word_scores, 125) # TWEAK
        print("Filtered entries:", filtered_entries)

        # move slowly to first pos of first word
        if filtered_entries:
            first_word, _ = filtered_entries[0]
            first_board_coords = solved[first_word]
            first_word_screen_coords = get_word_screen_coords(first_word, first_board_coords, top_left, bottom_right)
            glide_mouse_to_positions([mouse_controller.position, first_word_screen_coords[0]], duration=0.08, glide=False)

        for i in range(len(filtered_entries)):
            word, score = filtered_entries[i]
            print(f"Entering word: {word} with score: {score}")
            board_coords = solved[word]
            word_screen_coords = get_word_screen_coords(word, board_coords, top_left, bottom_right)

            mouse_controller.position = word_screen_coords[0]
            mouse_controller.press(Button.left)
            # *** glide = False means instantly go to each coord *** #
            glide_mouse_to_positions(word_screen_coords, duration=0, steps_multiplier_if_gliding=3, glide=True) # TWEAK
            mouse_controller.release(Button.left)

            # If there is a next word, slowly move to its first position
            if i < len(filtered_entries) - 1:
                next_word, _ = filtered_entries[i + 1]
                next_board_coords = solved[next_word]
                next_word_screen_coords = get_word_screen_coords(next_word, next_board_coords, top_left, bottom_right)
                glide_mouse_to_positions([mouse_controller.position, next_word_screen_coords[0]], duration=0.08, glide=False) # TWEAK

In summary, this project taught me a great deal about the importance of selecting the right technology for the task at hand. Just because a method seems like it could work, doesn’t mean it always will. Everything has its strengths and weaknesses. Looking back, I found the need for perseverance inconceivably important after being unsuccessful with three different methods for letter identification. Each time, I cleared my head, started a new file, and logically worked through the challenge. By the end, I was incredibly proud of myself.

Running the Program

Note: I did not enter head-to-head matches against real people. The free entry matches are against bots that reflect the skill of real players. Since I am under the age of 18, I cannot legally gamble and did not partake in gambling of any kind throughout this project. But yes, I would’ve made a pleasing amount of money.
This project is open-source and available on GitHub. Just know that I am not responsible for your irresponsibility. If you find this helpful or informative, please giving the repo a star. Your support truly makes a difference. You are free to use this however you’d like. Made with ❤️.