Building a SET Solver Using Python and OpenCV

Published in

The Startup

10 min readSep 9, 2020

Have you ever played SET? SET is a game in which players race to identify patterns of three cards — or SETs — within twelve unique cards in play at a given time. Each SET card has four properties: shape, shade/fill, color, and count. Below is an example twelve-card layout with some card descriptions.

A standard twelve card SET layout with some card descriptions

Notice that each of the card’s four properties can be expressed by one of three variations. These four properties and their variations are summarized in the table below.

Because no two cards are repeated, a SET deck contains 3⁴ = 81 cards (3 variations per property, 4 properties). A valid SET is constituted by three cards that, for each of the four properties, either all share the same variation or all have different variations. You can find the complete rules online. To visually demonstrate, here are three example valid SETs:

(1) Shape: all different (2) Shade: all different (3) Color: all different (4) Count: all the same

(1) Shape: all different (2) Shade: all the same (3) Color: all different (4) Count: all the same

(1) Shape: all the same (2) Shade: all different (3) Color: all the same (4) Count: all different

With some time to spare between the end of my summer internship and the beginning of the semester, I built a SET solver: a computer program that took an image of SET cards and returned all valid SETs. I used OpenCV —an open source computer vision library — and Python. To familiarize myself, I skimmed the library’s documentation and watched a tutorial series. In addition, I read blog posts and GitHub repositories for some similar projects.¹ I then broke the project down into four tasks:

Locating cards within an input image (CardExtractor.py)
Identifying each card’s unique properties (Card.py)
Evaluating the identified cards for SETs (SetEvaluator.py)
Displaying SETs to the user (set_utils.display_sets)

I created a dedicated class for each of first three tasks, which you can see in my type-hinted main method below.

All of the project’s code is publicly available on GitHub. Let’s jump in!

Locating Cards in Input Image

1. Image Preprocessing

After importing OpenCV and Numpy (an open source array and matrix manipulation library), the first step in locating cards was to apply image preprocessing techniques to accentuate the boundaries of the card. Specifically, this approach involved converting the image to greyscale, applying a Gaussian blur, and thresholding the image. Briefly:

Conversion to greyscale removes an image’s coloration by only retaining each pixel’s intensity or luminance (a weighted sum of the RGB color channels).
Applying a Gaussian Blur to an image transforms the intensity value of each pixel to a weighted average of that pixel’s neighborhood. Weights are determined by a Gaussian distribution centered at the current pixel. This removes noise and ‘smooths’ the image. After experimentation, I decided on a Gaussian kernel size of (3,3). Wikipedia has a great explanation of how this works here.
Thresholding converts a greyscale image to a binary image — a new matrix in which each pixel has one of two values (typically black or white). To do this, a constant value threshold is used to segment the pixels. Because I anticipated input images with varying lighting conditions, I used the cv2.THRESH_OTSU flag to estimate the optimal threshold constant at runtime. You can find a cool explanation of how Otsu’s method works here.

OpenCV makes these three steps a piece of cake:

Original → Greyscaled and Blurred → Thresholded

2. Finding Card Contours

Next, I used OpenCV’s findContours() and approxPolyDP() methods to locate the cards. Leveraging the image’s binary property, the findContours() method finds curves “joining all the continuous points (along the boundary) having the same color or intensity.”² The first step is to use the following function call on the preprocessed image:

contours, hierarchy = cv2.findContours(processed_image, cv2.RETR_TREE, cv2.CHAIN_APPROX_SIMPLE)

The cv2.RETR_TREE flag retrieves all found contours as well as a hierarchy describing the level at which a given contour is nested or embedded within other contour(s). The cv2.CHAIN_APPROX_SIMPLE flag compresses contour information by only encoding contour endpoints. After doing some error checking to exclude non-cards, I used the approxPolyDP() method to estimate a polygonal curve using contour endpoints. Here are some identified card contours overlaid on top of an original image.

3. Refactor Card Images

With contours identified, the card’s boundaries must be refactored to standardize the angle and orientation of the card in the original image. This can be done with an Affine warp transform, a type of geometric transformation that preserves collinear points and parallelism between lines on the image. You can see the below code snippet in action with the example image.

Affine transformation overlaid on original image to demonstrate standardized angle and orientation

I then passed each refactored card image and its coordinates as parameters into the Card class constructor. Here is an abridged version of the constructor:

Identifying Card Properties

As a preliminary step, a static method named process_card applied the same preprocessing techniques described above — as well as binary dilation and erosion—to the refactored card image. A brief explanation and example:

Dilation is an operation in which the value of a pixel P becomes the value of the maximum pixel in pixel P’s “neighborhood.” Erosion is the opposite: the value of a pixel P becomes value of the minimum pixel in pixel P’s “neighborhood.”³
The size and shape of that neighborhood — or the ‘kernel’ — can be passed as an input in OpenCV (the default is a 3x3 square matrix).
For binary images, the composition of erosion and dilation (also called Opening and Closing) are used to remove noise by eliminating any pixels that fall outside the “reach” of relevant pixels. Check this out in a contrived example I made below.

#Close card image (dilate then erode)
dilated_card = cv2.dilate(binary_card, kernel=(x,x), iterations=y)
eroded_card = cv2.erode(dilated_card, kernel=(x,x), iterations=y)

Card with noise → Preprocessed Image → Dilated+Erorded “”Closed” Image. Notice the noise removal.

I took the resulting image and used different approaches to extract each property — shape, shade, color, and count — from the processed card. I used the code from @piratefsh’s set-solver repo on Github to identify card color and shade, and devised my own approaches for shape and count. See the GitHub for the implementation.

Shape

To identify the shape of the symbol shown on a card, I used the area of the card’s largest contour. This approach assumes that the largest contour is a symbol on the card — an assumption that, barring extreme lighting, is almost always correct.

Shade

The approach to identifying the shade or ‘fill’ of a card used the pixel density inside the card’s largest contour.

Color

The approach to identifying a card’s color involved evaluating the values of the three color channels (RGB) and comparing their ratios.

Count

To identify the number of symbols on a card, I began by finding the four largest contours. Although count never exceeds three in practice, I chose four and then did error checks to exclude non-symbols. After filling in contours using cv2.drawContours to avoid double counting, I checked the value of the contour area as well the hierarchy (to ensure the contour wasn’t embedded within another contour).

Original symbols were filled in to ensure no internal boundaries were counted as contours.

Aside: an alternative approach to identifying card properties might be to apply a supervised ML classification model to card images. Based on some quick research, it looks like this could be done with with Scikit’s SVM or KNN and the Keras ImageDataGenerator to augment the dataset.

Each variation was then encoded as an integer, such that any card could be represented by an array of four integers. For instance, a purple card with two empty diamond symbols could be represented as [1,1,3,2]. The table below shows the property encoding key.

With cards now represented as arrays, let’s evaluate for SETs!

Evaluating for SETs

To check for sets within the identified cards, an array of Card objects was passed to the SetEvaluator class.

Approach 1: All Possible Combinations

There are at least two approaches to evaluating array representations of cards for valid SETs. The first approach requires evaluating all possible three-card combinations. When twelve cards are displayed, for example, there are ₁₂C₃ =(12!)/(9!)(3!) = 660 possible combinations of three cards. Using Python’s itertools module, these can be computed as follows:

import itertools
SET_combinations = list(combinations(cards: List[Card], 3))

Remember that for each property, the three cards in a SET must either share or differ in their variations. If three card-arrays are stacked atop each other, all values in a given column/property must therefore show the all the same value or all different values.

This characteristic can be checked by summing all the values in that column. If all three cards have the same value for that property, the resulting sum is by definition divisible by three. Similarly, if all three values differ (i.e. equal a permutation of 1,2, and 3) then the resulting sum of six is also divisible by three. No other sum of those values is divisible by three without a remainder. Check out the example below!

All values in the Sum%3= row are 0 so… SET!

I applied this method to all 660 combinations, saved the valid ones, and voila — we have our SETs! Here’s a code snippet that demonstrates this approach naively (without using a generator to return False early when possible):

But there’s a better approach…

Approach 2: Verifying a SET Key

Note that for any two cards in the deck, there is one card (and only one card) that completes the SET. Let’s call this third card the SET Key. A more efficient alternative to Approach 1 involves iteratively choosing two cards, calculating their SET Key, and checking whether that Key appears in the remaining cards. Checking membership for a Set() structure in Python has an average time complexity of O(1).

This reduces the algorithm’s time complexity to O(n²), as it lessens the number of combinations that need to be evaluated. Given the fact that only small n inputs were expected (there is a 96.77% chance of a SET with 12 cards in play, a 99.96% chance with 15 cards, and a 99.9996% chance with 16 cards⁴), efficiency wasn’t top of mind. Using the first approach, I timed the program on my mid-tier laptop and found that it ran in an average of 1.156 seconds (with rendering the final image) and 1.089 seconds (without rendering) on my test inputs. In one case, the program identified seven separate sets in 1.146 seconds. If human performance is a benchmark, then seven sets in just over a second is fast enough for me!

Displaying SETS to the user

Finally, I followed the lead of piratefsh and Nicolas Hahn in displaying SETs to users by circling the respective SET’s cards with a unique color on the original image. I stored a list of each card’s original coordinates as an instance variable, which I used to draw colored contours.

Cards that belonged to multiple SETs required multiple outlines. To avoid drawing contours on top of each other, the expanded_coordinates() method iteratively expanded a card’s contours depending on how many SETs the card appeared in. Here is the result in action using cv2.imshow():

And that’s it — a SET solver using Python and OpenCV! This project was a great introduction to OpenCV and computer vision fundamentals. In particular, I learned about:

Image processing, noise-reduction, and standardization techniques like Gaussian blurring, affine transformations, and morphological operations.
Otsu’s method for automatic binary thresholding.
Contour and Canny edge detection.
The OpenCV library and some of its uses.

I look forward to the next project, and please reach out with any feedback or ideas!

Until next time,

— Coby

Citations and Resources

Piratefsh’s set-solver on Github was particularly informative. After finding that her approach to color identification very accurate, I ended up simply copying the method. Arnab Nandi’s card game identification project was also a useful starting point, and Nicolas Hahn’s set-solver also proved useful. Thank you Sherr, Arnab, and Nicolas, if you are reading this!
Here’s a basic explanation of contours and how they work in OpenCV. I initially implement the program with Canny Edge Detection, but subsequently removed it because it did not improve card identification accuracy for test cases.
You can find a more detailed description of morphological transformations on the OpenCV site here.
Some interesting probabilities related to the game SET.