Understanding Machine Learning Models with a Puzzle Analogy

Liz Waithaka
Women in Technology
5 min readOct 12, 2023
Photo by Ant Rozetsky on Unsplash

Models are, in essence, a reflection of the data they’re trained on. It is the data that determines the upper bound of performance. Simply put, a machine learning model cannot surpass the capabilities and constraints of the data it’s fed.

Quality and quantity play a crucial role in defining the best achievable performance of a machine learning model. Therefore, the primary aim of any machine learning endeavor is to select a model and fine-tune its parameters to come as close as possible to the performance cap set by the data.

This emphasizes the importance of choosing the right model and optimizing its parameters to make the most of the available information. Machine learning models, though powerful, are bound by the confines of the data provided to them. They cannot glean insights that are absent in the data, and they certainly cannot invent knowledge beyond what the data offers. In other words, the old adage holds true: “garbage in, garbage out.” High quality, representative and diverse data is crucial for training effective models.

Photo by Jonny Gios on Unsplash

PUZZLE ANALOGY

Imagine you are trying to build a puzzle, and the pieces of the puzzle represent your data, while the completed puzzle represents the best possible model performance.

  1. The puzzle pieces are the data. You have a set of puzzle pieces each with a specific shape, color and pattern. All these are your data and they are all you have to work with\.
  2. The puzzle models: You will have different models to choose from, each representing the different ways you can use to solve the puzzle.
  3. Completing the puzzle(Model Performance): The goal is to complete the puzzle, which translates to the optimum performance. The better a particular model is at arranging the puzzle pieces the closer you get to completing the puzzle.
  4. Upper Bound(Data’s potential): You can only be able to complete the puzzle as good as the amount and quality of the pieces of the puzzles that you have. You cant add extra pieces and you cant make the pieces themselves change. They are what they are.
  5. Selecting the right model: The task is always to select a model and strategy that is able to arrange the pieces in the most efficient and effective way while trying to get as close as possible to the upper bound set by the pieces. You cant expect the puzzle to show an image that is not present in the pieces. The challenge therefore is to make the most of the available pieces to complete the puzzle as best as you can.

The Machine Learning Workflow

Image from LeetCode

We earlier learnt about the different types of machine learning problems and we now need to determine the type we will be trying to solve i.e. Supervised or Unsupervised.

For supervised machine learning data, we further determine the type of output we are expecting from the model i.e. it a discrete value or continous value. If it is a discrete value, we would use classification model and for continuos value it would be a regression model.

Once we determine the type of model we would like to build our data, we then go ahead to perform feature engineering, which is a group of activities that transforms the data into the desired format. There are a few examples of feature engineering.

Imagine you are still working on the puzzle and you have realized that some of the existing puzzle pieces are not providing enough detail to complete the puzzle. Feature engineering is the process of creating new, more informative puzzle pieces to improve the chances of solving the puzzle. Identifying specific relationships or patterns in the existing pieces and crafting additional pieces to highlight those relationships. The new pieces (engineered features) provide extra information that can help you complete the puzzle more easily.

The model can then use these features to better capture the underlying patterns and relationships in the data.

This could include:

  1. Extracting components like year, month, day, hour from timestamps,
  2. Converting categorical variables such as country, gender etc into numerical ones/binary (0 or 1) vectors due to the constraints of algorithim
  3. Binning and Discretization- Grouping continous data into discrete bins or intervals.
  4. Dealing with incomplete data using various strategies such as filling with the average value

The process of feature engineering is not a one-off step. Often one needs to repeatedly come back to the feature engineering later in the workflow.

The data set is then split into two groups: training and testing. The training dataset is used during the process to train a model, while the testing fata is used to validate whether the model is generic enough that can be applied to the unseen data.

Most times, it is rare that you would be happy with the first model. One would have to go back to the training process and tune some parameters that are exposed by the model we selected, something called hyper parameter tuning. Back to our puzzle analogy, you now have your puzzle and you have created additional informative puzze pieces(feature engineering), but now you are faced with a different challenge. You have a set of puzzle solving strategies that you can use to complete the puzzle.

Each strategy has a default set of rules or strategies(hyperparameters). These strategies represent how you approach assembling the puzzle. These strategies however might be too conservative or too agressive in their approach and not always lead to the best solutions.

Hyperparameter tuning is like adjusting the strategies to improve the puzzle solving process by experimenting with different ways of approaching the puzzle using each strategy. For example, you may change the order in which you try pieces or the techniques you use to evaluate their fit. You then systematically modify the parameters of each strategy to find what settings works best for the given puzzle. This could mean changing the time limit for a particular strategy or the order of its moves.

As you fine tune the strategies, you start to complete the puzzle more efficiently and more accurately. That’s the main goal of machine learning; to be able to improve model performance and enhance the model’s ability to find patterns and make accurate predictions.

And there you have it, a simple analogy that sheds light on exciting world of machine learning, all through the lens of a puzzle analogy. Just as we piece together puzzles to get a complete picture, we assemble and fine-tune machine learning models to extract meaningful insights from data. Remember, the quality of the puzzle pieces, or in this case, the data, determines the ultimate performance.

--

--

Liz Waithaka
Women in Technology

AI Enthusiast || Machine Learning || Data Scientist || StoryTelling || GitHub: https://github.com/liznjoki