Instance-Based Learning vs. Model-Based Learning

Machine Learning Tales

Jarrett Evans
3 min readJun 30, 2020

Overview:

The main difference in these models is how they generalize information. Instance-based learning will memorize all the data in a training set and then set a new data point to the same or average output value of the most common data point or similar data points it has memorized. In model-based learning, the model would create a prediction line or prediction sections based on the different attributes of the data it trained on. A new data point would then fall along this line or within certain sections based on the attributes it possesses.

Story:

In the middle of a small town, there was a popular clothing store that was run by a mother and her daughter. The mother needed to know approximately how much money a customer was about to spend in her store because she was someone who hated surprises. The daughter studied computer science at the local college, and she decided to build a system so her mother would not need to deal with the stress of the unknown spending habits of a customer. This system would look at the characteristics of a customer as they were pulling up to the store. Some of the characteristics included the type of car the customer drove and how high-end the clothes they had on were. It was common courtesy in this town to always reflect your spending habits through your car and clothes.

A regular customer at the store was a young man named Shane. Shane was a successful businessman and one of the wealthiest members of the town. He drove a Tesla and constantly flaunted his outfits from Nordstrom. The model that the daughter had built would take new customers who also had high-end cars and expensive clothes and predict that they would spend the same amount in the store that Shane does. For the most part, this worked quite well as every rich person in this small town had about the same amount of money.

Then one day a big-time producer from Hollywood named Kevin came into town. Kevin refused to be seen in anything less than the newest Lamborghini and finest custom clothes directly from a boutique in Italy. As Kevin approached the store the model predicted that he would spend the same amount as Shane. Even though Kevin’s car and clothes were significantly more expensive than Shane’s, that was the closest data point the system had to reference. Kevin ended up spending much more money in the store than Shane ever had.

The mother was distraught over this occurrence and it allowed the daughter to rethink her model. She landed on the idea of using a system that utilized model-based learning instead of the system she had in place that used instance-based learning. That way if a new customer comes in, without attributes that closely reflect the data her model already memorized, the predictions would be more likely to reflect the amount they spend in the store.

The next time an unfamiliar car pulled up with a customer sporting unfamiliar clothing it made an accurate prediction on how much they would spend in the store. The mother was then able to sleep soundly having confidence she’d never be thrown for a loop like the time when Kevin visited the store again.

Recap:

In this story, we were able to see a situation where having an instance-based learning model did not provide an accurate prediction. The reason being the new data point (Kevin) acted as an outlier compared to the data the model was trained on. Instance-based learning models can perform quite well if the data it is trained with resembles new data it is trying to make predictions for. However, in this circumstance where there could be outliers an instance-based model may drastically underestimate the predicted value. The type of model you ultimately use for your machine learning problem will depend on the situation. It is best to think through all the different conditions you could experience with new input data and to try different algorithms when you are testing.

--

--