Member-only story
Machine Learning — Look-ahead Bias
Understand the bias with a concrete example
The concept
In Data Science and Machine Learning, the look-ahead bias refers to a problem we could face and impact the performance of the model and predictions.
This bias is easy to understand but difficult to avoid because the model performance is often good during the training. So, you may see no visible issues.
In this article, I illustrate this bias with a simple example (build a supervised model to predict a product price). Afterward, we dive into explanations and tips to avoid it.
Example
To illustrate this bias, imagine you want to build a model to predict the price of a product. To train the model, you use a bunch of data such as:
- Product characteristics (The brand, weight, size, category of product)
- Historical data (Previous price, number of sales)
- Geographical data (The country in which the product is sold, the region)
- Customers reviews
- Consumer Characteristics (Customer age, preference)
- Etc.
During the training phase, you gather all those data, make feature engineering and start to train…