Guess What? Model Extrapolation!

Jinbin
unpack
Published in
3 min readJan 18, 2021

--

Source from The Spokesman-Review

Extrapolation, in the terms of mathematics, is an estimation of a value that is beyond a sequence of the known values in a pattern or a function. In other words, you are taking the best guess on further outcomes according to the given knowledge or information. For example, knowing the oil demand pattern in the market within the last two years, we try to predict the demands in the next decade based on the pattern. Similarly, in machine learning (ML), model extrapolation is to predict with a trained model on the given data that is outside the training and/or validation set, like a bear classifier attempting to recognise a polar bear after the training with data excluding any polar bears at all.

Model extrapolation, however, seems fairly tricky to achieve without any problems, as machines generally fail to make accurate predictions or recommendations for the unlearned information. Neither can humans. Despite the challenges, there are techniques in place that could help model extrapolate.

First things first, building an efficient ML model is the key of all. Therefore, the model should be neither under-fitted nor over-fitted. To do so, practitioners could follow data augmentation and machine learning techniques, such as collecting more suitable data, implementing Mix-up to create synthetic data, generalising the architecture with Batch Normalisation, and regularising the model with weight decay. And finally, we could reduce the complexity of the architecture if possible.

Another technique for model extrapolation, in my opinion, is feedback loops. Although this is a controversial topic these days, especially among the social media platforms, feedback loops, without a doubt, could improve the accuracy of the model after training. If a positive feedback loop were implemented properly, the model would carry on learning new data in the real world through the feedback data gathered. However, it doesn’t come without any cost. While it is in place, a feedback loop could make the model more biased toward a particular direction.

Certainly, there are recently more researches on extrapolation in machine learning models, like a study in Austria proposing a function learning network, equation learner (EQL), that improves ML models to make predictions on all possible domains.

When it comes to the extrapolation performances among different ML models, they don’t seem to hold up well thus far. There was a comparison study in Germany where several ML algorithms were taken on a test in the interpolation and extrapolation of Flame Describing Functions, including linear regression, neural networks, random forest regression, Gaussian Processes, etc. As a result, all algorithms reportedly performed poorly in the extrapolation test. Even with EQL, the performance boosted only 4%. We certainly cannot jump to the conclusion based on one study, thus further research would be needed.

Anyway, model extrapolation has made scientists and researches scratch their heads hard, and I am still looking for a better solution for it. So, I guess ML models would have to keep on guessing in the unknown domains.

--

--