Transfer Learning

Tech in 3

Published in

Nerd For Tech

3 min readFeb 10, 2021

Transfer Learning will be the next driver of Machine Learning success. — Andrew Ng

What is Transfer Learning?

The general idea of transfer learning is to use knowledge learned from tasks for which a lot of labeled data is available in settings where only a little labeled data is available. Creating labeled data is expensive, so optimally leveraging existing datasets is key.

In a traditional machine learning model, the primary goal is to generalize to unseen data based on patterns learned from the training data. With transfer learning, you attempt to kickstart this generalization process by starting from patterns that have been learned for a different task. Essentially, instead of starting the learning process from an (often randomly initialized) blank sheet, you start from patterns that have been learned to solve a different task.

There are three reasons to have a good understanding of transfer learning as a data scientist:

Transfer learning is essential in any kind of learning. Humans are not taught every single task or problem in order to be successful at it. Everyone gets into situations that have never been encountered, and we still manage to solve problems in an ad-hoc manner. The ability to learn from a large number of experiences, and exporting ‘knowledge’ into new environments is exactly what transfer learning is all about.

Transfer learning is key to ensure the breakthrough of deep learning techniques in a large number of small-data settings. Deep learning is pretty much everywhere in research, but a lot of real-life scenarios typically do not have millions of labeled data points to train a model. Deep learning techniques require massive amounts of data in order to tune the millions of parameters in a neural network.

Especially in the case of supervised learning, this means that you need a lot of (highly expensive) labeled data. Labeling images sounds trivial, but for example in Natural Language Processing (NLP), expert knowledge is required to create a large labeled dataset. Transfer learning is one way of reducing the required size of datasets in order for neural networks to be a viable option. Other viable options are moving towards more probabilistically inspired models, which typically are better suited to deal with limited data sets.

Transfer learning has significant advantages as well as drawbacks. Understanding these drawbacks is vital for successful machine learning applications. Transfer of knowledge is only possible when it is ‘appropriate’. Exactly defining what appropriate means in this context is not easy, and experimentation is typically required. You should not trust a toddler that drives around in a toy car to be able to ride a Ferrari. The same principle holds for transfer learning: although hard to quantify, there is an upper limit to transfer learning. It is not a solution that fits all problem cases.

The requirements of transfer learning:-

Transfer learning, as the name states, requires the ability to transfer knowledge from one domain to another. Transfer learning can be interpreted on a high level, that is, NLP model architectures can be re-used in sequence prediction problems since a lot of NLP problems can inherently be reduced to sequence prediction problems. Transfer learning can also be interpreted on a low level, where you are actually reusing parameters from one model in a different model.

Transfer Learning

What is Transfer Learning?

There are three reasons to have a good understanding of transfer learning as a data scientist:

The requirements of transfer learning:-

Written by Tech in 3