What is Transfer Learning?

Vidhi
AI for High Schoolers by High Schoolers
3 min readDec 23, 2023

Learning how to ice skate may help people learn how to ski. If humans can reuse skills from one task, why can’t AI?

Collecting a large amount of data when tackling a completely new task can be challenging, to say the least. Obtaining satisfactory model performance (think model accuracy) using only a limited amount of data for training is also tricky… if not impossible. Fortunately, there is a solution that can address this very problem, and it is called Transfer Learning.

It’s almost too good to be true because its idea is very simple: You can train the model with a small amount of data and still achieve a high level of performance. Pretty cool, right?

Transfer learning models focus on storing knowledge gained while solving one problem and applying it to a different but related problem. Transfer learning is used in scenarios where there is not enough data for training or when we want better results in a short amount of time.

For example, if you trained a simple classifier to predict whether an image contains a backpack, you could use the model’s training knowledge to identify other objects, such as sunglasses.

With transfer learning, we basically try to use what we’ve learned in one task to understand the concepts in another better. weights are being automatically shifted to a network performing “task A” from a network that performed “task B.”

Now we know what Transfer learning is, How does Transfer Learning Work?

1. Obtain pre-trained models (note: do research on which model works best for your task). Instead of training a neural network from scratch, many pre-trained models can serve as the starting point for training. These pre-trained models give a more reliable architecture and save time and resources.

For computer vision:

1. VGG-16
2. VGG-19
3. Inception V3
4. XCeption
5. ResNet-50

For NLP tasks:
1. Word2Vec
2. GloVe
3. FastText

2. Create a base model- basically just copy paste the same the same first steps as the pre-trained model

3. Add additional layers on top of the feature extraction layers to predict the specialized tasks of the model. These are generally the final output layers. It is common to fine-tune the higher-level layers of the model while freezing the lower levels as the basic knowledge is the same that is transferred from the source task to the target task

4. train the model with a new output layer in place

5. Fine-tuning involves unfreezing some part of the base model and training the entire model again on the whole dataset at a very low learning rate. The low learning rate will increase the performance of the model on the new dataset while preventing overfitting (the model fits exactly against its training data but cannot perform accurately against unseen data).

While transfer learning showcases its powerful capabilities, it’s crucial to recognize the potential for false outcomes stemming from machine learning algorithms and biases within training data. It is essential that we prioritize transparency in AI systems. The development process should include rigorous testing, ongoing validation, and continuous monitoring. Only through these can we guarantee our AI model’s alignment with human values and use it to make a positive impact.

--

--

Vidhi
AI for High Schoolers by High Schoolers

A high schooler who is interested in Artificial Intelligence.