Transfer Learning Will Radically Change Machine Learning for Engineers

The Problem

Harsh Sikka
ModelDepot
Published in
3 min readFeb 1, 2018

--

In traditional supervised machine learning, we teach a model to become more successful and efficient at a task, by providing it with example data. Generally, once the model begins to perform well on the training data for the domain or problem it is tasked with, we expect a reasonable performance for new data. But, if you think about it, there are a few issues with this traditional supervised learning process

As engineers, we’re forced to construct specific models that only excel at a specific problem. This costs us valuable engineering time to create, train and tune models from scratch for every new problem we want to tackle, even if it’s a problem that has been solved in industry.

From a product perspective, this is incredibly detrimental to progress, and could really hamper feature releases and engineering productivity. Transfer Learning offers an interesting solution to this problem.

Transfer learning as a paradigm can solve this problem, by allowing us to leverage existing knowledge and data from a certain related domain to the new one we’re trying to train for. In 2016, Andrew Ng posited that Transfer Learning would be essential to commercial and industry success.

Applications of Transfer Learning

Making use of pre-trained models and related domain data promises to supercharge most general development for machine learning. By tapping into a pre-trained model for a related purpose to its original design, your team can leapfrog the data cleaning, setup and training required to bring a model up to par for the task.

Two common areas where Transfer Learning has already been demonstrated to great success are Images and Text .

Transfer Learning has been particularly effective with Image Data, and it is common to leverage a deep learning model trained on some large image data set, like ImageNet. These pretrained models can be directly included in other new models that expect some form of Image Input.

Mask R-CNN: a ML model from Facebook built on top of a pre-trained ResNet, another image detection model.

With Textual data, words are mapped to vectors where different words with a similar meaning have a similar vector representation. Pretrained models exist to learn these representations, and are widely available. These can then be incorporated into deep learning language models, both at the input or output stage.

Transfer Learning and pretrained models are the future of machine learning applications in general development, and as such need to be made more accessible and discoverable for everyone.

That’s why we’re building ModelDepot, to decrease the friction associated with model access and contribute to the democratization of AI in the 21st century.

Join the conversation on Gitter! 👋

--

--

Harsh Sikka
ModelDepot

Grad Student at Harvard and Georgia Tech. Artificial Intelligence, Theoretical Neuroscience, Synthetic Biology, and generally cool stuff.