Multi-Task Deep Learning with Pytorch

Alessandro Lamberti
Artificialis
Published in
4 min readSep 26, 2022

--

Source

In Machine Learning, we typically aim to optimize for a single metric, for a single task.
Multi-task learning (MTL) has led to successes in many applications of machine learning, from natural language processing and speech recognition to computer vision and drug discovery.

The most famous case of MTL is probably Tesla’s autopilot.
You already know Tesla’s goal is to solve a multitude of tasks like object detection, depth estimation, 3D reconstruction, video analysis, tracking, and more, and you might think they’re implementing 10+ Deep Learning models that work together, but that’s not the case.

Introduction to HydraNet

Tesla’s main idea is fairly simple: one body, several heads. Leveraging a single model to solve multiple tasks.

Source

On a superficial scale, you can see images being extracted by feature extraction models. These images are then combined into a “super image”, which are then combined again with previous super images, bringing time into the equation. That’s usually done by using 3D CNNs, RNNs, Transformers, etc.

--

--