Opportunities in Deep Learning

Although the Garter hype cycle believes that we’ve reached peak Deep Learning in 2019, I believe Deep Learning is in its early stages and there are numerous extremely valuable opportunities that lie ahead. Here are some of the opportunities I see for the next decade.

1. Real-world deep learning

While the previous generation of deep learning models lived on a large server in the cloud, the next-gen will be on-device models (e.g., phones, cars, perhaps even people). The web and its ranking algorithms self-select for canonical, low complexity representations of things while the real world is unforgiving in its endless complexity and has immutable consequences. As deep networks move to the real world, it is essential that we get this right.

Stay tuned for our next post on bringing safer self-driving cars to market where we will elaborate on this in greater detail.

2. Automating the creation of datasets

However, I believe that a large portion of manual labeling of data is unnecessary and will be obsolete in the near future. Once a rule is adequately expressed through human labeling, an algorithm can learn the rule and apply it to the majority of cases — humans need only to handle the long tail. There are a few known techniques to do this already including classical unsupervised learning, self-supervised training and, distillation/active learning. However, these approaches are just the tip of the iceberg and there is a lot of research to be done.

3. Recycling old models

While I think both are equally important, it’s no secret that the carbon footprint of training Deep Learning is increasing over time. According to some estimates, it is 5x worse than owning a car.

This might sound surprising to some but I think there’s a huge opportunity in recycling old models. Researchers in academia and industry typically discard old models whenever a better one comes along, effectively throwing away a bunch of compute and resources. There’s a point where the ROI of the new model breaks even with the amount of compute used to train older models and we would make a net loss over time (assuming growth saturates at some point). If we can learn how to leverage previously trained models to improve the system instead of throwing them away, we can greatly improve growth and productivity.

We have thought about this and developed several ways to improve the sustainability of our models at NuronLabs but this is still a vastly underexplored opportunity.

4. Reproducibility of deep learning research

The fact is, a large proportion of current Deep Learning research is not reproducible. The pressure to publish or perish has led to the community reporting out results regardless of whether they truly move the field forward or not.

Things such as not publishing code, pre-selecting random seeds, reporting the max result achieved instead of mean/variance, and cherry-picking visual results all contribute to this.

There are other important issues with the current state of Deep Learning such as using graduate student descent to boost results or using a large number of resources which I will discuss in another post, but this is something that can be imminently solved.

Check out this blog post that outlines a potential solution to this problem.

5. AI powertools

Many people know this already and are trying to solve this problem by introducing greater abstractions on code (e.g., PyTorch Ignite, Keras, fast.ai, PyTorch Lightning). These are a vast improvement over the prior tools and greatly improve productivity.

However, these tools feel more PC and less Mac. I think there’s a huge opportunity to introduce a set of well designed, vertically integrated tools that are powerful yet elegant.

6. Using human filters for AI creativity

I believe that there’s an intrinsic set of aesthetic axioms that are distributed across the human race. The challenge is to understand these motifs. Integrating the biological feedback loops that we’ve evolved over millennia with modern deep learning feedback loops could be extremely powerful.

Building a neural layer for reliable real-world deep learning