2020 in Review With Jürgen Schmidhuber

Synced
SyncedReview
Published in
6 min readDec 25, 2020

In 2020, Synced has covered a lot of memorable moments in the AI community. Such as the current situation of women in AI, the born of GPT-3, AI fight against covid-19, hot debates around AI bias, MT-DNN surpasses human baselines on GLUE, AlphaFold Cracked a 50-Year-Old Biology Challenge and so on. To close the chapter of 2020 and look forward to 2021, we are introducing a year-end special issue following Synced’s tradition to look back at current AI achievements and explore the possible trend of future AI with leading AI experts. Here, we invite Prof. Jürgen Schmidhube to share his insights about the current development and future trends of artificial intelligence.

Meet Jürgen Schmidhuber

The media have called Jürgen Schmidhuber the father of modern AI. Since age 15, his main goal has been to build a self-improving AI smarter than himself, then retire. His lab’s deep learning neural networks (developed at TU Munich and the Swiss AI Lab, IDSIA, USI & SUPSI) have revolutionized machine learning. By 2017 they were on over 3 billion smartphones, and used many billions of times per day, for Facebook’s automatic translation, Google’s speech recognition and Google Translate, Apple’s Siri & QuickType, Amazon’s Alexa, etc. He also pioneered adversarial networks, artificial curiosity and meta-learning machines that learn to learn. He is recipient of numerous awards, and chief scientist of the company NNAISENSE, which aims at building the first practical general purpose AI. He is also advising various governments on AI strategies.

The Best AI Technology Developed in the Past 3 to 5 Years: “Deep Learning & Highway Networks”

The basic ideas behind the deep learning revolution were published deep in the previous millennium. However, these old ideas (plus certain improvements) work much better with the faster computers of today. Since 1941, when Konrad Zuse built the first working programmable general computer, every 5 years, compute got 10 times cheaper. In 1990, compute was 1 million times more expensive than in 2020. But it was back then when we published many of the basic deep learning ideas within fewer than 12 months in your “Annus Mirabilis” or “Miraculous Year” 1990–91 at TU Munich, for example: Artificial Curiosity & its special case called GANs (1990), deep learning through unsupervised pre-training (1991), vanishing gradients & LSTM, planning with recurrent world models, etc. See http://people.idsia.ch/~juergen/deep-learning-miraculous-year-1990-1991.html for more. Deep learning has greatly profited from this hardware acceleration between 1990 and 2020, especially in the recent 3–5 years you mentioned.

One big thing of the past 5 years were the Highway Networks. In the early 2010s, neural nets (NNs) were not yet extremely deep and achieved at most a few tens of layers, e.g., 20–30 layers. However, in May 2015, there was something new. Our Highway Networks were the first working really deep feedforward neural networks with hundreds of layers. This was made possible through my PhD students Rupesh Kumar Srivastava and Klaus Greff. Highway Nets are essentially feedforward versions of our earlier recurrent LSTM nets. If we open the gates of Highway Nets, we obtain the so-called Residual Net or ResNet (Dec 2015), a special case of our Highway Net. Microsoft Research won the ImageNet 2015 contest with a very deep ResNet of 150 layers. Today, many are using such networks. More: http://people.idsia.ch/%7Ejuergen/highway-networks.html

The Most Promising AI Technology in the Next 1 to 3 Years: “Non-Traditional Methods for Reinforcement Learning”

The most promising AI technology is non-traditional methods for reinforcement learning (RL). For example, our RL LSTM can be trained by policy gradients, as shown in 2007–2010 with collaborators including my PhD student Daan Wierstra, who later became employee number 1 of DeepMind, the company co-founded by his friend Shane Legg, another PhD student from my lab. (In fact, Shane and Daan were the first persons at DeepMind with AI publications and PhDs in computer science.) Policy gradients for LSTM have become important. For example, in 2019, DeepMind beat a pro player in the game of Starcraft, which is harder than Chess or Go in many ways, using Alphastar whose brain has a deep LSTM core trained by PG. And the famous OpenAI Five learned to defeat human experts in the Dota 2 video game in 2018. The core of it also was an PG-trained LSTM with 84% of the model’s total parameter count. Bill Gates called this a “huge milestone in advancing AI.” And combined with ideas about recurrent world models (since 1990 — see above, and the recent work with Google’s David Ha, 2018), this will become even more important.

The Biggest Challenge in the Field of AI: “AI in the Physical World”

The big thing of the future will be AI in the physical world, for robots and industrial processes etc. Today most profits in AI are in the virtual world, for marketing and selling ads — that’s what the big platform companies on the Pacific Rim do: Alibaba, Amazon, Facebook, Tencent, Google, Baidu … But marketing is just a tiny fraction of the world economy. A much bigger part is the rest, which is going to be invaded by AI as well, like in the movies. That’s what our company NNAISENSE is about. It is pronounced like “nascence” (meaning “birth”) but spelled in a different way because it’s about the birth of a general purpose Neural Network-based Artificial Intelligence for the physical world.

The Latest Noteworthy Development: “Upside Down RL”

One very interesting recent development is called Upside Down RL (UDRL). It is turning traditional RL on its head. Standard RL predicts rewards, while UDRL instead uses rewards as task-defining inputs, together with representations of time horizons and other computable functions of historic and desired future data. UDRL learns to interpret these input observations as commands, mapping them to actions (or action probabilities) through supervised learning on past (possibly accidental) experience. UDRL generalizes to achieve high rewards or other goals, through input commands such as: get lots of reward within at most so much time! Certain problems of traditional RL with high-dimensional actions, partial observability, and reward discount factors disappear. First experiments showed that even a pilot version of UDRL can outperform traditional baseline algorithms on certain challenging RL problems.

Synced Report | A Survey of China’s Artificial Intelligence Solutions in Response to the COVID-19 Pandemic — 87 Case Studies from 700+ AI Vendors

This report offers a look at how China has leveraged artificial intelligence technologies in the battle against COVID-19. It is also available on Amazon Kindle. Along with this report, we also introduced a database covering additional 1428 artificial intelligence solutions from 12 pandemic scenarios.

Click here to find more reports from us.

We know you don’t want to miss any news or research breakthroughs. Subscribe to our popular newsletter Synced Global AI Weekly to get weekly AI updates.

--

--

Synced
SyncedReview

AI Technology & Industry Review — syncedreview.com | Newsletter: http://bit.ly/2IYL6Y2 | Share My Research http://bit.ly/2TrUPMI | Twitter: @Synced_Global