2017 Retrospective on Predictions for Deep Learning
Last year, I wrote my predictions for Deep Learning in 2017. I will recap those prediction and present new predictions for the coming year.
Here’s the recap for those the 2017 predictions. Refer to the predictions article for more detail.
- Hardware will accelerate doubling Moore’s law (i.e. 2x in 2017).
Nvidia continues to dominate as predicted. They’ve added the Volta V100 on to their lead which includes a 110 teraflop tensor core component.
I expected Intel to show up in mid-2017, however they have not shown up at all with the exception of a 1 teraflop Movidius embedded deep learning chip. Intel had a great party in NIPS 2017 that unveiled the hardware. However, without any details on performance, it’s clear to me that this is more vapor than real.
Amazon’s FPGA based cloud instance did not go anywhere. As expected, the upfront time investment to use FPGA is too steep a price to pay for anyone.
Google revealed details of their TPU chip and then a few months later, unexpectedly revealed the TPU 2 system that is able to crank out 180 teraflops in a single module.
AMD delivered their Vega architecture in the middle of the year with fp16 and int8 capability. The specs for a single card are at 25 teraflops fp16 and 50 teraops at int8. This is competitive for novel or unique loads, however it is hard to compete with a dedicated tensor core as found in Nvidia’s V100 and Google’s TPU. AMD support for Deep Learning ROCm frameworks still needs improvement, however they are incrementally moving forward.
2. Convolution Networks (CNN) will Dominate
CNNs have emerged in areas where RNNs have been used as well in planning prediction. RNNs have not completely disappeared in the landscape. Geoffrey Hinton unveiled his Capsule Network that is designed to address the many flaws of CNNs.
There’s been no progress in differentiable memory networks. The early work on this appears to have died down and there seems to be less interest in creating Turing-like machines based on Deep Learning.
3. Designers will rely more on Meta-Learning
A lot of strong research in Meta-learning appeared in 2017. The most impactful development in this field comes from the MAML research where training is performed across diverse tasks. There have been plenty of discoveries using brute force search, which is a kind of meta-learning.
4. Reinforcement Learning will only become more creative
Reinforcement learning has exceeded expectations with AlphaGo Zero and AlphaZero. Despite the concerns of many practitioners that RL can’t scale due to its huge appetite for data, the most impressive developments in Deep Learning can be found in this area.
5. Adversarial and Cooperative Learning will be King
6. Predictive Learning or Unsupervised Learning will not progress much
If you consider the self-play of AlphaGo Zero and AlphaZero as unsupervised learning, then my prediction of a lack of progress is entirely incorrect. However, if self-play is the key to unsupervised learning, then this is a massive quantum leap development and is truly unexpected!
7. Transfer Learning leads to Industrialization
We haven’t developed good enough methods to allow our models to be transferred more easily into different domains. Deep Learning methods are very far from being mature industrialized methods. The most impactful development in transfer learning is the amazing work done by an Nvidia team to create stunning high resolution images using GANs.
8. More Applications will use Deep Learning as a component
We already saw this in 2016 where Deep Learning was used as a function evaluation component in a much larger search algorithm. AlphaGo employed Deep Learning in its value and policy evaluations. Google’s Gmail auto-reply system used DL in combination with beam searching. I expect to see a lot more of these hybrid algorithms rather than new end-to-end trained DL systems. End-to-end Deep Learning is a fascinating area of research, but for now hybrid systems are going to be more effective in application domains.
A revealing presentation (NIPS 2017, December) by Jeff Dean of Google shows that Deep Learning as a component is finding itself in its uses as indexing structures. The paper is “The Case for Learned Index Structures”. I had not expected this, but it portends even greater ubiquity of Deep Learning in software applications.
9. Design Patterns will be increasingly Adopted
This has not happened. The field is growing at an extremely rapid pace and our understanding of how these systems work keeps on changing. Nobody seems to have the luxury to create a clean set of design patterns that can guide the further maturity and industrialization of these techniques. Progress may need to plateau first before we can even considering curating and organizing our knowledge.
10. Engineering will outpace Theory
A keynote talk in NIPS 2017 finally branded Deep Learning as what it really is — “alchemy”. This was a problem in 2016 and it wasn’t addressed in 2017. However, perhaps better theories will come out in 2018 given that a lot of people were sympathetic to Rahimi’s talk.
Naftali Tishby had an elegant theory based on information theory, only to be questioned by a paper submitted in ICLR 2018. Tomaso Poggio also came up with a set of papers that addresses Deep Learning fundamentals. The “Rethinking Generalization” paper hasn’t been resolved. There have been some approaches were adversarial inputs aren’t present.
I am working on my 2018 predictions, so please stay tuned.