Retrospective from Deep Learning Predictions Made 4 Years Ago
Progress in Deep Learning AI is very subtle, and it takes considerable effort to understand the trends. I used to come up with predictions every year, the last one was at the end of 2018. It’s instructive to see what I guessed about 4 years ago.
My first prediction was that there would be no new hardware innovation outside Tensor cores. This has held up after 4 years. But do let me know if I’ve missed anything new or revolutionary.
My 2nd prediction is wrt Unsupervised Learning. This is enough of a consensus that @ylecun states the term “is useless.” This is despite him claiming that it was the future in 2019 (see: The AI technique that could imbue machines with the ability to reason)
My 3rd prediction was wrt Meta-Learning. Certainly, there’s scant progress in the area unless you cite generalist architectures like GATO and few-shot learning like GPT-3. Emergent meta-learning rather than deliberate design.
My 4th prediction is wrt to generative models and their use in scientific prediction. AlphaFold 1 was in 2018, but it’s intriguing how AlphaFold 2 solved the problem by using transformer models rather than learning physics. In biology, you treat language as first class objects!
My 5th prediction was about the use of hybrid models. This surprisingly lacks any traction despite @GaryMarcus constant promotion of the idea. I’m perhaps just as surprised as he is with the scant progress.
My 6th prediction is related to my 4th prediction. That we can’t conjure up good simulations, but we can do good mimicry. Hence we see this in AlphaFold2 and the mimicry in GPT-3 and Dall-E 2. Deep Learning can’t derive underlying causal effects.
My 7th prediction is wrt to DL in design exploration. Well, it’s finally here with DallE2, StableDiffusion, and MidJourney. It did not arrive in 2019 but rather four years later. For me, this indicates that my expectations were too ahead of their time.
My 8th prediction is wrt the decline of end-to-end training and instead in curriculum learning. This is partially true in that language models are trained in self-supervised mode and then fine-tuned for downstream tasks. But I expected greater modularity which has yet to happen.
My 9th prediction was that the transformer model will take over the world. It has come true: (see: An Argument for Modeling Consolidation)
My 10th prediction was a hopeful prediction that the DL community would adopt more holistic perspectives. This has not happened and still is an important project for me.
A lot is happening in the DL space, and I find it useful to consolidate the trends with ten predictions for the future. It’s not obvious how all of these stitches together into a larger synthesis. But I think it’s important to have a synthesis to understand where we are heading.
So, whatever synthesis my intuition has developed seems to me to survive the test of time. I notice these general trends, and I hope that my observation proves timelessly useful.
But what prediction will I change after 4 years? I will point out DL’s inability to generate abstractions and perform abduction. Although DL is a new kind of artificial cognition, it’s not like biological cognition.
DL will continue to be very useful, but it hits the wall wrt to autonomous systems and higher-level abstract thinking. Its capabilities are bounded from the bottom as well as the top. This is curious because it hints that humans still need to conjure up revolutionary ideas.