Paradigm shifts: deep learning’s lesson to modern science

5 min readMar 22, 2017

When trying to evoke a paradigm shift we have to think beyond the idea, methods, or technology itself. These are not the drivers of change, they are only the tools. It is people you have to convince, and it is people who will build and expand on those ideas in order to bring them to ubiquity.

There is a bit of ickiness to this. As scientists, artists, or engineers we want to focus on the idea itself. We think that if we create something great enough it will win people over by its magic. Magic is necessary, but it’s only the beginning in effecting real change.

“This is the science of programming: make building blocks that people can understand and use easily, and people will work together to solve the very largest problems.”
―Pieter Hintjens, ZeroMQ

Deep learning is a paradigm shifting technology. I bought in because I saw magic, but what impressed me much more was how effective the community had become in the years that followed. I think the magic enabled a lot, but it is the philosophy held by the early pioneers and those that followed that made deep learning what it is today. That philosophy is one that focuses on lowering the barrier of entry of a difficult technical field in order to build a strong community.

The first and most obvious driver was the supporting technology. Deep learning, as you may know, has been around as a theory for many decades now, but the increase in computer performance in the last decade has been its great enabler. Graphics cards that quickly render triangles for displaying video games and movies were also curiously good at training deep learners. Unfortunately, they have been very difficult to use for a long time.

Today we have TensorFlow, which is probably the most advanced machine learning library to date. It is a revolution on par with “Big Data” software (MapReduce). The barrier to building a sophisticated machine learning system has been lowered considerably. With TensorFlow we can prototype on a laptop (with or without a graphics card), and then move to large computer systems with ease. This is critical for building reliable software products with intelligence, and it will go a long way to bringing widespread use of deep learning in the industry.

The second driver is strong and open communication — something that clearly runs deep in the field. You can glean a lot about this by just reading some early works. Yoshua Bengio is my personal favorite — his expositions on deep learning and why it is such a fundamental advancement in AI are fantastic models of how technical communication should be done[*]. These days many researchers participate in more than writing scientific publications. Ian Goodfellow, Andrew Ng, Charles Martin, and many other talented individuals write about deep learning on Quora, a popular Q&A platform. Andrej Karpathy, Deny Britz, and Christopher Colah explain deep learning concepts on their blogs with great skill. Many of these researchers recently created Distill, a new publication platform for machine learning research.

“Clear writing benefits everyone.” — Distill team

Distill is the clearest indicator of how important communication is in the eyes of the community. In his final note for his blog[**], Chris Colah talked about some of the problems he has faced in prioritizing his writing work over research. This is a problem that’s personally very dear to me. I wrote a lot about computer science and physics during my research in quantum computing, and I was really lucky to have a group that was very supportive of it. But many in the field frown upon this sort of activity. For instance, read Sean Carroll’s ideas on how to get tenure in cosmology — one of the bullet points says, “Don’t write a book,” which is not at all in the spirit of effective dissemination of knowledge. This is a dismal state of affairs for the rest of science, no doubt.

Many other scientific fields are stuck in a world before open collaboration became common. Physics is somewhat progressive in this sense. For instance, they invented The Arxiv to get around entrenched journal institutions and their restrictive publishing rules. But physics and other sciences can be competitive in all the wrong ways, forcing secrecy to avoid getting scooped for instance. I remember many cases where getting code from a collaborator was impossible, and this was someone who was committed to being helpful! Much work in experimental sciences like biology are impossible to verify or reproduce because the details required to replicate experiments doesn’t appear in journal publications[***]. The situation is even worse in healthcare. A broken patent system along with the very high costs for developing new technology means that IP law can be abused in order to silo important progress for the goals of making money. This is a great loss for almost everyone.

Deep learning has shown that big companies like Facebook and Google can fund labs, build incredible technology, and open it all up to the world without losing their competitive edge. The philosophy is of cross-collaboration between academia and industry, of open-sourced software tools, and the efficient exchange and dissemination of ideas on new platforms and mediums. It is a philosophy that allows the deep learning community to charge ahead like no field has ever done before.

For the rest of science, it is imperative to move towards a model of open collaboration and the free exchange of tools and technology. In an age where we have the internet connecting more people than ever before, we erect artificial barriers to scientific progress. Deep learning as a technology will continue to change the world in unforeseen ways, but deep learning as a paradigm shift holds just as much value for us: it is a model for how science should be done.

[*] They were also very important to me for switching from theoretical physics to neural networks research.

[**] As far as I understand, it’s just the end of deep learning posts on his blog.

[***] This can happen for a plethora of reasons. It’s not always that someone is maintaining secrecy for competitive advantage. Sometimes it’s just that the journal frowns upon wordy explanations, enforced either through the peer reviews or the length limit. It is also the job of a journal to ensure that published results can be reproduced by a third party (but this is just my opinion).

Paradigm shifts: deep learning’s lesson to modern science

Written by Hadayat Seddiqi