An alternate viewpoint on Artificial Intelligence is Stuck
This article takes a contrarian view point to Prof. Gary Marcus’s article “Artificial Intelligence is Stuck. Here’s How to Move it Forward” featured in New York Times.
While I do acknowledge the merit of the central argument that academic research labs are often cash-strapped and the general industry do neither have much incentive nor often the ambition or mettle to think beyond the narrow confines of their immediate cash returns; last few years have been quite fascinating for me as a researcher, in the sense of seeing a tighter exchange of ideas and people between academia and industry. Solution in my opinion is strengthening various aspects of these collaborations, rather than formulating a newer model.
I am adopting a perspective of researcher/entrepreneur who have shifted from academia to industry, and is trying to build up an lab/group of computer vision researchers to explore applications of computer vision and machine learning to solve problems and explore newer possibilities in the medium of photography ( https://www.eyeem.com/tech ).
What currently works
What is interesting to see is that the handful of the various true research labs in industry ( Google Brain/DeepMind, FAIR, OpenAI, MSR ) etc. is that they are structured in a model heavily inspired and synchronized with the academic model. Their core emphasis is on publishing ideas, facilitating reproducible research, releasing datasets ( which being a primary asset) and code.
To give an example, few years back while I was a PhD student/post-doc a good majority of my time was spend reimplementing related work ideas, writing various optimizers etc, or just compiling code written for a different platform, which invariables seem to depend on various esoteric libraries. This process is so much easier now with the support of well tested and documented libraries. For example, libraries we use in my group: Tensorflow (developed inside google , with a lot of people contributing to .contrib), pytorch ( developed inside FAIR, with a lot active researchers and practitioners pushing pull releases ) and Theano ( Uni Montréal, with the backing of a similar excited community ) has made this process so much more efficient. I now spend an afternoon or two , instead of weeks in this process.
This equally applies when translating an idea into a prototype. For example: for various projects, at various stages, I had implemented SVM, Random Forest, SIFT descriptors, MLP etc. from scratch. All of these where multi-week efforts. At the moment they are 1 to 2 lines of python code ( yay!). I remember there was a popular myth floating around at EPFL ( around 7–8 years back ) where I did my post-doc, that convolutional neural networks could only be trained by a select few amount of people who were students of Prof. Yann LeCun. However, a fresh bachelor student can train such a network courtesy all the interest and discovery henceforth abstracted as tutorials and demo codes.
The other fact I noticed is also at conferences like CVPR/NIPS, the nature of conversations I have between research counterparts ones has remained the same pre-deep learning days ( though the sheer scale of these conferences have exploded); it still remains about the newer ideas, implications, possible spins and extensions. This is in contrast with the publicly released knowledge about work in Computer Vision and Machine Learning work at Google/Microsoft/Facebook/and Apple(?) :-) a few years back, were mostly kept a trade secret.
Primary Objection
Think as a community, not as an isolated academic group
“Too many separate components are needed for any one lab to tackle the problem. A full solution will incorporate advances in natural language processing (e.g., parsing sentences into words and phrases), knowledge representation (e.g., integrating the content of sentences with other sources of knowledge) and inference (reconstructing what is implied but not written). Each of those problems represents a lifetime of work for any single university lab.”
In my opinion, this is where the power of research community can be leveraged to the maximum. I primarily being a computer vision researcher, is interested in the space between how we cross-communicate about visuals using language and abstract (language-free?) thoughts about visuals ( an example being our work regarding personalized aesthetics : https://devblogs.nvidia.com/parallelforall/personalized-aesthetics-machine-learning/ ) . This involves thinking deeply and working extensively in knowledge representation and inference as mentioned in the article. Though our group’s expertise are mainly computer vision focused; the publications, interactions, code and datasets released by the research community in this front has made such a task accessible to us; and equipped us with the ability to think of these problems in our own perspective and narrative. We should be thinking of these problems as a community, but not necessarily a single group.
Differences from CERN
Unlike in search of particles at CERN( or gravitational waves at LIGO) , the current state of AI research is still in relative infancy to warrant a “focused” search by international entities. That is we do not have an equivalent of a credible postulate such as Higgs bosom or gravitational waves, to conduct a search of that scale.
Neither does it require an hardware of that scale. I do drool over the possibility of access to GPU clusters of ATLAS scale; but that is me being infantile. The postulates we have on AI ( for example, how to tackle unsupervised learnings) , can be cross-checked with relatively much smaller set of hardware, albeit in a limited set of data. From top of my mind, it sure will be very interesting to train a VAE/GAN style encoder-decoder networks from multiple trillion of data points, but is this the fundamental problem we need to solve at the moment, for understanding/modeling intelligence ? There are many interesting questions we need to answer here; for example, what is the generalization capabilities of VAE/GAN-style model; are they memorizing or have interesting generalization possibilities. But my point being, these are not questions we can answer by having scale or volume. In my experience, various groups approaching the problem in various forms is usually best way to reach the optimal answer ( Think of all the contribution from various groups, in terms of architecture, non-linearities, sampling schemes, loss functions etc. that are commonly used in trained a deep learning architecture now a days).
The current bottle neck
Access to data, access to computational resources and the amount of high quality researcher working on the problems that are fundamental in nature remains the current bottle neck for the growth of this field. At the moment, both from an industrial and academic perspective, setting up large scale AI research labs outside a cash rich corporations ( which are sitting on top of multiple billions of dollars of pocket money ) are still a tall order and risky propositions. There is no denying that threat of A.I. being a property of a privileged few is real.
However, it is something that research community can actively address by being true to the sense of openness and a commitment to the sheer pleasure of finding things out as the main criteria . To be honest, I am more optimistic about our field, than I was 4 to 5 years, we just need to keep the idea of research alive in the true sense of its word!
