Power to the People: How One Unknown Group of Researchers Holds the Key to Using AI to Solve Real…
Greg Borenstein

Greg Borenstein, the author of this Medium post, is a researcher at MIT Media Lab, so we can assume he’s a smart guy. The post is well-written in the sense that the sentences flow well from one to the next — but they also contain some deep factual misstatements, to wit:

> And then, as Deep Learning gained more support from big companies like Google and Facebook, it started to produce achievements that were legible — and extremely impressive — to the wider public. AlphaGo won historic victories against the world’s leading Go players. IBM Watson dominated human players at Jeopardy on network TV.

While IBM’s current business unit Watson may employ deep learning here and there, IBM Watson, the technology responsible for its Jeopardy victory in 2011, had absolutely zero to do with deep learning. It’s not part of the same trend at all. The Watson team at that time was not inspired by Hinton et al to apply deep learning to NLP for the purpose of answering trivia questions.

Which leads us to this, the supposed reveal:

> But now for a splash of cold water: while AI systems have made rapid progress, they are nowhere near being able to autonomously solve any substantive human problem.

Self-driving vehicles are a substantive human problem, and deep learning is at the heart of the computer vision technologies that help steer them. Most tasks of machine perception, whether we’re dealing with images, time series, sound or text, have seen great gains as well. Insubstantive? Hardly. For a brief review of deep neural networks and a non-exhaustive list of what they can do, please see these pages:



If Borenstein means that AI doesn’t solve problems entirely by itself without human intervention … well, sure. It doesn’t. But nobody who counts is saying that it does, so it would be a strange straw man to argue against. Do humans need to be involved in constructing and tuning AI solutions. Of course they do.

Here’s what Borenstein is driving at:

> What’s needed for AI’s wide adoption is an understanding of how to build interfaces that put the power of these systems in the hands of their human users. What’s needed is a new hybrid design discipline, one whose practitioners understand AI systems well enough to know what affordances they offer for interaction and understand humans well enough to know how they might use, misuse, and abuse these affordances.

So human-computer interfaces and augmented intelligence are the only “substantive” human problems Borenstein recognizes. While those are great problems to work on, I think few people would actually agree with him that they are the only significant problems to be solved.

> As Recurrent Neural Nets surpass Convolutional Neural Nets only to be outpaced by Deep Reinforcement Learning which in turn is edged out by the inevitable Next Thing in this incredibly fast moving field…

This is unfortunately untrue. While they have some overlap, CNNs and RNNs both do great on different problems, and few researchers would claim that one has surpassed the other, or that RL has surpassed them both. (Yann LeCun would be especially surprised to hear that…) DeepMind’s AlphaGo algorithm combines CNNs with RL with Monte Carlo Tree Search, meaning the best algorithms arise from combinations, rather than winning a horse race.

Such a poor understanding of these algorithms bodes ill for Mr. Borenstein’s attempts to bring it to the masses by building a Visicalc equivalent for AI…

> The core job of most machine learning systems is to generalize from sample data created by humans. The learning process starts with humans creating a bunch of labeled data…. At the end of training the learning algorithm produces a classifier

Borenstein has glossed over unsupervised learning, which deep learning can perform well and which has no need of labeled data. Unsupervised learning can be used for both search and anomaly detection, two enormous problems. This omission makes many of his subsequent assertions irrelevant or wildly off base when applied to ML as a whole. He’s really just talking about classifiers.

On the subject of building classifiers, yes, we do need systems that allow us to label data easily, we need to improve them. Crowdflower and Mechanical Turk are two tools that researchers use now, and many companies roll their own solutions in house. Unsupervised learning does not have that problem.

> So, if we want to build systems that users trust and that we can rapidly improve, we should select algorithms not just for how often they produce the right answer, but for what hooks they provide for explaining their inner workings.

This is one of the most baffling sentences in the whole essay. If we were to select algorithms for their explanatory power, we would not be working with deep neural networks at all, because they lack feature introspection. And by doing that, we would give up most of the gains AI has made in recent years. His stance is perplexing and scientifically regressive.

While I support initiatives that integrate AI and HCI to augment UX and technology in general, I find this essay to be a particularly poor expression of efforts in that direction. It’s disappointing to see a researcher employ the sensationalist tactics, misdirections and exaggerations of the press to draw attention to his writing and school of thought, and it does not do credit to his cause.