The man who revolutionized computer vision, machine translation, games and robotics

Published in

AI Frontiers

7 min readAug 29, 2018

On the stage, the man described the working of neural networks. His clarity was astonishing. I have never seen the concept of deep learning been stated in such simple form yet to the point. All the jargons suddenly take on a vivid form, weaved together in one framework. But why should I be surprised? He was the person who invented those jargons. Ilya Sutskever, the co-inventor of AlexNet, is also the inventor of sequence of sequence learning (as the first author). He has led the revolution in both computer vision and natural language processing, a remarkable achievement for a man, who also later co-invented AlphaGo and TensorFlow.

But he continues to invent. Today Ilya Sutskever is leading a billion-dollar nonprofit organization called OpenAI, funded by Elon Musk. OpenAI’s mission is preparing humanity for the unavoidable proliferation of AI. For doing that, Ilya wants to open up AI technology to everyone. To have a saying in the AI field, OpenAI stays at the cutting edge of AI research, and has thus made big progress on robotics and deep learning reinforcement learning.

The Road to AlexNet

Ilya Sutskever arrived at this role from a long journey. Sixteen years ago when he went to University of Toronto as an undergraduate student, he knew little about AI, though he was fascinated by computers. At U of Toronto he met Geoffrey Hinton, a professor and pioneer of the deep learning research. Hinton gave Sutskever a research project: Improve the Stochastic Neighbor Embedding algorithm. Sutskever did it well. That project started their collaboration. Sutskever joined Hinton’s group when he entered the Ph.D. program.

“Thanks to working with Geoff, I had the opportunity to work on some of the most important scientific problems of our time and pursue ideas that were both highly unappreciated by most scientists, yet turned out to be utterly correct,” says Sutskever in an interview U of Toronto.

In 2012, under Hinton’s guidance, Sutskever and fellow P.hD. Student Alex Krizhevsky developed AlexNet, which won the 2012 ImageNet competition by a large margin. AlexNet was a novel neural network architecture that contains five convolutional layers and three fully connected layers. The AlexNet paper was widely believed as a real pioneering work because for the first time it illustrated how deep neural networks trained on GPUs could take image recognition tasks to the next level.

And more importantly, AlexNet made Sutskever realize that deep learning can solve any pattern recognition problem as long as you have a deep neural network trained on a significant amount of data.

Ilya Sutskever (left), Alex Krizhevsky (centre), Geoffrey Hinton (right)

After graduation in 2012, Sutskever spent two months as a postdoc with Andrew Ng at Stanford University. He then returned U of T and joined Hinton’s new research company DNNResearch, a spinoff of Hinton’s research group. Four months later, in March 2013, Google acquired DNNResearch and hired Sutskever as a research scientist at Google Brain.

Sequence to Sequence Learning

AlexNet marked the beginning of the AI revolution since 2012. However, many areas still remained unexploited by deep learning algorithms, such as natural language processing. The type of neural networks used by AlexNet, Convolutional Neutral Networks (CNN), does not work well with sequential data like text.

After joining Google, Sutskever threw himself into sequence modeling problems, which can be applied to speech, text, and videos. One application is machine translation, which could really benefit from good sequential modeling.

In 2014, Sutskever proposed Sequence to Sequence Learning, together with fellow Google researcher Oriol Vinyals and Quoc Le. It captures the sequential structure of the input (such as a sentence in English) and maps it to an output that also has sequential structures (such as a sentence in French).

This method uses Recurrent Neural Networks (RNN), thus beginning an era of wide application of RNN for language tasks. Their work was applied to machine translation and outperformed phrase-based statistical machine translation baseline for very large datasets.

The seq2seq learning requires fewer engineering design choices and allows Google translation system to efficiently and accurately work on extensive datasets. It is mainly used for machine translation systems and is proved to be applicable in a broader range of tasks, including text summarization, conversational AI, and question-answering.

In 2015, MIT Technology Review named Sutskever the “Innovators Under 35” in the Visionaries category.

TensorFlow

In Google Brain team, Sutskever joined the development of Google’s open-source library TensorFlow for large-scale machine learning.

Coming with many convenience functionalities and utilities, TensorFlow is now the world’s most prominent machine learning system among researchers and application developers. It uses dataflow graphs to represent computation, maps the nodes of a dataflow graph across many machines in a cluster, and connects with a wide range of computational devices, including CPUs, GPUs, and custom-designed ASICs known as Tensor Processing Units (TPUs).

AlphaGo

At Google Brain, Sutskever was in contact with researchers at DeepMind, who worked on the epoch-making Go computer AlphaGo, which was trained on deep neural networks and Monte Carlo Tree Search and self-played with a reinforcement learning algorithm.

AlphaGo made the history by defeating the human champion Lee Sedol in a best-of-five duel match in March 2016. For the first time, a machine surpassed top human capabilities in the most complicated strategy board game. The paper on AlphaGo was published in Nature in 2016, where Sutskever was a co-author.

Founding OpenAI

Sutskever constantly thinks of the long-term future: what will happen to humanity when AI is everywhere? Being at the forefront of AI development, he sees how fast AI capabilities have increased. In the not so distant future, AI will replace every job we have. AI will be able to make big decisions for us (as it gradually takes over all responsibilities). What will happen to the humans?

In July 2015, Sutskeve joined a dinner organized by Sam Altman, President of Y Combinator, in a restaurant on Sand Hill Road, where he met Elon Musk and Greg Brockman. Brockman later commented in his blog post that “Ilya was a source of grounding: he was a clear technical expert with a breadth of knowledge and vision, and could always dive into the specifics of the limitations and capabilities of current systems.”

Everyone present agreed on one thing: such an organization needed to be a non-profit, without any competing incentives to dilute its mission; it would also require the best AI researchers in the world.

Ilya agonized over his choice of leaving Google. He enjoyed tremendously working in Google Brain. But he wanted to do more. In December 2015, he took the leap.

With an $1 billion funding from Elon Musk, Sam Altman, and LinkedIn Founder Reid Hoffman, Sutskever and Greg Brockman — now OpenAI CTO — co-founded OpenAI, whose goal is “to advance digital intelligence in the way that is most likely to benefit humanity as a whole.”

“It seems likely there will come a day, quite possibly in our lifetimes, when we will build an AI system that is as cognitively capable as a human being in every meaningful dimension,” says Sutskever.

Leading OpenAI

Under the humble title “Research Director”, Sutskever leads the research and operation of OpenAI. The organization attracted a couple of world-renowned AI researchers on board, including Ian Goodfellow who invented GANs, Pieter Abbeel from UC Berkeley, and Andrej Karpathy who now leads AI effort in Tesla.

Located in an unremarkable office in San Francisco, OpenAI has accomplished some amazing feats in the last 2 years: They created software platform called Universe for measuring and training AI systems across the world’s supply of games. It intended to allow the agent to learn general strategies. They created AI players who play better than 99.95 percent of gamers worldwide in the complex game Dota 2; they built agents that does Japanese sumo wrestling and playing soccer games.

OpenAI leads the research in robotics. They apply deep reinforcement learning to robotics to do household chores such as clean one’s room or cook a meal. Recently, they trained their Dactyl hand to learn how to spin an alphabet block and put a new face on top.

OpenAI also leads the research in AI security. Their concern is making AI safe for people. Two years ago, OpenAI listed many research problems around ensuring that modern machine learning systems operate as intended.

Sutskever has been at the forefront of the AI revolution in the last 6 years. His next journey is spreading that revolution to benefit for the whole mankind, while his team is pushing the envelope of AI for the ultimate peak of general intelligence. What’s his plan? How far can we reach? We are waiting to hear from him on the next chapter.

Ilya Sutskever will speak at AI Frontiers Conference on Nov 9, 2018 in San Jose, California.

AI Frontiers Conference brings together AI thought leaders to showcase cutting-edge research and products. This year, our speakers include: Ilya Sutskever (Founder of OpenAI), Jay Yagnik (VP of Google AI), Kai-Fu Lee (CEO of Sinovation), Mario Munich (SVP of iRobot), Quoc Le (Google Brain), Pieter Abbeel (Professor of UC Berkeley) and more.

Buy tickets at aifrontiers.com. For question and media inquiry, please contact: info@aifrontiers.com