Recent Developments in Artificial Intelligence

Few days ago, I was asked by the dean of our faculty to present what is behind recent success of artificial intelligence when AlphaGo defeated legendary player Lee Sedol in the ancient game of Go. I do not know anything about playing Go so decided to focus on Artificial intelligence in general and talk about recent advances, background, state of the art and applications. It looks like many people understand the importance of the topic and are willing to come. The capacity of the lecture room is however limited so I decided to share the story line of my talk here on Medium.

This is the outline of the talk. The talk is not about philosophy, superhuman intelligence and AI singularity regardless the importance of this aspect.

For those who are interested in this topic, I recommend book by Nicholas Joll.

Before I start, I have to explain basic machine learning concepts for those who are not from familiar with the subject. Supervised learning is most popular and you can learn how machines learn from data for example here. Example of unsupervised learning is clustering (when a target variable is not available). And reinforced learning comes to place when immediate evaluation of machine action is not available and reward can come later, after series of actions.

Artificial neural nets are increasingly popular machine learning model. This short intro is what you need to understand how to train them by back propagation of error. Recurrent neural networks are harder to train but can remember sequences internally. With increased computing power and improved training, we can now produce deep neural networks that can outperform us in number of tasks. Ensembling and meta-learning can further improve the accuracy of machine learning models.

There are several promising directions in machine learning and artificial intelligence that already show their potential. The first important area of research is

Deep Reinforcement Learning

We can observe superhuman performance in game intelligence for a long time already. Lets look at chess. The top chess engines are over 3300 points while the top human is barely above 2800. That’s more than a 500 rating difference at the extreme top of the scale.

However most of artificial intelligence engines were handcrafted for one task and not universal at all. Recently, deep reinforced learning (DLR)showed that it can match human in feature engineering and model construction as well. DRL model are able to learn playing games just based on visual input and rewards independently without and adjustment of topology.

As you can see or read in “Human-level control through deep reinforcement learning”, in 25 games superhuman performance was achieved.

How it works?

Machine has to act in state s based on inputs, rewards and historical data.

Q function represented by deep network can learn, how potent individual actions are. Very nice tutorial on DRL is here.

Watch how it works in a simulated environment.

AlphaGo DRL is not so universal, but still learns from self play and has many general features. The potential of General AI is very high. Investors can smell it.

Some of our students were collaborating with Good AI last summer, look at their models. In order to push things forward, one has to work on improving components of the system. Luckily, many companies including Google opensourced their AI frameworks.

One of the most important elements of the system is the memory (not a simple hard drive, but more complex memory e.g. resembling human brain).

Better memory to learn complex patterns

Memory in ML systems is often represented by a Recurrent Neural Network. It does not work well for long term patterns. It is able to learn simple tasks.

Such as driving simple robots. But fails on more complex tasks.

Handwriting demo is one example, where simple RNN fails, but can be solved by RNNs with longer term memory. It is still open research problem, how to learn longer term memory efficiently.

Alternative approach was proposed by our colleague Jan, who has moved to Nnaisense. Other direction is to add additional memory, such as stack or list so neurons can use it.

And multiple stacks are even better.

We are very interested in the Neural Turing Machine that is even more general. It is a neural computer trained by machine learning.

But it is very difficult to train.

With our implementation we managed to repeat experiments showing that NTM can generalize many simple tasks.

And it is universal so it can be also used to learn human behavior patterns. There are several directions in which NTM training can be improved. If you are interested, write me, I will send more details.

Another important application of machine learning systems with memory is language translation.

See this great comprehensive tutorial or this or this paper.

Another application with great industry potential is question answering.

You can join colleagues from our research group. There are many research directions.

Think twice before you release any learning system.

Probably the most significant area of actual ML research is deep learning.

Deep learning

These are feed forward supervised artificial neural nets. As you see, they are able to beat human and other approaches in image classification.

With deep structures — high number of layers.

Again, the challenge is to train it efficiently. In future, you can expect more adaptive variants of deep learning algorithms.

Look what can be done when you combine neural embeddings of text and images.

And use a lot of data to train the system.

Results are impressive and show higher level understanding of concepts.

And it can be used as generative model as well.

Interesting results:

You can also entangle neural nets (gate convolutional neural net with LSTM)

and get impressive results

Conv nets can be also used to recall memorability of pictures. See interesting demo, where you can sort images according to this criterion.

There are applications in art:

Speech recognition:

Or autonomous driving:

Deep networks have long history and there are many approaches how to build them. Some years ago we also proposed one variant allowing to build hybrid networks that show potential in predictive modeling.

Ensemble learning

Data mining challenges of past years are dominated by ensemble solutions. Not only in predictive modelling tasks where you can recently meet

but also in recommendation, learning to rank and other machine learning tasks.

Currently, we work on new scalable ensembling strategies and implement them under h2o.ai in Java complementary to other ensembling approaches. Again, contact me, if you are interested.

Better optimization algorithms

Core of machine learning research is the optimization of models — both in terms of structure and parameters.

Better optimization techniques allow machines to solve more complex tasks and allows for faster and more robust training. Look what you can do with advanced evolutionary optimization methods such as CMAES. Help us to move beyond simple algorithms such as SGD and extend our JCOOL optimization library.

Meta-learning and live data

Last, but not least important area of machine learning research is automation of modeling process. We proposed meta-learning templates to generate problem tailored machine learning algorithms. This approach is applicable not only to predictive modelling, but also to recommender systems, data clustering, feature engineering, etc.

Even more important is to optimize machine learning systems to real objectives. Thanks to the Recombee company, we are able to work on real user interaction data and our experiments show that performance metrics used in academia (e.g. generalization performance on offline data) often do not correspond with online metrics. Optimizing business objectives by machine learning and artificial intelligence is even more difficult problem and important research direction.

This is probably main reason why many top AI researches are moving from academia to companies. Universities has to work with companies to enable researchers access their big data and live data (which is hard, but even more important). By online optimization of artificial intelligence systems directly interacting with users, one is able to create really interesting and useful stuff.

Fortunately, many companies are getting very open, publish their research results immediately and even opensource their machine learning infrastructures. Models pre-trained on large data sets and machines are now freely available and you can use them in your research or business. You do not need to be big team to come with something revolutionary. Go and build something!

Conclusion

You can see that something big is happening in the field of artificial intelligence. The world as you know it is going to change rapidly in a few years thanks to new developments in AI. It also gives you opportunity to be part of the crowd that transforms our world by smart algorithms.

What you can do if you like to get involved? Learn the field — our faculty offers knowledge engineering bachelor and master program and many related courses on artificial intelligence, machine learning and data mining. There are several relevant MOOC online courses. If you are student of FIT CTU, there are several interesting machine learning projects with companies available in our portal. If you are a company with interesting project we can add you to the portal so you can work with our students (there are already several international companies involved).

If you are based in Prague, you come to machine learning meetups, or the conference. Almost every Friday, we organize meetings of the Machine Learning and Computational Intelligence Research group which are open and we welcome new members and students who are willing to work with us.

Every summer, we organize a camp (not exclusively for CTU students) so you can join as a student or as a company. We provide guidance, technologies, infrastructure, working environment and stipendia. Our plan is to create something big and show the potential of AI on real tasks.