Let’s talk about Artificial Intelligence

Artificial intelligence (AI) field was coined by John McCarthy in the summer workshop at Dartmouth College in 1956, which many of the world’s leading researchers in computing attended [1]. The main purpose of this workshop was to re-create in the machines the mechanism of the human brain aiming to emulate human intelligence. However, AI researchers were soon realized that this is not a trivial task and the ideas presented at that time were very important for the future rule based expert systems. Simulate human intelligence in computer systems is a very difficult task, firstly because we don’t have a strong grasp of how the brain works in its entirety. Furthermore, the human intelligence is not just about the brain, other factors are essential parts of our intelligence, such as education, memory, motivation, emotions, etc. Nowadays, the brain has been considered the main inspiration for the AI field, as well as the birds were inspiration to model the first airplane.

Currently, the AI systems are based on mathematical functions that exhibit intelligence in a given domain. For example, they can perform abstract and formal tasks easily as playing chess (or more recently Go) or predict weather. In contrast, the real challenges for the AI systems are in everyday tasks where knowledge is subjective and intuitive, and therefore difficult to describe in a formal way [2]. Machine Learning is a powerful approach allowing a computer in order it to make decisions that seem subjective. It usually consists of two steps: Feature Extraction & Learning Algorithm. Suppose you would like to write a computer program to recognize different types of vehicles from an image. Given an image, the computer program have to recognize if the image represents a bicycle, scooter, bus, car, etc. We humans know that there is a certain number of features that allow us to make this differentiation, such as texture, the relationship between height and width, the dominant color, shape, number of wheels, etc. For the computer to identify these features, we first need to write a computer program that extracts the most important features to recognize a vehicle. This first step is known as the features extraction. After that, these features will be used as input to a learning algorithm (e.g. Support Vector Machines, Naive Bayes, Logistic Regression, K-Means, etc.), which in turn it will build a model to automatically recognize the different types of vehicles. These steps are illustrated in Figure 1 (top).

Figure 1: Machine Learning versus Deep Learning

The machine learning works well as long as the features given for the learning algorithm are significant and discriminative. Therefore, the performance of the recognition depends heavily on how and what features were extracted. To extract a set of relevant features from an image, it is extremely important to understand well the problem to which we are attacking. For instance, to identify vehicles in an image it is important to know what are the essential features that helps differentiate one vehicle from another. In machine learning is the developer of the feature extraction algorithm who do most of the intelligence work and it require great deal of human time and effort. Feature extraction has been studied for decades for an entire community, and it was even the subject of my PhD thesis [3]. However, there is a technique called Deep Learning which can help. It is an approach of Machine Learning that allows to automate of feature extraction step, discovering the relevant features by itself, without human assistance. Generally speaking, in deep learning we have a neural network composed of several neurons and we given to it the raw image, and then it will learn the most relevant features and then it will recognize what the image represents. Figure 1 shows the deep learning pipeline (bottom).

The deep learning was relatively unpopular for several years and its name was changed several times. In [2], the authors report that deep learning went through three important waves over time. Firstly, deep learning became known as cybernetics in 1940s-1960s. After that, it has also been called connectionism + neural network in the 1980s-1990s, and from 2016 until today it has been know as deep learning. The Figure 2 illustrates a graph of historical waves of the artificial neural network by the frequency of the phrases according to Google Books [2].

Figure 2: The frequency of the phases “cybernetics” or ‘’neural network,” according to Google Books (the deep learning wave is too recent to appear). Reprinted from [2]

Since the 1980s, AI researchers have known that deep networks work well, but they could not demonstrate it at the time, since deep networks are too computationally costly to allow for many experiments with hardware available at the time. However, researchers like Yann LeCun, Yoshua Bengio and Geoffrey Hilton have continued to work on the domain, inclusive they received of the ACM A.M. Turing Award in 2018 [4] for their valuable contributions to the AI field. The accession of deep learning started in 2012 after a a convolutional network won the ImageNet Large-Scale Visual Recognition Competition for the first time and by a wide margin, bringing down the state-of-the-art error from 26.1% to 15.3%. Many people may ask: Why an approach that was neglected for many years, today is one of the most talked about AI approach in both the scientific and the business world (Microsoft, Nvidia, Google, Facebook, etc.)? Second [2] this is due to the following factors:

1. The increase in the amount of training data (thanks to Big Data Era).
2. Availability of powerful computational resources (faster CPUs, GPUs, CPUs) that allow to accelerate deep learning calculations.
3. The accuracy of deep learning for complex applications has increased over time.

In the illustrations below, Maria discusses with Ben about how to write her first program to recognize different types of vehicles. I hope you enjoy it! 😊


The biggest problem of deep learning today is that it needs labels in the training phase (supervised learning), while in in real life, labels are not always available. However, a great progress has been made in this area such as the Generative Adversarial Networks (GAN). It is a semi- supervised deep learning network that is composes by two parts, the generator, and the discriminator. The generator takes random noise and try to generate something similar to the input data. The discriminator take fake data generated from Generator and real input and learns to distinguish the difference between them.

Deep learning revolutionized AI and it remains one of the best options mainly for unstructured data, such as images, video, audio, text, etc. It’s powerful for specific tasks, but they are far from equalling the diverse cognitive abilities of a young child. However, the AI is a relatively new domain and it has not yet been sufficiently explored. Can you imagine the next AI revolution? I’m very excited about the AI future 😉.


[1] McCarthy, J., Minsky, M., Rochester, N., Shannon. A Proposal for the Dartmouth Summer Research Project on Artificial Intelligence, August, 1955.

[2] Goodfellow, J., Bengio, Y., Courville A.,: Deep learning. Genetic Programming & Evolvable Machines, 19(1–2):1–3, 2017.

[3] Silva, C. Extraction and Selection for Background Modeling and Foreground Detection, Ph.D. thesis, University of La Rochelle, France, May 2017.

[4] Available at https://en.wikipedia.org/wiki/Turing_Award, Accessed May 13 2019.