Hacking Artificial Minds

3 min readFeb 15, 2019

Artificial neural networks are being used more and more in mission-critical or user-facing applications. We use artificial neural networks in self-driving cars, we talk with artificial neural networks and social networks use deep learning to understand what you're interested in or who is present on a picture.

It's clear that neural networks augment our daily lives and tasks at an ever-increasing rate, but what if these artificial minds can be hacked? What if, just like a human, we could fool these neural nets into making a mistake. How would that be possible? What would the consequences be?

Adversarial Attacks

Adversarial attacks are carried out by presenting maliciously crafted inputs to neural networks that an attacker has intentionally designed to cause the network to make a mistake.

Imagine the following research by Ian Goodfellow, research scientist at Google Brain. In his research, Goodfellow shows how he was able to generate adversarial attacks on the GoogLeNet neural network architecture that is trained on ImageNet.

Reference: Explaining and Harnessing Adversarial Examples, by Ian J. Goodfellow, Jonathon Shlens, Christian Szegedy, 2014

The GoogLeNet neural network initially identifies a picture with a panda correctly as a 'panda' with a confidence of 57.7%. However, when a small perturbation or noise is added to the original image, GoogLeNet is tricked into thinking it is looking at a picture of a 'gibbon' with 99.3% confidence. It is important to notice here that the added noise is not perceptible by a human eye. The image that is classified as a gibbon, looks to a human still the same as the original image that was correctly classified as a panda. What is going on here?

The added noise is generated in such a way that it influences the decision-making mechanism in GoogleNet's neural network. The 0.007 portion of the added noise “corresponds to the magnitude of the smallest bit of an 8-bit image encoding after GoogLeNet’s conversion to real numbers”. This means that the noise is added in the least significant bits of an image which are not visible to the human eye, but plays an important role within the neural network because the bits are converted to real numbers and the least significant bit starts to play a role.

What could go wrong?

Imagine malware detection systems that use neural networks to decide whether a given software sample is malicious or not. We can imagine that we could encode a certain combination of bytes into a malware sample such that the neural network is tricked into classifying the malware sample as being benign. As such, effectively bypassing any anti-malware security controls.

What if you play an adversarial piece of audio next to your home automation system. Imagine your hacker friend Bob sends you a malicious audio file. When you play the audio file you will hear something along the lines of "Hey Alice, how are you doing, everything fine?" but your home voice assistant appliance hears "switch of security cameras and open the front door". This is exactly what has been shown to work by researchers at the Ruhr-Universität Bochum, Germany.

A lot of modern cars now come with automatic traffic sign recognition based on a camera that is incorporated in the front window. Imagine an evil hacker applying stickers on traffic signs that trick the neural networks in classifying a STOP-sign as a 70KPH-sign. This is exactly what has been shown to work by researchers at Princeton University. This imposes a lot of new problems for manufacturers of self-driving cars.

Adversarial Security at Overture

Neural networks and deep learning are the core of what we do at Overture. We take the security and privacy of our users very serious. As such, adversarial security is an important feature when we develop neural networks that are accessible on our platform. We have incorporated and tested several security countermeasures to protect neural networks against adversarial attacks, which is not an evident thing to do. In the near future we will be even extending our effort and will be rolling out a set of features that give adversarial security controls in the hands of our users when they develop, train and deploy neural networks on our platform.

Feel free to reach out if you are interested in discussing more about adversarial security on neural networks, we are happy to help out!

Hacking Artificial Minds

Adversarial Attacks

What could go wrong?

Adversarial Security at Overture

Further Readings

Written by Dario Incalza