Today I’m going to talk about a small practical example of using neural networks — training one to play a snake game.

This article is for beginners, so if you are good at machine learning you will not find something interesting for you. It would be great if you know something about machine learning, neural networks, and TensorFlow but there is no problem otherwise.

And finally, obviously, there are better approaches to write a logic for a snake game but let’s pretend that it is a real task and we need to solve it this way.

Part 0: The game

Firstly we need to write a game itself. It will have a 20x20 field, a snake of 3 pieces at the start, one randomly generated apple at each moment in time and API to use with our network. You can find a code of the game here.

Now let’s start with a neural network.

Part 1: Survive

Features

To make the snake “smart” we need to give some knowledge to it — we need to create features to teach it. Always try to choose features which will be most useful. If you add not enough features, a network will not get enough information to be good. From the other side, if there are too many features. it will be hard for a network to decide which are more important and learning will be longer.

At the first step, we will learn the snake how to survive and will not think about apples. To choose a right direction it should know if there are any obstacles around it. Considering these obstacles and suggested direction the network will decide is it a good action or not.

So on the input of our neural network we will give an array of 4 numbers:

  • Is there an obstacle to the left of the snake (1 — yes, 0 — no)
  • Is there an obstacle in front of the snake (1 — yes, 0 — no)
  • Is there an obstacle to the right of the snake (1 — yes, 0 — no)
  • Suggested direction (-1 — left, 0 — forward, 1 — right)

And as the output we want to receive a decision. 1 — we should go in the selected direction, 0 — we should choose another one.

Input data

The neural network need some data to learn on. Input data is very important part of machine learning. If you have a huge amount of data, you can achieve great results even if an architecture of your network is not good. That’s why companies like Google are trying to gain all information they can get from their users (of course not because they have bad architectures but because really big data is precious).

So we need to generate some data. You can sit and play as many games as you can, but it is always good when you can generate data automatically (from scratch or modifying data that you have). In our case it is easy to create data just randomly choosing direction and observing if the snake is still alive after the turn.

After 100 games I’ve got 5504 training examples. It is enough for training to survive

Architecture of neural network

Choosing the right architecture or your neural network is always hard. You can choose number of neurons in layers, number of layers and types of neurons. It always depends on task that you trying to solve. It’s better to try different variations and choose the one that fits more than others.

Our task is very simple therefore we will use only input and output layers. No hidden layers are needed.

In TensorFlow it will look like(I’m using TFLearn):

network = input_data(shape=[None, 4, 1], name='input')
network = fully_connected(network, 1, activation='linear')
network = regression(network, optimizer='adam', learning_rate=1e-2, loss='mean_square', name='target')
model = tflearn.DNN(network)

You can find the full code here

Results

Each turn we give to the network three arrays with possible actions and choose one with better output. After training the snake chose the easiest way to survive:

Part 2: Let’s feed the snake

New feature

Now when the snake know how to survive, it’s time to think about apples. To teach the snake how to find apples we need to add a new feature. I chose the angle between snake’s movement direction and direction to an apple:

If an apple is to the left of snake the number will be positive, if it’s to the right — negative. Also it is always good to normalize your features. In this case we need to divide the angle by 180 degrees so the number will be from -1 to 1.

So the new input will be an array of 5 numbers:

  • Is there an obstacle to the left of the snake (1 — yes, 0 — no)
  • Is there an obstacle in front of the snake (1 — yes, 0 — no)
  • Is there an obstacle to the right of the snake (1 — yes, 0 — no)
  • Normalized angle between snake’s movement direction and direction to an apple (from -1 to 1)
  • Suggested direction (-1 — left, 0 — forward, 1 — right)

New output

Now instead of just observing if the snake is alive after a turn or not, we need to decide was the turn successful or not. To do this we will calculate a distance between snake’s head and an apple before and after the turn.

And the new output will be:

  • -1 if the snake didn’t survive
  • 0 if the snake survived but the direction is wrong
  • 1 if the snake survived and the direction is right

After 10000 initial games I’ve got:

  • 294078 turns with a right direction
  • 280527 turns with a wrong direction
  • 10000 wrong turns

Let’s see is it enough to teach the snake, but before we need to change its architecture.

New architecture

Now we have more complicated input data and more options for the output. So let’s choose more sophisticated neural network architecture. Let’s add a hidden layer with 25 neurons.

Code:

network = input_data(shape=[None, 5, 1], name='input')
network = fully_connected(network, 25, activation='relu')
network = fully_connected(network, 1, activation='linear')
network = regression(network, optimizer='adam', learning_rate=1e-2, loss='mean_square', name='target')
model = tflearn.DNN(network)

You can find the full code here

First results

After learning the snake shows follows results in 1000 test games:

  • Average number of steps —166.61
  • Average number of points — 12.171

How it looks:

Not bad. But we can do better. Let’s look at some inputs when the snake was wrong:

  • [ 1, 1, 1, 0.36420025]
  • [ 1, 1, 0, 0.04516724]
  • [ 1, 0, 1, 0.2912856]

where numbers are: [obstacle to the left, obstacle in the front, obstacle to the right, angle to an apple]

In the first example there was nothing to do for the snake considering the input data, but in the next examples there were a chance to survive and the snake chose the wrong action. What can we do with it?

Let’s make snake great again

There are different things you can change tuning you network — an architecture, features, learning rate, and number of input samples. If you can get more input data it’s always worth trying to use it. In our case it is very easy to generate more games. Let’s try 100000 games.

  • Average number of steps — 398.959
  • Average number of points — 25.333

And all inputs when the snake dies looks like [1, 1, 1, x]. That means that the snake dies only when there is no any way to survive.

Summary

I chose features for a neural network, an architecture and got some input data. In the result, the network calculates the best result for a given features.

Next time I’ll show you how to make it better and how to use Convolutional Neural Networks.