# CrAIg: Using Neural Networks to learn Mario

Joe Crozier and I recently came back from YHack, a 36-hour, 1500 person hackathon held by Yale University. This is our second year in a row attending, and for the second time we managed to place in the top 8!

Our project, named “crAIg”, is a self-teaching algorithm that learns to play Super Mario Bros. for the Nintendo Entertainment System (NES). It begins with absolutely no knowledge of what Mario is, how to play it, or what winning looks like, and using neuroevolution, slowly builds up a foundation for itself to be able to progress in the game.

My focus on this project was the gritty details of the implementation of crAIg’s evolution algorithm, so I figured I’d make a relatively indepth blog post about it.

crAIg’s evolution is based on a paper titled Evolving Neural Networks through Augmented Topologies, specifically an algorithm titled “NEAT”. The rest of the blog post is going to cover my implementation of it, hopefully in relatively layman’s terms.

# crAIg’s brain

Above is the second part of this project, a Node.js server that displays the current state of crAIg’s brain, or what he is “thinking”. Let’s go through it quickly to understand what it’s representing.

On the left you see a big grid of squares. This is what the game looks like right now, or what crAIg can “see”. He doesn’t know what any of the squares mean, but he knows that an “air” tile is different from a “ground” tile in some way. Each of the squares is actually an input neuron.

On the right side you can see the 4 “output neurons”, or the buttons that crAIg can press. You can also see a line going from one of the black squares on the left grid to the “R” neuron, labelled “1”. This is a synapse, and when the input neuron fires on the left, it will send a signal down the synapse and tell crAIg to press the “R” button. In this way, crAIg walks right. As crAIg evolves, more neurons and synapses are created until his brain might look something more like this:

In this one I’ll just point out a couple things. First of all, the green square on the left is a goomba. Second, you can see another neuron at the very bottom (labelled 176). This is called a hidden neuron, and represents a neuron that is neither input nor output. They appear in crAIg’s brain for added complexity as he evolves. You can also see that at his time of death (Mario just died to a goomba), he was trying to press the “R” and “B” buttons.

# Why Neuroevolution is so cool

crAIg is a cool peek into the future where machines no longer need to be programmed to complete specific tasks, but are instead given guidelines and can teach themselves and learn from experience. As the tasks we expect machines to complete become more and more complex, it becomes less possible to “hard code” their tasks in. We need more versatile machines to work for us, and evolving neural networks are a step in that direction.

# NEAT (NeuroEvolution of Augmented Topologies)

NEAT is a genetic algorithm that puts every iteration of crAIg’s brain to the test and then selectively breeds them in a very similar way to the evolution of species in nature. The hierarchy is as follows:

Synapse/Neuron: Building blocks of crAIg’s brain.

Genome: An iteration of crAIg’s brain. Essentially a collection of neurons and synapses.

Species: A collection of Genomes.

Generation: An iteration of the NEAT algorithm. This is repeated over and over to evolve crAIg.

## 1. Calculating Fitness

Once the fitness of every genome has been calculated, we can move on to the next portion of the algorithm.

The proper implementation of this algorithm is relatively intensive, so for crAIg’s implementation we simplified it to the following:

The important part here is that each genome now has an adjusted fitness value associated with it.

## 3. Survival of the Fittest

The first step is to determine how many off a species will die to make room for more babies. This is done proportionally to a species’ adjusted fitness: the higher the adjusted fitness, the more die off to make room for babies.

The second step is to determine how many children should be born in the species. This is also proportional to the adjusted fitness of the species.

By the end of these two functions, the species will have a certain number of genomes left as well as a “baby quota” — the difference between the number of genomes and the populationSize.

## 5. Mating Season

Each empty population spot needs to be filled, but can be filled through either “asexual” or “sexual” reproduction. In other words, offspring can result from either two genomes in the species being merged or from a mutation of a single genome in the species. Before I discuss the process of “merging” two genomes, I’ll first discuss mutations.

## Mutations

1. Mutate Synapse Weights

This involves a re-distribution of all synapse weights in a genome. They can be either completely re-distributed or simply “perturbed”, meaning changed slightly.

Adding a synapse means finding two previously unconnected nodes and connecting them with a synapse. This new synapse is given a random weight.

This is the trickiest of the mutations. When adding a node, you need to split an already existing synapse into two synapses and add a node in between them. The weight of the original synapse is copied on to the second synapse, while the first synapse is given a weight of 1. One important fact to note is that the first synapse (bright red in the above picture) is not actually deleted, but merely “disabled”. This means that it exists in the genome, but it is marked as inactive.

Synapses added in either Mutate Add Node or Mutate Add Synapse are given a unique “id” called a “historical marking”, that is used in the crossover (mating) algorithm.

## Crossover Algorithm

When two genomes “mate” to produce an offspring, there is an algorithm detailed in the NEAT paper that must be followed. The intuition behind it is to match up common ancestor synapses (remember we’ve been keeping their “historical marking”s), then take the mutations that don’t match up and mix and match them to create the child. Once a child has been created in this way, it undergoes the mutation process outlined above. I won’t go into too much detail on this algorithm but if you’re curious about it you can find a more detailed explanation of it in section 3.2 of the original paper, or you can see the code I used to implement it here.

## 6. Respeciation

This equation determines how similar (or different) any two given genomes are. I won’t go into the gritty details of how the equation works, as it is well explain in section 3.3 of the original paper, as well as here in crAIg’s code.

If a genome is too different from any of the candidate genomes, it is placed in its own species. Using this process, all of the genomes in the generic pool are re-placed into species.

Once this process has completed, the generation is done, and we are ready to re-calculate the fitness of each of the genomes.

# Lessons Learned

First of all, the NEAT algorithm is a very complex one. Learning how to implement a complex algorithm without losing myself in its complexity was an exercise in code cleanliness, despite being pressed for time because of the hackathon.

It was also very interesting to create an algorithm that is mostly based off a paper as opposed to one that I have example code to work with. Often this meant carefully looking into the wording used in the paper to determine whether I should be using a > or a >=, for example.

One of the most difficult parts of this project was that I was unable to test as I was programming. I essentially wrote all of the code blind and then was able to test and debug it once it had all been created. This was for a couple reasons, partially because of the time constraints of a hackathon, and partially because the algorithm as a whole has a lot of interlocking parts, meaning they needed to be in a working state to be able to see if the algorithm worked.

Overall I’m happy and proud by how Joe and I were able to deal with the stress of creating such a deep and complex project from scratch in a short 36 hour period. Not only did we enjoy ourselves and place well, but we also managed to teach crAIg some cool skills, like jumping over the second pipe in Level 1:

Written by