A brief introduction to Genetic Algorithms
Note: a Portuguese version of this article is available at “Uma breve introdução aos algoritmos genéticos”
A genetic algorithm is a metaheuristic inspired by the process of natural selection that belongs to the larger class of evolutionary algorithms. They are well suited for optimization and search problems.
As Darwin’s natural selection says:
Individuals possessing traits well suited for the struggle for local resources will contribute more offspring to the next generation. Because of that, helpful traits tend to become more common in the next generations and the population will become adapted to its environment.
In genetic algorithms, these traits can be measured by a fitness score. It is basically how close the population is to archive its aim. The less a fitness value is, the closer to the final solution.
Concepts such as Crossing Over and Mutation have great relevance on how the algorithm generates each individual from a population. With Crossing Over, we guarantee that the genes from the fittest individuals will remain in the next generation, and the Mutation doesn’t allow the process to stagnate without reaching the target.
Let’s see how it works in practice.
Infinite Monkey
There is a theorem called Infinite Monkey. It says that an unlimited number of monkeys will eventually produce a text, such as Shakespeare’s Hamlet, given typewriters and sufficient time. We can model it as a Genetic Algorithm.
As our gene pool, we’ll set the chars from alphabet (lowercase and uppercase), numbers and punctuation marks.
The target sentence will be one of the most famous from Hamlet:
To be, or not to be, that is the question
You can check the entire code here or clicking the link at the end of this post.
Initialization and Fitness Score
First of all, we need an initial population. It will be generated randomly selecting the genes from our pool, and then, the fitness score is calculated for each individual.
In our algorithm, we want it to generate the target sentence. The fitness score can be modeled as the difference between chars:
Selection
As explained before, the individuals with well suited traits will contribute more offspring. To accomplish that, the population is sorted according to the fitness score.
The less fit part of the population won’t leave any descendants and the genes will disappear.
Crossing Over and Mutation
Parents selected, the crossing over phase can start. Basically we’ll randomly select each of the offspring genes from one of its parents. Also, a mutation rate is applied. It will get a random gene from our pool, when requested.
Except the initialization, all the phases will occur on each generation. The population from one generation will be the output from the previous one.
Finally, let’s run the code 🤞
It works! 🎉 🎉
Genetic Algorithm in Neural Networks? Welcome NEAT
What about applying the concept of genetic algorithms in neural networks? That is exactly what the algorithm Neuroevolution of Augmenting Topologies, or simply NEAT, does. Instead of relying on fixed size neural nets, it evolves the model to fit the target.
We call a genotype the collection of genes from the Neural Network. It can have node genes and connection genes. Eventually, a genotype will be mapped to a phenotype, i.e. the physical representation of the network.
Input and Output nodes won’t evolve. Hidden nodes can be added or removed. A Connection carry information from where it comes and go, a weight and a historical marker.
Competing Conventions and Historical Marker
When combining 2 genotypes, there is a big problem: it may result in worse networks, specially when using a blind approach when doing the Crossing over. That is called Competing Conventions.
To avoid it, we put a historical marker to the connections. This way, we ensure that each gene will be aligned, like what occurs in biology, generating better results.
Making a Game AI using NEAT 🕹
There is a Python library called neat-python that implements the algorithm. Using it with retro, we could train an AI and play a game.
I’ve trained the model to beat a single stage of the game Sonic the Hedgehog 2. You can see the code clicking here or on the GitHub link.
As you can see above, during early generations, it performs poorly. After some generations (in my case, 7) the result is as follows:
Wrapping up
A Genetic algorithm is a class of evolutionary algorithms, inspired by the Darwin’s natural selection. It performs well on optimization and search problems.
It uses biological concepts such as crossing over and mutation. Also, there is a score called fitness to measure how much a population is fit for the task.
NEAT is an algorithm that uses the concept from genetic algorithms in neural networks. The idea behind it is the evolution of the genes and build a neural network as fit as possible.