Genetic Algorithm: Part 4 -CartPole-v0

Satvik Tiwari
Apr 28 · 3 min read

So far, we have learned the basics of Genetic Algorithm(GA) and solved a classical problem using GA. GA can be applied to a variety of real world problems.

So, today we will use Open AI Gym environment to simulate a simple game known as CartPole-v0 and then we will use GA to automate the playing of the game. Sounds fun…..

So, lets jump right into it.

Problem Statement

A pole is standing upright on the cart. The goal is to balance the pole in an upright position by moving the cart left or right. You lose the game if the angle of the pole with the cart is more than 15 degrees. You win the game if you manage to keep the pole balanced for given number of frames. For every frame you mange to keep the pole in upright position you get a ‘+1’ score.

Solution

To solve this problem we will first create some random game data and then we will feed it to our GA model which in turn will predict the movement of cart(left or right) for every frame. For those of you new to GA, do refer to my previous tutorials.

As usual, we begin with importing libraries and making necessary initializations.

If you are new to Open AI then you can check this out. It will give you a better intuition of its terminologies.

Now we will collect the data by running the game environment 10000 times for random moves i.e. for every frame it will be randomly decided whether our cart goes left or right. If our score is greater than or equal to 50 then we are going to store every move we made thereafter.

Note that we are storing current action with previous observation. The reason behind it is that at first as the cart was still, there was no observation but when there was an action the observation was returned from the action. Hence, pairing the previous observation to the action we’ll take.

We are considering ‘0’ for ‘left’ and ‘1’ for ‘right’.

We will follow the same flow chart for GA as we did earlier.

Initial population will be 8 randomly created set of genes/weights.

The fitness function that we will be using is :

where,

w = weights

x = observations

We will select the fittest individuals as parents i.e. solutions with the highest fitness value.

Selected parents will be mated to produce offsprings.

Now we will mutate few offsprings to create diversity in solutions and for that we will add some noise to randomly selected gene of individuals selected for mutation.

Now we will call the functions defined above in the order of flow chart.

Output:

Initial Population: 
[[ 0.67999273  1.20045524 -0.31810563 -1.14804361]  
[-1.51475165 -1.42250336  0.03428274  0.63371852]  
[-0.8970108   1.03936397  1.84329259  1.72682724]  
[ 1.94204407 -0.77717282  0.14019162 -1.1903907 ]  
[ 0.41835458 -1.22852332  0.9296547  -1.12009693]  
[-0.65292285 -1.40827788  1.55964313 -0.23029554]  
[-1.44485637  0.02821767  0.48371509 -0.67509993]  
[ 1.51550571 -1.66566025  0.17737747 -1.76249427]] 
Weights: [array([ 7.53967288, 14.12987549, -2.29013767, -8.39335431])]

Lets visualize the fitness growth with respect to generations.

Output:

So everything is set and good to go. We will feed our model with training data and then run the game environment 10 times. Initially, we will decide a random move and based on that our model will play the game.

Output:

Required Score: 50 
Average Score Achieved: 140.2

Note that everytime you run the code, its not necessary that you get a good score. But if you combine this model with some other algorithm you can make the model more robust.

Thank you for reading this article. If you like it then mention the use cases you think Genetic Algorithm can have in the comment section below. Stay tuned for more Machine Learning stuff…. :)

Koderunners

We believe in Open Source, Open Education and Open Innovations

Satvik Tiwari

Written by

Machine Learning Enthusiast || Moderator at Koderunners https://www.linkedin.com/in/satvik-tiwari-1a2955155

Koderunners

We believe in Open Source, Open Education and Open Innovations