Creating an Ai to play Tetris (part 2)
Previously I have introduced the interesting Tetris AI problem and how we are going to model the problem. We came up with a Heuristic function that evaluates “how good” is an arrangements of blocks on the game board. Basically, we are going to have a linear combination of num_of_holes, sum_of_difference, landing_height, lines_cleared, height_range and total_height. The problem that we left previously was that how to find corresponding coefficient for each factor. There are basically two ways to do that, Particle Swarm Optimization and Genetic Algorithm. Today we are going to talk about PSO.
An interesting idea
From my understanding of this algorithm, it is trying to simulate the so called “society”. So how does the algorithm work? Imagine that you have a group of people who do not know each other before, everyone of them has his/her own ideas and principles. Of cause, they have their own position in this society at the beginning. Then they work, communicate, each time they have some changes in their lives, they might consider
- “I want to reach the best position in the society!”
- “Oh, from my own experience, where is my best position? When I had the best result, reach the best position, how was I dealing with my affairs?”
- “Oh, from my observation of others, how the person who has very good result deal with his/her affairs?”
Then people might tend adjust their ideas and principles according to the considerations above. In the end, all people will be heading to the best position range in the society. That is it!
How would the idea be applied?
Right, let us regard those people as Particles. Each Particle will have its own initial velocity, which is just an interpretation of their initial "ideas". Then there is a vector that contains the weights (coefficient) of each attributes defined previously, let us call it position. Since people's idea will have influence on their positions in the society, we just say

Where xi(k+1) refers to the new position of an individual, similar to vi(k+1). This is rather simple, but how we are going to update our velocity? There are three factors that take control of the updating: previous velocity, cognitive factor and society factor.

Where c1 is a number that shows how important is the cognitive influence. c2 is similar. Then r1 and r2 are just two random numbers. pi(k) is the current best position for individual while pg(k) is the current best position for the whole society. xi(k) is of course, current position. Then we add them up

Where here inertia is just a term that defines how hard it is to change the velocity of a Particle. This is a constant.
After all, we just keep looping, let particles “do work”, evaluate the result, and update velocity and position accordingly. After sufficient number of iterations, we are expected to observe optimal solution for position.
So that is basically what is PSO and how it can be applied to our specific problem (over all idea). In next section I will be talking about the specific implementation (using pseudo code).