Genetic Algorithms: Tournament Selection

Calvin McGowan
4 min readMar 9, 2017

--

In my previous post I went over the whole of genetic algorithms at a basic level. In this post I will go into more detail, specifically in the area of selection methods.

The selection method I went over previously is called “roulette selection”. Without going back over it in too much detail, it essentially just assigns each individual in the population a probability based on their share of the total fitness of the whole population. Then individuals are randomly selected, with those individuals with a larger share being selected more frequently.

Roulette selection does a fairly good job of replicating biological evolution, and with sufficiently large populations the law of averages means that roulette selection won’t usually give you a generation made up entirely of the bottom of the fitness table. The trouble there is the word “usually”, and if you’re a hobbyist working with genetic algorithms just to experiment and learn more about programming then the word “usually” causes few problems. For professional applications the word “usually” can cause significant problems. Roulette selection might do a good job of replicating selection in biological evolution, but then we’re not really trying to replicate biological evolution. What we’re trying to do is get a workable solution once the algorithm is done running. This means using the helpful parts of biological evolution, and replacing the parts that might not work for us. In this case that means paying less attention to the bottom of the fitness table, and trying to iron out some of the randomness.

When you start looking for alternative selection methods tournament selection is one of the first ones that comes up, and it does a decent job of solving our two problems. Tournament selection does basically what it says so, if your company makes widgets I suggest pitching this to superiors as “March Madness for Widgets”. There are two main variables that determine how the algorithm works in tournament selection, tournament size, and probability.

Tournament size is what it sounds like, March Madness for example has a tournament size of 32, it’s simply the number of individuals in each tournament. Now this is not as simple as setting your tournament size to equal your population size. We want to run enough tournaments to get a full population for the next generation, 20 individuals in the population, 20 tournaments. Setting your tournament size to 20 in this case would just give you 20 copies of the highest fitness individual, which is not terribly helpful. On the other end of the scale, a tournament size of 1 is essentially just wholly random selection, so it’s even less helpful. Tournament size is another one of those things that will take a little trial and error on the part of the programmer to come up with a good tournament size. A good rule of thumb though is to include roughly 20% of your population in each generation, though for simplicity’s sake you might also consider setting your tournament size to the nearest power of 2 to avoid accounting for byes.

Probability is named less helpfully, but it’s essentially the likelihood that the selection algorithm chooses the most highly ranked individual in the tournament bracket. Now this would be a rather strange consideration in March Madness (it might even be… madness) for there to be some percent chance that the number two or number three ranked team is selected that year as the “winner” but in genetic algorithms it has it’s purpose. This is mostly due to the fact that it’s assumed that even the individuals not in first place have something to contribute to the next generation, even if it’s just to avoid ending up with a generation consisting of 20 duplicates. The formula to use for probability is “ ρ*((1- ρ)^ α)” where α is the individual’s fitness in the table, and ρ is probability. As an example, if we select a probability of 0.9, then the first place has a 90% chance of success, second place a 9% chance, third place a 1% chance. As you can see, probability will depend a lot on population size. A ρ of 0.9 works fine when your tournament size is 8, and you’re only choosing between the top 37.5%, but if your tournament size is 4 then it might be less appropriate. Again, this depends a lot on your own situation, a ρ of 0.9 and a tournament size of 4 might be fine for you. It will take some tuning.

Taken together the two variables need some balancing. You might want larger tournaments that account for maybe half the population with a relatively lower probability (~0.75), or you might want a tournament size of only 4 with a probability of 1. Larger values for both variables will put more pressure on larger fitness values. A larger tournament size will result in higher fitness individuals winning the tournament, and larger probability values will result in those winners being selected more often.

Like almost anything to do with genetic algorithms there are few specific values when it comes to tournament selection, but generally speaking tournament selection should give both more options, and better options when it comes to tuning your algorithm.

--

--