FROM THE ARCHIVES OF PRAGPUB MAGAZINE, SEPTEMBER 2018
Genetic Algorithms: A Facet of Machine Learning
by Frances Buontempo
In this article adapted from her book from The Pragmatic Bookshelf, Frances introduces one approach to machine learning.
- How do you split your investments, maximizing your pension and avoiding buying shares in companies you have moral qualms over?
- How do you create a timetable for classes, making sure there are no clashes, and all the classes get taught and have a room?
- How do you make a seating plan for a wedding, making sure each guest knows someone else at the table and keeping certain relatives as far apart as possible?
These seem like very different problems. The investments will be amounts of money in different schemes. The timetable will tell you what happens when and where. The seating plan will be seat numbers for each guest. Despite these differences, they have something in common. You have a fixed number of investments that need values or classes or guests to arrange in a suitable order. You also have some constraints or requirements, telling you how good a potential solution is. Any algorithm returning a fixed-length array organized or populated to fulfill conditions will solve the problem.
For some problems, you can work through the options or use mathematics to find a perfect solution. For one fixed-rate bond in a portfolio, the mathematics to find its future value is straightforward. With one class, one teacher, and one room, there is only one possible timetable. For a single guest at a wedding, the seating plan is apparent.
For two or three guests you have more options, but can try out each and decide what to do. Once you have 25 guests, there are 15,511,210,043,330,985,984,000,000 possible arrangements. Trying each of these, known as brute force, against your constraints will take far too long. You could reject a few combinations up front, but will still be left with far too many to try. All you need is a list of seat numbers for 25 people. You could try a few at random and might get lucky, but you might not. Ideally, you want a way to try enough of the possible arrangements as quickly as possible to increase the chance of finding a good enough seating plan — or timetable — or investment.
One machine learning or evolutionary computing method called a genetic algorithm (GA) is ideal for problems like this. A GA finds a solution of fixed length, such as an array of 25 guests’ seat numbers, using your criteria to decide which are better. The algorithm starts with randomly generated solutions, forming the so-called initial population, and gradually hones in on better solutions over time. It is mimicking Darwinian evolution, using a population of solutions, and using the suitability criteria to mirror natural selection. It also makes small changes, from time to time, imitating genetic mutation.
The algorithm makes new populations over time. It uses your criteria to pick some better solutions and uses these to generate or breed new solutions. The new solutions are made by splicing together parent solutions. For investments of bonds, property, foreign exchange and shares, combine bonds and property from one setup with foreign exchange and shares from another, and you have a new solution to try out. For seating plans, swap half of one table with half of another, or swap parts of two seating plans. You might end up with the same seat used twice, so you need to do some fixing up. There are lots of ways to splice together arrays. The GA also mutates elements in the solution from time to time, such as swapping two people’s seats. This can make things worse — splitting up a couple might not be good — but can make things improve too. Nonetheless, this keeps variety in the solutions, thereby exploring several of the possible combinations.
Frances Buontempo shares more wisdom about genetic algorithms in her book with The Pragmatic Bookshelf, Genetic Algorithms and Machine Learning for Programmers:
About Frances
Frances Buontempo is the editor of ACCU’s Overload magazine. She has published articles and given talks centered on technology and machine learning. With a Ph.D. in data mining, she has been programming professionally since the 1990s. During her career as a programmer, she has championed unit testing, mentored newer developers, deleted quite a bit of code, and fixed a variety of bugs.