Genetic Algorithms in Practice

BigData Republic
bigdatarepublic
Published in
5 min readJun 30, 2017

A Genetic Algorithm (GA) is part of a family of algorithms known as Evolutionary Computing. These algorithms are all inspired by biological evolution, used for various optimizations. In this post we discuss what type of projects used Genetic Algorithms and provide a step-by-step example.

The basics of a GA are as follows: candidate solutions to a problem are randomly initialized and become part of the first population. Each individual (candidate solution) receives a fitness score, based on how well it solves the problem. A selection step follows, for which the fittest individuals have a better chance of survival. Via crossover and mutations new individuals are created and form the new population. This step can be repeated numerous times, until an individual is found that meets the acceptation criteria.

As an example, NASA created a space antenna using a GA, which had to meet a difficult set of requirements. They claimed that: “The current practice of designing and optimizing antennas by hand is limited in its ability to develop new and better antenna designs because it requires signi cant domain expertise and is both time and labor intensive”. The individuals (random antennas) were represented as a list of operations, such as forward(length, radius) or rotate-x(angle). Both the operations and their parameters were mutatable. The fitness score was mostly a combination of wide beamwidth for a circularly polarized wave and wide impedance bandwidth.

To illustrate in depth the working of a Genetic Algorithm, we use a simpler case than NASA’s antenna: The knapsack problem. For a given a set of items, that each have a certain weight and value, the total value of items that fit in the fixed size knapsack has to be optimized.

The optimization of this problem is NP-hard, meaning that there is no known algorithm that determines whether a solution is optimal in polynomial time. Genetic algorithms typically tackle NP-hard problems and approximate the optimal solution.

The first step in creating a GA is representing the problem in a way that is suitable for applying genetic operators. For the knapsack problem it is rather easy, the knapsack is represented as a flexible sized array with item identities. The fitness score is the summed value of the items in the backpack. If a backpack exceeds the carrying limit, its fitness score is 0.

Step 2 is creating an initial population. N backpacks will be filled with random items until the weight limit is reached.

Step 3 is determining the fitness score and transition to the next population. Two individuals are chosen pseudo-randomly (higher fitness score = higher probability). The first applied genetic operator is crossover, which relates to giving birth. With the combined genes of the parents a new individual is formed. It is difficult however to determine what parts of the parents are used, since there are two constraints that need to be satisfied: the backpack cannot contain duplicate items and it should not exceed the weight limit. The fact that we stated “should” instead of “cannot” is because it is a choice whether new individuals can be “falsely” born into the world, in which case new random individuals will take their place in the new population, similar to the first population.

The second operator is mutation. Mutation can be broken down to three operators of which one is randomly chosen to apply: adding an (valid) item, removing an item or replacing an item.

After the creation of a new population the steps repeat itself. For the knapsack problem a single iteration computes in only a few milliseconds, allowing the number of repetitions to be enormous and therefore guaranteeing an approximation to the optimal solution.

Recent developments in deep learning and especially Convolutional Neural Networks (CNN) have caused an increased interest for Evolutionary Computing. Google wrote a blogpost showing their research in this field. The “architecture” of a CNN is the specification of the layers and their parameters. The search space of the architectures is immense, a 10 layer network can have up to ~1010 variations. Many researchers study new types of architectures and the optimization of architectures, however there is no known architecture that is proven to be optimal. Evolutionary Computing can optimize these architectures by exploring the amount, types and order of the layers. The parameters of the layers and the global parameters of the architecture, such as learning rate and learning function can also be optimized. An issue is the computing power required to optimize these architectures. The fitness score of an architecture is determined by training and evaluating on a certain task, which for a typical GA has to be repeated thousands or millions of times.

To summarize: Evolutionary Computing uses techniques derived from natural processes and optimizes problems that have no clear optimal solution. It has been applied in various projects and has recently made an uprising in deep learning research, of which we will definitely see more of in the upcoming years.

Students from the Radboud University used an Evolutionary Algorithm to form a target image using small pieces of a base image. A live demo can be found here!

BigData Republic provides these type of Big Data Solutions. We are experienced in advanced predictive modeling and deploying large-scale big data pipelines. Interested in what we can do for you? Feel free to contact us.

--

--