Evolving design

Danil Nagy
Generative Design
Published in
17 min readJan 26, 2017

This article will describe the basics of the Genetic Algorithm (GA), which is one of the oldest and most popular metaheuristic algorithms for design optimization.

The history of the Genetic Algorithm begins (as most modern ideas in computer science) with Alan Turing’s famous paper from 1950 called Computing Machinery and Intelligence. In this article Turing developed many key early ideas behind artificial intelligence, including the test he called the “Imitation Game” (but now often called the “Turing Test”) for evaluating the degree to which an artificial intelligence can imitate full human intelligence. In the same paper, Turing also explores how concepts from natural evolution might be used to evolve artificially intelligent machine. Although this paper does not propose any actual implementations of such artificial evolutionary approaches, it is often sighted as the first example of such a system.

The first description of an actual algorithm for evolutionary computation was proposed by John Holland in his book Adaptation in Natural and Artificial Systems published in 1975. In this book Holland describes the first-ever Genetic Algorithm (GA). Although some of the details of Holland’s algorithm are particular to the early computer systems he was using at the time, most of the basic operations he describes are still used in modern Genetic Algorithms, including the one we will use in this course.

To see how this algorithm works, let’s consider another difficult problem, taken from Daniel Shiffman’s excellent book chapter describing genetic algorithms. In this problem, we try to recreate the first phrase from Shakespeare’s Hamlet — “to be or not to be” — letter by letter with a vocabulary of 96 possible characters:

Problem definition

The design space of this problem has 18 inputs (one for each character in the sentence) with each input taking on one of 96 character values. At first this doesn’t seem like too complicated of a problem, and certainly the solution seems obvious enough to us. However, if we actually calculate the total number of possible solutions in this designs space we get:

96¹⁸ = 479,603,335,372,621,236,652,373,132,533,825,536

This is 4.796 x 10³⁵ or 479.6 decillion possibilities. If we assumed that generating and evaluating each option took only one computer cycle, to work through all the options on a modern 2.6 GHz processor would take roughly 58.5 x 10¹⁷ years. Considering the estimated age of the entire universe is only 13.8 x 10⁹ years, separately computing each design is clearly not a tractable solution. Let’s look at how a GA fares at solving this problem.

Using the very simple GA described by Shiffman, we are able to solve this problem in 32 seconds after looking at only 38,000 possible solutions — a big improvement over our previous calculations!

Solution to Shakespeare problem using a genetic algorithm

Although there are other algorithms and more complex implementations of GA’s that can solve this problem faster, the fact that an algorithm written with only 36 lines of code can solve this problem after looking at only 38,000 out of 4.796 x 10³⁵ possible designs is pretty amazing, and speaks to the power and versatility behind GA’s most fundamental concepts.

Basic operators

Although the Genetic Algorithm is quite good at solving some very complex problems, the algorithm itself is driven by only four basic operators. Let’s review these operators and see how they are applied to solve the Shakespeare problem.

1. Generation
The algorithm begins by generating a set of designs which form the initial ‘generation’. GA’s can use different strategies for generating these initial designs, including regular sampling from the design space using algorithms such as SOBOL. However, the most common method is to randomly sample the designs from the design space. In this case we use a population size of 1,000 designs.

2. Selection (Ranking)
Next the algorithm selects which of the initial designs are going to be used to generate the next generation. There are several ways to do this, but in general we want to make sure that better designs have a higher probability of being selected, so that good strategies found in the first generation will be carried into the next. In this case we create a ‘mating pool’ containing the best designs according to the number of letters they have in common with the target.

The selection operator

3. Crossover
Once the algorithm selects the promising designs, it recombines them to create a new population of designs. This is similar to the idea of ‘breeding’ in natural evolution, where genetic information from two parents is randomly recombined to create a new offspring. The basic idea is that since the parents survived long enough to reproduce, they must both have some genetic material that could be useful to the survival of the species overall. By recombining elements of its parents’ genetic information, the child will likely inherit some winning strategies from both parents, and thus have a better chance of surviving and later reproducing.

The way this crossover is implemented depends on the input types and the particular GA we use. In this case, we select two designs randomly from the ‘mating pool’ as parents. Then we choose a random ‘crossover point’, and give the child design all the information up to the crossover point from the first parent, and all the information after the crossover point from the second parent. This strategy is called ‘single-point crossover’. The child produced by this operation is then added to the next generation.

The crossover operator

4. Mutation
Selection and crossover insures that most of the good strategies from each generation make their way into subsequent generations of designs. However, with only these methods we can easily get stuck short of the best solution. For example, if no designs in the first generation had an ‘e’ in the last place, no child would ever receive this information and the right solution would never be found. As in nature, we need a mechanism which can inject new information randomly into the gene pool. This is done through a ‘mutation’ operator, which randomly changes the inputs of a random number of children (typically a small percentage) before they enter the next generation. In this case, we randomly reassign 1% of the inputs in each generation.

The mutation operator

By applying these operators to a sequence of generations, the algorithm is eventually able to arrive at the correct solution. The following image shows a portion of designs from the first and last generation. You can see that in the first generation the designs are completely random. However, 38 generations later, all the designs have many of the right components of the target, including one design that got it totally right. The window on the right shows the best design in each generation and its score (as the ratio of correct characters). This shows how the algorithm is able to improve the designs gradually with each generation.

GA process for finding solution to Shakespear problem

Optimality in multi-objective optimization

The example above describes a very basic Genetic Algorithm used to solve a problem with only a single objective — to get the string as close to the target as possible. However, most interesting design problems are defined by many different goals, which may be related to each other in complex, non-intuitive ways. In this case, deciding which design is better than another is not as clear cut. As an example let’s consider a simple model which we define by a closed surface lofted through three circles:

Model input/output

The diagram above shows the inputs and outputs of the model. The inputs are the diameters of the three circles. The goals are to maximize it’s volume and minimize it’s surface area. We can easily imagine how these goals are in competition with each other. Although intuition tells us that a sphere will give us the most efficient trade-off in these goals, there is clearly no single design which has both the smallest surface area and the largest volume.

Now let’s look at a set of designs taken from the design space of the model:

Visualization of designs according to two goals

The image above is a 4-d scatter plot, where each circle represents a design found somewhere in the design space. This chart is a great way to visualize a design space because it allows us to visualize 4 different pieces of data simultaneously. In this case we plot the two goals along the x and y dimensions, and the color and size of the circles represent the sequence in which the designs were created

Plotting the designs relative to the two goals clearly shows the tradeoff between them, which is seen as a line of designs moving from the lower left corner to the upper right. In this case the tradeoff is fairly simple and close to linear, as designs with more volume tend to also have greater surface area. The designs along this line can all be considered optimal, since for each one there is no other design which beats it in both goals. For example, designs #126 and #137 can both be considered optimal while design #5 cannot because there are many designs that both have more volume and less surface area.

This line is called the Pareto optimal front, which refers to the concept of Pareto efficiency in economics. This optimal front is a kind of boundary that divides the design space in terms of its goals. All designs occurring on the boundary are optimal, because it is impossible to make any of them better in one goal without making it worse in at least one other goal. As you move along the boundary you find other designs that are equally optimal but solve the tradeoffs in other ways. Any designs found inside the boundary are feasible but non-optimal, since you can easily find designs which perform better in all goals simply by moving closer toward the boundary. No designs outside the boundary are possible according to the definition of the design space.

To make the boundary’s orientation more clear we can define a ‘utopia point’ on our chart as the hypothetical best solution for all goals. In our case this would be an object with infinite volume and no surface area, which is clearly impossible both in our model as well as the physical world. One way to think about this intuitively is that all designs in the design space want to get as close as possible to the utopia point, while the optimal front marks the boundary of how far they can get.

Visualization of Pareto optimal front and the utopia point

In practice, this front is not always as clear and continuous as the one above. With a discontinuous design space, the front will also be discontinuous, with several sections that show optimal boundaries within different areas of the design space. For example, consider the following plot, showing the design space of a chair, optimized with the two goals of minimizing weight while also minimizing deformation. In this model, there is a discontinuity every time the configuration of the legs changes or a new bar is added.

Discontinuous optimal fronts for discontinuous design space

To find the best possible designs within our design space, the task of the Genetic Algorithm is to find all of the designs along the optimal front. To do this, the algorithm must combine some level of exploration to make sure it does not miss any parts of the front, with some level of exploitation to make the designs as optimal as possible and ‘push’ the front as far as possible towards the utopia point.

In the chart above, the color of each circle denotes the time at which it was explored by the algorithm, with blue being designs explored early, and red being designs explored later in the search process. Here we can clearly see that the algorithm is performing as it should, finding multiple areas of the optimal front and exploiting them more and more over time, yet exploring enough so as not to get stuck in one particular solution to the trade-off.

As with previous concepts, the optimal front is easy to visualize with only two goals. However, the concept extends to any number of goals, and you can imagine the optimal front of any design space as a conceptual n-dimensional hyper-surface, where n is the number of dimensions. We can also see how this framework still holds in the case of a single goal. In this case the optimal front is a 1-dimensional surface (in other words a single point) which is the single best design in the design space.

At this point we can easily see that, at least when faced with more than one objective, generative design is rarely able to provide us with one single best design solution. This shows us once again that the goal of these methods is not to replace the human designer, since the designer must still select the final design from the pool of optimal ones. All the search algorithm can do is reveal some of the structure of the design space to the designer, who can then use that knowledge to either select a design based on other goals or preferences, or reframe the design space and explore further.

Ranking for multi-objective optimization

Considering the potentially complex trade-offs between multiple objectives, how can our Genetic Algorithm rank designs in order to encourage the most optimal ones to contribute to the next generation?

One very simple solution to this problem is to reduce the multi-objective problem to a single objective problem by combining the set of multiple objectives linearly using a set of weights. For example, if you have two objectives which you are trying to minimize, you can simply add them together and minimize the sum. You can also weight the objectives differently by multiplying them by different factors to give one objective more importance over the other. In practice this is not very effective because it artificially limits the ability of the algorithm to fully search the design space. By predetermining the best trade-off between objectives, it in effect focuses the algorithm on a specific point on the Pareto optimal front, thus limiting the chances of discovering unexpected results.

A better way to rank designs along multiple objectives is to order them in terms of relative dominance, as described in the following rule:

Solution A dominates another solution B if A performs at least as well as B in every objective, and better than B in at least one objective.

Based on this definition, in the example above b dominates h because it performs better in both objectives, while b does not dominate a because it performs better only in objective y2.

Using this definition, we can say that one solution is more optimal than another if and only if it dominates that solution. If neither of two solutions dominates the other they are considered equally optimal. We can also define the most optimal designs on the Pareto frontier as those designs that are not dominated by any other design in the design space.

To rank the designs according to dominance, we can calculate successive Pareto fronts, each time ignoring previously optimal designs. The rank of each design is the number of the front it is located on, which we can use to decide which designs are selected for crossover for the next generation.

Calculation of fronts for ranking

Encouraging exploration

As mentioned previously, the goal of multi-objective optimization is to find and explore the Pareto optimal front in order to discover a wide range of high-performing designs. Although the ranking process allows the GA to push towards the front, we need a separate mechanism to encourage it to also explore the front as much as possible when it finds it. Otherwise, it is very easy for a GA to get stuck and prematurely converge on one type of solution, which may be a local optimum and not the overall optimal design in the design space.

Typically, this problem is handled using a crowding factor, which penalizes designs found in highly-explored areas of the design space and gives preference to those in areas which are relatively less explored. One of the most popular ways to compute this crowding factor was introduced by the NSGA-II algorithm. It calculates the factor for each design as the distance between the two adjacent designs of the same rank (occurring on the same optimal front). It then uses this distance as a factor when selecting designs for crossover to encourage novel strategies found in under-explored areas of the design space.

Calculation of crowding distance

Handling constraints

In addition to multiple objectives, another issue not addressed by the simple GA above is how to deal with constraints. So far we have only looked at how the algorithm can be guided by the objective functions, which specify the goals of the optimization problem. Constraints, on the other hand, do not describe relative performance between design options, but dictate whether a design is a feasible option at all. Constraints provide a much more rigid criteria than objectives, in that a design that fails even one constraint cannot be considered as a solution, even if it performs exceptionally well in all objectives.

Because constraints are not strictly necessary to define an optimization problem, they are not included in the most basic version of a GA. However, because they can be very useful for describing more complex design problems, most sophisticated GA’s provide methods for dealing with constraints during the ranking and selection process.

There are three basic approaches for handling constraints in a GA —

  1. Designs which break one or more constraints can be immediately disqualified and never allowed to participate in crossover. Although this method is easy to implement, it can be overly strict. In the early stages of the optimization process, even unfeasible designs may contain useful genetic information, which is completely lost if we eliminate them from the gene pool. This makes optimization slower, and with highly constrained problems can even cause it to get stuck if few of the designs in the early generation are feasible.
  2. Another approach is to discount the rank of each design based on its feasibility. The amount of discounting is typically increased over the course of the optimization progresses to allow unfeasible designs to reproduce in the early generations, but penalize them more and more in the later stages to discourage such designs in the final solution set. This method works in principle, but can be difficult to tune because it requires the setting of the discount factor as a parameter of the optimization.
  3. The third option, which is both the most obvious and most robust, is to simply include the feasibility of the design as a separate criteria when deciding which options to promote for crossover.
Designs c, g, and k are unfeasible because they break at least one constraint

Selection through tournament

So far, I have described three separate criteria which can be used to judge a design’s relative performance compared to others in its generation:

  1. rank (based on dominance in objectives)
  2. crowding (based on distance to closest neighbors in same rank)
  3. feasibility (based on constraints)

Ideally, the GA should consider all three of these criteria when selecting which designs participate in crossover, so that it can discover a range of different solutions that are both feasible and high-performing. A common strategy for selecting designs based on more than one criteria is to have them compete directly with each other in ‘tournaments’. For example, each parent might be chosen by first picking two designs at random from the population, and then selecting the winner based on a set of rules relative to the criteria.

A common set of rules for selecting a winner is as follows:

  1. If one of the designs is feasible and the other is not, the feasible design always wins.
  2. If both designs are feasible, the design with the higher rank wins.
  3. If both designs are feasible and have the same rank, the design with the higher crowding factor wins.
  4. If both designs are not feasible, the design with smaller constraint violations is chosen.

These rules can be visualized as a decision tree:

See if you can guess, based on these rules, which among two randomly chosen candidates would be selected to participate in crossover:

Elitism

The final method we have to consider for the GA is how to avoid losing valuable genetic information over the course of the optimization. By default, the GA does not have any memory, and performs the optimization by only operating on a single generation of designs at a time. This is actually an important feature of the algorithm because it means that the algorithm only consumes a fixed amount of memory and won’t run out no matter how long the process is run for (which for complex problems can be days or even weeks).

The problem, however, is that when a high-performing design is for some reason not chosen for crossover (which is very possible given the stochastic nature of the algorithm), its genetic information is lost forever. In some advanced GA’s this issue is solved by forcing the children of a given generation to compete with their parents to form the following generation. This ensures that the dominant parents always stay in the population, even if they are not chosen for crossover, or if the crossover does not recombine their information in the best way. Another commonly used method is called elitism, in which a set number of the best-performing designs is directly carried over to the following generation.

Parameter tuning

Using only the set of operators and methods described here, a GA is able to take an arbitrary optimization problem defined by a set of inputs, objectives, and constraints and gradually derive better and better performing designs. However, while this process happens automatically and has been shown to be quite effective for a wide range of problems, its performance is substantially dictated by a set of parameters which must be set before running the optimization. Some of these parameters include:

  1. The size of each generation
  2. The number of generations to compute
  3. The elitism rate (number of high performing designs which are carried over directly to the next generation). Higher values lead to more exploitation but can cause premature convergence (getting stuck in a local optimum).
  4. Mutation rate (the percentage of generated designs whose input parameters are randomly changed). Higher values lead to more exploration but can cause the algorithm to diverge (never finding the optimal front).

Although these parameters can have a big effect on the performance of the algorithm and its ability to find the best designs, there are no hard rules for how they should be set. One very robust yet time-consuming approach is to define the tuning of parameters as an optimization problem, and to use an optimization algorithm (possibly another Genetic Algorithm) to search for the best combination of parameters. Called meta-optimization, this approach is very interesting but is beyond the scope of this course. In practice, the most typical approach is to tune the parameters through experimentation, by running multiple optimizations and seeing which settings work best for a given problem.

This section concludes our discussion of the Genetic Algorithm, and design optimization in general. The following sections will present advanced computational design and simulation methods that can be used to develop complex design spaces and sets of measures which can be efficiently searched and optimized by a GA.

--

--