Once we have defined our design space and established some metrics to evaluate it, we are ready to enlist the help of an optimization algorithm to explore the design space and find a variety of high-performing designs. As you may recall from a previous article, any optimization problem can be described in the following form:
The optimization problem is defined by three main components: (1) a vector of input data which describes every possible design in the system, (2) a set of one or more objective functions that describe the goals of the system, and (3) an optional set of constraint functions which determine the feasibility of any design.
Once an optimization problem is defined, the goal of the optimization process is to find combinations of input data that best satisfy the objective functions while working within the limits of the constraints. To solve these kinds of problems mathematicians and computer scientists have developed a range of different strategies called optimization algorithms. Let’s describe the three components of optimization problems in more detail before discussing some of the optimization strategies we might use to solve them.
1. Input parameters
The optimization problem is described by a design vector which combines all of the input parameters that define different solutions to the problem. Input parameters can take on one of three types:
- Discrete values (usually represented by integers) define a set of categories or options. An example of a discrete parameter is the type of material used to construct a building (for example: brick, steel, or concrete).
- Continuous values (represented by real numbers) define continuous measures. An example of a continuous parameter is the size of a building’s footprint or it’s number of floors.
- Permutation sequences define an ordering of a fixed number of elements. An example of a permutation sequence is the assignment of programmatic areas within an architectural space plan.
Although all three types of inputs are represented by numbers, it is important to specify the type of input because most optimization algorithms handle them differently while searching through the design space for optimal designs.
2. Objective functions
Objective functions describe the goals of the optimization problem. When defining an optimization problem you have to specify whether your goal is to minimize or maximize the value of each objective function. The optimization algorithm will then try to drive this value as low or as high as possible based on your goals.
Any optimization problem has to include at least one objective function to guide the optimization process. A problem defined by one objective function is called a single-objective optimization. Because they only have a single goal, such problems are typically easier to solve and can usually produce a single optimal solution. A problem defined by more than one objective function is called a multi-objective optimization. Because they have multiple goals, some of which may compete with each other, such problems are typically harder to solve and may produce a range of optimal solutions which describe different approaches to negotiating the trade-offs between different objectives.
3. Constraint functions
Constraint functions determine the feasibility of different solutions in the design space. Like objective functions, they describe something about each design’s performance. However, instead of measuring the relative desirability of a design, they dictate whether a design should even be considered as an option. Constraints can be described in three ways:
- The value of the function has to exactly equal a certain value. For example, the number of bedrooms in a house must equal exactly 3 for a house design to be valid.
- The value of the function has to be greater than a certain value. For example, a parking lot has to provide at least a certain amount of spaces in order to be valid.
- The value of the function has to be less than a certain value. For example, the deformation of a structure must not be beyond a certain fixed value, although smaller deformations are still acceptable.
Given a set of objective and constraint functions, the goal of the optimization process is to produce designs that maximize or minimize the values of the objectives as far as possible, while still working within the limits of the constraints.
Design space model as optimization problem
Because they are driven by a set of input parameters and evaluated by a set of measures, our design space models as described in the last two articles are already set up as optimization problems. Although the design space model is composed of geometric operations and produces formal design options, for the purposes of optimization it can be thought of as an abstract function that incorporates all of the objective and constraint functions that describe any given design.
The last step in setting up our design spaces as optimization problems is to determine whether each measure should be treated as an objective or constraint (or both) and to specify exactly how the objectives and constraints should be applied. If a given measure is used as an objective, you need to specify whether the goal is to minimize or maximize its value. If the measure describes a constraint, you need to specify the exact condition which would make the design unfeasible according to the measure.
Any numerical measure can be described as an objective or constraint, and the decision to use one over the other is based on the problem you are trying to solve. For example, the amount of material used for a certain object can be defined as an objective if you are trying to push the value as low as possible. It can also be defined as a constraint if there is a set limit to the amount of material that can be used, but all designs within that limit are equally valid.
Once an optimization problem is defined, it can be solved using a variety of approaches. While some simple optimization problems have direct solutions, most require the problem to be solved incrementally using ‘optimization algorithms’. These algorithms can be classified into two basic categories — deterministic methods which achieve the solution through the direct application of a series of defined steps, and stochastic methods which introduce some level of randomness while ‘searching’ for a solution. Some examples of these two categories include:
Deterministic methods — ‘solve’
- Direct analysis
- Gradient descent
- Exhaustive enumeration
- Heuristic solutions
Stochastic methods — ‘search’
- Monte Carlo (MC) — random sampling
- Stochastic gradient descent
- Metaheuristic search algorithms
Which method you choose depends on the way in which an optimization problem is described, and the kind of information you have about the objective and constraint functions. In the rest of this section we will briefly study these various optimization strategies, before concluding with a discussion of metaheuristic algorithms, which turn out to be the best approaches for the types of problems we typically deal with in generative design. This will set us up for the next section where I will describe the Genetic Algorithm, one of the oldest and most popular metaheuristic algorithms which we will use to optimize our design space models in the rest of the course.
With very simple optimization problems where the objective and constraint functions are defined through linear or quadratic equations, the best solution can be found through the direct analysis or solving of the functions. For example, consider the following optimization problem defined by one objective function (P) and four constraint functions which are all linear functions of the two input variables x1 and x2:
In this case we can find the optimum values of x1 and x2 graphically by plotting the curves of the objective and constraint functions or analytically by solving the functions. In the graphic solution above the constraint functions define a region of feasible solutions (the yellow area). We can find the best solution by plotting the curve of the cost function (P), and finding the point in the yellow region which maximizes the value of the function. In this case it is the upper right corner of the feasible region.
This type of direct analytical solution is applicable to a wide variety of real-world optimization problems. When possible, it is extremely fast and efficient. However, in order to be applicable, it requires that all the objective and constraint functions are represented as equations which can be solved, which is typically not the case in our design space models. Thus, the direct analytical approach is rarely applicable for solving general optimization problems in generative design.
When the objective and constraint functions cannot be directly expressed as equations, another option for optimization is the gradient descent algorithm. This method can be applied when the objective functions cannot be described globally but can be analyzed locally at specific points. The gradient descent algorithm uses this local information to iteratively change the solution in the direction which best fits the objective functions until the best possible solution is reached.
Let’s imagine the simplest case of a problem with one input parameter and one objective function. In this case, the optimization problem can be described as a curve. If we did not know the shape of the curve to begin with, we could find the highest point on the curve (the highest value of the objective function) by looking at the slope of the curve, and moving in the direction of the highest slope to reach the peak:
This idea can easily be extended for multiple input parameters, in which case the combination of the slopes (partial derivatives) of the objective function along each input parameter forms the objective function’s gradient vector, which gives the direction in which the parameters must be tuned in order to find the optimal result.
The name ‘gradient descent’ comes from the example of a minimization problem where we can imagine the algorithm gradually ‘descending’ down the slope of computed gradients until it reaches the lowest possible value. Another common name for this algorithm is “hill-climbing” which relates to how a climber might find the peak of a hill by continuously taking steps in the direction of the greatest slope.
One major issue with the gradient descent algorithm is that it can get stuck in local optima, or regions of the design space which are locally the best performing, but not the best performing in the entire design space. For example, if the algorithm happens to start at the bottom of the second hill in our two dimensional example, it will follow the slope of the curve until it reaches the second peak, at which point it will terminate. Since the algorithm cannot ‘see’ the entire landscape of the curve, it has no way of knowing that there is another even larger peak (the global optimum) nearby.
One solution for problems with many local minima is to execute a series of gradient descent trials using different starting points each time and then taking the best final solution from each run as the global optimum. The chance of finding the globally optimal solution increases with the number of trials increases, although there is never an absolute guarantee of it being found.
The gradient descent algorithm is extremely effective for solving optimization problems defined by objective functions which cannot be directly solved but whose gradients can be directly computed. For example, gradient descent is the dominant algorithm used to train neural networks because even though the network cannot be described as a single solvable equation there is a direct way to compute the gradient of the network’s cost function with regard to each of the network’s parameters.
However, this is rarely the case with design models, which are typically composed of many interlinked geometric operations, each of which are difficult if not impossible to differentiate. Thus it is usually impossible to know exactly how changes in any of the input parameters would change any of the output metrics. This is especially the case with more complex, simulation-based metrics. Although you can still apply gradient descent by sampling several local points and approximating the gradient (a technique called evolution strategies), this tends to be more computationally expensive and less effective than other strategies when applied to design models that often take a long time to compute. Thus gradient descent is rarely applied directly to solve problems in generative design.
One question you might be asking yourself at this point is — why do we even need optimization algorithms at all? With complex problems where the objective functions cannot be solved globally or analyzed locally, can’t we just test all the different possibilities and then choose the best one?
The problem with this naive approach is that for most of our design spaces, the full enumeration of all possible input values would generate such a large number of designs that it would be beyond the limits of not only a human designer to evaluate them all, but even the most powerful computer that exists today.
An approach based on full enumeration presents an even larger problem when using continuous variables. For example, let’s say we wanted to design a chair, and created a design space with several continuous inputs that describe the position of each leg. As you can imagine, the use of such continuous inputs is quite common with design models. The problem, however, is that a full permutation of such a design space does not exist at all. With even a single continuous variable, there exist an infinite number of different design solutions, each with a slightly different value of the continuous input.
If we really wanted to produce a finite set of designs from such a design space, we could first discretize the possible values of each continuous parameter (in other words convert the continuous variable into a discrete one by giving each parameter a fixed set of possible values). But then how do we choose the number of choices for each input? If we pick a small number of choices we decrease the number of possible options, but might miss the best possible value. If we increase the number of choices, however, the number of designs increases exponentially, and the enumeration quickly becomes intractable. Ideally we would have a way to find the best value of continuous parameters directly, without having to first establish a fixed number of possibilities.
When solving complex optimization problems where the objectives functions cannot be globally solved or locally analyzed, and where it is impossible to test each possible solution, we can often rely on approximate, ‘rule of thumb’ solutions that can guarantee that we achieve a reasonable solution, even if is not the absolute best one possible. In computer science such methods are called ‘heuristics’, and are actually one of the most common ways in which we solve complex problems in our every day life.
To see an example of such a problem, let’s consider a classically difficult problem from computer science — The Travelling Salesman Problem (TSP). In this problem you are given a set of cities, and the goal is to find the shortest route that visits each city once before returning to the starting city. Although most of us aren’t professional salesmen, all of us have encountered a version of this problem when trying to book a trip, especially a more complicated one that involves several destinations. So how do you optimize your trip to visit all cities while spending the least amount of money?
As the number of cities grows, testing each possible route quickly becomes in possible. The number of possible solutions to this problem is the factorial of the number of cities not counting the starting city, since at each step the number of options to choose from decreases by one. This means that the number of possible solutions grows more than exponentially with the addition of each extra city.
With 10 cities, there are 3,628,800 possible routes — slow but not impossible to solve with a computer. As we increase the number to 30, we increase the number of possible solutions to 2.65 x 10³², which is much higher that the estimated number of grains of sand in the entire world. Even for such a small number of cities, directly computing the best possible route by testing each option will be impossible even for the most powerful computer for years to come.
If we can’t determine the single best possible route, we can try a heuristic strategy to get a reasonable solution the problem. For example, we might plan the route by always going to the closest unvisited city.
Following this strategy will yield a reasonable solution which is definitely far from the worst possible option. However, we also know for a fact that this is not the overall best possible solution.
Although such heuristic approaches can be applied very quickly and can achieve reasonable results for many problems, they are also never guaranteed to find the best possible solution, especially with more complex problems defined by discontinuous design spaces with several competing objectives. Furthermore, heuristic strategies must be developed specifically for each design strategy, and thus cannot be generalized to different design problems.
Since we as humans are fundamentally unable to think through every possible solution to a problem, much of our decisions are made based on heuristic ‘rule-of-thumb’ approaches. This is also the case in design, where designers solve problems by relying on standard ‘best practices’ and using experiences from similar projects done in the past. Although this might produce reasonable and satisfactory results, it can also keep the designer from discovering unique high-performing designs which may not be intuitive.
All the strategies described so far have been direct methods — ones that can be derived deterministically by following a fixed set of rules. Another approach to solving a complex optimization problem where little is know about the relationship of input parameters to objective functions, is to simply start testing random solutions, and then take the best solution found as the optimal. Such purely random approaches are typically called Monte Carlo methods, a name originally inspired by the reliance on luck exhibited in casinos.
Although this purely random strategy would be very unlikely to find a good solution within any reasonable amount of time, it turns out that some degree of randomness can actually be quite useful in tackling some particularly thorny problems. For example, instead of running a single iteration of gradient descent, which is likely to end up in a local optimum, you can run many iterations of the algorithm while starting at a different point each time (an approach called stochastic gradient descent). As the number of trials increases, the likelyhood that one of the runs will find the global optimum increases as well, until you can be reasonably sure that the best solution has found. In probability theory, this principle is called the weak law of large numbers. Today, stochastic algorithms based on this principle are used in many applications including optimization, machine learning, and cryptography.
The class of optimization algorithms which rely on stochastic principles to solve complex optimization problems are called metaheuristics. The general strategy with these methods is to start by sampling one or more designs at random, and then use the knowledge derived from those designs to sample other, hopefully even better performing designs. By iteratively following this process the algorithm can efficiently locate the best designs without knowing anything about how the design space works and without having to sample every possible design.
The name ‘metaheuristics’ refers to the set of rules that guide the optimization process for each algorithm. In this case, the rules are ‘meta-’ because they are applied to a general algorithm instead of directly to a particular problem. This means that, unlike heuristic methods, metaheuristic approaches are generalizeable to any optimization problem. Although such approaches are also not guaranteed to find the optimal result within any given period of time, they have been shown to be extremely effective for solving complex problems where direct solutions do not exist.
Intuitively, we can think of this process as searching for something in an unknown space with limited information. For example, imagine how you would search for a light switch in a dark room. You might start by holding out your hands and feeling randomly in the air. When you encounter the first wall, you might use your knowledge of light switches to limit your search to the surface of that wall. If after a while of searching the wall you still don’t find the light switch you might go back to searching randomly until you find another wall, and so on.
This analogy to a searching process gives these methods the common name of ‘search algorithm’. This type of algorithm is much less direct and less deterministic than the first two methods. Since many of the decisions are made randomly, we are not guaranteed to find the same results every time we run the algorithm, nor are we guaranteed to find the absolute best solution at all. Despite these limitations, however, many years of research have proven them to be extremely effective. Even with complex problems which have deterministic solutions, stochastic approaches have been shown to be much faster at finding solutions which are very close to optimal.
These algorithms are especially useful for generative design applications because they work by directly sampling the design space and don’t rely on any global or local knowledge of the design model or the objective functions. They have also been show to be very effective at avoiding conditions of local optima and learning complex, discontinuous design spaces. For these reasons, metaheuristics are the dominant strategy for solving optimization problems in generative design.
Compared to functional analysis and gradient descent, metaheuristics are a relatively new approach for solving optimization problems. However, there are still many different types to choose from. Interestingly enough, many of them are inspired by processes found in nature. This makes sense, since we have already seen how good nature can be at working through complex problems through an iterative, stochastic approach. In addition to algorithms based on natural evolution, there are stochastic optimization methods based on the crystallization of metals during annealing, as well as the behavior of ants, bees, and fireflies. For a good theoretical foundation in stochastic methods you can look at Introduction to Stochastic Search and Optimization by James C. Spall. For a summary of various metaheuristic algorithms, including a collection of more recent approaches you can look at Nature-Inspired Metaheuristic Algorithms by Xin-She Yang.
In the next section, I will introduce the Genetic Algorithm, one of the oldest and most popular metaheuristic algorithms for design optimization. This algorithm is particularly interesting because its rules and operations are directly inspired by the evolutionary process in nature, allowing us to tap into some of the potentials of natural design we discussed in the first section. To show the effectiveness of this algorithm, we can apply it to the Travelling Salesman Problem with 10 cities. Even a basic GA can find the single best solution of over 3 million possibilities after considering only 863 designs: