Solve Multi-Objective Problem using NSGA-II and DEAP in Python
Introduction
Non-dominated Sorting Genetic Algorithm II was improved by NSGA. It was Proposed by K.Deb, A.Pratap, S.Agarwal, T.Meyarivan with the similar structure with GA but especially used to deal with the multi-objective optimization problems. This article will briefly introduce how it works and two primary functions of NSGA-II in python package — DEAP.
The concept of “Dominated”
This a simple case that illustrates the concept of Pareto-Optimal. Above we have 4 members A, B, C and D with two features: Height and Salary. Now, if we try to compare their height and salary simultaneously, we will find that it’s not quite intuitive because they have multiple objectives.
Now that these two objectives are the larger the better, we can simply compare each of them. At first, we observe that A and B have both higher number them C and D, so we say A and B “dominate” C and D with respect to height and salary. With the same idea, C dominates D and D is dominated by all of them.
How about A and B? A has higher height than B but lower salary. In the contrary, B confronts the same situation. We call the situation “non-dominated”.
If we can find a set of solutions that they don’t dominate each other and not dominated by any other solutions, we call them “Pareto-optimal” solutions. In the case above, A and B are just on the Pareto-optimal front.
In NSGA-II, we compute two attributes Sp and np to help us identify better individuals.
NSGA-II Algorithm
Let me briefly introduce the NSGA-II algorithm. At first, we have a bunch of individuals as parents in GA with multiple objectives. In each iteration, we combine the parent and the offspring after GA operations. Through the Non-dominated Sorting (will talk about the details later), we classify all individuals to different Pareto-optimal front level. We then select individuals as the next population from Pareto-optimal front in the order of different levels. As for diversity preservation, the “Crowding Distance” is also computed. The crowded distance comparison guides the selection process at the various stages of the algorithm toward a uniformly spread-out Pareto-optimal front.
Let’s take a look at the previous case. The identities of np and Sp are shown at the end of each row. np means “How many people dominate you?” and Sp means “Who are you dominating?”. Because A and B are not dominated by any solution and not dominating each other, their np equals to 0 and Sp contains C and D. C is dominated by A and B, its np equals to 1. C also dominates D, so its Sp contains D.
Selection mechanism
Finally, each solution in the population will have two attributes:
- Non-domination Rank
- Crowding Distance
The next population will be selected follow the principles:
- The individuals with higher non-domination rank (smaller number) will have higher priority
- If individuals have identical non-domination rank, the ones with larger Crowding Distance have higher priority
Non-dominated Sorting & Crowded Distance in DEAP
Let’s go over how python package DEAP accomplish 2 key methods in NSGA-II:
Non-dominated Sorting
There are several steps of Non-dominated Sorting Algorithm. I also added some comments in the code.
- Make a python dictionary with fitness values as keys and indexes as values.
- Perform Non-dominated Sorting Algorithm and record individuals’ Sp and Np identities.
- Iterate all Sp in the current front. If the Np-1=0, this solution (Sp) will be the key that belongs to the next front.
- Match the key to find the chromosomes and append to fronts.
For more information, please refer to DEAP.
Crowding Distance
To get an estimate of the density of solutions surrounding a particular solution in the population, the author calculated the average distance of two points on either side of this point along each of the objectives. As the below figure, the crowding distance of the ith solution in its front (marked with solid circles) is the average side length of the cuboid (shown with a dashed box).
Implementation:
Algorithm:
- Initialize zeros for all individuals’ crowding distances.
- Look through all individuals and objective values. Select bound solutions by assigning them with Inf values.
- Calculate mth maximum and minimum each objective to get denominators for normalization.
- Sum up crowding distances of m objectives for each ith individual
(More) Constrained NSGA-II
This constraint-handling method uses the binary tournament selection, where two solutions are picked from the population and the better solution is chosen. In the presence of constraints, each solution can be either feasible or infeasible. Thus, there may be at most three situations: 1) both solutions are feasible; 2) one is feasible and other is not; and 3) both are infeasible. For single objective optimization, we used a simple rule for each case.
- Choose the solution with better objective function value.
- Choose the feasible solution.
- Choose the solution with smaller overall constraint violation.
Since in no case constraints and objective function values are compared with each other, there is no need of having any penalty parameter, a matter that makes the proposed constraint-handling approach useful and attractive.
Conclusion
NSGA-II is an elitist MOEA based on a nondominated sorting approach. In practice, NSGA-II is still a classic approach to find a much better spread of solutions and better convergence near the true Pareto-optimal front. It’s also a great example to design a simple yet efficient algorithm. Also, the paper proposed the simple extension for constrained multi-objective optimization problems based on the binary tournament selection. When it comes to implementation, DEAP provides a good toolkit in python to perform NSGA-II. Next time, when confronting multi-objective problems, don’t hesitate to find DEAP and use it!