# DEAP, a Python evolutionary computation framework.

**Distributed Evolutionary Algorithms in Python** (DEAP) is described as an evolutionary computation framework for rapid prototyping and testing of ideas [1]. It incorporates tools and data structures to easy implement genetic algorithms, genetic programmings, evolution strategies, and particle swarm optimization. It is developed at Université Laval since 2009.

The easiest way to install it is through pip:

`pip install deap`

Once it is installed, building an EA is very straightforward.

# EA basics

An EA uses mechanisms inspired by biological evolution, such as reproduction, mutation, recombination, and survival selection. Possible solutions to the optimization problem are considered the individuals in a population, and their quality is determined by a fitness function. Sequential application of the operators drives the evolution of the population. In other words this can be written as:

Step One: Generate the initial population of individuals randomly. (First generation)

Step Two: Evaluate the fitness of each individual in that population (time limit, sufficient fitness achieved, etc.)

Step Three: Repeat the following regenerational steps until termination:

a: Select the best-fit individuals for reproduction. (Parents)

b: Breed new individuals through crossover and mutation operations to give birth to offspring.

c: Evaluate the individual fitness of new individuals.

d: Replace least-fit population with new individuals.

# DEAP implementation

Four important steps are necessary before running the program: the **type** of the individuals must be decided, to encode the problem in the smartest way. How to **initialise** the individual, and **how many** of them we want to apply is another important decision we have to take. How to **combine** them is very important, a wrong operator can block evolution at all, wasting a lot of time. Finally, the **type of algorithm** we want to use has to be decided. All these steps are explained in the following lines.

## Types

Before to generate the **initial population**, the type of the individual must be defined. For example, if we want to codify the individual with float values, we just create it with the “list” tag and that is it. A fitness function has to be added, therefore the

class is perfect for a minimization problem. A **FitnessMin****minimizing** fitness is built using *negatives* weights, while a **maximizing** fitness has *positive* weights. Other types are available to be implemented and a list with examples is available at [2].

**from** **deap** **import** base, creator

creator.create("FitnessMin", base.Fitness, weights=(-1.0,))

creator.create("Individual", list, fitness=creator.FitnessMin)

## Initialization

How to implement Step One? DEAP provides a useful container called **Toolbox **exactly for this task.** **It is used to create the initializers for a population and its individuals (implemented with random floating-point numbers, as shown before).

**import** **random**

**from** **deap** **import** tools

INDIVIDUAL_SIZE = 50

toolbox = base.Toolbox()

toolbox.register("attribute", random.random)

toolbox.register("individual", tools.initRepeat, creator.Individual,

toolbox.attribute, n=INDIVIDUAL_SIZE)

toolbox.register("population", tools.initRepeat, list, toolbox.individual)

The* Attribute* registration to the toolbox defines how the **individuals** are **initialised**, “random” in this case. This is passed to the *individual *registration together with a number to define the **number** of individuals to generate. The *individual *registration is then specified into the *population *registration in order to be able to **generate** it through the following line:

`toolbox.population()`

## Operators

Another important step to define is the *Operators* to use in Step Three B. They are going to drive the evolution, therefore attention must be paid to define them correctly. DEAP already implements several of them, making our life a little bit easier. Their declaration is very similar to the way we used to declare individuals.

In the next example, the **mate** operator executes a **uniform crossover** that modifies in place the two sequence individuals. The **mutation** operator applies a **gaussian mutation** of mean *mu* and standard deviation *sigma* on the input individual. The *indpb* argument is the **probability** of each attribute to be mutated. The **parent selection** is done by selecting the best individual among *tournsize* randomly chosen individuals, *k* times.

`toolbox.register("mate", tools.cxUniform)`

toolbox.register("mutate", tools.mutGaussian, mu=0, sigma=1, indpb=0.1)

toolbox.register("select", tools.selTournament, tournsize=5)

There is a great variety of mutation operators in the

module. If you want to implement your own operator, remember that they **deap.tools****only** mutate, therefore an independent copy must be made prior to mutating the individual.

`mutant = toolbox.clone(ind1)`

ind2, = tools.mutGaussian(mutant, mu=0.0, sigma=0.2, indpb=0.2)

**del** mutant.fitness.values

There is also a big variety of crossover operators implemented in the

module. Be aware to select the right operator for the representation chosen. If you want to implement your own operator, remember that they **deap.tools****only** mate individuals, therefore an independent copy must be made prior to mate the individual.

`child1, child2 = [toolbox.clone(ind) `**for** ind **in** (ind1, ind2)]

tools.cxTwoPoint(child1, child2)

**del** child1.fitness.values

**del** child2.fitness.values

Selection is made among a population by the selection operators that are available in the

module. The **deap.tools****selection operator** usually takes as first argument a container of individuals and the number of individuals to select.

## Algorithm

There are two ways to write the main loop of the evolutionary algorithm. Either we write our own implementation or we use one provided by the

module. The following algorithms are available:**algorithms**

**eaSimple:**This algorithm reproduce the simplest evolutionary algorithm as presented in chapter 7 of [3].**eaMuPlusLambda:**This is the “*mu+lambda”*evolutionary algorithm.**eaMuCommaLambda:**This is the “*mu,lambda”*evolutionary algorithm.**eaGenerateUpdate:**This is algorithm implements the ask-tell model proposed in [4], where ask is called generate and tell is called update.**different Variations**: Variations are smaller parts of the algorithms that can be used separately to build more complex algorithms.

The algorithms are very easy to extend through subclassing in order to add new features that are problem-dependent. For example, in the following algorithm, everything discussed so far is presented.

**def** main():

pop = toolbox.population(n=100)

mate_probability = 0.5

mutation_probability = 0.2

number_generations = 40

*# Evaluate the entire population*

fitnesses = map(toolbox.evaluate, pop)

# Set fitnesses to individuals

**for** ind, fit **in** zip(pop, fitnesses):

ind.fitness.values = fit

# loop for the number of generation we need

**for** g **in** range(number_generations):

*# Select the next generation individuals*

offspring = toolbox.select(pop, len(pop))

*# Clone the selected individuals*

offspring = map(toolbox.clone, offspring)

*# Apply crossover and mutation on the offspring*

**for** child1, child2 **in** zip(offspring[::2], offspring[1::2]):

**if** random.random() < mate_probability:

toolbox.mate(child1, child2)

**del** child1.fitness.values

**del** child2.fitness.values

**for** mutant **in** offspring:

**if** random.random() < mutation_probability:

toolbox.mutate(mutant)

**del** mutant.fitness.values

*# Evaluate the individuals with an invalid fitness*

invalid_ind = [ind **for** ind **in** offspring **if** **not** ind.fitness.valid]

fitnesses = map(toolbox.evaluate, invalid_ind)

**for** ind, fit **in** zip(invalid_ind, fitnesses):

ind.fitness.values = fit

*# The population is entirely replaced by the offspring*

pop[:] = offspring

**return** pop

## Checkpointing

As we all know, thousand of bad things can happen when a program is running. For this, **persistence** is needed to be able to restart from the point the program crashed. To do so, a si a simple

and a serialization method are required. Important data will be inserted in the dictionary and **dict****serialized** to a file so that if something goes wrong, the **evolution** can be **restored** from the last saved checkpoint. This can be also used to continue the evolution after a pre-fixed termination criterion.

Checkpointing is not offered in the standard algorithms. Either you create your own algorithm, or you copy one of the implemented algorithms and you add the feature by yourself. Only a few lines have to be added: you read the checkpoint using *pickle *and assigned the saved stated to the variables for the population, hall of fame, the random state (very important for reproducibility), number of generation reached and the logbook.

**if** checkpoint:

*# A file name has been given, then load the data from the file*

**with** open(checkpoint, "r") **as** cp_file:

cp = pickle.load(cp_file)

population = cp["population"]

start_gen = cp["generation"]

halloffame = cp["halloffame"]

logbook = cp["logbook"]

random.setstate(cp["rndstate"])

**else**:

*# Start a new evolution*

population = toolbox.population(n=300)

start_gen = 0

halloffame = tools.HallOfFame(maxsize=1)

logbook = tools.Logbook()

To save the states another piece of code is required. It just fills a dictionary with all the data needed to start the evolution from this point.

**if** gen % every_how_many_gen_save_state== 0:

*# Fill the dictionary using the dict(key=value[, ...]) constructor*

cp = dict(population=population, generation=gen, halloffame=halloffame,logbook=logbook, rndstate=random.getstate())

**with** open("checkpoint_name.pkl", "wb") **as** cp_file:

pickle.dump(cp, cp_file)

## LogBook & Statistics

In the previous paragraph, the **logbook** class came out. What does it do? It records evolution as a chronological list of dictionaries. For example, you could record the number of generation together with the number of evaluation done during it plus all the statistics that we defined with the command **Statistics. **In the following example, the statistics defined are the mean, standard deviation, min, and max of the fitness. DEAP automatically computes them at every generation.

`stats = tools.Statistics(lambda ind: ind.fitness.values)`

stats.register("avg", np.mean, axis=0)

stats.register("std", np.std, axis=0)

stats.register("min", np.min, axis=0)

stats.register("max", np.max, axis=0)

## History

DEAP provides also a way to visualise the graph about the genealogy of all the individuals produced in the evolution. The produced genealogy tree is compatible with the NetworkX library, and on the DEAP website, there is the code on how to produce it.

# References

[0] https://www.inverse.com/topic/evolution

[1]https://en.wikipedia.org/wiki/DEAP_(software)

[2]https://deap.readthedocs.io/en/master/tutorials/basic/part1.html

[3]Back, Fogel and Michalewicz, “Evolutionary Computation 1 : Basic Algorithms and Operators”, 2000.

[4] Collette, Y., N. Hansen, G. Pujol, D. Salazar Aponte and R. Le Riche (2010). On Object-Oriented Programming of Optimizers — Examples in Scilab. In P. Breitkopf and R. F. Coelho, eds.: Multidisciplinary Design Optimization in Computational Mechanics, Wiley, pp. 527–565;

[5] https://deap.readthedocs.io/en/master/api/tools.html#deap.tools.History