A Genetic Algorithm for Optimizing Neural Network Parameters using Python

Luana Gonçalves
Analytics Vidhya
Published in
3 min readSep 27, 2019
Photo by Clément H on Unsplash

Artificial neural network is a supervised machine learning algorithm very popular in applications in various fields such as speech and image recognition, time series forecasting, machine translation software, among others. They are useful in research by their ability to solve stochastic problems, which often allows for approximate solutions to extremely complex problems.

However, it is very difficult to define ideal network architecture. There are no clear rules for how many neurons are in the intermediate layers or how many layers or how the connections between these neurons should be implemented. To solve this kind of problem, this article instructs how to use a Genetic Algorithm to automatically find good neural network architectures in Python.

First, you need to install the scikit-learn package. A Simple and efficient tool for data mining and data analysis.

For the training of the hybrid algorithm, we will use a database of Iris flower classes (Setosa, Virginica and Versicolor).

from sklearn import datasets
import numpy as np
import matplotlib.pyplot as plt
from sklearn.neural_network import MLPClassifier
from sklearn.metrics import accuracy_score
from sklearn.model_selection import train_test_split
from random import randint
import random
from sklearn.metrics import mean_absolute_error as mae
iris = datasets.load_iris()
X = iris.data
y = iris.target
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.33, random_state=42)

Now we can start structuring the genetic algorithm. Individuals in the population are composed of activation, solver, and numbers of neurons in the hidden layers — the neural network here has two hidden layers. The code below shows an example of population initialization. Population size is defined by size_mlp.

def inicialization_populacao_mlp(size_mlp):
pop = [[]]*size_mlp
activation = ['identity','logistic', 'tanh', 'relu']
solver = ['lbfgs','sgd', 'adam']
pop = [[random.choice(activation), random.choice(solver), randint(2,100), randint(2,100)] for i in range(0, size_mlp)]
return pop

Crossover is an operator used to combine the information of two parents to generate new individuals. The objective is to increase genetic variability and provide better options. The recombination used here was a single-point crossover.

def crossover_mlp(mother_1, mother_2):
child = [mother_1[0], mother_2[1], mother_1[2], mother_2[3]]
return child

To further increase genetic variability and avoid local minima, another operator used is mutation. The probability of mutation is defined by prob_mut.

def mutation_mlp(child, prob_mut):
for c in range(0, len(child)):
if np.random.rand() > prob_mut:
k = randint(2,3)
child[c][k] = int(child[c][k]) + randint(1, 10)
return child

Such an example is a classification task, the fitness function is calculated from the accuracy of the neural network, in which case the objective of the genetic algorithm is to maximize the accuracy of the neural network.

def function_fitness_mlp(pop, X_train, y_train, X_test, y_test): 
fitness = []
j = 0
for w in pop:
clf = MLPClassifier(learning_rate_init=0.09, activation=w[0], solver = w[1], alpha=1e-5, hidden_layer_sizes=(int(w[2]), int(w[3])), max_iter=1000, n_iter_no_change=80)
try:
clf.fit(X_train, y_train)
f = accuracy_score(clf.predict(X_test), y_test)
fitness.append([f, clf, w])
except:
pass
return fitness#

Finally, the body of the genetic algorithm is structured.

def ag_mlp(X_train, y_train, X_test, y_test, num_epochs = 10, size_mlp=10, prob_mut=0.8):
pop = inicializacao_populacao_mlp(size_mlp)
fitness = function_fitness_mlp(pop, X_train, y_train, X_test, y_test)
pop_fitness_sort = np.array(list(reversed(sorted(fitness,key=lambda x: x[0]))))
for j in range(0, num_epochs):
length = len(pop_fitness_sort)
#seleciona os pais
parent_1 = pop_fitness_sort[:,2][:length//2]
parent_2 = pop_fitness_sort[:,2][length//2:]
#cruzamento
child_1 = [cruzamento_mlp(parent_1[i], parent_2[i]) for i in range(0, np.min([len(parent_2), len(parent_1)]))]
child_2 = [cruzamento_mlp(parent_2[i], parent_1[i]) for i in range(0, np.min([len(parent_2), len(parent_1)]))]
child_2 = mutacao_mlp(child_2, prob_mut)

#calcula o fitness dos filhos para escolher quem vai passar pra próxima geração
fitness_child_1 = function_fitness_mlp(child_1,X_train, y_train, X_test, y_test)
fitness_child_2 = function_fitness_mlp(child_2, X_train, y_train, X_test, y_test)
pop_fitness_sort = np.concatenate((pop_fitness_sort, fitness_child_1, fitness_child_2))
sort = np.array(list(reversed(sorted(pop_fitness_sort,key=lambda x: x[0]))))

#seleciona individuos da proxima geração
pop_fitness_sort = sort[0:size_mlp, :]
best_individual = sort[0][1]

return best_individual

ENJOY YOUR CODING!

To download the code, click here.
P.S. If you like to read more stuff like this in Medium, consider supporting me and thousands of other writers by signing up for a membership. Or you can buy me a coffee here instead. Have a nice day :)

--

--

Luana Gonçalves
Analytics Vidhya

Specialist in machine learning, Stats and signal processing. Brazilian.