Building an Image Colorization Neural Network — Part 2: Artificial Neural Networks

George Kamtziridis
5 min readSep 5, 2022

--

Welcome back to the second part of this series where we are trying to build a neural network capable of applying realistic color to black and gray images. If you haven’t already checked the first part on which we described what generative models and autoencoders are, be sure to read it through before you read the current part. However, if you feel confident on the subject of generative models and autoencoders, then by all means read on.

The entire series consists of the following 4 parts:

  1. Part 1: Outlines the basics of generative models and Autoencoders.
  2. Part 2 (Current): Showcases the fundamental concepts around Artificial Neural Networks.
  3. Part 3: Presents the basic knowledge of Convolutional Neural Networks.
  4. Part 4: Describes the implementation of the actual model.

Disclaimer: This is not a tutorial in any way. It provides some rudimentary knowledge, but the main goal is to showcase how one can build such a model.

Artificial Neural Networks

In the previous article we have mentioned that our Autoencoder will consist of 2 separate artificial neural networks. But, what is an artificial neural network or an ANN? In Artificial Intelligence, a neural network attempts to mimic the functionality of the biological neural network. That is why, ANNs are composed of artificial neurons, which simulate the actual neurons in the human brain. Each artificial neuron can be linked to other neurons with connections or links called synapses. Just like in our brain, a neuron can send signals of arbitrary intensity to other neurons. When a signal is strong enough the receiving neuron is activated and in turn sends a new signal to the next neurons. A visual example can be found below, where the circles are the nodes and the edges are the synapses.

Basic layout of an Artificial Neural Network

Do note that neurons are grouped into distinct layers, also known as hidden layers, and the information flows from left to right. The input layer is always the features of the dataset and the output layer is the one that provides the final answer to the problem. In regression tasks, the output layer usually has a single node which is basically the answer. In classification tasks, the output layer generally has as many nodes as the available classes, where the output of each neuron displays the degree of which the given sample belongs to the given class. Both cases can be depicted in the following images.

Regression: predicting house prices based on size and location
Classification: predicting whether a house, given its size and location, is more profitable to sell or buy

How do we measure the intensity of the signal? To do that we need to understand that each neuron has its own synapses and each synapse has a weight and a bias. To calculate the intensity of the incoming signal, one must multiply the incoming signals with the corresponding weights and, then, sum it all up. Additionally, to each sum, we need to add the bias b. However, we are not done yet. This value needs to pass through an additional function that is called an activation function. The activation function is responsible for determining if the signal will propagate forward and in what intensity. In other words, every neuron or every layer must have an activation function. A detailed view of a single neuron is demonstrated in the next figure:

Single neuron with weights w1,… wn, bias b and activation function σ

Putting it formally, the output of a neuron is the following:

Ok, but how can we use a neural network to solve a problem? First of all, we must decide on the network architecture. That is, how many layers, how many nodes per layer and what activation functions should be used. The nature of the problem usually can give us some guidelines, but there is no bulletproof solution. Actually, there is a whole field in AI called Neural Architecture Search, or NAS, that tries to deal with this problem. After that, we need to initialize all the weights and biases in some way (there are some very sophisticated ways of doing that, but they are out of the scope of this series), feed the features to the first layer and wait for the answer in the output layer.

This process is known as a feed-forward pass. During training, we pass every sample of the dataset in this way and we get an answer. Initially, this answer is way off the desired one, so we must find a way to calculate the error and somehow fix the network to better predict the next samples. To measure the error we have to choose a metric, such as Mean Squared Error for regression or Cross Entropy for classification to name a few. These are the loss functions that can indicate directly the accuracy of the neural network.

Ok, so we have passed the features, got an answer and calculated the loss. How can we train the model? To train the model, we need to fix the weights of the network in a way to make it more accurate on the task at hand. This is achieved by the Back-Propagation algorithm. The algorithm takes as input the derivative of the loss with respect to the output layer and propagates the result backwards in order to adjust the weights in the direction (gradient) where the loss is minimized. The most popular algorithm of measuring the loss derivative, to locate the minimum, is the Gradient Descent, where we have to run through all the samples before updating the network parameters. After we feed the entire dataset to the neural network, we say that an epoch has been completed. The problem with plain Gradient Descent is that it needs a lot of time to converge to an optimal solution. This can be bypassed with Stochastic Gradient Descent, which updates the parameters after each sample. Lastly, one can have the best of both worlds with the Mini-batch Gradient Descent, which updates the parameters after a batch of samples passes through the network.

If the overall architecture is appropriate and the features are properly engineered, after some epochs the neural network can be sufficiently accurate and we can have a model that provides a solution to the given problem. And that’s how an artificial neural network works! Again, this is a very basic introduction to ANNs which will allow you to understand the following sections.

So, that’s it for now! In the next part we will take a look at Convolutional Neural Networks. Stay tuned!

--

--

George Kamtziridis

Full Stack Software Engineer and Data Scientist at fromScratch Studio. BEng, MEng (Electrical Engineering/Computer Engineering) MSc (Artificial Intelligence)