Feed Forward Neural Network — Explainable AI Visualization (Part 6)

Parvez Kose
DeepViz
Published in
4 min readFeb 10, 2023

This article continues the background research for the study ‘Explainable Deep Learning and Visual Interpretability.’

A good example of a multi-layer perceptron is a Feed Forward Neural Network involving multiple sequential function composition layers. Each layer outputs a set of vectors that serve as input to the next layer, which in turn is a set of function units

Multi-layer Perceptron

There are three types of layers:

  • Input layer is a collection of the raw input data
  • Hidden layers are sequences of multiple functions to apply to either inputs or outputs of previously hidden layers
  • Output layer is the final function or set of functions.

The above multi-layer neural network with three hidden layers and multiple outputs is suitable for the MNIST handwritten-digit problem. The input layer has p D 784 units. Such a network with hidden layer sizes .1024; 1024; 2048/, and particular choices of tuning parameters achieves a state-of-the-art error rate of 0:93.

Convolutional Neural Networks

A convolutional neural network (CNN) is a feed-forward artificial neural network in which the network preserves the hierarchical structure by learning internal feature representations and generalizing the features in common image problems such as image classification tasks. It is not restricted to images; it is also applied to natural language processing problems and speech recognition.

Network architecture

The classic network architecture of CNN comprises stacked convolutional layers, periodically followed by a pooling layer, activation function, and optional batch normalization. It also consists of fully connected layers. As an image moves through the network, its spatial dimensions are periodically down-sampled while increasing the number of feature maps. The final layer outputs the class probabilities prediction.

These architectures can be broadly classified as classic network architectures such as LeNet-5, AlexNet and VGG16, and modern network architectures such as Inception, ResNet and DenseNet. While the classic network architectures mainly comprise stacked convolutional layers and few fully connected layers, the latest state-of-the-art architectures explore innovative techniques for constructing convolutional layers that allow for more efficient learning.

These architectures also serve as a baseline to build on top of them to solve various computer vision tasks. These architectures serve as rich feature extractors which can be used for computer vision problems such as object detection, object tracking, pose estimation, text detection, visual saliency detection, semantic segmentation, image captioning, visual question answering, action recognition, scene labeling, speech and natural language processing.

The effectiveness of CNN in image recognition is one of the main reasons why the world recognizes the power of deep learning. Figure 4–7 illustrates that CNN is good at building position and rotation invariant features from raw image data. It has led to significant advances in machine vision, which has critical applications for self-driving cars, robotics, drones, and treatments for the visually impaired.

A simple example of a computer vision model is seen below:

Computer Vision Example

The broad range of CNN model families includes:

  1. CNNs with fully-connected layers (e.g. VGG)
  2. CNNs used for structured outputs (e.g. image captioning)
  3. CNNs used in tasks with multi-modal inputs (e.g. Visual Question Answering, aka VQA) or reinforcement learning.

Types of Tasks:

  • Classification
  • Regression
  • Similarity Matching
  • Clustering
  • Co-occurrence grouping
  • Profiling
  • Link prediction
  • Data Reduction
  • Causal Modelling

Types of Learning:

There are different types of learning processes. Learning can be supervised, semi-supervised, unsupervised and reinforcement.

Supervised learning

Supervised learning refers to training a model using a labeled dataset, where some of the training examples have labels, but others don’t. It’s a machine learning process that learns a function from an input type to an output type using data comprising examples with input and output values. Two typical examples of supervised learning are classification learning and regression. In these cases, the output types are respectively categorical (the classes) and numeric.

Unsupervised Learning

Unsupervised learning refers to any machine learning process that seeks to learn to find patterns in a dataset, typically an unlabeled dataset or learns structure in the absence of either an identified output. Examples of unsupervised learning are clustering, dimensionality reduction, recommendation and self-organizing maps.

Reinforcement Learning

Reinforcement learning refers to a large class of learning problems characteristic of autonomous agents interacting in an environment: sequential decision-making problems with delayed reward or involving reward maximization. Reinforcement-learning algorithms seek to learn a policy (mapping from states to actions) that maximizes the reward received over time.

Unlike supervised learning problems, reinforcement-learning problems have no labeled examples of correct and incorrect behavior. However, unlike unsupervised learning problems, a reward signal can be perceived. This technique is based on how animals seem to learn in response to positive feedback and requires an enormous amount of dataset.

The next article in this series covers the advent of black box models and how they are harder to interpret than most machine learning models due to their large number of layers and parameters.

https://medium.com/@parvez__/the-advent-of-black-box-models-explainable-ai-and-visualization-part-7-6ac896f4adc4

--

--

Parvez Kose
DeepViz

Staff Software Engineer | Data Visualization | Front-End Engineering | User Experience