Convolutional Neural Network

Long
4 min readJun 1, 2017

--

  1. Architecture Overview
  2. ConNet Layers
  3. ConNet Kernel
  4. CIFAR-10(Example)

1. Architecture Overview

Training ConNet model is train the weights and bias through various type of layers:Conv layer,Pooling layer,Normalization Layers and Fully-Connected Layers to aggregation the value ,take the value into loss function (usually softmax), and back forward backpropagation update weights, loop the process.

The whole training process is non-linearity, from the raw image pixels on one end to class scores at the other.

Below is the normal NetConv process:

In the real training, there should be an activation function insert into two layers,except fully-connected layer . so the process is :Conv-> ReLU -> Pool -> ReLU -> Conv -> ReLU -> Pool -> FullCon -> FullCon

ConNet Architecture

A ConvNet is made up of Layers. Every Layer has a simple API: It transforms an input 3D volume to an output 3D volume with some differentiable function that may or may not have parameters.

2.Conv Layers

The layers of a ConvNet have neuron arranged in 3 dimensions:width,height and depth. E.g. the input images in CIFAR-10 are an input volume has dimension:32*32*3, depth represents RGB.The final output layer have dimensions 1*1*10,bacause by the end of the ConvNet arthitecture we will reduce the full image into a single vector of class scores,arrange along the depth dimension.

Convolutional Layer

This layer is the core layer of ConvNet, here is a visual demo to show how this layer work.

The Conv Layer parameters consist of a set of learnable filters:

  1. filter is a small dimension of matrix. width * height * 1
  2. filter attributes :size, stride,zero-padding.
  3. every depth of layer share a same filter.
  4. filter is one type of feature.

Every filter slice the layer into special dimension, e.g. input layer(10*10*3), if the filter is (2*2*10:means 1o*(2*2*3)), multiple them, output layer is 9*9*10.

ReLU Layer

This layer is an activation layer, click this blog activation chapter to see detail,the layer won’t change dimension of matrix, but amplify the positive value,reduce negative value.

Pooling Layer

There have two types of pooling layers: max pooling,average pooling, when we mention pooling layer, max pooling is by default,this layer is downsampling operation along the spatial dimensions(width,height).

Average pooling & max pooling

Fully-Connected Layer

This is the final layer, that aggregate the previous layer and weight into range of classification.

Normalization Layer

This layer is rare to use...

A ConvNet architecture is in the simplest case a list of Layers that transform the image volume into an output volume (e.g. holding the class scores)

There are a few distinct types of Layers (e.g. CONV/FC/RELU/POOL are by far the most popular)

Each Layer accepts an input 3D volume and transforms it to an output 3D volume through a differentiable function

Each Layer may or may not have parameters (e.g. CONV/FC do, RELU/POOL don’t)

Each Layer may or may not have additional hyperparameters (e.g. CONV/FC/POOL do, RELU doesn’t)

3.ConNet Kernel

Kernel is also called filter, that can extract image features, but what types of feature we want to get is unknown, our purpose is input the datasets and train the filter, adjust filter on every epoch.

Here are some simple kernels:

Average Kernel:

Average kernel
Average image

Gaussian Kernel

Gaussian Kernel
a)origin image ,b)Gaussian image

Edge Kernel

Edge kernel
Edge kernel image

Sharpen Kernel

avergage kernel - gaussian kernel = sharpen kernel

sharpen kernel
sharpen image

4.Example

I have already add comment to the code,Click & check detail.

Reference:

Article:

Course:

--

--