# Simple CNN using NumPy: Part I (Introduction & Data Processing)

# Introduction

Convolutional Neural Networks (CNNs) are a class of neural networks that work well with grid-like data, such as images. They extract useful features from images to make the image recognition process more robust. These networks are inspired by the results of experiments conducted by **David Hunter Hubel & Torsten Nils Wiesel **who observed different neural activity in the cat’s brain in response to different orientations of a straight line.

## First Convolutional Networks

The first convolutional neural network was the Neocognitron, implemented by Dr Kunihiko Fukushima in 1980. This system used a hierarchical structure to learn simple & complex features of an image. An unsupervised learning procedure was used to recognize handwritten characters.

These ideas were improved upon by Dr Yann LeCun and his team in the 1990s, to include back propagation to recognize handwritten postal codes (Le-Net5). These types of implementations have led to drastic improvements in image recognition tasks.

CNNs have the ability to detect useful features (edges, horizontal lines, vertical lines, curvature, etc) from an image, which is akin to different neurons firing in the brain for different orientations of an image. This feature is made possible by the convolutional layer.

The experiments by **David Hunter Hubel & Torsten Nils Wiesel **can be found in the following Youtube Video (Credits: Ali Moeeny)

In these series of articles, I will try to implement a rudimentary Convolutional Neural Network using NumPy.

# Input Data

The input used here will be Kannada Digits sourced from the Kannada Digits MNIST data repository in Kaggle. Kannada is a Dravidian language, which is spoken by over 45 million people. The Kannada Digits are as follows;

## Input Data Processing

The input is sourced from a CSV file, that contains the flattened version of the images. The data preprocessing involves converting each of these entries to 28X28 arrays of pixel values.

The code below creates the training dataset

import pandas as pd

import numpy as np

np.random.seed(42)

## import data setdata = pd.read_csv('../input/Kannada-MNIST/train.csv')

data['row_number'] = range(0,data.shape[0])

## Shuffling the data

data = data.sample(frac=1,random_state=42)

tmp = pd.DataFrame()

## Getting a balanced dataset with 600 entries per classfor labelinrange(10):

if label==0:

tmp = data[data['label']==label].head(600)

else:

temp = data[data['label']==label].head(600)

tmp = pd.concat([tmp,temp])

data_train = tmp

row_numbers_in_train_set = tmp['row_number'].values

test_set = data.loc[~data['row_number'].isin(row_numbers_in_train_set)]

## Create one hot encoding

one_hot = pd.get_dummies(data_train['label'].unique())

one_hot['label'] = one_hot.index

data_train = pd.merge(data_train,one_hot)

data_test = test_set.sample(frac=1)

tmp = pd.DataFrame()

## Getting a balanced test set with 120 entries per classfor labelinrange(10):

if label==0:

tmp = data_test[data_test['label']==label].head(120)

else:

temp = data_test[data_test['label']==label].head(120)

tmp = pd.concat([tmp,temp])

data_test = tmp

data_test = pd.merge(data_test,one_hot)

data_train.drop('label',axis=1,inplace=True)

data_test.drop('label',axis=1,inplace=True)## Create the train and test set and normalize the inputs

X_train = np.array(data_train.drop([0,1,2,3,4,5,6,7,8,9,'row_number'],axis=1).values)/255

y_train = np.array(data_train[[0,1,2,3,4,5,6,7,8,9]].values)

X_test = np.array(data_test.drop([0,1,2,3,4,5,6,7,8,9,'row_number'],axis=1).values)/255

y_test = np.array(data_test[[0,1,2,3,4,5,6,7,8,9]].values)

The flattened data is imported to create a training data set of 6000 entries and a test dataset of 1000 entries. These pixel entries range from 0 to 255 (greyscale). These entries are normalized by dividing by the max value (255).

The output classes are changed to one-hot encoding representations.

The following re-shapes the input vectors to 28X28 NumPy arrays

`X_train = X_train.T`

y_train = y_train.T

X_test = X_test.T

y_test = y_test.T

X_train_reshape = np.zeros((X_train.shape[1],1,28,28))

for i **in** range(X_train.shape[1]):

temp = X_train[:,i]

temp = np.ravel(temp)

temp = temp.reshape(28,28)

X_train_reshape[i,0,:,:] = temp

X_train= X_train_reshape

X_test_reshape = np.zeros((X_test.shape[1],1,28,28))

from matplotlib import pyplot as plt

for i **in** range(X_test.shape[1]):

temp = X_test[:,i]

temp = np.ravel(temp)

temp = temp.reshape(28,28)

X_test_reshape[i,0,:,:] = temp

X_test= X_test_reshape

Some of the re-shaped arrays are as follows

After the processing, the input train data set now has the dimensions (6000,1,28,28) and the test data has (1000,1,28,28).

# CNN Architecture

After data processing, the images, that are in the form of NumPy arrays, are passed through a series of layers as follows.

Let’s assume that we are passing a single image of dimension (1,1,28,28). The structure of the neural network will then be , as follows

- Input Layer (1,1,28,28)
- Convolutional Filters (2,1,5,5)
- Max Pool Layer (2x2)
- Fully Connected Layer (1,288)
- Second Fully Connected Layer (1,60)
- Output Layer (1,10)

The above diagram shows the rough “blueprint” of the network. I will explain convolutional filters and the convolutional operation in the next post.

Thanks for reading! Please feel free to e-mail me at padhokshaja@gmail.com in case of feedbacks/queries. I will do my best to get back to them.