Simple CNN using NumPy: Part I (Introduction & Data Processing)

Published in

Analytics Vidhya

5 min readJun 20, 2021

Introduction

Convolutional Neural Networks (CNNs) are a class of neural networks that work well with grid-like data, such as images. They extract useful features from images to make the image recognition process more robust. These networks are inspired by the results of experiments conducted by David Hunter Hubel & Torsten Nils Wiesel who observed different neural activity in the cat’s brain in response to different orientations of a straight line.

First Convolutional Networks

The first convolutional neural network was the Neocognitron, implemented by Dr Kunihiko Fukushima in 1980. This system used a hierarchical structure to learn simple & complex features of an image. An unsupervised learning procedure was used to recognize handwritten characters.

These ideas were improved upon by Dr Yann LeCun and his team in the 1990s, to include back propagation to recognize handwritten postal codes (Le-Net5). These types of implementations have led to drastic improvements in image recognition tasks.

CNNs have the ability to detect useful features (edges, horizontal lines, vertical lines, curvature, etc) from an image, which is akin to different neurons firing in the brain for different orientations of an image. This feature is made possible by the convolutional layer.

The experiments by David Hunter Hubel & Torsten Nils Wiesel can be found in the following Youtube Video (Credits: Ali Moeeny)

In these series of articles, I will try to implement a rudimentary Convolutional Neural Network using NumPy.

Input Data

The input used here will be Kannada Digits sourced from the Kannada Digits MNIST data repository in Kaggle. Kannada is a Dravidian language, which is spoken by over 45 million people. The Kannada Digits are as follows;

Input Data Processing

The input is sourced from a CSV file, that contains the flattened version of the images. The data preprocessing involves converting each of these entries to 28X28 arrays of pixel values.

**Each entry(row) is converted to a 28 X 28 array**

The code below creates the training dataset

import pandas as pd
import numpy as np


np.random.seed(42)
## import data setdata = pd.read_csv('../input/Kannada-MNIST/train.csv')
data['row_number'] = range(0,data.shape[0])
## Shuffling the data
data = data.sample(frac=1,random_state=42)
tmp = pd.DataFrame()
## Getting a balanced dataset with 600 entries per classfor label in range(10):
    if label==0:
        tmp = data[data['label']==label].head(600)
    else:
        temp = data[data['label']==label].head(600)
        tmp = pd.concat([tmp,temp])
data_train = tmp
row_numbers_in_train_set = tmp['row_number'].values
test_set = data.loc[~data['row_number'].isin(row_numbers_in_train_set)]
## Create one hot encoding
one_hot = pd.get_dummies(data_train['label'].unique())
one_hot['label'] = one_hot.index

data_train = pd.merge(data_train,one_hot)
data_test = test_set.sample(frac=1)
tmp = pd.DataFrame()
## Getting a balanced test set with 120 entries per classfor label in range(10):
    if label==0:
        tmp = data_test[data_test['label']==label].head(120)
    else:
        temp = data_test[data_test['label']==label].head(120)
        tmp = pd.concat([tmp,temp])
data_test = tmp
data_test = pd.merge(data_test,one_hot)
data_train.drop('label',axis=1,inplace=True)

data_test.drop('label',axis=1,inplace=True)

## Create the train and test set and normalize the inputs
X_train = np.array(data_train.drop([0,1,2,3,4,5,6,7,8,9,'row_number'],axis=1).values)/255
y_train = np.array(data_train[[0,1,2,3,4,5,6,7,8,9]].values)
X_test = np.array(data_test.drop([0,1,2,3,4,5,6,7,8,9,'row_number'],axis=1).values)/255
y_test = np.array(data_test[[0,1,2,3,4,5,6,7,8,9]].values)

The flattened data is imported to create a training data set of 6000 entries and a test dataset of 1000 entries. These pixel entries range from 0 to 255 (greyscale). These entries are normalized by dividing by the max value (255).

The output classes are changed to one-hot encoding representations.

The following re-shapes the input vectors to 28X28 NumPy arrays

X_train = X_train.T
y_train = y_train.T

X_test = X_test.T
y_test = y_test.T
X_train_reshape = np.zeros((X_train.shape[1],1,28,28))

for i in range(X_train.shape[1]):
    temp = X_train[:,i]
    temp = np.ravel(temp)
    temp = temp.reshape(28,28)
    X_train_reshape[i,0,:,:] = temp
    
X_train= X_train_reshape  

X_test_reshape = np.zeros((X_test.shape[1],1,28,28))

from matplotlib import pyplot as plt

for i in range(X_test.shape[1]):
    temp = X_test[:,i]
    temp = np.ravel(temp)
    temp = temp.reshape(28,28)
    X_test_reshape[i,0,:,:] = temp

X_test= X_test_reshape