Simple CNN using NumPy: Part I (Introduction & Data Processing)

Pradeep Adhokshaja
Jun 20 · 5 min read


Convolutional Neural Networks (CNNs) are a class of neural networks that work well with grid-like data, such as images. They extract useful features from images to make the image recognition process more robust. These networks are inspired by the results of experiments conducted by David Hunter Hubel & Torsten Nils Wiesel who observed different neural activity in the cat’s brain in response to different orientations of a straight line.

First Convolutional Networks

The first convolutional neural network was the Neocognitron, implemented by Dr Kunihiko Fukushima in 1980. This system used a hierarchical structure to learn simple & complex features of an image. An unsupervised learning procedure was used to recognize handwritten characters.

These ideas were improved upon by Dr Yann LeCun and his team in the 1990s, to include back propagation to recognize handwritten postal codes (Le-Net5). These types of implementations have led to drastic improvements in image recognition tasks.

CNNs have the ability to detect useful features (edges, horizontal lines, vertical lines, curvature, etc) from an image, which is akin to different neurons firing in the brain for different orientations of an image. This feature is made possible by the convolutional layer.

The experiments by David Hunter Hubel & Torsten Nils Wiesel can be found in the following Youtube Video (Credits: Ali Moeeny)

In these series of articles, I will try to implement a rudimentary Convolutional Neural Network using NumPy.

Input Data

The input used here will be Kannada Digits sourced from the Kannada Digits MNIST data repository in Kaggle. Kannada is a Dravidian language, which is spoken by over 45 million people. The Kannada Digits are as follows;

Input Data Processing

The input is sourced from a CSV file, that contains the flattened version of the images. The data preprocessing involves converting each of these entries to 28X28 arrays of pixel values.

The code below creates the training dataset

The flattened data is imported to create a training data set of 6000 entries and a test dataset of 1000 entries. These pixel entries range from 0 to 255 (greyscale). These entries are normalized by dividing by the max value (255).

The output classes are changed to one-hot encoding representations.

The following re-shapes the input vectors to 28X28 NumPy arrays

Some of the re-shaped arrays are as follows

After the processing, the input train data set now has the dimensions (6000,1,28,28) and the test data has (1000,1,28,28).

CNN Architecture

After data processing, the images, that are in the form of NumPy arrays, are passed through a series of layers as follows.

Let’s assume that we are passing a single image of dimension (1,1,28,28). The structure of the neural network will then be , as follows

  • Input Layer (1,1,28,28)
  • Convolutional Filters (2,1,5,5)
  • Max Pool Layer (2x2)
  • Fully Connected Layer (1,288)
  • Second Fully Connected Layer (1,60)
  • Output Layer (1,10)

The above diagram shows the rough “blueprint” of the network. I will explain convolutional filters and the convolutional operation in the next post.

Thanks for reading! Please feel free to e-mail me at in case of feedbacks/queries. I will do my best to get back to them.


Next Post

Convolution Operation

Analytics Vidhya

Analytics Vidhya is a community of Analytics and Data…

Analytics Vidhya

Analytics Vidhya is a community of Analytics and Data Science professionals. We are building the next-gen data science ecosystem

Pradeep Adhokshaja

Written by

Data Scientist @Philips. Passionate about ML,Statistics & hiking. Come say 👋LinkedIn:

Analytics Vidhya

Analytics Vidhya is a community of Analytics and Data Science professionals. We are building the next-gen data science ecosystem