Member-only story
Training a Convolutional Neural Network from scratch
A simple walkthrough of deriving backpropagation for CNNs and implementing it from scratch in Python.
In this post, we’re going to do a deep-dive on something most introductions to Convolutional Neural Networks (CNNs) lack: how to train a CNN, including deriving gradients, implementing backprop from scratch (using only numpy), and ultimately building a full training pipeline!
This post assumes a basic knowledge of CNNs. My introduction to CNNs covers everything you need to know, so I’d highly recommend reading that first. If you’re here because you’ve already read that, welcome back!
Parts of this post also assume a basic knowledge of multivariable calculus. You can skip those sections if you want, but I recommend reading them even if you don’t understand everything. We’ll incrementally write code as we derive results, and even a surface-level understanding can be helpful.
1. Setting the Stage
Buckle up! Time to get into it.
We’ll pick back up where my introduction to CNNs left off. We were using a CNN to tackle the MNIST handwritten digit classification problem:

