Member-only story
Using PCA to Reduce Number of Parameters in a Neural Network by 30x Times
While still getting even better performance! — Neural Networks and Deep Learning Course: Part 17
In the previous article, we created a Multilayer Perceptron (MLP) classifier model to identify handwritten digits.
We used two hidden layers with 256 neurons for the network architecture. Even with such a small network, we got 269,322 total parameters (weights and bias terms). The main reason for getting such a big number of parameters for the network is the large size of the input layer.
Because the input layer of an MLP takes 1D tensors, we need to reshape 2-dimensional MNIST image data into 1-dimensional data. This process is technically called flattening the image.
Each pixel in the image represents an input. If there are 784 pixels in the image, we need 784 neurons in the input layer of the MLP. When the size of the input layer increases, we get a large number of total parameters in the network. That’s why MLPs are not parameter efficient. The size of the input layer significantly increases when we use high-pixel…