What Exactly Is the “Gaussian Connection”?
The creation of the Gaussian connection was probably inspired by the “Feature Extractor”, which is a traditional technique of image recognition.
The development of image recognition techniques include the following three stages:
- Traditional image recognition
- Multilayer neural network
- Convolutional neural network
Here, let’s dive into the first stage. The traditional image recognition technique consists of the “Feature Extractor” and the “Trainable Classifier”. Specifically, the feature extractor should be designed by researchers; they need to gain experience by observing and learning from the sample of images, and then they were able to apply the experience to design the feature extractor. The limitation of such a “Feature Extractor” is that it’s not reusable; for different samples of images with various features, researchers have to design the corresponding feature extractors to ensure good performance for recognizing images. In fact, the “Gaussian Connection” takes a page from the “Feature Extractor”, and I will shed more light on this in the following explanation.
Gaussian Connection vs Full Connection
There are my definitions of gaussian connection: from a macro perspective, the Gaussian connection is a connection method that should be exclusively applied between the full connected layer and the output layer; from a micro perspective, the Gaussian connection is the calculation of the euclidean radial basis function with a bunch of artificially-designed and fixed weights.
* Gaussian Connection
- Gaussian connection is dense-connected.
- The output value of the gaussian connection will not be plugged into any activation function. Instead, the output value will be directly outputted by the output layer.
* Full Connection
- Full connection is dense-connected
- Generally, the output value of the full connection layer will be plugged into an activation function.
The Mathematical Principles of Gaussian Connection
We noticed the Gaussian connection is located in between the F6 (i.e., the sixth layer with the full connection) and the Output layer. The explanation for the mathematical principles of Gaussian connection is provided below:
- x(j) (j = 0,1,2,…,83) is the output value of F6 (i.e., the sixth layer with the full connection), it can be reshaped into a matrix with 12 rows and 7 columns
- w(ij) (i = 0,1,2,…,9 and j = 0,1,2,…,83) is the artificially-designed and fixed weights within the gaussian connection, which has been mentioned earlier. Taking the w(1j) as an example, this weight could also be reshaped into a 12 x 7 matrix. These weights (i.e., w(0j), w(1j), w(2j), …,w(9j)) altogether form the “core” of the gaussian connection, and they are distributed at each neuron in the output layer, respectively.
The output value from F6 will be subtracted by the weights, and the square of the difference will be summed up (i.e., the calculation of the euclidean radial basis function). Then, the final result will be outputted by the output layer. This is the whole process of Gaussian connection.
The Answers for Some Important Questions
- Q: Why these weights of gaussian connection (i.e., w(ij)) should be artificially-designed and unchangeable?
The debut of Gaussian connection was in the LetNet-5 deep learning model, and this model was crafted for recognizing the handwritten Arabic numerals: there are 10 kinds of patterns in total (i.e., ‘0’, ‘1’, ‘2’, …, ‘9’). We can conclude that the number of patterns of samples of the image is really small, and the features of these patterns can be easily recognized and learned by humans. Thus, authors of LetNet-5 were likely to be inspired by these findings, and then they designed the 10 weights as shown above. If we reshape the weights of Gaussian connection into the matrix format with 12 rows and 7 columns, this is how it looks like:
As we expected, the weights of the Gaussian connection are nothing but pixel matrixes that mimic each Arabic numeral. To summarize, I personally considered the creation of Gaussian connection is that parts of patterns of the image that should be learned by neural networks are compensated by the human brain.
- Q: What exactly is the connection between the feature extractor and the Gaussian connection?
Now, we’ve known that the “core” of the Gaussian connection (i.e., w(0j), w(1j), w(2j), …,w(9j)) is the pixel matrixes that mimic the Arabic numerals. The way that authors created the Gaussian connection is similar to the way that other researchers developed the feature extractors; both of these techniques need the human’s experience based on learning and recognizing the samples of the images.
Reference and Image Source
: LeCun Y, Bottou L, Bengio Y, et al. Gradient-based learning applied to document recognition[J]. Proceedings of the IEEE, 1998, 86(11): 2278–2324.