SpinalNet: Deep Neural Network with Gradual Input

A Neural Network Mimicking Human Nervous System

Ratnam Parikh
VisionWizard
7 min readAug 11, 2020

--

Source: Web

AI winters were not due to imagination traps, but due to lack of imagination. Imaginations bring order out of chaos. Deep learning with deep imagination is the road map to AI springs and AI autumns.

The advancements in the field of Deep Neural Networks (DNNs) have helped us achieve many milestones over the period of time. DNNs have brought machines closer to human performance.

With many advantages of the usage of DNNs, there have also been some issues that are being addressed and will also be in the future. DNNs usually have a large number of input features, as the consideration of more parameters usually improves the accuracy of the prediction. The size of the first hidden layer is critical. A small first hidden layer fails to propagate all input features properly while a large first hidden layer increases the number of weight drastically. Another limitation of the traditional DNNs is the vanishing gradient. When the number of layers is more, the gradient is high at neurons near output, and it becomes negligible at neurons near inputs. DNN training becomes difficult due to the vanishing gradient problem.

Table Of Contents

  1. Introduction
  2. Theoretical Background
  3. Proposed SpinalNet
  4. Universal Approximation of Proposed SpinalNet
  5. Results
  6. Future Prospects of SpinalNet
  7. Conclusion
  8. References

1. Introduction

  • The human brain receives a lot of information from different parts of the body. The brain senses pressure, heat, vibrations, complex textures, hardness, state of matter, etc.
  • The human spinal cord receives senses of touch from different locations in different parts of it.
  • The next figure presents simplified rough connections between
    human touch-sensors and the spinal cord.
Figure 1: Connections of Spinal Chord in the human body (Source: [1])
  • The new proposed SpinalNet architecture works on the principle of gradual inputs, and this way a similar behavior experienced in the spinal cord and human brain can be achieved by neural networks.
  • SpinalNet tries to overcome issues like more computation, vanishing gradient, number of connections, and number of layers are too high in deep neural networks, such problems can be solved with the use of SpinalNet.

2. Theoretical Background

2.1 Human Somatosensory System and the Spinal Cord

  • The authors of SpinalNet try to mimic a few attributes of the human Somatosensory system as this system is still not well understood.
  • The SpinalNet tries to mimic the following

— Gradual input and nerve plexus

— Voluntary and involuntary actions

— Attention to pain intensity

  • Sensory neurons reach the spinal cord through a complex network, known as nerve plexus.
  • The following figure shows a part of the nerve plexus.
Figure 2: Nerve Plexus (source: [1])
  • Single vertebrae do not receive all the information. The tactile sensory network consists of millions of sensors.
  • Moreover, our tactile system is more stable compared to the vision or the auditory system, as there is a much fewer number of ‘touch-blind’ patients than the number of blindness.
  • The nerve plexus network sends all tactile signals to the spinal cord gradually. Different locations of a spinal cord receive the pain of leg and the pain of hand.
  • Neurons in vertebrae transfer the sense of touch to the brain and may take some actions.
  • Our brain can control the spinal neurons, to increase or decrease the pain intensity.
  • Sensory neurons may also convey information to the lower-motor before getting instruction from the brain. That is called involuntary movements, or reflex movements.

3. Proposed SpinalNet

  • The SpinalNet is created to have commonalities with the working of the human spinal cord.
  • The similarities are as follows:

— Gradual Input

— Local output and probable global influence

— Weights reconfigured during training

  • Similar to the functioning of the brain, SpinalNet takes inputs gradually.
  • All layers of the proposed model contribute towards the local output that can be compared to reflex, and a modulated portion of the input is sent to the global output and that can be compared to the brain.
  • The following figure is the proposed SpinalNet architecture.
Figure 3: SpinalNet Architecture (source: [1])
  • In the proposed architecture consists of an input row, an intermediate row, and an output row.
  • Firstly, the input is separated and sent to the intermediate row. This intermediate row consists of different multiple hidden layers.
  • In the above model, each intermediate row’s hidden layers consists of two neurons, and also the output row’s hidden layers consists of two neurons each.
  • Both the number of intermediate neurons and the number of inputs per layer is usually kept small to reduce the number of multiplication. As the number of inputs and the number of intermediate hidden neurons per layer is usually low, the network may underfit.
  • To overcome the above issue, each layer receives inputs from the previous layers. By repeating the input, if one important feature doesn’t affect the output in one hidden layer, it might affect the output by later hidden layers.
  • The intermediate row’s hidden layers have a non-linear activation function, and the output row’s hidden layer has a linear activation function.

4. Universal Approximation of Proposed SpinalNet

Universal Approximation Theorem: The aim of a neural network is to map attributes(x) to output(y), and mathematically can be represented as a function y= f(x). The function f(x) can be any complex function, but should help map x to y. The Univeral Approximation Theorem poses that for any attributes(x) there is always a neural network that can map f(x) to output y, with any number of inputs and outputs.

  • The universal approximation theorem can be proved for SpinalNet using the following approach.

1. Single hidden layer NN of large width is a universal approximator.

2. If we can prove that, SpinalNet of a large depth can be equivalent to the single hidden layer NN of large width, the universal approximation is proved.

Figure 4: Simplified versions of SpinalNet (source: [1])
  • The above figure is to show how a simpler version of SpinalNet can be converted to a single hidden layer NN.
  • In Fig- 4(a), the first layer neurons are simplified by making them as linear functions. So, the first layer only takes the weighted sum of x1 to x5 inputs.
  • Now, from the first layer, the output only goes to the corresponding neuron in the second layer. All cross-connections between neurons of two layers and the connections with the output layer are disconnected by assigning weight zero.
  • The second layer receives the weighted sum of x6 to x10 as one input, and a weighted sum of x1 to x5 from the output of the first layer.
  • So, the second layer will perform activation function equivalent on applying to x1 to x10 data points together. These two layers combined can be equivalent to a neural network containing a single hidden layer, with two neurons as shown in Fig- 4(b).
  • A simplified version of SpinalNet of 4 hidden layers, containing 2 neurons in each layer is shown in Fig- 4(d). Similarly, SpinalNet is also equivalent to a neural network of one hidden layer, containing 4 neurons.
  • From the above-discussed points, we can conclude that SpinalNet of large depth can be equivalent to a NN of a single hidden layer, containing a large number of neurons.
  • Also, as a NN with a single hidden layer containing a large number of neurons achieves universal approximation, in a similar way SpinalNet with large depth achieves universal approximation.

5. Results

  • The authors of [1], have effectively verified SpinalNet in both classification and regression problems.
  • Popular MNIST and CIFAR datasets are used for the classification problem.
  • A traditional NN has about 300 hidden neurons, SpinalNet also has 300 hidden neurons. But there is a drop in the number of multiplications by 35.5%.
  • In a traditional NN 21,700 multiplications take place, while with the same number of neurons in SpinalNet it takes 14,000 multiplications.
Figure 5: SpinalNet (Arch2) (source: [1])
  • The above figure is the second proposed architecture that uses 3 SpinalNets, and this is used as a model for the classification problems.
Figure 6: SpinalNet Classification Results (source: [1])
  • The above-presented table includes the dataset, Base Model, SpinalNet model specifications, epochs, accuracy, and increment or decrement of accuracy from the state-of-the-art models.
  • A more detailed description of which model is being outperformed on which dataset by SpinalNet is presented in Section-3 Results of paper [1].
Figure 7: Convert Traditional Hidden Layer to Spinal Hidden Layer (source: [1])

6. Future Prospects of SpinalNet

  • The authors of [1] have proposed few prospects for SpinalNet as this is the first paper on SpinalNet.
  • The below given are the prospects of SpinalNet, for brief theory of these topics refer to Section-4 Prospects of SpinalNet [1].
  1. Auto Dimension Reduction
  2. Transfer Learning
  3. Ver Deep NN
  4. Spinal Hidden Layer
  5. Better Accuracy and New Datasets
  6. NN Ensemble and Voting

7. Conclusion

  • The nervous system and spinal cord in a human being have a unique way of sensing information and locating its source.
  • The SpinalNet is proposed in [1], with the idea of mimicking the functions of the brain and spinal cord that eventually help the deep neural networks to work with lesser computation and quicker response.

8. References

[1] Dipu Kabir, H. M., et al. “SpinalNet: Deep Neural Network with Gradual Input.” arXiv e-prints (2020): arXiv-2007.

[2] Github Link for SpinalNet- https://github.com/dipuk0506/SpinalNet

--

--