GSoC 2018: Final Work Report of LSTM Networks

Describes about the project status, future work and work experience with CERN mentors.


In this blogpost, I’ll be summarising my GSoC 2018 project work. I was working on project ‘Recurrent Neural Networks and LSTM on GPUs for Particle Physics Applications’. More details of the project can be found here: https://summerofcode.withgoogle.com/projects/#4728727959764992

I’ve written blogpost in the past regarding my project. These are mentioned below:

  1. GSoC 2018: Starting with CERN — Part 1: https://medium.com/@harshit.prasad/google-summer-of-code-2018-the-beginning-5ed4b8504d18
  2. GSoC 2018: RNN and LSTM Networks — Part 2: https://medium.com/@harshit.prasad/gsoc-2018-rnn-and-lstm-networks-part-ii-96661bf24442
  3. GSoC 2018: Forward Propagation in LSTM Network — Part 3: https://medium.com/@harshit.prasad/gsoc-2018-forward-propagation-feature-in-lstm-part-iii-4b3363257d44
  4. GSoC 2018: Backpropagation through time in LSTM Network — Part 4: https://medium.com/@harshit.prasad/gsoc-2018-backpropagation-through-time-in-lstm-network-part-iv-7e33a7b3d729
  5. GSoC 2018: Performance of LSTM Network — Part 5: https://medium.com/@harshit.prasad/gsoc-2018-performance-of-lstm-network-part-v-73c452820dc5

Current State of the Work

The current work of LSTM includes the complete implementation of forward propagation and backpropagation. The LSTM layer was designed from scratch. The layer currently supports Reference and CPU. For GPU, it’s still in progress and is a low priority at present.

  1. The forward propagation is working but results are not expected ones. The error is is around ~ 0.90 in testing forward pass.
  2. The backward propagation has been implemented following similar design of RNN backward pass. The backward pass is also not providing expected results since the forward pass is not providing expected results.
  3. The current design only updates internal parameters like state weights, input weights and biases.
  4. The error is too high during backpropagation at present. Error lies in range 0.64 to 1.32. The expected error should lie in range of 10^(-7) to 10^(-13).

Future Work

The design of LSTM layer and tests has been implemented by taking reference of torch-rnn in Lua. The repository can be found here: https://github.com/jcjohnson/torch-rnn. The future work on LSTM layer design are highlighted below:

  1. Backpropagation design has to be improved by updating values of each gate during the process for each timestep. The backpropagation related blogpost can be found here:
  2. Decreasing the error during forward pass that will gradually improve backpropagation.
  3. Implementation of variant of LSTM i.e GRU layer design.
  4. Test to check full working of LSTM network. It should be similar to numerical example: https://medium.com/@aidangomez/let-s-do-this-f9b699de31d9 for verification purpose.
  5. Training and testing it on Ecal dataset.

Conclusion and Experience

The whole work related to this project can be found here: https://github.com/tmvadnn/root/pull/7

I’ve gained a lot of experience in the field of Deep Learning as this was my first intern in the field of machine learning. It was great experience working with CERN mentors. Special mention: Lorenzo Moneta, Saurav Shekhar and Kim Albertsson for providing me opportunity to work on this work project and all the help and suggestions provided by them during weekly meetings. Also, it was a great experience working with my project partners: Siddhartha, Ravi, Manos and Anushree.