Zafarali Ahmed
Aug 15, 2017 · 1 min read


Thanks for reading. The step function is applied to each time step (this is technically a for loop.)

Regarding point #1, _stm is of size (batch_size, dim, timesteps) because we use K.repeat to effectively implement equation 1. Equation 1 is calculated for every character j in the input sequence using St-1. The purpose of _stm is to allow us to vectorize the calculation so we can do the calculation for all of the input sequence easily.

Regarding point #2, were you able to reconcile the equations 4–6? The only for-loop here is that step is applied to every element in the input sequence (over index t)

Regarding point #3, this is a good idea! If you want to take a stab at it please do make a PR or you can open an issue and I’ll work on it soon :)


    Zafarali Ahmed

    Written by

    Computer Science, Genomics, and Machine Learning