Hi!
Thanks for reading. The step
function is applied to each time step (this is technically a for loop.)
Regarding point #1, _stm
is of size (batch_size, dim, timesteps)
because we use K.repeat
to effectively implement equation 1. Equation 1 is calculated for every character j
in the input sequence using St-1. The purpose of _stm
is to allow us to vectorize the calculation so we can do the calculation for all of the input sequence easily.
Regarding point #2, were you able to reconcile the equations 4–6? The only for-loop here is that step
is applied to every element in the input sequence (over index t
)
Regarding point #3, this is a good idea! If you want to take a stab at it please do make a PR or you can open an issue and I’ll work on it soon :)
Zaf