Hi!

Thanks for reading. The `step`

function is applied to each time step (this is technically a for loop.)

Regarding point #1, `_stm`

is of size `(batch_size, dim, timesteps)`

because we use `K.repeat`

to effectively implement equation 1. Equation 1 is calculated for every character `j`

in the input sequence using St-1. The purpose of `_stm`

is to allow us to vectorize the calculation so we can do the calculation for all of the input sequence easily.

Regarding point #2, were you able to reconcile the equations 4–6? The only for-loop here is that `step`

is applied to every element in the input sequence (over index `t`

)

Regarding point #3, this is a good idea! If you want to take a stab at it please do make a PR or you can open an issue and I’ll work on it soon :)

Zaf