Memo Akten
Aug 25, 2017 · 1 min read

This is fantastic. Thanks for taking the time to write this (and the whole series). One tiny thing which I found a bit confusing and perhaps could do with a little bit more of a clarification:

Output will be the last state of every layer in the network as an LSTMStateTuple stored incurrent_state as well as a tensor states_series with the shape [batch_size, truncated_backprop_length, state_size] containing the hidden state of the last layer across all time-steps.

Could possibly be expanded as:

Output will be the internal state (both cell state and hidden state) of every layer in the network for the final timestep as a tuple (for each layer) of LSTMStateTuple stored incurrent_state as well as a tensor states_series with the shape [batch_size, truncated_backprop_length, state_size] containing the output of the last layer for each time-step.

(The bold bits are not for emphasis, they’re just to indicate which bits I changed).

This doesn’t contradict what you say, just avoids some ambiguity (at least it wasn’t clear to me just from reading it).

Finally, with batch_size=3 , state_size=3 and truncated_backprop_length=3 it’s a bit tricky to read the diagrams, since so many dimensions are of size 3! If say batch_size was 4 and state_size was 5 it could be much more immediately obvious.

)

    Memo Akten

    Written by

    computational ar̹͒ti͙̕s̼͒t. curious. philomath. science∩spirituality. algorithm∩ritual. PhD AI/ML expressive human-machine interaction. I like to touch people.