Ceshine Lee
1 min readMay 15, 2019

--

Hi Abin Jo Abraham,

Thanks for reading and your kind words.

I haven’t followed this track of research for a while, so please take my following response with a grain of salt.

The TCN used here was meant to be a starting point, but the paper already showed some promising results. Maybe there are already some new papers that expand this work. However, I’d guess that the answer to whether TCN-like structures can replace LSTM still largely depends on the data set.

Theoretically TCN should be faster than LSTM, especially for longer sequences. Other factors like the size of hidden states, the number of layers might be involved as well. (The sequential MNIST, even permuted one, can still be too easy for meaningful comparison.)

In my experience, TCN-like models are generally harder to tune than RNN models. You might also want to check out self-attention models like the transformer for sequence modelling .

--

--