Pre-padding and post-padding in LSTM

1 min readApr 28, 2020

There are huge difference regarding my text classfication experiment using LSTM on the choice of whether I use pre-padding or post-padding when preprocessing.

Default setting of Keras pad_sequences us to pre-pad and pre-truncate the sentence which generates good result. However, when the setting changes to post-pad, the performance is really garbage.

If you use post-pad, meaningful words from your input will be washed out in the end, so model kind of “forgets” what is in the beginning, so in this case, pre-pad is more powerful.

Another way can be explored is the add start <s>and end</s> symbol in the beginning and end of the sentence. It will help the model capture when the real part of sentence starts (even if you use post-padding?).

Pre-padding and post-padding in LSTM

Written by Xin Inf