Also notice I do not exclude stopwords if they have embeddings, I will later give them a random representation — this is done for the sake of simplicity, a far better approach will be to train your own embeddings to better capture context of the problem.
How to predict Quora Question Pairs using Siamese Manhattan LSTM
Elior Cohen
3646

Unrecognized words are words that are not stopwords, and are not existent in the word2vec supplied by Google, these words are assigned random weights.

Zeros are only for the padding indice, they are not words.

Hope this answers your.doubt

One clap, two clap, three clap, forty?

By clapping more or less, you can signal to us which stories really stand out.