Also notice I do not exclude stopwords if they have embeddings, I will later give them a random representation — this is done for the sake of simplicity, a far better approach will be to train your own embeddings to better capture context of the problem.
How to predict Quora Question Pairs using Siamese Manhattan LSTM
Elior Cohen
86815

Unrecognized words are words that are not stopwords, and are not existent in the word2vec supplied by Google, these words are assigned random weights.

Zeros are only for the padding indice, they are not words.

Hope this answers your.doubt