Learning Day 25: RNN concept and in Pytorch
Published in
1 min readMay 10, 2021
Data representation for text/sentences
One-hot encoding
- The feature size will be huge based on how many common words are used in each language, data is sparse (many 0s and few 1s)
- sentence would be analysed word by word, no context
Word embedding
- Existing methods like word2vec or GloVe
- Use vector to represent words, so the relationship of words can be represented by the angle between two vectors (cosθ)
- Smaller feature size. Eg. GloVe embedding’s feature size = 300
from torchnlp.word_to_vector import GloVe
vectors = GloVe() # 2.18GB download
print(vectors['hello'])
# OUT: tensor([ 0.2523, 0.1018, -0.6748, 0.2112, 0.4349, 0.1654, 0.4826, -0.8122,.....]print(vectors['hello'].shape)
# OUTPUT: torch.Size([300])
Question
- torchnlp or torchtext is not part of torch? Why is torchvision part of torch then? Installed torchnlp using
pip install pytorch-nlp
Recurrent Neural Network (RNN)
CNN deals with spatial data and RNN deals with temporal data
RNN
- Weight sharing by using the same w and b for all words
- Carry context throughout the network via h like consistent memory
- for t=time step, hₜ = x@wₓₕ + hₜ₋₁@wₕₕ. eg h₁=x@wₓₕ + h₀@wₕₕ
- Batch representation [word num, b, word vec] — feeding a few sentence for training at one time