Learning Day 25: RNN concept and in Pytorch

De Jun Huang
dejunhuang
Published in
1 min readMay 10, 2021

Data representation for text/sentences

One-hot encoding

  • The feature size will be huge based on how many common words are used in each language, data is sparse (many 0s and few 1s)
  • sentence would be analysed word by word, no context

Word embedding

  • Existing methods like word2vec or GloVe
  • Use vector to represent words, so the relationship of words can be represented by the angle between two vectors (cosθ)
  • Smaller feature size. Eg. GloVe embedding’s feature size = 300
from torchnlp.word_to_vector import GloVe
vectors = GloVe() # 2.18GB download

print(vectors['hello'])
# OUT: tensor([ 0.2523, 0.1018, -0.6748, 0.2112, 0.4349, 0.1654, 0.4826, -0.8122,.....]
print(vectors['hello'].shape)
# OUTPUT: torch.Size([300])

Question

  • torchnlp or torchtext is not part of torch? Why is torchvision part of torch then? Installed torchnlp using pip install pytorch-nlp

Recurrent Neural Network (RNN)

CNN deals with spatial data and RNN deals with temporal data

RNN

  • Weight sharing by using the same w and b for all words
  • Carry context throughout the network via h like consistent memory
  • for t=time step, hₜ = x@wₓₕ + hₜ₋₁@wₕₕ. eg h₁=x@wₓₕ + h₀@wₕₕ
  • Batch representation [word num, b, word vec] — feeding a few sentence for training at one time

Reference

link1

--

--