Analytics Vidhya
Published in

Analytics Vidhya

Torch Text :Not so Popular Library

Basic Intro
Fig: steps used by torch Text
from torchtext.legacy.data import Field , TabularDataset
from torchtext.legacy.data import BucketIterator
#overview of data
import pandas as pd
data=pd.read_csv('train.csv')
data.head()
Figure 1
tokens=lambda x:x.split()
comment=Field(sequential=True,use_vocab=True,tokenize=tokens,
lower=True)
score=Field(sequential=False,use_vocab=False)
fields={'comments':('c',comment),'score':('s',score)}
train_set,test_set=TabularDataset.splits(path='/content',
format='csv',
train='train.csv',
test='test.csv',
fields=fields)
comment.build_vocab(train_set,
max_size=100,
min_frequency=1)
train_iter,test_iter=BucketIterator.splits((train_set,test_set),
batch_size=2)
for i in train_iter:
print(i.c,i.s)
Result

--

--

Analytics Vidhya is a community of Analytics and Data Science professionals. We are building the next-gen data science ecosystem https://www.analyticsvidhya.com

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store