Paper Review 5: Bag of Tricks for Efficient Text Classification

Fatih Cagatay Akyon
NLP Chatbot Survey
Published in
2 min readNov 8, 2018

In this post, the paper “Bag of Tricks for Efficient Text Classification” is summarized.

Link to paper: http://aclweb.org/anthology/E17-2068

Armand Joulin, Edouard Grave, Piotr Bojanowski, Tomas Mikolov, 2017, “Bag of Tricks for Efficient Text Classification,” in EACL 2017

In this paper, authors from FAIR explore ways to scale present baselines to very large corpus with a large output space, in the context of text classification. They show that linear models with a rank constraint and a fast loss approximation can train on a billion words within ten minutes, while achieving performance on par with the state-of-the-art. Their approach fastText, is evaluated on two different tasks, namely tag prediction and sentiment analysis.

Authors use bag of n-grams to capture some partial information about the local word order. In the proposed method, the ngram features are embedded and averaged to form the hidden variable, which is in turn fed to a linear classifier. They maintain a fast and memory efficient mapping of the n-grams by using the hashing trick.

Compared Text Classifiers:

• n-grams and TFIDF baselines from Zhang et al. (2015)

• Character level convolutional model (char-CNN) of Zhang and LeCun (2015)

• Character based convolution recurrent network (char-CRNN) of Xiao and Cho (2016)

• Very deep convolutional network (VDCNN) of Conneau et al. (2016)

  • Two approaches based on recurrent networks (Conv-GRNN and LSTM-GRNN) of Tang et al. (2015)

As the conclusion; in several tasks, fastText obtains performance on par with recently proposed methods inspired by deep learning, while being much faster. And their code is available on: https://github.com/facebookresearch/fastText

--

--