Processing text is the first step toward any successful project in NLP, and fortunately the tools to help us here are growing evermore sophisticated.

For my first deep-learning project, I built a high-functioning English-French translator and wouldn’t have been able to do so without the TorchText library. Ok, maybe that is a slightly dramatic way of putting it… I at least would’ve been slowed down considerably.

TorchText is incredibly convenient as it allows you to rapidly tokenize and batchify (are those even words?) your data. What would’ve required your own functions and a lot of thought now requires solely a few lines of code, and boom, you’re ready to build that AI of yours that will take over the world (or just translate a language or something, I don’t know what you’re planning). …

Could The Transformer be another nail in the coffin for RNNs?

Doing away with the clunky for loops, it finds a way to allow whole sentences to simultaneously enter the network in batches. The miracle; NLP now reclaims the advantage of python’s highly efficient linear algebra libraries. This time-saving can then spent deploying more layers into the model.

So far it seems the result is faster convergence and better results. What’s not to love?

My personal experience of it has been highly promising. It trained on 2 million French-English sentence pairs to create a sophisticated translator in only three days.

You can play with the model yourself on language translating tasks if you go to my implementation on Github here. …

It’s hard to believe that only two weeks ago I knew less about neural networks than I did about the inner workings of North Korea.

Yet fortunately, getting to understand neural networks is really not as difficult as it seems from far away. Already today, I’m able to devise my own simple neural network without using any libraries (beyond basic maths operations), and have developed a clear picture of the underlying concepts.

And you can too! So don’t be put off by the naysayers, PHD mathematicians, and fancy names, and instead read this guide in order to :


Samuel Lynn-Evans

Data Engineer @Skyscanner, AI writer @FloydHub, ex-biology teacher, language enthusiast.

