Hi Zachary, thanks for your comment. I love the AIDungeon project.

This post is very outdated in terms of what I know about NLP, and I wrote it when I'd just learned about LSTMs and RNNs... but hadn't realized the importance of word vectors yet! That's why these networks have such terrible performance, without any pretrained word-model (using a char-based model without anything pretrained on it and such a small dataset, there was no way it'd work).

If I had to redo this again, I'd probably go straight to a pretrained model like GPT-2, and if I had to implement it from scratch I would at least use pretrained word vectors.

--

--

Luciano Strika

Luciano Strika

B. Sc.+M. Sc. Computer Science, Buenos Aires University. Sr Data Scientist at MercadoLibre. I write about Machine Learning and Data, and love NLP and languages.