Hi Zachary, thanks for your comment. I love the AIDungeon project.

This post is very outdated in terms of what I know about NLP, and I wrote it when I'd just learned about LSTMs and RNNs... but hadn't realized the importance of word vectors yet! That's why these networks have such terrible performance, without any pretrained word-model (using a char-based model without anything pretrained on it and such a small dataset, there was no way it'd work).

If I had to redo this again, I'd probably go straight to a pretrained model like GPT-2, and if I had to implement it from scratch I would at least use pretrained word vectors.


What is each individual filter doing in a Convolutional Neural Network? Which kinds of images is it learning to detect? Here’s a way to know

According to Wikipedia, apophenia is “the tendency to mistakenly perceive connections and meaning between unrelated things”. It is also used as “the human propensity to seek patterns in random information”. …


Markov chains have been around for a while now, and they are here to stay. From predictive keyboards to applications in trading and biology, they’ve proven to be versatile tools.

Here are some Markov Chains industry applications:

  • Text Generation (you’re here for this).
  • Financial modelling and forecasting (including trading algorithms).


Source: Pixabay

Hadoop’s MapReduce is not just a Framework, it’s also a problem-solving philosophy.

Borrowing from functional programming, the MapReduce team realized a lot of different problems could be divided into two common operations: map, and reduce.

Both mapping and reducing steps can be done in parallel.

This meant as long as…


Why do Neural Networks Need an Activation Function? Whenever you see a Neural Network’s architecture for the first time, one of the first things you’ll notice is they have a lot of interconnected layers.

Each layer in a Neural Network has an activation function, but why are they necessary? And…


LSTM Neural Networks have seen a lot of use in the recent years, both for text and music generation, and for Time Series Forecasting.

Today, I’ll teach you how to train a LSTM Neural Network for text generation, so that it can write with H. P. Lovecraft’s style.

In order…


Probably a nice dashboard. Source: Pixabay.

Probability Distributions are like 3D glasses. They allow a skilled Data Scientist to recognize patterns in otherwise completely random variables.

In a way, most of the other Data Science or Machine Learning skills are based on certain assumptions about the probability distributions of your data.

This makes probability knowledge part…


Keep an eye out for Deep Learning. Source: Pixabay.

Convolutional Neural Networks are a part of what made Deep Learning reach the headlines so often in the last decade. Today we’ll train an image classifier to tell us whether an image contains a dog or a cat, using TensorFlow’s eager API.

Artificial Neural Networks have disrupted several industries lately…


300+ colors to 24 colors.

Applying filters to images is not a new concept to anyone. We take a picture, make a few changes to it, and now it looks cooler. But where does Artificial Intelligence come in? Let’s try out a fun use for Unsupervised Machine Learning with K Means Clustering in Python.

I’ve…


Source: Pixabay

There is a Japanese word, tsundoku (積ん読), which means buying and keeping a growing collection of books, even though you don’t really read them all.

I think we Developers and Data Scientists are particularly prone to falling into this trap. …

Luciano Strika

Computer Science student at Buenos Aires University, Sr Data Scientist at MercadoLibre. I write about Machine Learning and Data, and love NLP and languages.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store