Meme versus machine

Part 1: In the year 2017

Waltteri Vuorimaa
Geek Culture


Back in 2017, I had this website, called Slowmeme. It used an “AI” to generate memes, and they were pretty good. This was in 2017, so long before the meme generators made using those pesky geepeetee twos and threes. I was on the bleeding edge of the meme generation technology.

Everything was great, and the meme business was booming.

Until I had to pull the plug.

This four-part series follows my journey back to the top of the AI meme generation game, and shows you how to do the same.

The O.G. meme AI

Slowmeme was a website that used machine learning to generate memes, based on user input.

The input to the model was the title and the meme template selected by the user, and the output was the meme text. This generated text was superimposed on the meme image, giving us the final end results.

Below are a few examples of Slowmeme’s finest creations:

Memes generated by the last iteration of the original Slowmeme.

I don’t mean to pat my own back, but come on! They were pretty good!

Back in 2017 (and a few years prior to that), the word “meme” was synonymous with the handful of web-famous images, with white, all-caps text superimposed on them. Slowmeme was built to generate specifically this kind of memes, as is evident from the examples above.

The inner workings of the final iteration of Slowmeme were quite simple. At the heart of Slowmeme’s machine learning innards, there were a few thousand LSTM nodes. The basic function of this seq2seq model was as textbook as one might imagine: give the model a sequence of words, and it tries to predict the next word.

I tried to visualize the inner workings of Slowmeme to the graph below.

Graphical explanation of the LSTM/word2vec text generation approach employed in Slowmeme’s last version.

So all in all, a quite simple little thingamajig.

When I was developing Slowmeme, I had a limited amount of training data and not that much computing power. Using the entire lexicon in the meme corpus as input and output tokens would’ve been a) prohibitive computationally, and b) a poor choice for generalization. I tried training a model to predict character-by-character (like e.g. the GPT models do), but that was either overfitting due to the lack of GPU memory or training data, or the output text didn’t really fit the memes in any meaningful way.

These old memes often contained certain some semi-static word structures that repeated from meme to meme (within the same meme template, that is). These one-hot word token models and character-by-character models I tried weren’t really getting these right. As a result, I decided to make life simple for myself: instead of the aforementioned approaches, use word2vec vector representations of words as the inputs and outputs.

You can kinda think of word2vec vectors as the number-form “meaning” of the vectorized word: synonyms are close to each other in the word2vec vector space, whereas completely unrelated words will have a larger distance between them. This allowed for the model to be smaller, while still coming up with somewhat coherent sentences, with some mix-ups between synonyms here and there.

Assuming many of you are not familiar with the slightly outdated tech of word2vec word vectors, it might be a good idea to do some quick and dirty visualization. The snippet below uses 25-dimensional word2vec vector representations of ~10,000 random words, and transforms these 25 dimensions to just two dimensions using principal component analysis. We then plot these 10k words, while also highlighting three distinct word groups: weekdays, food items, and interrogative words.

The code produces the below graph. As we can see, the words from the similar groups are — indeed — neatly at roughly the same locations. I tested the visualization with up to a million words for the visualization, and the separation of the groups was a lot more pronounced, but the picture wasn’t pretty anymore. So I settled for 10k. I guess you get the gist anyways.

A 2D projection of a 25-dimensional word2vec model, made by fitting a PCA model and plotting the transformed word vectors. Only 10k word vectors used in the graph.
A 2D projection of a 25-dimensional word2vec model, made by fitting a PCA model and plotting the transformed word vectors. Only 10k word vectors used in the graph.

So. These word vectors were the input and the output of the LSTM model. The model itself was trained to predict the next words from sequences containing the titles, meme template identifiers and the captions of the memes.

That was it. That was Slowmeme. The app was really simple to create, the generated memes were surreal but fun, and people really liked the whole concept. But what happened to it?

Pulling the plug

Unfortunately for my bleeding-edge meme AI, I was about to move to India for an exchange semester in December of 2017. The all-important GPU back end of my meme generation operation had been on a physical machine running in my closet. And, as I was going to sublet the flat for the duration of my exchange studies, I wasn’t going to be able to self-host the GPU back end anymore.

I looked into renting GPU servers from the cloud, to keep the site running, and the absolute lowest-spec machine able to run the model was upwards of a 50 bucks. And this was from some really shady VPS provider that no longer exists. If I wanted an AWS server specced similarly to my self-hosted hardware, I would’ve been paying like $200+a month for hosting Slowmeme.

So, I’m a poor university student that’s going to study abroad, and want to spend as much money on traveling and beer as possible — what do I do? The idea of spending $50–200+ a month to host a silly meme site with a handful of (faithful) users just didn’t make sense at the moment. Sure, the site was making some AdSense revenue, but not nearly enough to cover the costs of beefy cloud GPUs. The net present value of the enterprise was heavily in the red.

So I decided to pull the plug on Slowmeme.

R.I.P. old Slowmeme.

What are memes, anyway?

The meme game has truly changed in the past five years.

Like, when I was making ye olde Slowmeme, /r/AdviceAnimals was the shit. Nowadays, the amount of different kinds of memes have exploded. Even if we gloss over the whole world of short-form videos (TikTok dances, /r/wallstreetbets’s re-mastered meme versions of The Wolf of Wallstreet, etc…), the different formats and the ways in which people create memes in the form of pictures is a different game as it was in 2017.

These are just a handful of examples I pulled from Reddit:

Examples of modern memes from /r/memes.
Examples of modern memes from /r/memes. From top left to bottom right, credits for the memes go to the following Reddit users: /u/ARTA_THE_KILLER, /r/glizzyMaster108, /u/HeartStoneTV and /u/Meerkat_Mayhem_.

There is no clearly visible common denominator for these. Other than that they try to be funny and that they are pictures. Pictures, that might have some text on them.

A small anecdote regarding the training data I used for the O.G. Slowmeme. There was this bot on Reddit that recognized the captions and meme formats of /r/AdviceAnimals submissions. So, instead of scraping tens of thousands of memes from Reddit and building a text detection pipeline for parsing them, I just downloaded the comments of this bot from Google BigQuery’s Reddit comment dataset. It took like two minutes to come up with the SQL query, and like half a minute for the query to run. So it took me — literally — less than five minutes to gather a decent dataset of all the memes that were relevant back then.

This isn’t the case today. Good luck building a bot that annotates even the quite simple memes shown above. Memes come in these countless different formats, and building a pipeline for generating them isn’t as easy as it was back then.

New Slowmeme

I’m no longer in possession of the Keras models or literally anything else related to the GPU back end doing the actual meme generation. Or if I am, I have no clue where the backups are. And to be fair, the results weren’t that earth-shatteringly good that I shouldn’t just remake the whole thing from scratch.

Which I will, in the coming weeks.

This new god-emperor of AI meme generators will be a lot different than its predecessor. It’s going to be able to continuously learn new templates, and not be limited to the old Advice Animal format. The tricks I use in the process are absolutely not all AI, but I bet it will still be a entertaining journey to the finish line.

The rest of this epic meme saga will be divided to the following topics:

  1. Collecting the data and parsing the meme templates
  2. Generating the memes based on user input
  3. Creating the new web service
The Slowmeme project plan.

The future posts in this series will be a lot more technical than this introductory post, but still written tongue-in-cheek. I do already have working prototypes of most-if-not-all steps, so I’ll be just taking my sweet time to write about the process, pick the best snippets of code, and build the final form of the end result.

If this application of machine learning is interesting to you then subscribe to me, Geek Culture, or bookmark this page. I’ll add links to the topic list above, so you’ll find the rest of the series there.

See you next week!