The Dad Joke Generator

Automatically generating bad jokes.

Paul-luuk Profijt
The Startup
6 min readAug 12, 2020

--

Which australian animal is known for it’s southern hospitality?

(If you’re just here for the generated jokes, scroll down to the bottom!)

I thought: what is super easy to build, yet would still get an approving chuckle if someone found it on my github page? Obviously, a Dad Joke generator.

The first thing I did, was think: how exactly do dad jokes work? Let’s make up a few random dad jokes (these are pretty good, right?):

Alright, so let’s try to create three types of jokes that follow this pattern.

Part 1: The dataset

I first tried to create a dataset of words and their definitions. My first instinct was to take a dictionary, but dictionary results are much too technical and, frankly, confusing. Look at this example for fish:

fish
any of various cold-blooded, aquatic vertebrates, having gills, commonly fins, and typically an elongated body covered with scales.
(loosely) any of various other aquatic animals.
the flesh of fishes used as food.
(Informal) a person.
a long strip of wood, iron, etc., used to strengthen a mast, joint, etc.

Alright, so how about wikipedia?

Fish are gill-bearing aquatic craniate animals that lack limbs with digits

Okay, so then my “selfish” joke would become something like:

That’s still a bit too technical for me. So where could I find descriptions of words that even a child would understand? Well, by using a children’s dictionary of course!

Fish: an animal that lives in water and has fins for swimming and gills for breathing

That’s just perfect! That would turn my joke into:

Alright, that’s good enough. I created a dataset of words by scraping about 2000 words from that children’s dictionary to generate something like this:

What do you call a greedy animal living in the sea?

Part 2: Catching a sel-fish

Okay, so now that we have a dataset, the first type of joke was pretty simple: looking at all possible pairs of our words, find words that end or start with the other word.

Which generated something like this:

This wasn’t really working for me, and I also suddenly realized that joke 1 and joke 2 actually followed the same pattern: overlap between the words (whether whole or partial). So instead I wrote something that combined both.

Part 3: Going to koalabama

Alright, so both for “koalabama” and for “selfish”, the key is that they have overlapping parts of the words. The only difference is that for selfish, the word “fish” is entire part of the word “selfish”, whereas “koala” and “alabama” both have characters that are not in the other word.

Regardless, I rewrote the logic a little bit to something like this:

Which yielded this:

Alright, that’s working! the jokes are pretty lame, but that’s what dad jokes are supposed to be.

What do you call a freezing, chicken-like creature?

Part 4: Going cold turkey

So that's two of our first jokes tackled, let's tackle the third one. The funny thing about a “cold turkey” is that, aside from being “cold” and a “turkey”, it’s also something else entirely. So what we’re looking for is a set of words which together have another, third meaning.

Given our limited dataset, let’s see what words we can combine to find a third word that is also in the data:

Okay, not so many combinations there, but we can create jokes with them, and increase the size of the dataset later. This is the pseudocode that I used to create the next jokes:

Which resulted in:

The catfish one is funny to me, but the others are not. Still, it works pretty well.

Part 5: Forced chuckles

Alright, so putting everything together, let’s generate a bunch of dad jokes:

I mean, they’re not as good as my own jokes, but they’re pretty close ;)
Next steps could include:
— making use of the part-of-speech of a word, such as noun or verb.
— expanding the dataset with more words.
— setting up a simple website that shows random dad jokes.
— applying NLP to turn sentences like “What do you call an automobile and a tame animal people keep in their homes as a companion or for pleasure?” into “What do you call an automobile which people keep in their homes as a companion or for pleasure?”

You can find the code (and all 1710 generated jokes) here: https://github.com/paulluuk/DadJokes

--

--

Paul-luuk Profijt
The Startup

Data Engineer; Competitive Programmer; Far Leftist