The Dad Joke Generator
Automatically generating bad jokes.
(If you’re just here for the generated jokes, scroll down to the bottom!)
I thought: what is super easy to build, yet would still get an approving chuckle if someone found it on my github page? Obviously, a Dad Joke generator.
The first thing I did, was think: how exactly do dad jokes work? Let’s make up a few random dad jokes (these are pretty good, right?):
What do you call a greedy animal living in the sea?
A selfish.Which australian animal is known for it’s southern hospitality?
A koalabamaWhat do you call a freezing, chicken-like creature?
A cold turkey.
Alright, so let’s try to create three types of jokes that follow this pattern.
Part 1: The dataset
I first tried to create a dataset of words and their definitions. My first instinct was to take a dictionary, but dictionary results are much too technical and, frankly, confusing. Look at this example for fish:
fish
any of various cold-blooded, aquatic vertebrates, having gills, commonly fins, and typically an elongated body covered with scales.
(loosely) any of various other aquatic animals.
the flesh of fishes used as food.
(Informal) a person.
a long strip of wood, iron, etc., used to strengthen a mast, joint, etc.
Alright, so how about wikipedia?
Fish are gill-bearing aquatic craniate animals that lack limbs with digits
Okay, so then my “selfish” joke would become something like:
What is greedy and are gill-bearing aquatic craniate animals that lack limbs with digits?
A sel-fish
That’s still a bit too technical for me. So where could I find descriptions of words that even a child would understand? Well, by using a children’s dictionary of course!
Fish: an animal that lives in water and has fins for swimming and gills for breathing
That’s just perfect! That would turn my joke into:
What is greedy and an animal that lives in water and has fins for swimming and gills for breathing?
A sel-fish
Alright, that’s good enough. I created a dataset of words by scraping about 2000 words from that children’s dictionary to generate something like this:
length:noun:the distance from one end of a thing to the other.
good:adjective:having qualities that are desired.
dog:noun:a furry animal with four legs, a pointed nose, and a tail.
prisoner:noun:a person who is held in a jail or prison while on trial or after being sentenced for a crime.
Part 2: Catching a sel-fish
Okay, so now that we have a dataset, the first type of joke was pretty simple: looking at all possible pairs of our words, find words that end or start with the other word.
for word1,desc1 in words:
for word2,desc2 in words:
if word2.startswith(word2):
print(f"What do you call a kind of {word1} that is {desc2}?")
print(f"a {word1+word2}")
Which generated something like this:
What do you call a kind of leg that is a group of people within a government that has the power to make or change laws?
A legislature
This wasn’t really working for me, and I also suddenly realized that joke 1 and joke 2 actually followed the same pattern: overlap between the words (whether whole or partial). So instead I wrote something that combined both.
Part 3: Going to koalabama
Alright, so both for “koalabama” and for “selfish”, the key is that they have overlapping parts of the words. The only difference is that for selfish, the word “fish” is entire part of the word “selfish”, whereas “koala” and “alabama” both have characters that are not in the other word.
Regardless, I rewrote the logic a little bit to something like this:
for word1,desc1 in words:
for word2,desc2 in words:
if overlap(word1, word2):
print(f"What do you call a kind of {word1} that is {desc2}?")
print(f"a {combine(word1, word2)}")
Which yielded this:
What do you call a kind of energy that is a tiny section of a chromosome?
A genenergyWhat do you call a kind of tail that is a small item; a particular?
A detailWhat do you call a kind of anniversary that is a polite and honorable man?
A gentlemanniversaryWhat do you call a kind of era that is a play in which all or most of the words are sung and the music is played by an orchestra?
An opera
Alright, that’s working! the jokes are pretty lame, but that’s what dad jokes are supposed to be.
Part 4: Going cold turkey
So that's two of our first jokes tackled, let's tackle the third one. The funny thing about a “cold turkey” is that, aside from being “cold” and a “turkey”, it’s also something else entirely. So what we’re looking for is a set of words which together have another, third meaning.
Given our limited dataset, let’s see what words we can combine to find a third word that is also in the data:
cat fish catfish
door way doorway
life time lifetime
share holder shareholder
frame work framework
cup board cupboard
fire wood firewood
mess age message
pass age passage
work shop workshop
percent age percentage
bed room bedroom
car pet carpet
rail way railway
cover age coverage
class room classroom
land lord landlord
sea lion sea lion
sea son season
blue whale blue whale
birth day birthday
rain forest rain forest
bath room bathroom
week end weekend
Okay, not so many combinations there, but we can create jokes with them, and increase the size of the dataset later. This is the pseudocode that I used to create the next jokes:
for word1,desc1 in words:
for word2,desc2 in words:
for mix in [word1+word2, word1+"-"+word2, word1+" "+word2]:
if mix in words:
print(f"What do you call {combined_desc(desc1, desc2)}?")
print(f"a {mix}")
Which resulted in:
What do you call a small, furry mammal with whiskers, short ears, and a long tail that lives in water and has fins for swimming and gills for breathing?
A catfishWhat do you call an opening through which one enters or leaves a room or building and a road or path leading from one place to another?
A doorwayWhat do you call the state of being that can never be turned back?
A lifetime
The catfish one is funny to me, but the others are not. Still, it works pretty well.
Part 5: Forced chuckles
Alright, so putting everything together, let’s generate a bunch of dad jokes:
What do you call a kind of infant that is a soft, light gray metal that is one of the chemical elements?
A tinfant (tin+infant)What do you call a kind of attendance that is a white or yellow oily substance found in some parts of animals or plants?
A fattendance (fat+attendance)What do you call the color of a clear sky which lives in the water?
A blue whale (blue+whale)What do you call a kind of airplane that is to put in good condition again after damage has been done; fix?
A repairplane (repair+airplane)What do you call a kind of mountain that is measure; quantity?
An amountain (amount+mountain)What do you call the solid part of the earth's surface and a person who rules?
A landlord (land+lord)What do you call a kind of plaintiff that is an act of complaining?
A complaintiff (complaint+plaintiff)What do you call an automobile and a tame animal people keep in their homes as a companion or for pleasure?
A carpet (car+pet)What do you call a kind of intervention that is the season of the year between autumn and spring?
A wintervention (winter+intervention)What do you call a kind of cancer that is great value; importance?
A significancer (significance+cancer)What do you call a kind of spectrum that is the state or condition of being thought of with honor or admiration; such admiration itself?
A respectrum (respect+spectrum)
I mean, they’re not as good as my own jokes, but they’re pretty close ;)
Next steps could include:
— making use of the part-of-speech of a word, such as noun or verb.
— expanding the dataset with more words.
— setting up a simple website that shows random dad jokes.
— applying NLP to turn sentences like “What do you call an automobile and a tame animal people keep in their homes as a companion or for pleasure?” into “What do you call an automobile which people keep in their homes as a companion or for pleasure?”
You can find the code (and all 1710 generated jokes) here: https://github.com/paulluuk/DadJokes