The DAP Journey: Having Pun!

It’s a Punderful Life — Analytics of the English Language

Published in

SMUBIA

9 min readJun 17, 2019

In this Medium series, BIA extracts the introspection of our Data Associates as they recall their academic exploration. This post features an analytics project on puns, directed by Tammi and Wa Thone, supervised by Gabriel.

Introduction

“If you wear cowboy clothes are you ranch dressing?”

The English Language is chock full of inconsistencies and ambiguities that we often twist into the best/worst cheesy dad jokes. Over this semester, team Dappity Dap (creative, we know), a team of two from SMU’s Business Intelligence Analytics Club decided to make puns the subject of our project.

Tammi, SIS Y1

“When I first signed up for DAP, I had no idea what I was in for — but I never looked back since. Despite the struggles we had with complicated mathematics (dummy variables cough cough) and a self-chosen project topic that was way beyond what we thought we could do, we had a great mentor and an awesome community that had our backs. In the end, I’ve learned more from this programme than anything else, and I’m excited to continue my journey with my fellow DA members!”

Wa Thone, SIS Y1

“As a year 1 without prior technical background, I wanted to explore each technical field offered by SMU to help me choose my major in the future. So, through a mixture of chance and a desire to explore the analytics / machine learning field, I found out about DAP and decided to try it out!”

Together, we wanted to explore how machines learn to understand our language. In doing so, we aimed to create an algorithm that could detect puns from a bowl of random sentences. With that, let the journey begin:

The Challenge

Puns come in a wide range of flavours, all of which relied on different factors for humour and wit. With help from our mentor, we created a data pipeline that would help us divide and conquer.

Identify characteristics of puns
Target each characteristic in isolation (if possible)
Develop features
Create a final algorithm that would be an amalgamation of all the different features.

Identifying Characteristics

Puns generally fall into three main categories. Namely:

Meaning: Words with similar, related or double meanings
“My phone has had to wear glasses ever since it lost its contacts.”
Sound: Words that sound similar but mean different things
“How was Rome split in two? With a pair of Ceasars.”
Association: Words that are associated with each other

Already, a big challenge was that puns in either the “Meaning” or “Sound” category also often fell in the “Association” category.

For example, in the pun about Rome splitting in two, the ‘ha-ha’ moment comes from the word “Ceasars” and how it not only sounds like “Scissors” that can cut things in two, but also how the Ceasars are known for creating Rome as it is today. How would a machine understand these associations without the information you and I have about World History? It can’t.

To simplify our problem, we decided to isolate puns that fell almost solely in either the Meaning or Sound category to start — nevertheless, it was near impossible to completely avoid associations in our puns.

The “Punniest” List

Before we could start, we needed a training and test set. We were unable to find any data set on Kaggle that matched our requirements so we created our own excel sheet of puns and non-puns (we used random quotes) harvested via some manual leg work and spiders on the web.

In manually combing through our pun list, we discovered that in most cases puns relied on just a few main words to convey its puniness. Hence, we used NLTK’s tokenizing, stopword removing, and lemmatizing features as well as regex to filter and strip down each sentence into its core parts — creating our final train and test set.

Many, Many, Many Meanings

One of the most comprehensive and widely used Python libraries dealing with the meanings and similarities of words is NLTK’s WordNet Corpus.

WordNet uses Synset’s to group Synonyms of words based on their meaning

Related Meanings

The first idea we had to analyse Meaning puns was to see if we could make use of WordNet’s related words functions (Synonyms, Hyponyms, etc) to connect related words in a pun.

“What do you call a belt with a watch on it? A waist of time.”

The hope was that we would be able to connect the word “belt” with “waist” and “watch” with “time”. In fact, we hypothesized that puns were more likely to have these types of connections than normal sentences.

We experimented with taking all related words that WordNet provided, versus simply using the Synonym function and found that the result almost always ended up being the same — not at all promising. The features we had created weren’t as correlated with whether a sentence was a pun or not as we had hoped.

Regardless, we pressed on.

We used the “related pair count” in a sentence to create a predictor, playing with how sensitive it was — i.e. if a sentence had X number of pairs of related words, the sentence was a pun.

It turned out, our method was much better at picking up on non-puns than puns, than the other way round, but that was still helpful for our purposes.

Word Embeddings

The next thing we tried was gloVE, Word2vec and WordNet Similarity checks to help measure our similar words in a sentence was.

To oversimplify, gloVE and Word2vec essentially use vectors (e.g. (0, 0, 2) ) to represent words and in doing so, provide some interesting mathematical comparisons between different words. One of the most popular and interesting examples is that the vector for “King” — ”Man” + “Woman” = “Queen”.

Image result for word embeddings — Not only are vector positions important, so is the distance between different sets of words.

Our hope was that these word embeddings would be able to provide similarly interesting insight into our problem.

First off, word embeddings can be fickle. It’s difficult to get a comprehensible interpretation of numbers in relation to each other. There’s just too many different ways to interpret the same data. Therefore, we focused purely on vector distance as our similarity measure.

Another major issue in this approach was that with the number of meanings each word had in the WordNet corpus (the word ‘glass’ has more than 10), it was difficult to get accurate similarity ratings.

To address this, we tried limiting definitions by position tags provided by NLTK’s pos_tag function to some effect, but it was not sufficient.

Instead, the more effective method was the Lesk algorithm — an algorithm that essentially compared the definitions of two words, finding the number of words in common to decide how similar the two words were.

Lesk, though a simple algorithm, was much more effective than expected and was able to accurately identify which meaning of a word was being used in each context at least 80% of the time.

With this, we were able to get more accurate similarity ratings on our puns — and better features.

Sounds About Right

To tackle puns that use homophones, which are words with similar sound but different meanings, we thought of finding synonyms / related words of the words in the sentence. We hypothesized that if any of these generated words sounds the same as a word in the sentence, it is a pun.

“The pony had a raspy voice. It was hoarse.”

In this case, “hoarse” means the same as “raspy”, but is also related to “pony” as it sounds like “horse”. Thus, if we generate words related to “pony”, one of them will be “horse”, which sounds similar to “hoarse”, indicating its punniness. Ding ding!

However, when we tested this hypothesis on a few puns, we found that the related words generated by WordNet usually do not include the synonym we desired.

“Lions eat their prey fresh and roar.”

In this case, our hypothesis requires one of the related words of “fresh” to be “raw”, which sounds similar to “roar”. However, our program was unable to generate that. In an attempt to solve this, if we were to increase the number of related words generated, there is a risk of unintentional sound matching.

“raw” not found in related words of “fresh”

As such, we decided to jump ship and try another method.

The Second Approach

Instead of finding related words and then matching their similar sounds in the previous method, here we do the reverse. We first find the most related / similar pair of words in the sentence using GloVe.

“Religious lions get down to their knees to prey.”

GloVe returns that “lion” and “prey” is the most similar pair of words. Then, we find similar sounding words of this pair using Double Metaphone as our phonetic library.

Similar sounding words of ‘prey’

We hypothesized that if there is a significant jump in the similarity of meaning between one word in a sentence and another word’s homophone, it indicates a pun. In this case, the original words of the sentence “religion” and “prey” are not similar in meaning, but “religion” and “pray” are. This jump in similarity suggests the sentence to be a pun.

Moving forward, we used the distribution of similarity jump for all pairs of words and used the descriptive statistics as parameters for training models.

Confusion matrix of SVC and LDA models categorising puns and non-puns

It turns out the second approach works well in differentiating puns and non-puns that uses homophones. We move forward to combining the models so that it works for both types of puns tackled above.

Ensembling

We have 3 models trained for puns using different meanings, which are LDA, LR, and SVM, and 2 models trained for puns using homophones, namely LDA, and SVM. Ensemble learning uses the predictions of these 5 models and combines them to issue the final verdict: whether it is a pun. In our final model, when 4 out of 5 of these models predict that the sentence is a pun, then it is a pun.

Predictions by the final model of both type of puns

Associations

Alas, by the time we had muddled our way through Sound and Meaning puns, we were out of time to focus purely on solving the Association pun problem. We did, however, have some ideas that we were able to look into on a cursory level.

For one, we attempted using Network X. In time, we might have been able to glean more from the powerful library. But for now, we counted the number of nodes (red dots) and edges (lines connecting nodes) in each sentence — perhaps if there were more nodes and edges in a sentence, it would have a higher likelihood of being a pun, or so we theorized.

The Final Count(down)

By now, we had about 17 different features that each told a slightly different story about each sentence/pun. By combining them together through something known as ensemble learning, we were able to achieve a final accuracy of around 82%!

In the end, while the results we got were above our expectations, there are still a lot of aspects of the pun that we would have liked to explore. Namely, a deeper exploration into tackling Association puns, perhaps with Network X in hand. We also discovered that Recurrent Neural Networks had been used before to good effect in the field of pun detection. Hopefully, we will be able to explore these tools in the future!