Deep Learning

Meme Generator (MemeGen) Using Deep Learning

MemeGen is a web application that generates memes automatically based on the expression of the human face in the given image. All the user has to do is upload a clear picture, and the application analyses the expression of the person in the image producing amusing memes. To use the web application, click here.

Here are a few images generated by MemeGen:

Image for post
Image for post

How it works:

  • Undersampling
  • Oversampling
  • Image Augmentation
  • Convolutional Neural Networks
  • Auto ML

It is little wonder that there was a vast image dataset playing its part behind the scenes. Image datasets bring in the importance of CNN with it. If you are new to the term, keep reading!

Balanced data:

Image for post
Image for post
This is how an animated version of balanced data might look like

So what happens if they aren’t equal?

Image for post
Image for post
Bet you don’t want your model in his place!

Now the question is, how do we overcome this problem?

So there are two feasible solutions, Oversampling and Undersampling! In simple words,

  • If we reduce the data of the majority class to match the amount of data of minority class and reduce the samples, the process would be called undersampling.
  • If we do not want to miss out on the quality data we’ve caught hold of, and somehow manage to increase the samples of the other class which has less data, we end up increasing the number of samples in total, which is termed as oversampling.

Moral of the story, oversampling and undersampling in data analysis are techniques used to adjust the class distribution of a data set.

Image Augmentation:

Image for post
Image for post
Getting 6 different images out of one

Now that we have enough data and are ready to build a model, which procedure do you think would give us the best output? A combination of deep learning and images immediately rings a familiar bell in our minds, absolutely, Convolutional Neural Networks!

CNN is a type of Artificial Neural Network, the other types being MLP, RNN, Shallow neural networks, sequence-to-sequence models etc. ANN is inspired by the way the biological nervous system processes information. It is composed of large number of highly interconnected processing elements (neurons) working in unison to solve a specific problem. The model of MemeGen, is based on CNN algorithm, so let’s see in brief what exactly it does.

Convolutional Neural Networks (CNN):

Image for post
Image for post

CNN can be understood by clearly associating it with two crucial elements:

1. Convolutional Layer
2. Kernel K

The convolutional layer is the core building block of CNN, and it helps with feature detection. Kernel K is a set of learnable filters and is spatially small compared to the image but extends through the full depth of the input image.

Here is a very satisfying metaphor we found on the Internet about how CNN works:

If you were a detective and you come across a large image or a picture in the dark, how will you identify the image? You will use a flashlight and scan across the entire image. This is exactly what we do in the convolutional layer. Kernel K, which is a feature detector, is equivalent to the flashlight on the image I, and we are trying to detect features and create multiple features maps to help us identify or classify the image. We have multiple feature detectors to help with things like edge detection, identifying different shapes, bends, or different colors, etc.

CNN has several advantages over the conventional image classification methods, one of them being transational invariance which typically means that it identifies an object even if it is translated, rotated or slightly deformed.

This was just a brief on CNN. It seems simple, but in reality, involves a lot of mathematical calculations and understanding of neural networks. If you want to know more about CNN, click here.

Auto ML:

Image for post
Image for post
Depicting how millions of parameters are tried out by models automatically to find the best values

In our case, we used 3 different datasets to get the best results.

  • First dataset: Japanese faces
  • Second dataset: Pictures of faces Indian actors
  • Third dataset: Images of faces of famous personalities

Here is an example of how it works:

Image for post
Image for post

A point to notice here is that the model A gives a good precision for Sad expression and 60% of the times it is correct, whereas model C gives an inadequate precision for Sad faces and predicts it to be some other expression predicting approximately 25% of the times right. Hence we assign weights to the models for each class (expressions) to conjure up the best possible results.

For example, for sad expression the weight of model A would be 0.8 and that of B and C would be 0.3 and 0.1 respectively.

This ensures that all the existing models play their part appropriately and provide the most accurate results.

Other tech stack used in MemeGen:

1. Flask as a framework 
2. Chef as Configuration Management tool
3. Google Cloud Platform as Cloud Service

To know what the above terms mean and do, please visit our GitHub repository “COVID19 Feedback Application,” where we have explained it clearly in detail.

The full code can be found in GitHub here.

Conclusion:

Towards AI

The Best of Tech, Science, and Engineering.

Sign up for Towards AI Newsletter

By Towards AI

Towards AI publishes the best of tech, science, and engineering. Subscribe to receive our updates right in your inbox. Interested in working with us? Please contact us → https://towardsai.net/contact Take a look

By signing up, you will create a Medium account if you don’t already have one. Review our Privacy Policy for more information about our privacy practices.

Check your inbox
Medium sent you an email at to complete your subscription.

Thanks to Towards AI Team

Ritheesh Baradwaj Yellenki

Written by

Machine Learning | DevOps. I am a technology lover. I love building new applications-starting from things that I have learned, which come across my daily life.

Towards AI

Towards AI is the world’s leading multidisciplinary science publication. Towards AI publishes the best of tech, science, and engineering. Read by thought-leaders and decision-makers around the world.

Ritheesh Baradwaj Yellenki

Written by

Machine Learning | DevOps. I am a technology lover. I love building new applications-starting from things that I have learned, which come across my daily life.

Towards AI

Towards AI is the world’s leading multidisciplinary science publication. Towards AI publishes the best of tech, science, and engineering. Read by thought-leaders and decision-makers around the world.

Medium is an open platform where 170 million readers come to find insightful and dynamic thinking. Here, expert and undiscovered voices alike dive into the heart of any topic and bring new ideas to the surface. Learn more

Follow the writers, publications, and topics that matter to you, and you’ll see them on your homepage and in your inbox. Explore

If you have a story to tell, knowledge to share, or a perspective to offer — welcome home. It’s easy and free to post your thinking on any topic. Write on Medium

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store