Project 1: Say Cheese

Published in

Camping with python

8 min readJul 9, 2019

Hey buddy, as said earlier here is my very first project for you, which you can do it yourself and gain some knowledge about how a computer processes an image? and also it’s quite a simple task to do.

Well, I have faced a problem while capturing a photograph or taking a selfie. “Many of us are bad at showing our teeth at the right time of taking any shot.” I think that most of you have faced this problem, especially boys😎.

Most of the smartphone has this functionality of capturing a photo when you smile. But, We are here to learn and explore. So, Let’s think about this problem in the first place. Can we automate the process of clicking photos? Can we make the camera capture our images when we smile? Yes. So, you might have got the idea of the project.

Idea: To write a python program that can capture your images when you smile.

Note: “We are not building anything state of the art(but will try), we are here to gain something and have experience of working in a project. Many developers will say that it is too easy, yes it is and it will be in the future. The thing is that you are not my audience.”

The focus of this project and future projects is not spoon-feeding. You can find the required code every time you google, but you need to learn how to read and understand the documentation, stack overflow answers, gists, blogs, etc. I will not explain the whole thing here. You need to google many terms and understand it by yourself. I will also provide the necessary links for reference.

Now, As we proceed with the project we need to list down all the sub-tasks. If you are working in a quite big project, it may feel impossible at first but you need to break it down into sub-tasks and then proceed step by step.

So, list down all the things you have in your mind about this (DIY):

Step 1: How to access the camera in python?

Step 2: How does a digital Image works? and the same for Video.

Step 3: How to access frames(Actually Images) of the video?

Step 4: How to find the location of faces in the image?

Step 5: How to find the mouth on the face?

Step 6: How to detect whether the person is smiling?

Step 7: How to save the image in the storage?

Let’s do it

At the beginning of every project, I would recommend you to make a virtual environment for it and continue to work in that space.

Step 1: How to access the camera in python? I hope you have googled it first before scrolling.

Note: Google it yourself first, then proceed with the task.

There is an awesome library called OpenCV. First Install OpenCV using pip

(venv)$ pip install opencv-python
(venv)$ pip install opencv-contrib-python

OpenCV has a class called VideoCapture which can do the heavy tasking for us. We need to pass an argument such as the 0,1,2… for accessing the camera(primarily 0 for your webcam) or We can pass the location of the video file and then we can process that particular video file.

#importing the library
import cv2cap = cv2.VideoCapture(0) #Creating the object

You can now access the camera with the cap object we have just created.

Step 2: How does a digital Image works? and the same for Video.

This is something theoretical and I think you can understand it on your own using google as a tool.

About Computer Vision : [Link]

Step 3: How to access frames(Actually Images) of the video?

Now comes the fun part where you can see your face using your code. There is a function called read in the VideoCapture class. It returns the next frame of the video at each call of the function.

We also have imshow function in 0penCV which can display the image or frame into a window.

Using these two function we can make our mirror camera.

Note: Try to write the code on your own.

Run the script and see the result.

Now, here is a real challenge for you. You have to search each keyword on google and then, try to elaborate on it in your brain. Also, write a brief comment in the code for each.

Reference for line number 13 : [Link]

Step 4: How to find the location of faces in the image?

Congratulations on getting here in the project most people don’t even start. So, Congrats!!!.

Till now we are getting the video feed showed up on a window.

But the question is how to find the location of faces in the image???? By surfing over the internet you can find many terms like Haar Cascade, HOG(Histogram oriented gradients), CNN(Convolutional Neural Network), Viola-Jones Algorithm, etc.

I cannot go through all these algorithms and methodologies in this article but will try to cover most of them in the future.

Better explained here: [link] by Maël Fabien Highly recommended.

We will be using the HOG (Histogram oriented gradients) method for the detection of the faces. The feature of HOG face detection is in the library called Dlib.

Paper : https://lear.inrialpes.fr/people/triggs/pubs/Dalal-cvpr05.pdf you can go through this if you are interested in HOG functionality in detail.

Recommended Blog : [link] By Adam Geitgey

Some theory: The idea behind HOG is to find the gradients which can be used as a feature vector, and feed it into any classification algorithm.

First, start the process by converting our image black and white(because without colors you call still see faces). Then we’ll look at every single pixel in the image one at a time. For every pixel, we need to look at its neighboring pixel.

And the task is to find out how dark the current pixel is, compared to its neighboring pixels. An arrow can be drawn for representation.

By repeating the process for each pixel in the image, breaking the image into small squares of 16x16 pixels and counting the gradients for each block, we will end up like this.

As we have the pattern generated for the image, we can now compare it with the HOG pattern generated from lots of face images.

Time to code:

Note: Surf over the internet to find out, how to use dlib for face detection.

install dlib:

(venv)$ pip install dlib

First, we need to collect the coordinates of the faces, then we need to draw the rectangle over the face so that we can visualize it.

Run this script and see the result.

Step 5: How to find the mouth on the face?

Reference: https://www.pyimagesearch.com/2017/04/10/detect-eyes-nose-lips-jaw-dlib-opencv-python/ Give it a read.

We can find facial landmarks for the nose, lips, eyes, etc by using the dlib library. The model inside the detector is trained on the dataset [Link]. Below you can see 68 coordinates of the facial map.

We need to grab the lips coordinates and set a parameter on what basis we can decide that person is smiling or not. We can extract the lips coordinates using dlib.shape_predictor(“ shape_predictor_68_face_landmarks.dat”). Now, let’s use this in our code.

Download and extract this shape_predictor_68_face_landmarks.dat file : [Link]

Install imutils for converting the shape_predictor object to numpy array.

(venv)$ pip install imutils

First, apply the predictor to the region of interest, which is within face location coordinate and then, use imutils.face_utils.shape_to_np() to convert the dlib.full_object_detection object to numpy array. Now, the coordinates of the mouth are on index [48: 68]. These are the required values for evaluation.

Reminder: Do not copy-paste the code. Also, You have to search every unfamilier keyword on google and try to elaborate on it in your brain. Also, comment in brief.

Step 6: How to detect whether the person is smiling?

Recommended blog: [Link]

We can use the mouth aspect ratio: that is average height divided by average width.

MAR

We can calculate this quantity and by using some threshold value we can determine whether the person is smiling or not.

To calculate the euclidean distance, we need to install scipy.

(venv) $ pip install scipy

Code: Write a function that accepts the coordinates and returns the MAR value. Then, we can use some threshold value got by trial and error to determine whether the person is smiling or not.

Watch the terminal while you smile or be neutral.

Step 7: How to save the image in the storage?

Kudos for making it into the last step of the project.

We can detect a smile now, but how can we store the image?? and will there be bottleneck situation when we try to save images at a fast rate than the writing speed of the disk because it is continuously detecting the smile in every frame.

Try to solve this on your own because this is where you need to apply your mind and try some googling also.

Or Just use this and our project’s objective is done.

cv2.imwrite(“image.png”,img)

Note: You can use different techniques like multithreading and queues for storing images. I will cover the multithreading techniques for asynchronous tasks in future articles.

Reference: [Link] Read this blog and try to think about how can you solve the above-stated problem.

Code: https://github.com/Akash16s/Say-Cheese

Congratulations on successfully completing the project with me. I hope it helped you in any sort. 😄

Thanks for being here with me till the end. I am currently in the starting days of writing articles so, please comment below about the article, your comments mean a lot to me.

You can follow me on Twitter, GitHub, Medium, LinkedIn.

Don’t forget to follow Camping with python.

If you have any doubts regarding this project or have any other issue or you want to suggest something, you can comment below.

The next article will be up soon until then keep practicing.

Project 1: Say Cheese

Let’s do it

Time to code:

Thanks for being here with me till the end. I am currently in the starting days of writing articles so, please comment below about the article, your comments mean a lot to me.

Written by Akash Srivastava