Project- Poor man’s rekognition
Mentor- Johannes Lochter
Organization- CCExtractor Development
Org admin- Carlos Fernandez
May 27 — June 1 = complete use-cases 1,2 and 3 (completed)
June 3 — June 15 = complete use-cases 4 (completed)
June 17 — June 19 =1st report (completed)
June 20 — July 3=complete use-case 5(completed)
July 4 — July 6=2nd report(completed)
July 8 — July 17 = complete use-case 6(completed)
July 18- July 31 = Web app (completed)
August 1 — August 3 = writing report 3 (completed)
This time period was probably the most difficult part, here I had to write code for facial expression recognition (facial analysis) as well as make a web app carrying out all the above mention cases.
USE-CASE 6 (FACIAL EXPRESSION RECOGNITION)
Facial expression recognition is a technique used to identify the mood of a person based on facial expression and that too for a machine because we as a human still can’t guess. However jokes apart, this could really be useful in multiple ways. Imagine yourself entering into your house and based on your facial expression (mood), a song starts playing in the background or you are calling your wife/husband but before she/he picks up the call, you get a notification about their mood. Sounds awesome! right? But trust me, a machine needs a lot of processing power to do that fast, so don’t expect all that into your daily-life real soon.
Machine Learning is all about training the data and then executing it on the sample to get the result and that’s what I have done here.
Collecting raw data (pictures)
- Images Source-1: Link
2. Images Source-2: Link
3. Images Source-3: Link
I must say, you will need a lot of manual work to do. I created 6 sub-directory naming the expression under one folder.
Expressions which I decided to go with are Angry, Disgust, Fear, Happy, Sadness and Surprise.
Facial recognition needs a lot of data to train. I just trained it for like 100. But with 100 images the module was working fine.
Before cropping you also need
haarcascade_frontalface_alt.xml, which can be found here
Cropping the images: (crop.py)
This process will clean the images for training purposes. I personally skipped this because my data was already cropped and training ready.
Training the data: (retrain.py)
For this purpose, I’m using the Mobilenet Model which is quite fast and accurate. To run the training, hit the got to the parent folder and open CMD/Terminal here and hit the following:
python retrain.py --output_graph=retrained_graph.pb --output_labels=retrained_labels.txt --architecture=MobileNet_1.0_224 --image_dir=images
Here retrained_graph and retrained_labels are the trained file which will be used to for detection and images is the directory where all your data are stored.
Now just run this command and everything will be up and running.
Let’s go through Label.py
import label_imagesize = 4# We load the xml file
classifier = cv2.CascadeClassifier('F:/dd/Library/etc/haarcascades/haarcascade_frontalface_alt.xml')im = cv2.imread('C:/Users/Faiz Khan/Desktop/ddd/facial expression/test/3.jpg', 0 )
#im=cv2.flip(im,1,0) #Flip to act as a mirror# Resize the image to speed up detection
mini = cv2.resize(im, (int(im.shape/size), int(im.shape/size)))# detect MultiScale / faces
faces = classifier.detectMultiScale(mini)# Draw rectangles around each face
for f in faces:
(x, y, w, h) = [v * size for v in f] #Scale the shapesize backup
cv2.rectangle(im, (x,y), (x+w,y+h), (0,255,0), 4)
#Save just the rectangle faces in SubRecFaces
sub_face = im[y:y+h, x:x+w]FaceFileName = "test.jpg" #Saving the current image for testing.
text = label_image.main(FaceFileName)# Getting the Result from the label_image file, i.e., Classification Result.
text = text.title()# Title Case looks Stunning.
font = cv2.FONT_HERSHEY_TRIPLEX
cv2.putText(im, text,(x,y), font, 1, (255,0,0), 2)
# Show the image
key = cv2.waitKey(10000)
You can see here, it’s basically detecting the face and labeling and nothing special but what’s that import label_image. Well, magic is happening there.
from __future__ import absolute_import
from __future__ import division
from __future__ import print_functionimport argparse
import timeimport numpy as np
import tensorflow as tfdef load_graph(model_file):
graph = tf.Graph()
graph_def = tf.GraphDef()with open(model_file, "rb") as f:
tf.import_graph_def(graph_def)return graphdef read_tensor_from_image_file(file_name, input_height=299, input_width=299,
input_name = "file_reader"
output_name = "normalized"
file_reader = tf.read_file(file_name, input_name)
image_reader = tf.image.decode_png(file_reader, channels = 3,
image_reader = tf.squeeze(tf.image.decode_gif(file_reader,
image_reader = tf.image.decode_bmp(file_reader, name='bmp_reader')
image_reader = tf.image.decode_jpeg(file_reader, channels = 3,
float_caster = tf.cast(image_reader, tf.float32)
dims_expander = tf.expand_dims(float_caster, 0);
resized = tf.image.resize_bilinear(dims_expander, [input_height, input_width])
normalized = tf.divide(tf.subtract(resized, [input_mean]), [input_std])
sess = tf.Session()
result = sess.run(normalized)return resultdef load_labels(label_file):
label = 
proto_as_ascii_lines = tf.gfile.GFile(label_file).readlines()
for l in proto_as_ascii_lines:
return labeldef main(img):
file_name = img
model_file = "retrained_graph.pb"
label_file = "retrained_labels.txt"
input_height = 224
input_width = 224
input_mean = 128
input_std = 128
input_layer = "input"
output_layer = "final_result"parser = argparse.ArgumentParser()
parser.add_argument("--image", help="image to be processed")
parser.add_argument("--graph", help="graph/model to be executed")
parser.add_argument("--labels", help="name of file containing labels")
parser.add_argument("--input_height", type=int, help="input height")
parser.add_argument("--input_width", type=int, help="input width")
parser.add_argument("--input_mean", type=int, help="input mean")
parser.add_argument("--input_std", type=int, help="input std")
parser.add_argument("--input_layer", help="name of input layer")
parser.add_argument("--output_layer", help="name of output layer")
args = parser.parse_args()if args.graph:
model_file = args.graph
file_name = args.image
label_file = args.labels
input_height = args.input_height
input_width = args.input_width
input_mean = args.input_mean
input_std = args.input_std
input_layer = args.input_layer
output_layer = args.output_layergraph = load_graph(model_file)
t = read_tensor_from_image_file(file_name,
input_std=input_std)input_name = "import/" + input_layer
output_name = "import/" + output_layer
input_operation = graph.get_operation_by_name(input_name);
output_operation = graph.get_operation_by_name(output_name);with tf.Session(graph=graph) as sess:
start = time.time()
results = sess.run(output_operation.outputs,
results = np.squeeze(results)top_k = results.argsort()[-5:][::-1]
labels = load_labels(label_file)for i in top_k:
It’s the unsung hero.
Now, moving forward.
WEB APP USING FLASK:
- ) Here I have defined my flask app. I have also defined a static folder. All the files shown to the user is inside the templates folder.
2.) The below screenshot returns the “index.html” file inside templates folder. So If anyone opens “/” or 127.0.0.1:5000/ page, index.html will open.
3.) This code executes when someone open /imageorvideo.html page. Here, first, I get the parameter of the URL. The URL will look like this:
So, the “dowhat” variable will have the value faceoreye. This is how I know, on which link the user clicked.
If the dowhat contains celebrity, then we pass maleorfemale variable as “yes” to the imageorvideo.html file. The imageorvideo.html file shows the radio buttons in that case.
Otherwise, I return imageorvideo.html template, which will let users upload image or video
4.) In imageorvideo.html I defined a form that asks users to upload the image. The action contains the URL of the form + the dowhat variable. It contains a value according to the link a user clicked. For example, if someone clicked on “object” link the URL will be http://127.0.0.1:5000/object.html and the code of @app.route(‘object.html’), will get executed.
5.) I have 7 things like face or eye, celebrity, object detection, read text, etc. So I made 7 functions for each one. All those function codes is very similar.
Here, if someone open “faceandeye.html”, the below code gets executed.
The request.files[‘filetoupload’], takes the form image/video and I save it using f.save function.
The extension line gets the file name, finds the extension. If the extension is of an image, we execute image function otherwise we execute video function.
At line 467, the code checks, if someone submits a form (or upload an image). If this condition is not POST, then it redirects the user to index.html.
The function below the @app.route will get executed, and whatever it returns will be displayed to the user.
6.)Using this I run the code, and debug is True. I can also change it to False, when in production.