A simple keras model on my laptop webcam

Jinil C Sasidharan
6 min readApr 22, 2018
Photo by Fabian Irsara

If you already know about deep learning convolutional neural networks and thinking of doing something practical, I hope this post will be really helpful to you.

Here we will build a simple convolution neural network using keras deep learning library to detect “you” whenever you are in front of your laptop.

We will write a small python code to use the laptop webcam and show the video frame on your display. As soon as you come in front of your laptop, the model will detect your presence on webcam video frame and display the video frame in color and when you leave, it will turn to gray color.

My environment details:

  • Mac OS High Sierra
  • Python 3.5.6

And please make sure the below python packages are installed on your system.

  • Open CV (opencv-python)
  • PIL (Pillow)
  • keras (I used the one with tensorflow backend)
  • numpy

Lets start. Any machine learning process mainly involves the below steps.

  • Get the data
  • Build the model
  • Train the model
  • Test the model

Get the data

This is a binary classification problem. So we need 2 classes of data, images with your presence and images without your presence.

class 0 -> you are absent on image

class 1-> you are present on image

And we have to place the data in different directories as given below. This is required when you train the model using keras.

data/train/0 -> all training images with class 0

data/train/1 -> all training images with class 1

data/valid/0 -> all validation images with class 0

data/valid/1 -> all validation images with class 0

Usually validation/test data will be 10-20% of total training data. Here lets generate 4000 training samples (2000 per each class) and 400 validation samples (200 per each class).

The below python program capture_images.py can do the trick for you.

In the above code you can see, the image is resized to 128x128. This is the image size we will use when training the model. So no need to save with big resolution if you don’t need it.

Save this program on you local system and execute it by passing directory name and number of images we need.

python3 capture_images.py data/train/0 2000

This will capture training images for class 0. Your webcam will start running and capture the images and place it in directory data/train/0. This will run till all 2000 images are captured or till you press ‘q’ on your keyboard. Make sure you are not in front of camera when running this script. 😃

You can move around carrying your laptop (of course you should be behind the laptop) to get different images.

Sample training data for class 0
python3 capture_images.py data/train/1 2000

This is for class 1 training label. Make sure you are in front of camera this time. You can tilt your head, come closer to camera, move away from camera or take your laptop with you and move around to get images with different backgrounds.

Sample training data with class 1 (Yes its me! 😃)

Do the same to get validation images. Please note directory and number of images are different.

python3 capture_images.py data/valid/0 200
python3
capture_images.py data/valid/1 200

Now we are ready with all training and validation data. Lets go ahead and build the keras model.

Build the model

You can use Jupyter notebook to build and train the model.

Import all keras modules.

from keras import models
from keras import layers
from keras import optimizers
from keras.preprocessing.image import ImageDataGenerator

Create a keras sequential model

size=128model = models.Sequential()model.add(layers.Conv2D(32, (3, 3), activation='relu', input_shape=(size,size,3)))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(64, (3, 3), activation='relu'))
model.add(layers.MaxPooling2D(2, 2))
model.add(layers.Conv2D(128, (3, 3), activation='relu'))
model.add(layers.MaxPooling2D(2, 2))
model.add(layers.Conv2D(128, (3, 3), activation='relu'))
model.add(layers.MaxPooling2D(2, 2))
model.add(layers.Flatten())
model.add(layers.Dropout(0.5))
model.add(layers.Dense(512, activation='relu'))
model.add(layers.Dense(1, activation='sigmoid'))

The model I used have 4 convolutional layers each followed by a max pooling layer. And finally a fully connected layer with 512 activation units.

I have added 1 Dropout layer to reduce overfitting.

All layers except the last layer use ‘relu’ activation function. The last layer has ‘sigmoid’ activation as it as binary classification.

And please note that input shape of image to the model is 128x128x3

Compile the model

model.compile(optimizer=optimizers.RMSprop(lr=0.0003), loss='binary_crossentropy', metrics=['acc'])

Used RMSprop optimizer with learning rate 0.0001 and binary_crossentropy as loss function as it is a binary classification problem.

I tried with different learning rates (0.01, 0.001, 0.0001 ...) and finally chose 0.0003 which gave me a good accuracy on validation data.

Train the model

During the training we don’t have to pass all the images in a single pass (Stochastic Gradient Descent). And Keras has a beautiful technique (ImageDataGenerator) which uses Python Generator. This will load only a batch of images into memory for one single pass through the convnet.

I used data augmentation on training data. You can try without this if you have more training data.

train_datagen = ImageDataGenerator(
rescale=1./255,
rotation_range=40,
width_shift_range=0.2,
height_shift_range=0.2,
shear_range=0.2,
zoom_range=0.2,
horizontal_flip=True)
validation_datagen = ImageDataGenerator(rescale=1.255)

Cratate a train generator and a validation generator to get the batch of images from respective directory. I used batch size 64 which means 64 images will be passed to the network in single pass. Class mode is ‘binary’ for binary classification.

train_generator = train_datagen.flow_from_directory('data/train',
target_size=(size,size),batch_size=64, class_mode='binary')
validation_generator = validation_datagen.flow_from_directory('data/valid', target_size=(size,size), batch_size=64, class_mode='binary')

Train the model using these generators.

We have total 4000 images in train folder. And our batch size is 64. So steps per epoch should be 63 to cover all images. Validation steps is 7 (7*64 ~ 400).

model.fit_generator(train_generator, epochs=5, steps_per_epoch=63, 
validation_data=validation_generator, validation_steps=7, workers=4)

After 5 epochs I got 97.8% accuracy on training data and 99.5% accuracy on validation data.

Now save the model to a file. We will use this later for prediction.

model.save('model.h5')

The above method will save your model into a file named model.h5 in your current working directory. If you want to save this to a different location, give the absolute path for the filename.

model.save('/home/jinilcs/model.h5')

Test the model on your webcam

To use the model you saved in the previous step in another application,

from keras import modelsmodel = models.load_model('/home/jinilcs/model.h5')

And as given below, we can predict the class using this loaded model. Please note that image_array dimension should be 1x128x128x3. Becasue we used 4D tensor to train the model. (numberOfImages x height x width x channels). And here we are predicting on 1 image

model.predict(image_array)

The below python program (more or less similar to the first one we used to capture images) loads the saved model and do the prediction on each video frame from the webcam. If prediction is class 0 then it will show the frame in Gray color.

Save it on your system and execute the program. One video frame will popup and you will see the colors changing based on prediction.

python3 detect_me.py

The video frame gets color as soon as you come in front of your webcam and goes back to gray shade when you leave. And press ‘q’ to exit from it. Its interesting right? 😉

Jupyter notebook and python scripts are available in https://github.com/jinilcs/webcam-model

I know this is not a perfect practical application, but I hope I was able to give you some idea on how we can use these models on simple applications like this.

You can extend this and make more interesting applications like,

  • Start an alarm when anyone other than you comes in front of your laptop
  • Multi class model to recognize your family members, etc

I am planning to create a model to monitor my eyes whenever I use my laptop. If I forget to blink my eyes for long time, it should give me an alarm. Lets see how it goes 😃

Thanks for reading, feedbacks will be highly appreciated. Thank you.

--

--

Jinil C Sasidharan

Software Engineer | Technical Lead | Co-founder @IQness. Passionate about crafting innovative software solutions and leading teams to success.