Creating A Thing That Looks At You

This series of posts chronicles my path to Create A Thing That Looks At You.

Using my freshly printed gimbal mount, a Raspberry Pi, a Pi Cam, Dlib’s Face Recognition library, two gimbals, two gimbal controllers, and a PWM generator board, the plan is to make a 3D printed object that detects faces, and points itself directly towards the detected face. If more than one face is detected, it just picks the most active one, that is, the most rapidly changing position one — so it appears to take an interest in interesting things.
We definitely want to minimise the creepy factor here. None of this.
The plan
The plan is to have it work by having Dlib find faces in the camera frame image, calculate the centre x, y co-ordinates of a selected detected face, and subtract the x, y co-ordinates of the centre of the whole camera image. From that we get the x, y amounts to need to move the pan (x) and tilt (y) gimbals to have it place that face in the centre of the camera image. We feed these into a PID (Proportional, Integral, Derivative) controller written in Python, and output these values into the PWM generator board. This board generates PWM signals which feed into the gimbal motor controllers, and these send the power signals along three wires each to the gimbals to silently adjust where the camera is pointing.
Well that’s the plan anyway. Let’s see how far we get.

First step. Get Dlib happening on the Pi 3.
pip install face_recognition
Python: import face_recognition
Did it work first time? Of course not! Let’s go through the errors.
Boost python library not found
No lapack/blas resource found
Looks like we need some more stuff.
sudo apt-get install libboost-all-dev
100+ packages to install, cool.

And we also need some lapack.
sudo apt-get install build-essential cmake libatlas-dev libavcodec-dev libavformat-dev libgtk2.0-dev libjpeg-dev libswscale-dev liblapack-dev
Run it again. Seems that the cpp compiler is running 100% CPU, but it’s not saying it’s compiling anything. Oh well, we’ll let it go and do its thing.

Actually, it wasn’t two hours later. Three hours and two Pi freeze restarts later, we still don’t have Dlib installed.
Ah. It’s too much for the poor Pi to compile Dlib. Got it. Read the instructions for setting up some swap space on the Pi to compile Dlib. The important part is to increase swap space to a gig, from 100 meg.
sudo vi /etc/dphys-swapfile
( change CONF_SWAPSIZE=100 to
CONF_SWAPSIZE=1024 )
:wqsudo /etc/init.d/dphys-swapfile restart
Update: The above only works on Raspbian. On Ubuntu, we do:
sudo dd if=/dev/zero of=/swapfile bs=1G count=4sudo chmod 600 /swapfilesudo mkswap /swapfilesudo swapon /swapfile
Hooray! Dlib compiles. We move onto hardware again. Next, we plug the Pi cam into the Pi.

And now we write a little Python to detect some faces.
import face_recognition
import picamera
import numpy as np# Start camera
camera = picamera.PiCamera()
camera.resolution = (320, 240)
output = np.empty((240, 320, 3), dtype=np.uint8)# Initialize some variables
face_locations = []
while True:
# Grab a single frame of video from the camera as a numpy array
camera.capture(output, format="rgb")
# Find the faces and in the current frame of video
face_locations = face_recognition.face_locations(output)
print("Found {} faces in image.".format(len(face_locations))) for face_location in face_locations:
# Print the location of each face in this image
top, right, bottom, left = face_location
print("A face is located at pixel location Top: {}, Left: {}, Bottom: {}, Right: {}".format(top, left, bottom, right)) # Face center
face_x = right - left
face_y = bottom - top # Image center
image_center_x = 320/2
image_center_y = 240/2 # How far should we move?
movement_x = image_center_x - face_x
movement_y = image_center_y - face_y # PID controller
k_d = 0.1
p_x = k_d * movement_x
p_y = k_d * movement_y print( "Movement: {}, {}".format( p_x, p_y ) )
We install this:
sudo pip install picamera[array]
And give it a run.
Bingo!
Found 1 faces in image.
A face is located at pixel location Top: 82, Left: 124, Bottom: 211, Right: 253
Movement: 3.1, -0.9
Great success. And now we take those movement numbers and move our camera.

