SmartCar: Set Rear Mirror Automatically using OpenCV and Python
Hi. I am Amir. Today, in this post, I want to implement an interesting and practical idea with our current knowledge in the field of coding and image processing. Ever wondered why you have to manually adjust the rear-view mirror every time you get in the car as a driver, especially when another driver (such as your parent) is driving the same car before you; and set the right angle based on the position of their head?
In this post, we want to do something so that, whenever any driver with any physical dimensions gets in the car as a driver, the rear-view mirror will be adjusted automatically. So Hooray :)
At the end of this article, we expect to achieve the following goals:
1- We can set up two cameras and perform the stereo calibration process.
2- Implement the online face and eye recognition algorithm using cameras and successfully extract the coordinates of the eyes’ center.
3- Using Python and knowledge of spatial geometry, implement the algorithm for calculating the right angle to adjust the rear-view mirror.
4- Send the angles to servo motors in every frame through the serial interface.
To begin with, it is best to review the process ahead. As we can see in the road map above, we are going to go through 4 main steps. We must first perform a successful stereo calibration. Then, using the OpenCV library and Python coding knowledge, implement an algorithm that can detect the driver’s face, understand the position of their eyes in the image, and finally calculate the position of the center of the line between the eyes on the image frame in pixels. Now it is enough to enter these coordinates into the triangulation algorithm to output the three-dimensional coordinates of the target point in the real world by using the intrinsic and extrinsic camera parameters and performing a series of calculations. These coordinates are very valuable, because the continuation of the project depends on its existence. In the last step of the software phase of the project, we dive into the world of 3D geometry and try to find the right angle for the mirror using the available Python libraries.
Step 1: Stereo Calibration
Be prepared that we want to do some practical work. You are going to convert any point in the image to its real-world coordinate. To do this fascinating magic, you need two identical (almost identical) cameras. The first step is to somehow put the two cameras in a fixed frame, similar to what we did in this project (see Figure 2). For this purpose, we used SOLIDWORKS software environment to sketch a 3D design of the camera frame and printed it using a 3D printer. Simple and convenient :). Here you can download the 3D file of the designed frame. Notice that it is necessary to place the center of the cameras’ sensors as far as possible on a line in the horizontal direction. The horizontal distance between the center of the cameras is also a parameter that you must determine at this stage. We chose a distance of approximately 2.5 inches, which is also the distance between the two eyes of a human.
After preparing the camera setup, we move forward in the first step.
Here you need to print an A4 checker-board (or even bigger). Mount the paper on a flat wall and start capturing images from both left and right cameras simultaneously. Try to take pictures from different angles at different distances. You can see some examples of images that we used in this project in Figure 3.
There are many tutorials on the stereo calibration process on the internet space. I will put the link of some of them for you in the reference section. In this project we do not focus on the details of the calibration process. But there are some things to keep in mind. The images we take with the cameras are beautiful and smooth in appearance, but in fact they have deviations especially radial distortion that must be detected and removed. In fact, we expect this step to obtain the amount of these deviations in the form of distortion matrices. We should also be able to obtain the transformation and rotation matrices of the right camera (camera number 2) relative to the left camera which is the reference. This helps us to further find the position of an object in real world space using the matrices and the concept presented in the Figure below.
The important tip to keep in mind is that each camera will have its own matrices after the calibration process. So you have to go with me and this project once with your cameras and go this journey :). Of course, you can see the complete code we used for stereo calibration in this link.
Here are some golden tips for an accurate calibration:
1- Try to keep a distance of at least two meters when capturing images from the checker-board.
2- If you have designed a fixed frame for the cameras (like what I did), make sure that the horizontal and vertical axis of each camera corresponds exactly to the real horizontal and vertical axis.
3- If you figure out that the result of real-world coordinates calculation is not correct, it must be a problem with the matrices of the cameras. I suggest you do the stereo calibration process with the application in MATLAB software designed specifically for stereo calibration and compare the obtained matrices. Use this link to get acquainted with this app.
4. Use cameras that do not have auto-focus. Because it changes the focal length of the cameras and causes serious problems in final real xyz while taking images from the checker-board.
5- The checker-board must be rectangular and not square, so that the number of grids in rows should not be equal to the number in columns.
6- According to the recommendation of MATLAB app, you should take at least 10 to 20 images and it is better that the images are in PNG format.
Be sure to follow these small tips as it will ultimately affect the performance of the system.
Step 2: Face Detection
There are many ways to identify faces, most of which are based on machine learning. The method we used in this project was to use the Haar Cascade classifier. The main reason for using this method was its low computation cost and no need to train a deep learning network with a lot of data. The Haar Cascade classifier was introduced in 2001 by Paul Viola and Michael Jones in an article entitled “Rapid Object Detection using a Boosted Cascade of Simple Features” [Reference]. In the following, we will examine the details of the face and eye recognition algorithm step by step.
First, we need to import the Cascade files into the coding environment (I mostly use Spyder) to identify the face and eyes, and then receive the stream of frames from the cameras with the following code.
Then in a while loop we start reading the frame of the left and right cameras.
All subsequent code is executed within this infinite while loop. Of course, at the end of the loop we put a command to exit it. We give the captured images of two cameras into a function called undistortion in each try of the loop. Let’s take a look at this function and its output together.
As the Figure 6 shows, there is no noticeable difference between Figure 5 and this image, indicating a small amount of distortions in the cameras’ sensor.
The outputs of the lines 16 and 17 are given in the Figure 7. Use the cv2.getOptimalNewCameraMatrix command to find the region of interest (ROI) and remove deviations appeared in the corners of images.
Well, so far so good. Let’s go back to the face recognition codes inside the while loop. First of all, we change the color space of frames from RGB to Gray. Also, we have already created an example of a cascade classifier class before the while loop. Now we detect the face area using the following code snippet. However, it is possible to find no faces in some image frames. So what happens to continue the algorithm that is completely based on the extracted area of the face? The answer is simple; the program encountered an exception error. To solve this problem, we use try command. If a face is not recognized, the algorithm passes the current frame in the hope that a face will be found in the next frame.
Do not forget that we are still in the while loop. If we open any of the output variables of the above code such as facesL, the 4 variables x, y, w, h are related to the identified face. Here x, y contain the coordinates of the left and top corners of the rectangle containing the face. Take a look at the image below.
Now we can extract the face area from the rest of the image and look for the position of the eyes in the desired area.
Like the face, the eye recognition output, which is located in the eyesL and eyesR variables, has 4 variables named ex, ey, ew, eh, albeit with the l and r indices for the left and right cameras. Since there may be two different human faces in the image, it is necessary to create a for loop on the coordinates of each eye, which is stored as a list. In the next step, we calculate the center coordinates of the line between the eyes in the main image (the image before extracting the face).
Congratulations, so far we have been able to reach the following output.
What we were looking for in step 2 is the coordinates of the red circle between the two eyes. We use this point as a driver representation to calculate the best angles of rear-view mirror. But we need the 3D coordinates of this point in the real world, not the pixels of the image. For this purpose, we obtain the xyz coordinates by the following code using the projection matrices obtained from the stereo calibration step.
There are two main prerequisites for the third step; one is the 3D coordinates of the driver’s point, and the other is the coordinates of the center of the rear window. In this project, we calculated the coordinates of the center of the rear window manually relative to the reference coordination (rear-view mirror). With these two parameters, it is time to begin step 3 and immerse yourself in the world of spatial geometry in Python. So fasten your belt.
Step 3: Calculate Mirror’s Angles
To adjust the mirror so that we have the best view of the rear of the car, we usually turn the head of the mirror a little towards ourselves (left, in cars where the steering wheel is on the left). In addition, we usually change the mirror head down a bit. There are two angles for adjusting the mirror, the heading angle or Yaw and the Pitch angle. Figure 10 illustrates the definition of these angles.
Now we want to talk a little bit about the idea of getting the right angle specifically for each driver, so that they have the widest view from the rear window of the car. Along the way, always keep in mind that the reference origin to calculate the angle is the left camera, which is located above the mirror. We also assume that the distance from the left camera to the center of the rear mirror is a value that can be ignored, and in a way we assume the center of the mirror to be the origin of the coordinates. To put it bluntly, imagine yourself in a car. Consider the center point of the line between your two eyes. From this point draw a straight line to the center of the rear mirror (coordinate origin). If we want to look at ourselves from the side and top view, we can consider the purple line in Figure 10. Now draw a line from the center of the mirror to the center of the rear window. To avoid the complexity of the subject, let’s look at red lines together in the Figure 11.
Now we have to look for the half-truth of these two pages. Because the angle between the half-instrument plate and the x-axis represents the Yaw angle.
The following Figure helps to better understand the above code.
We need a line and a point to create a bisector plane of the driver and rear window planes, so that the desired point is not on that line. The target line is actually the intersection of the two planes in Figure 11. This line provides two points of three required points for plane creation. To find the third point, it is enough to calculate the tan value of the Yaw angle. I leave the discovery to you.
Now we create the plane containing the origin, the driver point and the rear window point.
As a final blow, it is enough to calculate the intersection of the bisector and the driver-window planes. The output of this intersection is a line. This line is rotated in three-dimensional space to the value of the Yaw angle on the horizontal plane and to the value of the Pitch angle in the vertical plane.
Step 4: Send Angles to the Servo Motors via the Serial Port
If you have been with me up to this point, you know that we are inside the while loop and all previous processing is done on each frame. Therefore, it is necessary to send the calculated angles to the servo mechanism through the USB and serial port interface. To do this, we use the Serial library. Enter this code before entering the while loop. Do not forget to change the COM number.
Then, send the angles as strings by the following lines inside the while loop. Keep in mind that what is your servo motor’s format to receive data.
In this project, we used our creativity to move the rear mirror in a horizontal and vertical plane :) and designed a two-axis configuration using two servo motors. One of the servos is responsible for rotating the mirror in the horizontal plane (implementing the Yaw angle) and the other is used to rotate the mirror in the vertical plane (implementing the Pitch angle).
And finally, we will prepare a complete report on the details of the PCB design and implementation very soon and I will share the link with you.
References and Resources
1- Stereo Vision Library in Python: Link
2- Stereo calibration tutorial on Medium: Link
3- Haar Cascade Classifier Documentation: Link
4- Stereo calibration Documentation — MATLAB: Link