Vision Based Motion Control for Mobile Robots

Today, the automotive industry is changing rapidly as major automaker shifts its focus towards developing driver-less vehicles of the future. As a result, current automotive technologies are joining those of mobile robotics in the area of autonomous navigation. In terms of autonomy sensor development, the camera sensor receives the most attention because of the abundance of information that it provides to create object identification and classification algorithms in addition to depth, color, shape and texture detection. It also helps that significant improvements have been made with regards to image processing algorithms and computational power over the past decade. This increase in both computer vision performance and capabilities led to an increase in demand for developing low level motion controllers. In this article, a type of vision based motion controller known as visual servo control system, or “visual servoing” as it is often referred to as, is described with differential-drive mobile robotic applications. The content of this article is adopted from two papers that I published recently. The intent here is to provide a brief overview of visual servoing for those who are interested before they dive in to available literature and publications for detailed documentation.

Visual servoing in simple terms is a control system that utilizes vision as feedback to control motors and servos. There are three types of visual servoing techniques; namely, image based, position based and hybrid. In addition, there are two primary configurations for camera placements. These two configurations are know as eye-in-hand and hand-in-eye. In the eye-in-hand set up, the camera is placed on the robot so it only sees the target object and not itself. Contrary, the camera is placed in a third person point of view where it sees both the robot and the target object in the hand-in-eye set up. There are plenty of tutorials available online which describes the workings of all these techniques; however, those are generally filled with complicated math. My job here is to explain image and position based visual servoing without going in depth with equations. The following article discusses the regulation of a mobile robot towards a desired pose with an eye-in-hand configuration.

Image Based Visual Servoing

In image based visual servoing, the control system functions directly on the two dimensional image frame without considering three dimensional Cartesian coordinates. The following diagram shows a typical camera model where an object in three dimensional space is described with respect to the camera frame by (Xc, Yc, Zc). This object is projected on to the two dimensional image plane and is described by a feature point with coordinates (u, v). The main goal of image based visual servoing is to control the trajectory of the feature points on this image plane.

Figure 1: Camera model

How does one control feature point trajectory on an image plane? The answer is to apply motion to the camera if we assume the object in 3D space is stationary (which we will in this article). The controller works by comparing the desired feature point positions to the current feature point positions on the image plane. When the camera is not in its desired position within the world frame, there will be a pixel error between the current and desired feature point locations. This error becomes the input for a proportional control law (usually the choice of control scheme in classical approaches) to output the correct wheel speeds to steer the robot/camera in a way that decreases the error. This can be seen with the control system block diagram below.

Figure 2: IBVS control block diagram

Figure 3 illustrates the result of this control. In this figure, we are assuming a free flying camera. The top of Figure 3 shows the initial camera pose with respect to a square object that is placed in the world frame. The goal here is to steer the camera towards the front of the target object where the desired feature point locations are shown in the bottom of Figure 3. When comparing the differences between the desired and initial, the controller is able to generate the necessary trajectory to move the camera towards the desired pose. Figure 4 shows physical implementation of this algorithm where a kinematics model is necessary to relate the camera frame to the wheel velocities since we are no longer dealing with free flying cameras. You may reference the publications for a full system model development. It is also worth mentioning here that image based visual servoing is the easiest of all visual servoing techniques to implement.

Figure 3: IBVS simulation
Figure 4: IBVS experiment

Position Based Visual Servoing

On the other hand, position based visual servoing considers the three dimensional Cartesian space as oppose to just the two dimensional image plane that image based visual servoing is based on. To do this, a camera and a pose estimation algorithm are used to extract the pose data of the visual tag and the robot itself. This technique requires previously knowledge of the target object; which often times is a Quick Repose code (also used in this article). Instead of feature point positions, position based visual servoing compares the current estimated pose to that of the desired. This can be seen in the control block diagram below.

Figure 5: PBVS control block diagram

Corrections are then generated by the controller with consideration of the three dimensional pose error to regulate the robot towards its goal. In PBVS, it is necessary to have a model of the target object prior to implementation which makes this technique more limited and difficult than image based visual servoing. In addition, the accuracy of the control is heavily dependent on the performance of the visual pose estimation algorithm deployed.

Figure 6: PBVS experiment


In conclusion, the two techniques shown here are applied to differential-drive mobile robots using classical approaches. Researchers across the world have looked in to ways to improve this technique by using thermal and panoramic cameras and implementing more advanced controllers such as MPC. The intent of this article is to provide readers with a high level understanding of visual servoing without using mathematical equations. For more detailed descriptions, you may refer to the publications. If you have any questions, please feel free to contact me!


  1. A. H. Tan, A. Al-shanoon, H. Lang and M. El-Gindy, “Mobile Robot Regulation with Image Based Visual Servoing,” in ASME IDETC/CIE, Quebec City, 2018
  2. A. Al-Shanoon, A. H. Tan, H. Lang and Y. Wang, “Mobile Robot Regulation with Position Based Visual Servoing,” in IEEE CCECE, Montreal, 2018