The Magic behind Camera Calibration

6 min readJan 26, 2020

Assuming you are working in the area of machine vision and you detected something really interesting in a camera image. You also managed to nail its position down to a fraction of a pixel. Wouldn’t it be great to also get its exact geometric location in relation to the camera?

No problem, you might think. I just do a quick camera calibration implemented in many open source tools and feed it to a standard camera model, which gives the exact relation between 3D scene points and 2D image points.

In general, you are right. This so-called camera calibration is the base for many machine vision tasks ranging from classification to visual tracking and 3D scene reconstruction. However, you would be surprised how poorly most of these calibrations are if the typical process is followed taught by many universities and machine vision labs.

The Calibration Dance

Let’s look at a common use case. You mounted a camera on your new robot, and now you want to do a camera calibration to take advantage of all the fancy vision algorithms out there.

Therefore, as suggested, you use an online calibration tool and start dancing in front of the robot equipped with a tiny calibration target until the system tells you the camera is calibrated.

This not only looks pretty awkward for anyone not familiar with the process, but it also introduces tons of issues:

The calibration dance jeopardizes image quality and introduces motion blur, poor illumination, and critical configurations, to name only a few.
The repeatability is close to zero and depends on the skill of the dancer.
The calibration takes longer than necessary and is therefore unlikely to be repeated very often.

The same is true by randomly distributing static calibration targets across the scene until you have the feeling that enough images were collected in the hope that all errors get magically smoothed out.

Resulting images of multiple calibration dances.

The Myth

First of all, the general assumption that many different poses of a calibration target must be collected to get good results is plain wrong.

In a perfect situation, only two poses of a planar target would be sufficient to estimate all model parameters of a camera correctly.

However, because most calibration images are far from being perfect and have at least some systematic error, it is usually better to plan for four substantial different poses. This is also, for example, adopted by Intel, suggesting the same for an OEM calibration of their Intel RealSense Camera.

More poses must only be added if the calibration target is too small to cover the full field of view of the camera. Here, it is not important how far the target is away from the camera as long as it has acceptable sharpness. The only benefit of using different distances is that it controls the spacing of the support points in the image domain. This is equivalent to using calibration targets in various sizes. However, pushing the calibration target into the background is usually a good sign for an oversized checkerboard or poor calibration results. Therefore, do not waste your time with images that are not really required.

Accurate Calibration in 30 Seconds

A better approach is to start with the right pattern spacing. In the case of a standard checkerboard target, a good spacing would be around 50 pixels in the image domain to have some margin for perspective distortion. A larger spacing usually excludes more area than necessary where no rectangular checkerboard can be fitted.

The symmetric error of standard lenses allows us to split the image into two regions along the vertical or horizontal axis and to place two different tilted calibration targets in each of these regions.

The required four poses can be generated with only two images with excellent and repeatable results in less than 30 seconds by rotating the camera once.

Everything what is needed for an accurate calibration are two images of the right target.

However, assuming, for example, a field of view of 64°x52° with an acceptable sharpness starting at around half a meter, two calibration targets each of 0.6mx0.4m would be required.

Therefore, this is only an option if the calibration pattern is directly printed onto stiff plates such as aluminum composite panels (ACP). I strongly recommend this to anyone taking calibration seriously and won’t break anyone’s bank account if an online print service is used. Self-printed targets are, in theory, nice. However, during my more than ten years of experience in machine vision, I have never seen a self-printed target glued to a planar stiff plate without bubbles.

In the case, only smaller calibration targets are available, the bigger one can be simulated by taking multiple images from the smaller ones. Here the virtual bigger checkerboard can be used as a template guiding the placement of the smaller ones.

But keep in mind, the less area the calibration target covers in the image, the more grows the uncertainty of the estimated camera parameters. Therefore, use a proper calibration target whenever possible and leave the small one to others.

The Photo Shooting

“Planning is everything. Even 1 Million bad images won’t fill the missing information.”

For taking the actual calibration images, it is not a good idea to just hold the target into the camera. Think of a photo shooting where you are looking for the perfect shot. Planning is everything. Even 1 Million bad images won’t fill the missing information required for accurate calibration. Therefore,

Mount the target and camera on a tripod.
Even an exposure time of 5ms is already enough for introducing significant motion blur.
Do not collect images where the camera axis is orthogonal to the calibration target.
This is an unstable configuration, and noise can have a considerable effect on the calibration result. It is better always to use a tilt of 15–30° between target normal and camera axis.
Get the illumination right.
The sensor is not linear at both ends of its measurement range. Therefore, it is better to keep the sensor value between 128 and 200 for bright image regions. Try to use diffuse illumination by pointing spotlights away from the target.
Do not care so much about the image corners.
Even if you place the target as close as possible to the image corners, the camera won’t be correctly calibrated all the way to its corners. This would require a precise localization of image features in the image corner where surrounding support points are missing. You even get slightly better results for the rest of the image if you just ignore them.
If you really need a calibration all the way to the corners, you have to use a calibration pattern, which requires only a few pixels to successfully signal a support point in the image domain, such as circular patterns.

Final thoughts

Next time you see someone dancing with a calibration target in front of the camera, you know what to do. Or you can just enjoy watching the show which I performed already way too often before looking behind the magic curtain ;)

If you have any questions or ideas to ask or add, feel free to comment below.

Happy calibration!

In case, you want some more background informations, please feel free to read my paper Accurate Detection and Localization of Checkerboard Corners for Calibration.