Point Set Registration Methods — Discussion and Implementation

Anupama Rajkumar
The Startup
Published in
12 min readFeb 3, 2021

An Introduction to Point Set Registration using Iterative Closest Point and Trimmed Iterative Closest Point with Implementation

Photo by Ivan Sanford on Unsplash

What is Point Set Registration?

Point set (or cloud) registration¹ is a widely used technique in the field of computer vision, pattern recognition, robotics and image processing. It finds extensive application in many fields like medical image processing, intelligent or autonomous vehicles , robotic manipulation, SLAM, panorama stitching to name a few².

The main purpose of point set registration is to find correspondences and estimate the transformation (rotation and translation) between two or more point sets. In practice, point set registration is a rather challenging task as the available point clouds always have some noise (like outliers) or deformation².

There are many point set registration algorithms that can be used to transform point clouds. These algorithms can be largely categorized into: Pairwise and Groupwise registration algorithms².

Pairwise registration only considers two point sets while groupwise registration performs transformations on more than two point sets simultaneously². In this article, we will discuss about the classic Iterative Closest Point (ICP) and Trimmed Iterative Closest Point (TrICP) methods both of which are pairwise registration method. While there are algorithms that perform affine transformations between clouds (rotation, translation, scaling and skews), in this article we will talk about only rigid transformations i.e. rotation and translation between point clouds.

Iterative Closest Point (ICP)

Paul J. Besl and Neil D. McKay in 1992 published a paper called “A Method for Registration of 3-D Shapes”³ and this paper forms the foundation of the ICP algorithm as we know it. In this section, I will discuss the methodology that they proposed to perform rigid transformation between two point clouds along with code snippets. In the end, I will show some point clouds to demonstrate the results achieved and some limitations of this method.

Given a 3-D data point cloud which describes the data shape corresponding to a model shape, the aim of this algorithm is to estimate the optimal rotation and translation that aligns, or registers, the model shape and data shape minimizing the distance between the shapes and thereby allowing the determination of the equivalence of the shapes via a mean-square distance metric³. This method provides a generic solution to the alignment problem.

The first step of the algorithm is to compute the closest point to a given point. The method proposed in the paper finds Euclidean distance between a point p and a point set A with N points. It picks the point in the point set A with minimum distance to the given point p³. As one can imagine this is a very slow and computationally demanding process and this can pose a challenge when the size of the point cloud is very large. Hence, this approach to finding closest point doesn’t seem very practical.

As an enhancement, instead of using euclidean distance, k-d trees can be used to find the closest points. I would leave out the details of k-d trees as it is beyond the scope of this article. K-d trees can be easily implemented using the open source library Nanoflann⁶. The code snippet to do the same would look like this:

This would create a tree and store all the nearest points, as desired. Once we have corresponding closest point of both the data and point clouds, the next step is to actually calculate the rigid transformations ie. the rotation and translation matrices. This process is called point set registration and there are two ways to do this — using Single Value Decomposition (SVD) as described by Arun, Huang et al. in Least Square Fitting of Two 3-D Point⁴ or using Unit Quaternion as applied in the original paper³. We will discuss the steps and implementation of both the methods:

  1. Least-Squares Fitting using Single Value Decomposition (SVD)⁴

In this section, I will describe the steps proposed by Arun, Huang et al. in their paper Least Square Fitting of two 3-D points⁴ using SVD.

Given two point sets P and Pᶦ are related by :

where, R : Rotation matrix, T : Translation matrix and N is a noise vector, then this algorithm presents a way of finding the least-squares solution of R and T based on SVD of a 3 x 3 matrix.

We want to find R and T to minimize

In order to do this, first we find centroid of both the point clouds by finding their respective means. Once we have found the centroid to these point clouds, we subtract these values with the respective point clouds in order to center the mass-point of each cloud.

If Q and Qᶦ are the centered mass-points of P and Pᶦ respectively we have:

Therefore, we reduce the original least-squares problem to two parts:

a. To find R to minimize the above equation⁴

b. Find translation, T using:⁴

Once we have P, Pᶦ, Q and Qᶦ, we can compute a 3 x 3 covariance matrix matrix as :

This can be implemented as follows

Next, we find SVD of H. Since it is out of the scope of this article, I will skip the discussion about the math behind SVD and assume that the reader has background knowledge of SVD. There are functions in C++ that help in calculating the SVD of H which gives matrices U and Vᵗ. Once we have these matrices, we calculate rotation matrix, R as:

It is important to calculate the determinant of R matrix obtained from the calculation above. If the det(R) is 1, then the rotation matrix is correct and we can proceed with calculation T from the equation given above⁴.

However, in case det(R) is -1, the algorithm fails and leads to reflection instead of rotation. However, this can be remedied by multiplying the last column of V matrix with -1. This solution works only if two of the three singular values of H are equal⁴.

In case det(R) is -1 and none of the singular values of H is 0, this method fails. This usually happens when the point clouds have higher outliers and RANSAC is a better approach in such a case as opposed to least-squares fitting⁴.

That was too much maths but there is a chance that if you work in 3D computer vision or SLAM, SVD is an everyday jargon for you!

Let’s see how do we implement this!

2. Least- Squares Fitting using Unit Quaternion³

This was the method followed by Besl and McKay in their paper³ so as to avoid reflections as described in the previous section. However, the advantage of SVD over unit quaternions is that the former generalizes well to n dimensions and would be an ideal approach in case n > 3.³

The unit quaternion is a four vector

where

To find the transformation matrix, we start by calculating the covariance matrix by following the steps as explained in the previous section.

After we have the covariance matrix, H, we find the anti-symmetric matrix:

From this, we take the cyclic components to compute:

This vector is then used to form the symmetric 4 x 4 matrix Q(H):

Here, I is the 3 x 3 identity matrix. The unit eigenvector corresponding to the maximum of eigenvalue of the matrix Q(H) is selected as the optimal rotation

The translation vector can be determined from R as done in the first method of SVD.

Now that the methods for computing the closest point on a geometric shape to a given point and for computing a least squares registration vector has been outlined³, we can get on with aligning the two point clouds.

We move or register the data shape point cloud P to best align with the model shape point cloud M.

We repeat the following steps until convergence³:

  1. Compute the closest point using the method outlined earlier (k-d tree)
  2. Compute registration (SVD or unit quaternion)
  3. Apply the computed registration to the data point cloud
  4. Terminate the process when the mean square error falls below a preset threshold or if the total number of iterations have been exhausted. In that case, the solution fails to converge and find an optimum registration.

As per the Convergence Theorem of the ICP Algorithm, “it always converges monotonically to a local minimum with respect to the mean-square distance objective function”³.

Experimentally, it is found that in the initial few iterations the convergence is very fast but slows down as it approaches the local minimum. Even at slow pace, usually the solution is obtained in between 30–50 iterations³. In the implementation shared here for the fountain point cloud, a reasonable solution was obtained in 40 iterations.

Let us look at some results now. For a fountain model and data points with some pre-registration yields the following result after applying ICP.

Data point cloud. Image by Author
Model point cloud. Image by Author
Fountain after successful ICP registration. Image by Author
Fountain after successful ICP registration. Image by Author

ICP gives good results as long as the data points are outlier free and that data point set is a subset of the model set. ICP is not robust in itself and would require robustification by using techniques like RANSAC or better sampling methods. In addition, although ICP guarantees convergence, it can be relatively slow.

Trimmed Iterative Closest Point (TrICP)

Trimmed Iterative Closest Point or TrICP addresses the problem of alignment of two roughly pre-registered, partially overlapping 3D point sets in presence of measurement outliers and possibly, shape defects⁵.

TrICP is an extension of ICP proposed by D. Chetverikov (who also taught me sensor fusion at University!) et al in their paper and addresses the critical issue of robustness as the original algorithm assumes outlier-free data and that the data point cloud, P is a subset of model cloud, M⁵.

The new algorithm is based on the use of Least Trimmed Squares (LTS) approach in all phases of the operation. LTS means sorting the square errors and minimizing a certain number of smaller values⁵. What this implies is that instead of taking square distance of all the points in the cloud, only first few points which represents appropriate correspondences between data and model sets are taken.

LMedS is another method that minimizes the median ie. the value in the middle of the sorted sequence. However, we use LTS as it has better convergence rate and a smoother objective function⁵.

Let’s talk about the algorithm now. Given we have two point clouds, data point cloud P with p points and model point cloud M with m points⁵. Usually p is different from m. This means that a large portion of data points do not find correspondence in model set. Let us assume that the minimum guaranteed rate of data points that can be paired is known and is given by a minimum overlap, o. Then the number of data points that can be paired to model point set is given as⁵:

What this means is that after applying overlap parameter, it would be assured that the correspondences between P and M would be ensures and points taken from both the point clouds would be same.

Note that the value of o is unknown and we calculate it’s value using golden section search algorithm⁷ as explained later in the section.

TrICP assumes at P and M have been roughly pre-registered, either manually or automatically⁵. Pre-registration means that the data and the model set already has some established correspondence or overlap.

Usually this value is initial relative rotations up to 30 degs. TrICP is similar to ICP with consistent use of LTS in all major aspects of it’s operation which allows it to cope with outliers, shape defects or just partial overlap; to estimate the optimal transformation at each iteration step⁵.

We repeat the following steps until convergence⁵:

  1. For each point in data set P, find closest point in M and compute the square of individual distance (refer the code snippet for finding closest point earlier)
  2. Sort these distances in ascending order and select the oN least values and calculate their sum S, where o is the overlap parameter
  3. If the stopping conditions are satisfied, exit. Otherwise, continue.
  4. Compute the optimal transformations (R, t) that minimizes S
  5. Transform P according to (R, t) and go to 1

Stopping conditions used in 3 are⁵:

  1. Maximum allowed iterations have been reached in which case the solution doesn’t converge
  2. The trimmed mean square error (MSE) is lower than the threshold
  3. The relative change of the trimmed MSE is sufficiently small indicating that the solution has reached a minima

Let’s touch upon how we calculate the overlap parameter parameter o. When o is unknown, it is set automatically by minimizing the objective function⁵:

Usually:

We try to minimize the objective function using golden section search algorithm⁷. The minimum of the objective function is searched in the range 0.4 and 1. Usually this value can be determined in 5–8 iterations. Please note that when o is 1, TrICP becomes ICP⁵. The code snippet below shows how I determine the overlap parameter.

TrICP’s Convergence Theorem states that the algorithm “always converges monotonically to a local minimum with respect to the trimmed MSE objective function”⁵.

From the experiments, it can be seen that TrICP converges faster than ICP and has advantages when the model and data point sets have different sizes. The result for the same fountain data and model point clouds as shown earlier with TrICP can be seen here:

Fountain after successful TrICP registration. Image by Author
Fountain after successful TrICP registration. Image by Author

Through this article, I wanted to discuss about some basic point set registration algorithms and present a way to implement them. There are open source libraries like PCL which already do this for you but it is always good to know and understand what they do in real.

In addition, it is worth mentioning that this is the simplest possible implementation of the said algorithms. Point registration remains one of the most challenging tasks and there are innumerable variations to make it efficient, fast and robust. This becomes even crucial when the incoming point clouds are real time and has ample outliers for example in a drone or an autonomous vehicle.

You can find the entire implementation of the above discussed algorithms along with some other topics in sensor fusion here . I hope you find this useful and Happy Coding!

References

[1] Point set registration, Wikipedia: https://en.wikipedia.org/wiki/Point_set_registration

[2] Hao Zhu et al. A Review of Point Set Registration: From Pairwise Registration to Groupwise Registration: https://www.semanticscholar.org/paper/A-Review-of-Point-Set-Registration%3A-From-Pairwise-Zhu-Guo/c0fff09ebc59fef099d27ea6375acb861b7fe4e9

[3] Besl and McKay. A method for registration of 3-D shapes: https://ieeexplore.ieee.org/document/121791

[4] Arun, Huang et al. Least Square Fitting of Two 3-D Point Sets:https://ieeexplore.ieee.org/document/4767965

[5] D. Chetverikov, Svirko et al. The Trimmed Iterative Closest Point Algorithm : https://ieeexplore.ieee.org/document/1047997

[6] Nanoflann: https://github.com/jlblancoc/nanoflann

[7] Golden Section Search Algorithm : https://en.wikipedia.org/wiki/Golden-section_search

--

--

Anupama Rajkumar
The Startup

Student for life. Here to share all I’ve done or learnt. Support my writing by subscribing https://anupamarajkumar.medium.com/subscribe