Exploring KD-Tree in Point Cloud

Method to pair up Point Cloud with KD-Tree

4 min readJul 18, 2023

Comparing point clouds is a crucial step in processing and analysing point cloud data. However, due to the inherent differences among individual scans, it is impossible to perform a point-by-point comparison. Therefore, the first and primary step in point cloud analysis is to pair up points in order to draw meaningful comparisons.

Pairing points is a critical task in distinguishing surface deformation and movement analysis. This process not only provides insights for deformation analysis but also allows us to understand the growth rate of vegetation from airborne LiDAR data by comparing points among different scanned samples.

Although Grid sampling is an alternative approach to point pairing, still it results in coarser data than point pairing due to the sub-sampling process, which may include averaging.

On the other hand, point pairing excludes averaging and produces a more in-depth and fine-grained result compared to grid sampling.

KD-Tree

The KD-Tree would be the simplest way to do the point pairing and matching. It make use of the range query operations by the distance weighting and nearest neighbourhood searches to compute the closest best fit point by the target data to the original.

It is a binary tree structure that partitions the point cloud into smaller subsets based on their spatial coordinates. The root node of the KD-Tree represents the entire point cloud. Each internal node of the tree represents a subset of the point cloud and is split into two child nodes based on a selected coordinate axis. The splitting plane is chosen to minimise the variance of the point cloud data, which helps to balance the tree.

For nearest neighbour search, the tree is traversed from the root node to the leaf node that contains the query point. At each node, the distance between the query point and the splitting plane is calculated. The algorithm then recursively traverses the child node that contains the nearest point until the closest point is found.

Sample of the KD-Tree

K-d tree nearest neighbor search algorithm helps find the closest point to a given query point in a K-d tree data structure.

To find the nearest point, the algorithm starts at the root of the tree and moves down to a leaf node as if it’s inserting the query point into the tree.
It calculates the distance between the query point and the leaf node, and updates a variable to store the distance and the coordinates of the leaf node.
It then moves up the tree towards the root, checking each node along the way.
If the distance from the query point to the node is less than the current best distance found so far, the algorithm updates the current best with the new distance and node coordinates.

The algorithm continues traversing the tree until all nodes have been checked, or until it is known that the nearest point has been found. Finally, it returns the coordinates of the nearest point.

Application with python

Scipy method with numpy:

import numpy as np
from scipy.spatial import KDTree

# Build KD-trees for each point cloud
tree1 = KDTree(cloud1)
tree2 = KDTree(cloud2)

# Find nearest neighbors for each point in cloud1
dists, indices = tree2.query(cloud1, k=1)

# Get the neighbor points from cloud2
neighbors = cloud2[indices]

OpenCV:

# Create KDTree from the first point cloud
pcd_tree = o3d.geometry.KDTreeFlann(pcd1)

# Find the nearest neighbors of each point in the second point cloud in the first point cloud
[k, idx, _] = pcd_tree.search_knn_vector_3d(pcd2.points, 1)

pcd = o3d.geometry.PointCloud()
pcd.points = o3d.utility.Vector3dVector(pcd1.points[idx])
pcd.colors = o3d.utility.Vector3dVector(np.tile([0, 0, 1], (len(idx), 1)))

Problems in the KD-Trees

KD-Trees can sometimes encounter challenges when used to compare complex three-dimensional data.

One such challenge arises when the scanner used to capture the data samples the object from different angles. In these cases, the nearest neighbor comparisons performed by the KD-Tree may result in errors. This is because the process only looks at the nearest neighbors based on two-dimensional distance weighting, which does not take into account the complexities of the three-dimensional space.

Another issue that can arise with KD-Trees is imbalanced trees. This occurs when some nodes in the tree have many more points than others. This imbalance can lead to poor performance when searching for nearest neighbors.

Non-uniform data distribution is another issue that can cause problems for KD-Trees. When the data points are not uniformly distributed, some regions of the space may have many more points than others. This can lead to an imbalanced tree and poor performance.

Reference list

Arya, S., & Mount, D. M. (2018). Approximate nearest neighbor queries in fixed dimensions. Handbook of Discrete and Computational Geometry, 3, 865–880.

Ailon, N., & Chazelle, B. (2009). The fast Johnson–Lindenstrauss transform and approximate nearest neighbors. SIAM Journal on Computing, 39(1), 302–322.

Friedman, J., Bentley, J. L., & Finkel, R. A. (1977). An algorithm for finding best matches in logarithmic expected time. ACM Transactions on Mathematical Software (TOMS), 3(3), 209–226.

Muja, M., & Lowe, D. G. (2014). Scalable nearest neighbor algorithms for high dimensional data. IEEE Transactions on Pattern Analysis and Machine Intelligence, 36(11), 2227–2240.