Rotation of Voxels in 3D Space using Python
Transformation of Cartesian Coordinate followed by Interpolation
Introduction
Data augmentation is essential for most machine learning or deep learning tasks. In the field of computer vision and image processing, image rotation is one of the common techniques for data augmentation. Rotation of 2D images is rather simple and can be accomplished by existing tools like OpenCV. However, there is a lack of well-established utilities available for the rotation of the 3D images. This article will discuss how a rotation of 3D data (voxels) can be accomplished by transformation of Cartesian coordinate followed by interpolation.
Data Representation for 3D Objects
Depending on the use case, 3D objects or shapes are often represented in the point cloud, 3D mesh, or voxel. A point cloud is a collective set of points with their respective coordinates (x, y, z) which represent the presence of an object in space. 3D triangular/polygon mesh utilized sets of connecting vertices, forming edges and triangular facets, and results in a triangular mesh that describes the surfaces and edges of an object. Voxel is a 3D equivalent of a pixel in a 2D image (VOlume piXEL). The data is represented by a 3D array where the value of a specific element (voxel) in the array represents some physical properties (color, density) in the space [1]. The point cloud can be found in photogrammetry where multiple images are shot at different angles and resulting in a point cloud. 3D meshes are mostly found in rendering engines in prevalent CAD software or games. Instead of describing vertices or surfaces as found in 3D mesh, voxel focuses on describing the physical properties of a point in the space. Voxels are useful for representing medical data (computed tomography imaging) or physical quantity (density, field) in scientific data.
Both point cloud and 3D triangular mesh encode the properties of an object with sets of coordinates. The rotational operation for 3D volumetric data encoded in these 2 formats can be easily achieved by multiplying a rotational matrix to the coordinates of the points. However, the rotation matrix cannot work directly on voxel data as the data are not representing coordinates. However, the coordinate of a voxel is derived from its relative position in the 3D array, a coordinate system can be constructed and the rotation can be performed by rotating the coordinate system in the opposite direction.
Existing Utilities for 3D Rotation
For 2D rotational transformation, the operation is represented by a 2×2 orthogonal matrix [[cos𝜃,−sin𝜃], [sin𝜃,cos𝜃]] with is the angle of rotation. For Python, well-known packages such as OpenCV [2] and Pillow [3] can perform rotation around an arbitrary point. For the sake of retaining information, both packages support preserving the edge of the image by enlarging the rotated image (figure 2).
For 3D volumetric data consisting of voxels (or 3D array), the rotation operation can only be achieved through ndimage module of the SciPy package [4]. However, the operation is only limited to rotation along 3 principle axes around the center of the array. Any arbitrary rotation in 3D space can be accomplished by a series of rotation around 3 principle axes (Euler angle) followed by translation, nevertheless, this lead to further complication and increase the processing time.
Basic Concept
As mentioned previously, the 3D array of voxels does not carry any information about its coordinates. However, the relative coordinate can be inferred from the location of each voxel in the 3D array. By taking the distance between each neighboring voxel as 1 unit, for an array of size 𝑁𝑥×𝑁𝑦×𝑁𝑧, the coordinate of each voxel is assigned as (𝑖,𝑗,𝑘), with 𝑖∈{0,1,2,…𝑁𝑥−1},𝑗∈{0,1,2,…𝑁𝑦−1},𝑘∈{0,1,2,…𝑁𝑧−1}. As the rotation of the object is impossible, the whole coordinate system is rotated in the opposite direction (figure 3). An object can be viewed as a function 𝐴(𝑥,𝑦) in space. The rotated function in the original coordinate system takes the form 𝐴′(𝑥,𝑦). The un-rotated function in the rotated coordinate system takes the form 𝐴(𝑥′,𝑦′). It can be shown mathematically that 𝐴′(𝑥,𝑦) and 𝐴(𝑥′,𝑦′) have the same expression. The rotational operation can be achieved by interpolating the original functions 𝐴(𝑥,𝑦) using the coordinates (𝑥′,𝑦′) from the rotated coordinate system.
The interpolation can be done by using the RegularGridInterpolator() [5] function from the SciPy package. Extra works need to be done to handle coordinates that lie outside of the region. To simplify things, the values at these ‘outlier’ coordinates are set to zero which is the same approach as rotational operation in OpenCV or Pillow.
Codes and Implementations
The core components of rotation discussed here will be the transformation of the coordinate system and the interpolation. Hence, a transformation operation is defined to carry out these tasks. The function generalTransform() takes in the original 3D array and an invertible 3×3 matrix as the parameter and returns the interpolated results after transformation. For the sake of simplicity, the data consider here is a single channel. The simplified procedures are as below:
(1) Calculate the inverse matrix 𝑀−1 of the transformation matrix 𝑀.
trans_mat_inv = numpy.linalg.inv(transform_matrix)
(2) Construct the coordinate (𝑥𝑖,𝑦𝑖,𝑧𝑖) for all the voxels with mesh grid from NumPy.
Nz, Ny, Nx = image.shape
x = numpy.linspace(0, Nx - 1, Nx)
y = numpy.linspace(0, Ny - 1, Ny)
z = numpy.linspace(0, Nz - 1, Nz)
zz, yy, xx = numpy.meshgrid(z, y, x, indexing='ij')
coor = numpy.array([xx - x_center, yy - y_center, zz - z_center])
(3) Evaluate the new coordinate (𝑥𝑖′,𝑦𝑖′,𝑧𝑖′) by multiplying matrix 𝑀−1 to the original coordinate (𝑥𝑖,𝑦𝑖,𝑧𝑖).
coor_prime = numpy.tensordot(trans_mat_inv, coor, axes=((1), (0)))
xx_prime = coor_prime[0] + x_center
yy_prime = coor_prime[1] + y_center
zz_prime = coor_prime[2] + z_center
(4) Identify the set of points (voxels) that require interpolation, eliminate the points with new coordinates which lie beyond the region bounded by the cuboid of the original volume, i.e. 𝑥′𝑖∉{𝑝|0≤𝑝≤𝑁𝑥−1},𝑦′𝑖∉{𝑝|0≤𝑝≤𝑁𝑦−1},𝑧′𝑖∉{𝑝|0≤𝑝≤𝑁𝑧−1}.
x_valid1 = xx_prime>=0
x_valid2 = xx_prime<=Nx-1
y_valid1 = yy_prime>=0
y_valid2 = yy_prime<=Ny-1
z_valid1 = zz_prime>=0
z_valid2 = zz_prime<=Nz-1
valid_voxel = x_valid1 * x_valid2 * y_valid1 * y_valid2 * z_valid1 * z_valid2
z_valid_idx, y_valid_idx, x_valid_idx = numpy.where(valid_voxel > 0)
(5) Initialize a 3D array with size the same as the original 3D array (use for storing transformed results).
image_transformed = numpy.zeros((Nz, Ny, Nx))
(6) Interpolate using the function scipy.interpolate.RegularGridInterpolator(), return the values to the transformed array according to their respective indices.
data_w_coor = RegularGridInterpolator((z, y, x), image, method=method)
interp_points = numpy.array([zz_prime[z_valid_idx, y_valid_idx, x_valid_idx],
yy_prime[z_valid_idx, y_valid_idx, x_valid_idx],
xx_prime[z_valid_idx, y_valid_idx, x_valid_idx]]).T
interp_result = data_w_coor(interp_points)
image_transformed[z_valid_idx, y_valid_idx, x_valid_idx] = interp_result
The rotation matrix along the arbitrary axis in 3D space is given by Rodrigous’ rotation formula [6]. A rotational operation on the 3D array can be completed by inserting the rotation matrix into the generalized transformation function defined in the previous step. Figure 4 shows an example of 3 objects rotated around given axes.
def getRodriguesMatrix(image, x_center, y_center, z_center, axis, theta):
v_length = numpy.linalg.norm(axis)
if v_length==0:
raise ValueError("length of rotation axis cannot be zero.")
if theta==0.0:
return image
v = numpy.array(axis) / v_length
# rodrigues rotation matrix
W = numpy.array([[0, -v[2], v[1]],
[v[2], 0, -v[0]],
[-v[1], v[0], 0]])
rot3d_mat = numpy.identity(3) + W * numpy.sin(theta) + numpy.dot(W, W) * (1.0 - numpy.cos(theta))
return rot3d_mat
Optimization
As the transformation requires the calculation of new coordinates and interpolation of all the voxels in the 3D array, it is expected that the workload, as well as the processing time, grows linearly with the number of voxels. This part will discuss how to optimize the speed of the whole operation. A simple analysis shows that interpolation alone takes up 85% of the duration. Hence, the optimization will focus on the interpolation part. Since the interpolation is independent among all the voxel, the optimization can easily be done by partitioning the workload and distributing it into parallel threads. Figure 5 shows the speed (duration) comparison before and after parallelization. The parallelization is performed by the Pool class in the Multiprocessing library of Python.
With the parallelization, the speed is almost quadruple. The speed-up does not scale accordingly with the number of threads utilized during the processing. This is probably due to the parallelization at the lower level has been done by the NumPy library. The interpolation involves simple arithmetic such as addition and multiplication over multiple data within the array, it is an excellent task to be optimized using GPU which is specialized in SIMD operations. Figure 6 shows further optimization using GPU with trilinear interpolation [7] implemented using the TensorFlow module. As stated in the previous section, rotation in 3D can also be performed by the ndimage of the SciPy package by using the Euler angles, the duration for SciPy is added for comparison to act as a baseline.
With the acceleration from GPU, within the test range (500×500×500 voxels), the processing speed can be further improved by 20%. The small improvement probably arises from 2 aspects: 1) the duration for tasks other than interpolation (coordinates transformation and indexing) is not optimized, and 2) the cost of communication between CPU and GPU for data transfer.
Apart from speed, distortion introduced during interpolation is another concern when performing the transformation. To evaluate the errors from interpolation, a 3D image with simple solid shapes (sphere, cuboid, cone) undergoes a series of consecutive rotations and finally back to its original orientation (figure 7). Voxel-wise RMSE between the original image and final image is calculated as a metric for comparison.
Table 1 shows the RMSE results from a single series of rotations (6 consecutive rotations per series) for 3D images of different sizes. Figure 10 shows the cumulative error after the kth round of rotation. As the rotation using ndimage from SciPy consists of 3 rotations around principle axes, which is equivalent to performing interpolation thrice, the error is expected to be larger than a single rotation/interpolation in the rotation operation defined in this article.
Conclusion
This article introduces a technique of rotation of 3D data (voxels) by the transformation of Cartesian coordinate followed by simple interpolation. Apart from rotation, any linear invertible transformation like shearing can be achieved by the method discussed above. As compared to the rotation in the ndimage module of the SciPy library, the rotation operation defined above provides a more powerful and intuitive way (rotation along the arbitrary axis) for generic usage as compared to SciPy ndimage which can only rotate along the principle axis. The speed for the discussed rotation operation is on par with existing rotation by SciPy ndimage for small volume (<100×100×100) while 40% faster for a large volume. In the aspect of image quality, the error arises from the rotation operation defined above in 25% lesser than the rotation operation in SciPy ndimage.
Supplements
Source code for visualization using Mayavi
# pip install mayavi
import numpy
from mayavi import mlabdef visualize3D(image):
if not isinstance(image, numpy.ndarray):
raise ValueError("Parameter 'image' must be a numpy array.")
if not len(image.shape) == 3:
raise ValueError("Parameter'image' must be a numpy array of 3 dimension.")
s = image
s = numpy.moveaxis(s, 0, 2)
s = numpy.moveaxis(s, 0, 1)
s = 0.0 - s
color_opt = "gnuplot2"
min_val, max_val = numpy.min(s), numpy.max(s)
surf_levels = [0.001, 0.05, 0.1, 0.15, 0.2, 0.25, 0.3, 0.35, 0.4, 0.45, 0.5, 0.55, 0.6, 0.65, 0.7, 0.75, 0.8, 0.85, 0.9, 0.95, 0.99]
fig = mlab.figure("", bgcolor=(1.0, 1.0, 1.0))
src = mlab.pipeline.scalar_field(s)
for surf_lvl in surf_levels:
#mlab.pipeline.iso_surface(src, contours=[s.min() + surf_lvl * s.ptp(), ], colormap=color_opt, opacity=1.0-surf_lvl)
if -surf_lvl<=max_val and -surf_lvl>=min_val:
opc = min(0.8, max(0.2, surf_lvl))
mlab.pipeline.iso_surface(src, contours=[-surf_lvl,], colormap=color_opt, vmin=-1.0, vmax=0.0, opacity=opc)
mlab.gcf()
mlab.orientation_axes(figure=fig, xlabel='X', ylabel='Y', zlabel='Z')
mlab.show()
References
[1] https://towardsdatascience.com/how-to-represent-3d-data-66a0f6376afb
[2] https://www.pyimagesearch.com/2021/01/20/opencv-rotate-image/
[3] https://pillow.readthedocs.io/en/stable/reference/Image.html
[4] https://docs.scipy.org/doc/scipy-0.16.1/reference/generated/scipy.ndimage.interpolation.rotate.html
[6] https://handwiki.org/wiki/Rodrigues%27_rotation_formula
[7] https://en.wikipedia.org/wiki/Trilinear_interpolation
[8] https://docs.scipy.org/doc/scipy/reference/generated/scipy.spatial.transform.Rotation.html