Converting 2D image coordinates to 3D coordinates using ROS + Intel Realsense D435/Kinect
Japanese version: https://ych.hatenablog.com/entry/2020/05/01/012518
Objective
I want to recognize an object with RGB image of RGB-D camera and get the 3D coordinates of the object.
Problem
At first, I was thinking that this isn’t a hard task to solve by using math. However, to increase accuracy, I need to take into account distortions, FOV, etc.., and it will be a painful task.
Solution
Using rs2_deproject_pixel_to_point
function provided by liberalsense. (https://github.com/IntelRealSense/librealsense/blob/master/include/librealsense2/rsutil.h#L67)
This function converting 2D coordinate to 3D coordinate take into account information provided by sensor_msgs/CameraInfo
.
In this article, I am using Python. (I usepyrealsense2
installed bypip
)
rs2_deproject_pixel_to_point
takes 3 arguments. (intrinsic, pixel, depth)
intrinsic
All values required by pyrealsense2.intrinsics
can be obtained from sensor_msgs/CameraInfo
.
_intrinsics = pyrealsense2.intrinsics()
_intrinsics.width = cameraInfo.width
_intrinsics.height = cameraInfo.height
_intrinsics.ppx = cameraInfo.K[2]
_intrinsics.ppy = cameraInfo.K[5]
_intrinsics.fx = cameraInfo.K[0]
_intrinsics.fy = cameraInfo.K[4]
#_intrinsics.model = cameraInfo.distortion_model
_intrinsics.model = pyrealsense2.distortion.none
_intrinsics.coeffs = [i for i in cameraInfo.D]
This article and this article will help you understand this code.
pixel
2D coordinate in the 2D RGB image. The coordinate system is defined here. The left up is (0,0), right is x
, down is y
.
depth
Depth value obtained from the depth image.
Return value
The 3D coordinates (float[3]). The coordinate system is also defined here. If you want to use these values in rviz, you need to change the coordinate.
result = rs2_deproject_pixel_to_point()
x = result[2]
y = -result[0]
z = -resilt[1]
The complete code is
def convert_depth_to_phys_coord_using_realsense(x, y, depth, cameraInfo): _intrinsics = pyrealsense2.intrinsics()
_intrinsics.width = cameraInfo.width
_intrinsics.height = cameraInfo.height
_intrinsics.ppx = cameraInfo.K[2]
_intrinsics.ppy = cameraInfo.K[5]
_intrinsics.fx = cameraInfo.K[0]
_intrinsics.fy = cameraInfo.K[4]
#_intrinsics.model = cameraInfo.distortion_model
_intrinsics.model = pyrealsense2.distortion.none _intrinsics.coeffs = [i for i in cameraInfo.D] result = pyrealsense2.rs2_deproject_pixel_to_point(_intrinsics, [x, y], depth) #result[0]: right, result[1]: down, result[2]: forward
return result[2], -result[0], -result[1]