How to use the YOLOV8 with CSI camera in Jetson xavier

Dean.Du
5 min readSep 4, 2023

It’s not a good beginning. And I just want to use the yolov8 in my newbuy jetson xavier to use the strength of the strong GPU. And It’s a hard work to boot the os, and install the necessary softwares. Luckily, I spent about two days to finish the installation and make the jetson well-worked.

It’s a better choice to install the Jtop and the versions of the installed softwares, especially the CUDA, is described in the screen-shot:

And the L4T works well, we notice that from the profile. And use the command shown bellow to check the parameters of the CSI camera.

$ v4l2-ctl --list-devices
# VIDIA Tegra Video Input Device (platform:tegra-camrtc-ca):
# /dev/media0

# vi-output, imx219 10-0010 (platform:tegra-capture-vi:2):
# /dev/video0
$ v4l2-ctl --device=/dev/video0 --list-formats-ext

The result of the command showed the supported resolution and fps of the camera. We can pick one of them to develop the code. At this part, we chose the 1920*1080 and 30 fps to set the configuration.

With the GOOGLE help, I can easily get some documents to deploy my CSI camera and make it work well with the following code:

import cv2


# set the config of gstreamer-pipeline
def gstreamer_pipeline(
capture_width=1280, # camera captured width
capture_height=720, # height
display_width=1280, # displayed window witdh
display_height=720, # height
framerate=60, # captured fps
flip_method=0, # whether rotate image
):
return (
"nvarguscamerasrc ! "
"video/x-raw(memory:NVMM), "
"width=(int)%d, height=(int)%d, "
"format=(string)NV12, framerate=(fraction)%d/1 ! "
"nvvidconv flip-method=%d ! "
"video/x-raw, width=(int)%d, height=(int)%d, format=(string)BGRx ! "
"videoconvert ! "
"video/x-raw, format=(string)BGR ! appsink"
% (
capture_width,
capture_height,
framerate,
flip_method,
display_width,
display_height,
)
)

if __name__ == "__main__":
capture_width = 1920
capture_height = 1080

display_width = 640
display_height = 640

framerate = 10 # fps
flip_method = 0 # dirction
# create the pipeline
gp = gstreamer_pipeline(capture_width,capture_height,display_width,display_height,framerate,flip_method=0)
# bind the video stream and pipeline
cap = cv2.VideoCapture(gp, cv2.CAP_GSTREAMER)

if cap.isOpened():
window_handle = cv2.namedWindow("CSI Camera", cv2.WINDOW_AUTOSIZE)

while cv2.getWindowProperty("CSI Camera", 0) >= 0:
ret_val, img = cap.read()
cv2.imshow("CSI Camera", img)

keyCode = cv2.waitKey(30) & 0xFF
if keyCode == 27:# ESC to quit
break

cap.release()
cv2.destroyAllWindows()
else:
print("Failed to open the camera")

Before the code works, you should verify the installation of the gstreamer with the code:

$ gst-inspect-1.0 --version

If you got some error, or you didn’t install the software, the gstreamer document may help you. After the installation, you should better verify it, and make sure you have installed it.

At the same time, you should be careful that the package of torch should support the GPU, and you should install the whl package developed by nvidia, then install the corresponding version of torchvision. You should check it carefully enough.

The OpenCV initially installed is not support the CUDA and if you want to make it work with the CUDA . It’s another complex work to rebuild the package OpenCV from source code to make it support CUDA. One day, I will publish another blog to describe the process.

Run the code above, and the window to display the video shown in front of the window. It worked well. At least, it’s a good message!

Everything seems ok, then I added the Yolov8 mode and codes to provide the ability to do the detection. The code is described as following:

from ultralytics import YOLO
import cv2
import math
import torch

print(torch.cuda.is_available())

# model
model = YOLO("deploy.pt")

# object classes
classNames = [
"person",
"bicycle",
"car",
"motorbike",
"aeroplane",
"bus",
"train",
"truck",
"boat",
"traffic light",
"fire hydrant",
"stop sign",
"parking meter",
"bench",
"bird",
"cat",
"dog",
"horse",
"sheep",
"cow",
"elephant",
"bear",
"zebra",
"giraffe",
"backpack",
"umbrella",
"handbag",
"tie",
"suitcase",
"frisbee",
"skis",
"snowboard",
"sports ball",
"kite",
"baseball bat",
"baseball glove",
"skateboard",
"surfboard",
"tennis racket",
"bottle",
"wine glass",
"cup",
"fork",
"knife",
"spoon",
"bowl",
"banana",
"apple",
"sandwich",
"orange",
"broccoli",
"carrot",
"hot dog",
"pizza",
"donut",
"cake",
"chair",
"sofa",
"pottedplant",
"bed",
"diningtable",
"toilet",
"tvmonitor",
"laptop",
"mouse",
"remote",
"keyboard",
"cell phone",
"microwave",
"oven",
"toaster",
"sink",
"refrigerator",
"book",
"clock",
"vase",
"scissors",
"teddy bear",
"hair drier",
"toothbrush",
]


#
def gstreamer_pipeline(
capture_width=1280, # captured width
capture_height=720, # captured height
display_width=1280, # display width
display_height=720, # display height
framerate=10, # captured fps
flip_method=0, # whether rotate
):
return (
"nvarguscamerasrc ! "
"video/x-raw(memory:NVMM), "
"width=(int)%d, height=(int)%d, "
"format=(string)NV12, framerate=(fraction)%d/1 ! "
"nvvidconv flip-method=%d ! "
"video/x-raw, width=(int)%d, height=(int)%d, format=(string)BGRx ! "
"videoconvert ! "
"video/x-raw, format=(string)BGR ! appsink"
% (
capture_width,
capture_height,
framerate,
flip_method,
display_width,
display_height,
)
)


if __name__ == "__main__":
capture_width = 1920
capture_height = 1080

display_width = 640
display_height = 640

framerate = 5 # fps
flip_method = 0 # direction

# pipeline
gp = gstreamer_pipeline(
capture_width,
capture_height,
display_width,
display_height,
framerate,
flip_method,
)

# bind pipeline and stream
cap = cv2.VideoCapture(gp, cv2.CAP_GSTREAMER)

if cap.isOpened():
window_handle = cv2.namedWindow("CSI Camera", cv2.WINDOW_AUTOSIZE)

while cv2.getWindowProperty("CSI Camera", 0) >= 0:
ret_val, img = cap.read()
results = model(img, stream=True)
for r in results:
boxes = r.boxes

for box in boxes:
# bounding box
x1, y1, x2, y2 = box.xyxy[0]
x1, y1, x2, y2 = (
int(x1),
int(y1),
int(x2),
int(y2),
) # convert to int values

# put box in cam
cv2.rectangle(img, (x1, y1), (x2, y2), (255, 0, 255), 3)
# class name
cls = int(box.cls[0])
print("Class name -->", classNames[cls])

# object details
org = [x1, y1]
font = cv2.FONT_HERSHEY_SIMPLEX
fontScale = 1
color = (255, 0, 0)
thickness = 2

cv2.putText(
img, classNames[cls], org, font, fontScale, color, thickness
)

cv2.imshow("CSI Camera", img)
keyCode = cv2.waitKey(30) & 0xFF
if keyCode == 27: # ESC to quit
break

cap.release()
cv2.destroyAllWindows()
else:
print("Failed to open camera")

Oops, some error happened!!!

Seems that the GStreamer can not find the plugin and load it. After a sequence of searching, I found the similar solution in the forum of the nvidia, and just set the environment variable.

$ export LD_PRELOAD=/usr/lib/aarch64-linux-gnu/gstreamer-1.0/libgstnvvidconv.so

If your path is different, maybe you should change the path. Then run the code, and get the result, shown as the following picture:

At last, it worked well, Thank God!

--

--

Dean.Du

A lucky man, python/golang programmer and experienced in invest.A translator.