[Yolov8/Jetson/Deepstream] Benchmark test — Orin Nano 4GB, 8GB, NX, TX2

DeeperAndCheaper

4 min readSep 3, 2023

Backrounds

In order to create an intelligent video analysis product, it is necessary to determine which Jetson series was appropriate in terms of price and performance.
The cheaper the price while satisfying the FPS we require, the better the choice.
The experiment to figure this out is the benchmark test.

Benchmark Test Environment

Target device (HW): Orin Nano 4GB, Orin Nano 8GB, NX, TX2
SW Environment: jetpack 5.1 (ds 6.2, trt 8.5.2) for Orin Nano 4GB, Orin Nano 8GB, NX and jetpack 4.6(ds 6.0, trt 8.2.1) for TX2
Power mode / jetson clocks: 15W mode for Orin Nano 8GB, and 10W mode for Orin Nano 4GB, 20W & 6 Core for NX, MAXN for TX and jetson clocks for all device is running

Benchmark Test Process

Download model: Yolov8-medium.pt
Export and modify model: Yolov8-medium.onnx (with Dynamic batch and EfficientNMS_TRT layer)
Convert to tensorrt engine: Yolov8-medium.engine
trtexec latency, throughtput check in terms of batch size
Deepstream FPS check in terms of batch size

Benchmark Test Result

Yolov8m FLOPs

Latency

Throughput

FPS

Price

Ideal TOPs (or TFLOPs)

Summary

Conclusion

I did not choose TX2 because it has very low FPS compared to its price.
NX had similar performance to Orin Nano 4GB, but was not chosen because the price was higher.
Orin Nano 4 GB satisfied the performance requirements we wanted, but since we might add additional DL Models to our system, Orin Nano 8GB with redundant resources was thought to be appropriate.

Ref.

https://developer.nvidia.com/embedded/jetson-benchmarks

—

About Authors

Hello, I’m Deeper&Cheaper.

I am a developer and blogger with the goal of integrating AI technology into the lives of everyone, pursuing the mission of “Make More People Use AI.” As the founder of the startup Deeper&Cheaper, operating under the slogan “Go Deeper Make Cheaper,” I am dedicated to exploring AI technology more deeply and presenting ways to use it cost-effectively.
The name encapsulates the philosophy that “Cheaper” reflects a focus on affordability to make AI accessible to everyone. However, from my perspective, performance is equally crucial, and thus “Deeper” signifies a passion for delving deep with high performance. Under this philosophy, I have accumulated over three years of experience in various AI fields.
With expertise in Computer Vision and Software Development, I possess knowledge and skills in diverse computer vision technologies such as object detection, object tracking, pose estimation, object segmentation, and segment anything. Additionally, I have specialized knowledge in software development and embedded systems.
Please don’t hesitate to drop your questions in the comments section.

Appendix

Dockerfile (jetpack 4.6)

FROM nvcr.io/nvidia/deepstream-l4t:6.0-samples

ARG WS_ROOT=/opt/ws
ENV WS_ROOT_PATH=${WS_ROOT}

WORKDIR ${WS_ROOT_PATH}/deepstream_test

ENV DEBIAN_FRONTEND noninteractive
ENV NVIDIA_VISIBLE_DEVICES all
ENV NVIDIA_DRIVER_CAPABILITIES all
ENV OPENBLAS_CORETYPE=ARMV8
ENV LD_LIBRARY_PATH /usr/local/cuda-10.2/lib64:/usr/lib/aarch64-linux-gnu/tegra/:$LD_LIBRARY_PATH

RUN apt-get update && apt-get install -y --no-install-recommends \
    libsm6 \
    libxext6 \
    libxrender-dev \
    curl \
    software-properties-common \
    libcurl4-openssl-dev \
    zlib1g-dev \
    pkg-config \
    libssl-dev \
    pbzip2 \
    pv \
    bzip2 \
    unzip \
    devscripts \
    lintian \
    fakeroot \
    dh-make \
    build-essential \
    gcc \
    g++ \
    gdb \
    clang \
    cmake \
    rsync \
    tar \
    libgstreamer1.0-dev \
    libgstrtspserver-1.0-dev \
    libgstreamer-plugins-base1.0-dev \
    libgtk2.0-0 \
    libtbb2 \
    libeigen3-dev \
    libxi-dev \
    libxrandr-dev \
    fonts-freefont-ttf \
    protobuf-compiler \
    libprotoc-dev \
    git \
    && rm -rf /var/lib/apt/lists/*

COPY ./docker_build/wget.txt /tmp
RUN wget -L -i /tmp/wget.txt --no-check-certificate
RUN dpkg -i --force-all *.deb && rm -r *.deb

RUN apt-get update && apt-get install -y python3-pip
RUN python3 -m pip install -U pip setuptools
RUN pip3 install --upgrade pip

COPY ./docker_build/requirements.txt /tmp
RUN pip3 install --no-cache-dir -r /tmp/requirements.txt
RUN python3 -m pip install *.whl && rm -r *.whl

COPY ./docker_build/Jetson_deepstream_python_6.0_install.sh /tmp
RUN sh /tmp/Jetson_deepstream_python_6.0_install.sh
ENV LANG C.UTF-8

COPY ./ ${WS_ROOT_PATH}/deepstream_test

CMD ["/bin/bash"]

Dockerfile (jetpack 5.1)

FROM nvcr.io/nvidia/l4t-base:r35.2.1

ARG WS_ROOT=/opt/ws
ENV WS_ROOT_PATH=${WS_ROOT}

WORKDIR ${WS_ROOT_PATH}/deepstream_test

ENV DEBIAN_FRONTEND noninteractive
ENV NVIDIA_VISIBLE_DEVICES all
ENV NVIDIA_DRIVER_CAPABILITIES all
ENV OPENBLAS_CORETYPE=ARMV8

ENV LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/cuda/lib64:/usr/lib/aarch64-linux-gnu/tegra/
ENV CPATH=$CPATH:/usr/local/cuda/targets/aarch64-linux/include
ENV LIBRARY_PATH=$LIBRARY_PATH:/usr/local/cuda-11.4/targets/aarch64-linux/lib
ENV PATH=$PATH:/usr/local/cuda-11.4/bin:/usr/local/cuda/

RUN apt-get update && \
    apt-get install -y --no-install-recommends \
        python3-pip \
  python3-dev \
  python3-matplotlib \
  build-essential \
  gfortran \
  git \
  cmake \
  curl \
  nano \
  libopenblas-dev \
  liblapack-dev \
  libblas-dev \
        libatlas-base-dev \
  libhdf5-serial-dev \
  hdf5-tools \
  libhdf5-dev \
  zlib1g-dev \
  zip \
  libjpeg8-dev \
  libopenmpi3 \
  openmpi-bin \
  openmpi-common \
  protobuf-compiler \
  libprotoc-dev \
  llvm-9 \
  llvm-9-dev \
  libffi-dev \
  libsndfile1 \
        libboost-all-dev\
  libgtk2.0-0 \
  libgtk2.0-common \
  libgtk2.0-bin \
  libgail-common\
  libtbb2\
  libgail18\
  libnpp-11-4\
  libnpp-dev-11-4\
  libcufft-11-4\
  libgstreamer1.0-dev\
  libgstreamer-plugins-base1.0-dev\
  libyaml-cpp-dev\
  protobuf-compiler \
     libprotoc-dev \
  git \
    && rm -rf /var/lib/apt/lists/* \
    && apt-get clean

COPY ./docker_build/wget.txt /tmp
RUN wget -L -i /tmp/wget.txt --no-check-certificate
RUN dpkg -i --force-all *.deb && rm -r *.deb

RUN apt-get update && apt-get install -y python3-pip
RUN python3 -m pip install -U pip setuptools
RUN pip3 install --upgrade pip

COPY ./docker_build/requirements.txt /tmp
RUN pip3 install --no-cache-dir -r /tmp/requirements.txt
RUN python3 -m pip install *.whl && rm -r *.whl

COPY ./docker_build/Jetson_deepstream_python_6.2_install.sh /tmp
RUN sh /tmp/Jetson_deepstream_python_6.2_install.sh
ENV LANG C.UTF-8
COPY ./ ${WS_ROOT_PATH}/deepstream_test

CMD ["/bin/bash"]

NOTE: if you need requirements.txt and wget.txt, please contact me.

[Yolov8/Jetson/Deepstream] Benchmark test — Orin Nano 4GB, 8GB, NX, TX2

Backrounds

Benchmark Test Environment

Benchmark Test Process

Benchmark Test Result

Yolov8m FLOPs

Latency

Throughput

FPS

Price

Ideal TOPs (or TFLOPs)

Summary

Conclusion

Ref.

Trending Articles

Hit! [yolov8] converting to Batch model engine

Hit! [Quantization] Go Faster with ReLU!

[Quantization] Achieve Accuracy Drop to Near Zero

[Quantization] How to achieve the best QAT performance

[Yolov8/Jetson/Deepstream] Benchmark test

[yolov8] NMS Post Processing implementation using only Numpy

[yolov8] batch inference using TensorRT python api

About Authors

Appendix

Dockerfile (jetpack 4.6)

Dockerfile (jetpack 5.1)

Written by DeeperAndCheaper