Face Recognition with Arcface with TensorRT

2 min readSep 25, 2019

in the past post Face Recognition with Arcface on Nvidia Jetson Nano

I fail to run the TensorRT inference on jetson Nano, due to Prelu not supported for TensorRT 5.1.

But, the Prelu (channel-wise) operator is ready for tensorRT 6.0!

pre-requirement: make sure you can run the following line

docker run --rm --gpus all nvcr.io/nvidia/tensorrt:19.09-py3 nvidia-smi

There are two parts in this article

start container, build the arcface TensorRT engine
run the inference

Run the Container and build arcface TensorRT Engine

# bash
git clone https://github.com/penolove/insightface.git -b eyeWitnessWrapper-with-tensorrt-example
cd insightface;# download the arcface mdoel from https://github.com/onnx/models/tree/master/vision/body_analysis/arcface
wget https://s3.amazonaws.com/onnx-model-zoo/arcface/resnet100/resnet100.onnx# start container
docker run \
    --gpus all \
    -v  $PWD/:/insightface/ \
    -ti nvcr.io/nvidia/tensorrt:19.09-py3 /bin/bash# in side container install pre-requirements
cd insightface/pip install -r requirements.txt
apt-get install -y libsm6 libxrender1 libxext-dev
pip install mxnet-cu101  # just for compare speed

# Now let’s convert the downloaded onnx model into tensorrt engine arcface_trt.engine

# python
import os
import tensorrt as trtbatch_size = 1
TRT_LOGGER = trt.Logger()
def build_engine_onnx(model_file):
    with trt.Builder(TRT_LOGGER) as builder, builder.create_network() as network, trt.OnnxParser(network, TRT_LOGGER) as parser:
        builder.max_workspace_size = 1 << 30
        builder.max_batch_size = batch_size
        # Load the Onnx model and parse it in order to populate the TensorRT network.        
    with open(model_file, 'rb') as model:
        parser.parse(model.read())
    return builder.build_cuda_engine(network)
        
# downloaded the arcface mdoel
onnx_file_path = './resnet100.onnx'
    
engine = build_engine_onnx(onnx_file_path)
engine_file_path = './arcface_trt.engine'
with open(engine_file_path, "wb") as f:
    f.write(engine.serialize())

Inference With Trt Engine and compare speed mxnet one

# inference with trt engine
python naive_detector.py --is_trt_engine --model arcface_trt.engine# inference with the original mxnet model
# original mxnet model can be downloaded from
# https://github.com/deepinsight/insightface/wiki/Model-Zoo
python naive_detector.py

registered faces:

the left one is original mxnet model, the right one is the trt engine

inference 1000 times 5 faces on my gtx-1070 takes:

Trt Engine : 38s with batch_size = 1
Trt Engine : 22s with batch_size = 5
mxnet: ~60s with batch_size =1
mxnet: ~29s with batch_size =5

Face Recognition with Arcface with TensorRT

Run the Container and build arcface TensorRT Engine

# Now let’s convert the downloaded onnx model into tensorrt engine arcface_trt.engine

Inference With Trt Engine and compare speed mxnet one

Written by 楊亮魯