Part III -Lego Brick Image Classifier implementation on Jetson Nano

We now implement the classifier on a Jetson Nano. NVIDIA Jetson NanoDeveloper Kit is a small, powerful computer that lets you run multiple neural networks in parallel for applications like image classification, object detection, segmentation, and speech processing. All in an easy-to-use platform that runs in as little as 5 watts.

Preparing Jetson Nano

Here we already and SD card image created on board and it was ready to install the needed software for development.

First, we installed the required system packages:

$ sudo apt-get install git cmake
$ sudo apt-get install libatlas-base-dev gfortran
$ sudo apt-get install libhdf5-serial-dev hdf5-tools
$ sudo apt-get install python3-dev

After that, we configured the python environment.

$ wget https://bootstrap.pypa.io/get-pip.py
$ sudo python3 get-pip.py
$ rm get-pip.py

Now, we prepared a virtual env as a good practice to keep all the needed libraries in the same env and avoid conflicts with other potential applications.

sudo pip install virtualenv virtualenvwrapper

We needed to add to ~/.bashrc usign below commands in order to be able to run python on the environment.

# virtualenv and virtualenvwrapper
export WORKON_HOME=$HOME/.virtualenvs
export VIRTUALENVWRAPPER_PYTHON=/usr/bin/python3
source /usr/local/bin/virtualenvwrapper.sh

Now we are able to create the python environment

mkvirtualenv deep_learning -p python3

Finally, we install numpy, TensorFlow and sci-kit learn on the board.

$ pip install numpy
$ pip install --extra-index-url https://developer.download.nvidia.com/compute/redist/jp/v42 tensorflow-gpu==1.13.1+nv19.3ing started with the NVIDIA Jetson Nano
$ pip install scipy

Now the board is ready to start running inference models.

Running inference model

We copied the frozen graph and label files generated on the previous post to the SDcard to be used on the Jetson nano.

After that, we used the below script to do the inference on a test image.

We proceeded to the test the model using the below image:

Lego brick 2x3
python label_image.py \
--graph=output_graph.pb --labels=output_labels.txt \
--input_layer=Placeholder \
--output_layer=final_result \
--image=21652746_cc379e0eea_m.jpg

We got the below results on the test:

2x3 0.6287782
1x4 0.36173648
2x2 l 0.0044714455
2x2 0.0029475156
1x2 0.0020642714

As we saw on previous the algorithm is working with the test image giving the highest probability for the expected value.

Profiling Results

To use the cam we added below lines to the label_images.py script.

Now that we have all the model and the cam working we do a test on the performance using NVProf. We got the below results

Optimizing the code

After analyzing the code we used we didn't find any code optimization that could be done. However, we found that using the function RELU6 from TensorFlow library on the frozen graph on the jetson nano we got better performance than before. This is a function that has optimal results on the mobilenetV2.

Demo

Here is a video showing the project working. We took a photo of a 1x4 lego brick and it gave a high accuracy for the test.

Mo

Conclusions

  • The use of retraining network methodology allowed creating learning models with high accuracy, 95%, using limited hardware.
  • MobilenetV2 is a type of CCN optimized for limited hardware devices that were successfully run on Jetson Nano board for an image classification application.
  • An inference model was deployed to the Jetson Nano board to create an image classifier for 6 categories of lego bricks using the raspberry pi cam as input for the application and correct functionality was probed by taking pictures of random lego pieces.
  • The usage of embedded systems with limit resources is a challenge for some RAM demanding applications like a CNN to process an image. Swap files was an alternate solution used in the application.
  • By using native Tensorflow function RELU6 it was possible to improve the performance of the inference model by reducing the consumption time of many API functions and other calls.
  • Fast results was an issue due to limited resources of the Jetson Nano, limitations in CPU/GPU power reduce the performance of the CNN could be improved by using specialized software and not an entire OS with other applications demanding resources.

References

Nvidia. (2019). NVIDIA Jetson Nano Developer Kit | NVIDIA Developer. Retrieved August 25, 2019, from https://developer.nvidia.com/embedded/jetson-nano-developer-kit

Rosebrock, A. (2019). Getting started with the NVIDIA Jetson Nano — PyImageSearch. Retrieved August 25, 2019, from https://www.pyimagesearch.com/2019/05/06/getting-started-with-the-nvidia-jetson-nano/

Nvidia. (2019). Cuda toolkit Documentation. Retrieved August 25, 2019, from https://docs.nvidia.com/cuda/profiler-users-guide/index.html

Sheng, T., Feng, C., Zhuo, S., Zhang, X., Shen, L., & Aleksic, M. (2019). A Quantization-Friendly Separable Convolution for MobileNets.

--

--