2015 summary at Sampa lab, University of Washington, Seattle
There is a time when I wonder what to do with FPGA, since they have so much freedom than general CPU, yet no specific task is suitable for it, until I talked to Amrita Mazumdar in UW CSE women’s day about using it as an object detection unit. The idea of using FPGAs to improve the power consumption while keeping the same latency is so convincing that I come on board.
Phase I Slow initial learning: Viola-Jones algorithm
The goal for this research is to implement a hardware accelerator for a specific computer vision algorithm, with the focus of low power consumption. Three problems arrives: Which algorithm are we using, how should I work on accelerator, and which part of the computing process can we improve.
Viola-Jones algorithm, our target computer vision algorithm, is the first fancy word I learned in first day at lab. Amrita walked me through the idea and throw me a paper [Viola1], but the paper is rather cumbersome, and it took me a while to find a better introduction at OpenCV . I have no background knowledge on computer vision, in fact it is until this winter quarter that I decide to audit the CSE 455 computer vision class to get the bigger picture, but I can understand Viola-Jones now.
First a classifier is trained with a few hundred sample views of a particular object in a fixed size, and negative examples. The training process is down and the classifier is given to me. I don’t know the process, but I know how to use them. When we are using a classifier, it will be written in a xml file, with a cascade consists of several simpler classifiers called stages, and multiple features. The classifiers are used in the decision-tree structure, and the features are the standard to judge. The stage is formed in a node list, and each node has a weight for the decision tree. The more info you can find in the header file definition in parse.h. The features this algorithm used are the following Haar-like features:
These features are famous in the object recognition world, because with the help of integral image, the calculation using Haar-like features can be constant time O(1).
Phase II Rapid exponential learning: to the familiarity of FPGA
I start to work on the physical FPGA after I finished reading the computer vision code in Nov. 2015. I have more confidence on Verilog code since I have built a five-stage pipe-lining CPU, but for this time, Amrita consider the complexity of a new Verilog project can overwhelms me, and decide to let me use the state-of-art FPGA design tool: Xilinx Vivado High Level Synthesis.
I’m working directly on Xilinx Zynq-7000 Evaluation Kit. It was once connected to a computer called pinga in Sampa cluster through Sampa VPN.
ssh -X ‘‘user-name’’@cluster.sampa
ssh -X pinga
We met some difficulty in the comparability of Xilinx Vivado tool with Sampa cluster, so I download the Xilinx Vivado 2014.2 at my home computer as well. Installation tutorial can be found here.
If you can’t open the terminal tool, source the setting:
Then I build a “hello world” Zynq design from scratch, but haven’t uploaded to the board yet. The tutorial  can be found in the same website as the installation guide.
Phase III “the Altitude sickness”
Here is the stage for an expect facing a slow down on the learning curve, and unfortunately I haven’t gone to that stage yet.
One of my motivation for me to fully understand how to use an object detection algorithm and accelerated on a FPGA is because I want to use it in a robotics competition called RoboMasters, and I believe after pass through this learning curve I can actually cut out the power consumption for the computer vision algorithm on the chassis.
[Viola1] Paul Viola and Michael J. Jones. Rapid Object Detection using a Boosted Cascade of Simple Features. IEEE CVPR, 2001. The paper is available online at: https://www.cs.cmu.edu/~efros/courses/LBMV07/Papers/viola-cvpr-01.pdf
 OpenCV: Cascade Classification: http://docs.opencv.org/2.4/modules/objdetect/doc/cascade_classification.html