The sad story of an edge computing device: Why can’t I impress Olive(DNN)

AiOTA LABS
3 min readMay 24, 2018

--

image credit: https://blathering.deviantart.com/art/Olive-Oyl-and-Bluto-19089728

Our beloved Popeye(lean-thin-tiny edge-computing device) with tear in his eyes is wondering, will ever Olive(Deep Neural Network-DNN) be mine?

Why Olive loves muscular, full of raw strength Bluto( big, power guzzlers computing device) sic.

Well let me tell you Mr.Popeye, people has underestimated you. Olive can still be yours after you munch AiOTA Labs special spinach.

AiOTA Labs always wondered: How can we create this spinach for Mr. Popeye so that Olive embraced him with the following specification

  1. Beauty of Olive should be unaltered: The most accurate DNN available in the market
  2. Spinach for Popeye: lowest power, fastest FPS than the un-accurate custom tailored DNN.

In order to run DNN, IoT-edge computing chip vendors compromise the accuracy of DNN by creating custom tailored DNN so that the network can fit in the memory size.

Hey, This custom tailored DNN is not the kind of spinach our Popeye will like. Ah, and this is not even the spinach, it is actually turning the beautiful Olive into a ugly one-yuck!!

But AiOTA labs loves so much Popeye that they were hell bent in getting Popeye back his Olive without altering are stunning beauty.

After 2 years of research we finally created this magical spinach and named it emDNN which promises to run state-of-art DNN with highest accuracy while consuming sub-mW of power, yet an order magnitude faster processing time. We took tiny IoT edge-computing devices from leader of this market segment, STM32-F from STMicro , GAP8 from Greenwaves Technology and i.mx 6ULL from NXP. These are really tiny devices with extremely low memory size and compute resources but extremely low in power.

Below is the snapshot of various important parameter while running a highly customized DNN network by ARM using CMSIS-NN which achieves a 80% accuracy on CIFAR-10 while promises 5x-6x gain on various parameters. Let me remind the readers that using state-of-art DNN, researcher has achieved ≥93% of accuracy on CIFAR-10 while this particular DNN achieved only 80% due to problem of fitting high accuracy network in the memory size.

GP8 and STM32 information from https://greenwaves-technologies.com/en/gap8-blog/

The result are good but can we do far far better than this while achieving state-of-art accuracy of 93%.

Here is the table showing how the scenario changed once this chip embraced the emDNN software technology.

This is an amazing result. Isn't it? The lowest power reported after using emDNN is sub-mw which is a reduction of almost 189X and highest FPS reported while using emDNN is 1056 which is increase of 105X in FPS among these chips while retaining the state-of-art accuracy!!!!!

So what do you say. Will Popeye get back his beautiful Olive after munching our magical spinach?

So how did we created this spinach. Here is snapshot of the magical trick.

Is emDNN integrated in all the popular training framework. Yes, emDNN is integrated in Caffe, Caffe2, TensorFlow and PyTorch as a library for seamless integration

Can I use known compression technlogy like prunning and quantization after emDNN. Yes, emDNN output can further be compressed by adopting known techniques like quantization and prunning and the gain is multplicative!!

For more info, please visit our website www.aiotalabs.com or write to us @ info@aiotalabs.com.

--

--