Member-only story
Neural Network Inference on FPGAs
How to build and implement FPGA applications on AWS from scratch
Author: Daniel Suess, Senior Machine Learning Engineer at Max Kelsen
Deploying your deep learning models directly to edge devices comes with many advantages compared to traditional cloud deployments: Eliminating communication can reduce latency and reliance on the network connection; since the data never leaves the device, edge-inference helps with maintaining user privacy; and since the amount of cloud resources is drastically reduced, edge-inference can also reduce ongoing costs.
The proliferation of ML running on edge devices both drives and is driven by the development of specialised hardware accelerators such as GPUs, ASICs, or FPGAs. For an overview of the advantages and disadvantages of each hardware type see this series of posts or this article.
In this post we will go over how to run inference for simple neural networks on FPGA devices. The main focus will be on getting to know FPGA programming better and slightly lowering its traditionally high barrier to entry.
This post will go over the basic development by means of a simple example application running inference on a 2 layer fully connected network. By using AWS’s F1-instances and their provided AMI with all the…