High Level Synthesis Made Easier: Deeper Dive

Adam Taylor
5 min readNov 8, 2019

--

In my last blog looking into Silexica’s SLX FPGA tool, I introduced the tool, its concepts and of course the benefits it brought in accelerating not only the performance but also development time. In this example, I continue to see big benefits of using the SLX FPGA tool when using Xilinx’s HLS flow. Specifically, I see a 42x improvement in latency from SLX FPGA while only seeing a 5x in area (LUTs).

In this blog we are going to look a little more in detail at the development flow, targeting industrial an application algorithm. Commonly these industrial algorithms are used in? industrial control applications such as controlling a set point or converting measurements values into a value which can be further processed. Typical algorithms include

· Calendar-Van Dusen — used to convert a reading from a Platinum Resistance Thermometer to a temperature.

· Proportional Integral and Derivative algorithm — used in control applications to achieve a set point for example measuring a temperature and driving a heater to achieve the desired temperature.

Both to these algorithms can be implemented in software however, many industrial control systems require a low latency to ensure control and accuracy. As such accelerating the algorithm using High Level Synthesis is popular choice of course once the algorithm has been written we need to optimise its performance which is where SLX FPGA comes in.

Another benefit of accelerating algorithms in programmable logic is that larger batches of data can be processed and still achieve increased performance.

Creation of both the PID and CVD algorithms can be achieved quickly and easily using floating point maths that can then be synthesised using HLS.

Let’s look at the CVD algorithm and its implementation, mathematically this can be very complicated

R=R0 ×(1+a×t+b×t²)

Re arranging for temperature gives

t= (-R0×a+√(R⁰²×a²-4×R0×b×(R0-R)))/(2×R0×b)

However, it can be implemented using polynomial approximation in C as shown below.

Running this code through Vivado HLS with no optimisations, produces an interval between calculations of 4225 clock cycles and resource utilisation of approximately 3%.

Un-optimised Implementation

To get started with SLX FPGA the first thing we need to do is create a new project, we can then start adding in the source and test bench code. Like any development tool we want to ensure the correctness of the algorithm before we begin optimisation.

Once we have the project name entered, we can then select the target platform, selecting this is mainly for the SDSoC / VITIS flow. If we want to use a standard HLS flow, we can select this once the project has been created.

With the project created, we will see the configuration settings dialog open for the project. It is using the configuration setting we control

· Vivado HLS or VITIS / SDSoC flow

· Project files for compilation and linker flags

· We can also identify the top function, the desired clock frequency and synthesis and estimation models

Configuration of the HLS / SDSoC / VITIS Flow
HLS options

Within the configuration page we can also define the number of currently occupies resources in the platform, this is very useful for a SDSoC / SDAccel based flow.

We can also define the size of arrays we wish to partition to improve throughput, arrays below the complete partitioning limit are fractured.

HLS optimisations and configurations

We can also identify which functions are candidates for acceleration hardware, by defining he candidate threshold.

For this example, we are going to be using the HLS flow to implement the CVD algorithm. To identify the best HLS optimisations, we are going to use the following processes from the tool bar

1) Find Parallelism — Identify parallel data and pipeline parallelism.

2) HLS Hints — Review the HLS recommendations which include pipelining and partitioning.

3) HW Optimisation and Processing — Allows optimisation of the HLS suggestions.

4) Generate the HLS aware code — Insert the HLS optimisations into the source code

5) Synthesise the project — Implement the project with the HLS optimisations.

Clicking on the HLS Hints button will run first the find parallelism and then display the HLS Hints. HLS hints show several opportunities for optimising the presented source code, including the data-level pipelining and partitioning of arrays.

HLS Hints identifying the possible optimisations

With the HLS hints available, we can optimise using the hardware optimisation and partitioning dialog. This enables us to determine the number of times identified loops are unrolled, we can of course trade performance for area to achieve a lower initiation interval when we define this.

As the batch size in this example is 128 samples we can if we desire and resource allow change the loop unroll factor to 128.

When this code was implemented using Vivado HLS via SLX FPGA. The performance was significantly improved over the original un-optimised code, producing a latency of 100 clocks compared to the original 4225 clocks. Of course, this implementation does use more significantly more logic resources at approximately 20% of the device.

Utilisation of the optimised HLS

As we can see running our code through SLX FPGA enables us to quickly and easily implement industrial control algorithms using HLS.

If you are interested in seeing a demo of SLX FPGA, the SLX FPGA team will be attending XDF in The Hague on the 12th & 13th October.

--

--

Adam Taylor

Adam Taylor is an expert in design and development of embedded systems and FPGA's for several end applications. He is the founder of Adiuvo Engineering Ltd