High Level Synthesis is great for implementing algorithms. However, there are times as we develop our HLS IP that we need to think about how it interfaces with the rest of the system beyond the AXI Interfaces which are our main interfaces.
This can be challenging in HLS as it often means we need to be able to wait on external signals, or to be able to wait for several clock cycles etc. implementing these can be challenging in HLS.
In this blog we are going to look at how we can implement structure in out HLS algorithms which
Wait for an input signal as a trigger
Wait for a defined number of clock cycles
Generate an output trigger signal
Let’s start with waiting for an input trigger from an external IP block, this could be a external frame sync in a image processing application.
The first thing we need to do is define an input using the arbitrary precision unsigned integer type, as we want a single bit, we can use
We declare this in the parameter list for our HLS function, we can also then declare a HLS pragma which enables the input to be implemented as a ap_none.
#pragma HLS INTERFACE ap_none port=trig_in
This now provides us a single bit input into our C function, the next thing we need to do is pause the HLS IP block until we see the trigger.
The simplest way to do this within Vivado HLS is to use a function defined with ap_utils.h that is the ap_wait_until(X) function. This will cause the HLS IP to pause until the variable is true, true in this case means any none zero value.
When we simulate this in our C simulation, we need to ensure the trigger variable is set to a non-zero value otherwise the simulation will pause.
To ensure the implementation of the ap_wait_until() function is correct within our HLS code we can look at the analysis view.
In the analysis view we should see an operation called _lnXX(wait), right clicking and selecting goto source should cross probe to the ap_wait_until() function call.
Now we know how to wait for an external trigger in our HLS code, how can we implement a delay for a specific number of clock cycles.
Typically writing our own delay function will not be efficient or implement the delay as we intend. As such the best way to implement a several clock cycle delay is to use the ap_wait_n(X) function.
This function is also defined with the ap_utils.h library, this function will delay at least X number of clock cycles. However, resumption of processing might take a few more clock cycles depending upon the implementation.
The delay can be changed on the fly, and if desired can be controlled using a AXI Lite Interface for example
void test (int delay)
#pragma HLS INTERFACE s_axilite port=delay
ap_wait_n(delay); //delay for a number of clock cycles provided over AXI bus
One useful application of the ap_wait_n(X) function is to adjust the frame rate to the exact frame rate you desire if creating a custom test pattern generator for video applications. In C simulation this delay will be ignored, however you will notice it in Co-Simulation when inspecting the waveform.
Similar to as we did with the ap_wait_until() we are able to observe the delay in the analysis view, in this case we will see a new loop which implements the counter to check for the correct delay.
The final aspect I want to examine is the creation of output signal, which changes state as the HLS IP core runs. This is important as the Vivado HLS compiler like many C compilers assumes the function will be single threaded. This means much like in a VHDL process only the final value of the variable is returned, to be able to generate intermediate outputs we need a little thought.
The first thing we need to do as for the trigger in is to create a single bit variable using the arbitrary precision types.
We declare this in our function parameter list however, as we want the signal to be an output and it is a scalar output, we must declare it is a pointer in accordance with figure 41 in user guide 902.
To ensure the intermediate values are output from the HLS IP function if we define the signal as being volatile. When the HLS compiler runs the intermediate operations will be performed and not optimized out.
Hopefully now you understand a little more about some of the more specialist features and functions that you deploy in your HLS solutions to make it easier to integrate into your overall solution.