01-11-2019 02:28 AM - edited 01-11-2019 02:30 AM
My project consists of a top function that calls a number of sub-functions as a stream under dataflow.
each sub-function represents a computation block that performs a certain task. I also have DMA for input and output in order to convert axilite (memory mapped) to axi stream and vice versa.
It looks something like :
TOP : [MM2S --> StreamBlock 1 --> StreamBlock 2 --> ... --> StreamBlock n --> S2MM]
Each time, I have to wait for the data to pass through the whole stream before the next input can be applied.
I would like to speed up the stream by making use of ap_ctrl_chain, so that whenever a block is done processing, a flag signal is sent out.
Thank you so much !
01-14-2019 09:29 AM
The dataflow directive should be telling the synthesizer to pipeline the sub-functions, and it's my understanding that it does this by inferring the ap_ctrl_chain directive on those sub-functions. So you shouldn't have to do this explicitly..
But there does seem to be some reason why HLS couldn't pipeline the dataflow region. Are there any dataflow warnings in the console? Can you use the dataflow viewer and the information in UG902 to debug why your functions aren't being pipelined?
01-14-2019 12:26 PM
Thank you for your answer !
First of all, I noticed that when I add the directives of :
#pragma HLS interface ap_ctrl_chain port=return bundle=CRTL_BUS
then the protocol of the following RTL Ports is changed from ap_ctrl_hs to ap_ctrl_chain for every sub-function (saw it in synthesis report) :
ap_clk, ap_rst, ap_start, start_full_n, ap_ready, ap_done, ap_continue, ap_idle, start_out, start_write .
Secondly, the only thing I got in the log is that dataflow is applied to the top function, and that the process function(s) were detected/extracted. Looks like it is applied correctly.
But this means that the sub-functions are pipelined with each others, but not the whole design for the input. Please let me explain by the following example:
A top-level function takes an image as input and does some image processing on it, by calling 10 sub-functions. These 10 functions are pipelined with each others, and this is good.
But what I would like to do is to be able to feed the top-level function with a 2nd image even before the 1st image is done processing, because the first part of the top pipeline is empty (i.e. done processing the 1st image). say only 4 sub-functions are left for the 1st image, while the other 6 sub-functions are actually idle.
Even I don't know how to feed the 2nd image as input before the top-function call from PS returns.
How to utilize the idle sub-functions for the next input? and how to call the PL correctly from the PS ?
Sorry for long explanation, I hope you got my point :)
Thank you so much !