01-18-2019 05:19 AM
Hi, dear HLS elites,
I'm working on the following problem with regard to strategy of HLS optimizations for large designs. If you have any thoughts or comments, I welcome anything and thanks your time in advance.
Assume the a given design is composed of tens of loops and functions in HLS. When user executes the HLS synthesis, it outputs the latency result in terms of many millions cycles. If you obtain such a design from others and your mission is to optimize performance in HLS, what's your strategy in jumping start of optimization in HLS ?
What I can come up with is to optimize for individual loops and functions with compile directives first as I can.
If I'd like to analyze the latency profile of each loops and functions, are there any recommended way to master Design Analysis in HLS ?
Or what would you do ?
All the best,
01-18-2019 06:28 AM
When you're designing the block, you should always be thinking about how long each part is going to take - so when you finish writing the code, you should have a pretty solid idea of what total run-time is expected. For example, a block that does some processing on a 640*480 image at 1 pixel per cycle would be expected to have a run-time of around 307,200 cycles - if you see 50,000,000 cycles then clearly HLS has not built the function as you expected. Since you know how long each part should take, it's easy to identify which parts are unreasonably slow and add appropriate pragmas to fix them.
If you don't know how long you expect the block to take, that's where you need to start. Optimizing individual loops is a great way of wasting time if (a) you don't know which loops are actually running slower than expected, (b) you don't know which loops are most critical to timing, or (c) you're not actually sure that the design should meet your speed requirements even if everything is optimized perfectly.
01-20-2019 03:34 AM
Thanks for the comment. Yes, I have the time budget spec of the big system which composed of several blocks under design in HLS.
What I am thinking about to to find a way in HLS to investigate if there has any bubbles among blocks.
For example, given that there has blocks A, B and C. The output of A and B will enable C to generate the output. However, if the block A takes less cycles than B to complete the data for C, the B is the critical path in the data path A,B-->C. For B, maybe the operation is fast enough but the DSP is waiting for data available which exists bubbles between DSP and memory block in B.
Since I synthesis the whole design, I'm wondering how to analyze such places with bubbles in data path in HLS. What I know is to use the HLS performance debug GUI.