cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
Highlighted
Explorer
Explorer
356 Views
Registered: ‎08-31-2017

Latency figures in post-HLS

Jump to solution

Hi, dear HLS experts

 

After HLS synthesis, it reports the latency figures in terms of cycles. What's the rough estimation error in your cases?

According to your experience, what's the accuracy against the final value running on FPGA? Or what figure can we get which is as close the real latency on FPGA as possible?

I don't know if you only check the latency figure after HLS or are there any more accurate values we can get in Vivado or HLS?

Thank you

 

 

0 Kudos
1 Solution

Accepted Solutions
Highlighted
Xilinx Employee
Xilinx Employee
353 Views
Registered: ‎09-04-2017

@nanson  If the loop limits are static, HLS can figure out the latency and these would mostly be accurate.

For variable bounds, it depends on simulation data. So if it's the same set that is being used in hardware, i would expect it to match there. On hardware, you need to take care of the interfaces though, because in simulation, we might be providing data as and when available.

Thanks,

Nithin

View solution in original post

2 Replies
Highlighted
Xilinx Employee
Xilinx Employee
354 Views
Registered: ‎09-04-2017

@nanson  If the loop limits are static, HLS can figure out the latency and these would mostly be accurate.

For variable bounds, it depends on simulation data. So if it's the same set that is being used in hardware, i would expect it to match there. On hardware, you need to take care of the interfaces though, because in simulation, we might be providing data as and when available.

Thanks,

Nithin

View solution in original post

Highlighted
Advisor
Advisor
321 Views
Registered: ‎04-26-2015

In addition to what @nithink has said, there are two big sources of error:

 

- Variable-length operations (eg. floating-point). HLS will normally give you a latency range for these.

- Interface delays. If the HLS block is stuck waiting 100000 cycles for data from off-chip RAM (because something else is occupying all the bandwidth, then obviously the latency is going to increase by 100000 cycles. This will tend to change every time you run the block.

0 Kudos