cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
Highlighted
Participant
Participant
280 Views
Registered: ‎10-30-2018

Early Stage Evaluation for the resource utilization

Hi, I've read several paper and documentation for the early estimation on the on-chip resources.

It is obvious that the implementation can not fully-utilized all the resource of target FPGA platform since the placement and routing issue (is this right?).

Most of the paper "guesstimate," based on there empirical result, there is about 10~30% mismatch between the evaluation from the HLS and the final implementation result in Vivado.

I wonder if there are any description explaining these affect, not just by the "empirical result". And 10~30% is quite a wide range.

 

For now, I think UG904 may possibly mention the effect when executing implementation, maybe?

 

P.S. I am reading with some article about the matrices multiplication. I guess the type of application might affect to the mismatch number, too.

0 Kudos
1 Reply
Highlighted
Advisor
Advisor
210 Views
Registered: ‎04-26-2015

It is obvious that the implementation can not fully-utilized all the resource of target FPGA platform since the placement and routing issue (is this right?).

There are a lot of reasons. No design is going to map perfectly to an FPGA structure (with effort, you might map one perfectly to a simple CPLD structure) so there are always going to be some wasted resources. Adding timing constraints will limit resource usage because some blocks are simply too far away to be usable. And then on top of that you have the limitations of the place-and-route tools; this is a very challenging problem and there's no way (apart from spending several trillion years doing a brute-force approach) to produce an optimal result.

 

Most of the paper "guesstimate," based on there empirical result, there is about 10~30% mismatch between the evaluation from the HLS and the final implementation result in Vivado.

I wonder if there are any description explaining these affect, not just by the "empirical result". And 10~30% is quite a wide range.

Fundamentally, it's the same reason that if I ask you "how much space will it take to build an 8088 in Vivado?" you're probably going to be wrong by at least 30%. It's an estimate, nothing more.

HLS does a bunch of high-level optimizations. It doesn't do any low-level ones (eg. seeing that an output isn't connected and can therefore be removed, or that an input is tied to a constant and can therefore be simplified), and in many cases that isn't even possible (because the outputs aren't connected until you use the block in Vivado). Its estimate is likely to be along the lines of "add up all the resources for everything we've used, multiply by a fudge factor that gives a decent answer on average, and report that". For some resources like DSP slices and block RAM, that is likely to be very close to correct. For LUTs and FFs, there are more likely to be optimizations that Vivado can do to cut the number down.

 

On top of that, Vivado's own resource usage calculations can be somewhat misleading. Vivado implementation runs until it finds a solution that works (ie meets timing and resource requirements), then it stops. There is no guarantee that this is the best solution. It's quite possible that your design could be implemented with 1000 LUT6s (and HLS reported that), but Vivado tried implementing it with 3000 LUT4s and it worked so implementation stopped there. With further work it might well have fitted in 1000 LUT6s, but Vivado doesn't bother to do that work because there's no need. As a result, the HLS estimate looks odd.