Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

- Community Forums
- :
- Forums
- :
- Software Development and Acceleration
- :
- HLS
- :
- HLS resources and latency increase without any sen...

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page

cerilet

Explorer

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

07-14-2016 09:44 AM

5,296 Views

Registered:
08-26-2014

HLS resources and latency increase without any sense

Hello,

I am coding a program using double-precision floating-point variables to be implemented in a Zynq. I have resource problems and I am trying to shrink it in order to make it fit in the fabric (precision reduction is not a solution).

The algorithm calculates a Tustin discretization of a doubly-fed induction generator (DFIG), which includes among other operations one 4x4 matrix inversion and two 4x4 matrix multiplications.

I have tested two different 4x4 matrix multiplications in a separate project and the latency, initiation interval and resources usage are the following:

__Implementation A:__

Latency = 60

Initiation Interval = 61

BRAM = 0 (0% of resources)

DSP48E = 28 **(12% of resources)**

FF = 2664 (4% of resources)

LUT = 4261 (8% of resources)

__Implementation B:__

Latency = 80

Initiation Interval = 81

BRAM = 2 (1% of resources)

DSP48E = 14 **(6% of resources)**

FF = 1151 (1% of resources)

LUT = 1984 (4% of resources)

Then, I just use these matrix multiplications as functions in the whole algorithm.

Regarding the whole program, the resources used using either matrix multiplication version A and B are the following:

Using matrix multiplication version A:

Latency = 382

Initiation Interval = 383

BRAM = 16 (5% of resources)

DSP48E = 300 **(136% of resources)**

FF = 27279 (25% of resources)

LUT = 43124 (81 % of resources)

Using matrix multiplication version B:

Latency = 391

Initiation Interval = 392

BRAM = 20 (7% of resources)

DSP48E = 314 **(142% of resources)**

FF = 25100 (23% of resources)

LUT = 41491 (77 % of resources)

It doesn't make much sense, does it? Can somebody tell me why is this happening or how can I reduce the number of DSPs used?

Many thanks,

Cerilet

1 Reply

debrajr

Moderator

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

07-15-2016 12:03 AM

5,275 Views

Registered:
04-17-2011

a_ Use #pragma HLS INLINE on the matrix-multiplication function in your complete project

b_ Use the ALLOCATION directive on the matrix-multiplication function to limit the number of multiplication operations inorder to encourage Resource sharing. These mul operations usually gets fitted in the DSP.

c_ Use a higher order of binding by setting the config_bind in Solution Configurations

Regards,

Debraj

----------------------------------------------------------------------------------------------

Kindly note- Please mark the Answer as "Accept as solution" if information provided is helpful.

Give Kudos to a post which you think is helpful and reply oriented.

----------------------------------------------------------------------------------------------

Debraj

----------------------------------------------------------------------------------------------

Kindly note- Please mark the Answer as "Accept as solution" if information provided is helpful.

Give Kudos to a post which you think is helpful and reply oriented.

----------------------------------------------------------------------------------------------