UPGRADE YOUR BROWSER

We have detected your current browser version is not the latest one. Xilinx.com uses the latest web technologies to bring you the best online experience possible. Please upgrade to a Xilinx.com supported browser:Chrome, Firefox, Internet Explorer 11, Safari. Thank you!

cancel
Showing results for 
Search instead for 
Did you mean: 
Highlighted
Observer angu_sewa
Observer
588 Views
Registered: ‎07-16-2017

Over utilization of BRAMs in HLS

Hello All,

             I am facing this problem in HLS regarding the utilization of BRAMs. I am working with matrices of dimensions 128x100x360 (as shown in the figure below).  How to overcome this problem of over-utilization of BRAMs. I have reduced floating point data to fixed point type by using ap_fixed<18,8>. Please help.

code.JPG
bram.JPG
mem.JPG
1 Reply
Scholar u4223374
Scholar
551 Views
Registered: ‎04-26-2015

Re: Over utilization of BRAMs in HLS

@angu_sewa

 

Somehow, you need to get rid of the really big matrices - "proj" in particular is responsible for at least 4500 block RAMs. The common approaches to this are:

 

(1a) If the data flow through the matrix is streaming-friendly (ie it reads every element, in sequence) then you can use the DATAFLOW directive to remove that matrix completely.

 

(1b) If the data flow is sequential (even if you don't read every single element) then you can push it into off-chip RAM (assuming the board has off-chip RAM). AXI Master accesses to sequential elements will be pretty efficient if you can set up a burst access.

 

(2) Rewrite the code so it doesn't have to store this matrix at all. An example might be that if "proj" is generated from two other matrices, maybe instead of actually storing "proj" you can just generate each element when it's required (from the two original matrices). This is horrible for a CPU (as it has to keep wasting time regenerating the same values), but on an FPGA it can be quite efficient.

 

(3) Do smaller sections. For example, change the code so that it reads out 32*32 sections of the matrices from RAM, processes them, and writes them back to RAM. This may or may not be possible, depending on the algorithm.

0 Kudos