UPGRADE YOUR BROWSER

We have detected your current browser version is not the latest one. Xilinx.com uses the latest web technologies to bring you the best online experience possible. Please upgrade to a Xilinx.com supported browser:Chrome, Firefox, Internet Explorer 11, Safari. Thank you!

cancel
Showing results for 
Search instead for 
Did you mean: 
Highlighted
246 Views
Registered: ‎10-15-2018

BRAM TIle and slice utilizationon exceeds max avilable on Zynq-xc7z045-ffg900-2 SoC.

Jump to solution

Hi,

We are using Zynq-xc7z045-ffg900-2 for our project.

Our BRAM Tile usage  is 731.5

Available = 545.

Few points - We surely cannot bring this all down using only architectural changes.

What are some of the other options in implementation that we have ?

Is there such a thing as combining some of these modules in 1 physical BRAM on the FPGA - does the MAP-PLACE-ROUTE process automatically fit them to use entire BRAM module and not waste fractions (something somewhat similar to fragmentation) ? If not, how can I check/enable/force that ?

Also, from what I see -each BRAM on Zynq 7z045 is 36KB, Can you please confirm this and that means I am presently short of (36*186.5)KB BRAM ?

I also see we are using 14245 LUTs as Distributed memory out of 70400 available, means =55k available. I understand MAP-PLACE-ROUTE tool will use these by default. Do I need to specify anything anywhere ? Very importantly, as I understand 2 of these LUTS can form a 32 bit memory - please confirm if these values are correct. If correct - that means I can use 55k of these to add (55k/2)*(32/8)bytes = 110KB. This will only add like equivalent of 3 BRAMS OF 36KB each. Please confirm if used data values (of 32 bits per 2 LUTS) and this calculation are correct so I know this will not suffice for the entire present shortage of (36*186.5) KB. To add information - our LUTs as logic usage is within the max available.

If this still does not suffice - what else can I check - how to bring this BRAM usage down ?

We also have utilization issue with Slice - (Slice-M and Slice-L). I cannot understand what to infer from this. I checked all other LUTS as logic and memory, BRAM, DSP etc. Does slice refer to the other blocks only from a top level - so the slice over utilization is only a reflection of the BRAM over-utilization ?

Thanks and sincerely

Bhawandeep Singh

 

 

0 Kudos
1 Solution

Accepted Solutions
Moderator
Moderator
125 Views
Registered: ‎08-08-2017

Re: BRAM TIle and slice utilizationon exceeds max avilable on Zynq-xc7z045-ffg900-2 SoC.

Jump to solution

Hi @bhawandeepsingh

Q1. - Please confirm that each of the 545 BRAMs on Zynq 7z045 is of size 36 Kbits. If this wrong, please help with the correct figure, I must know the size and structure of BRAM available clearly.

Yes, Each if the BRAM is of Size 36Kbits  and  can be configured as either two independent 18 Kb RAMs, or one 36 Kb RAM. Each 36 Kb block RAM can be configured as a 64K x 1 (when cascaded with an adjacent 36 Kb block RAM), 32K x 1, 16K x 2, 8K x 4, 4K x 9, 2K x 18, 1K x 36, or 512 x 72 in simple dual-port mode. Each 18 Kb block RAM can be configured as a 16K x 1, 8K x2 , 4K x 4, 2K x 9, 1K x 18 or 512 x 36 in simple dual-port mode.

What happens when I instantiate a block RAM of width 2 bits and depth 1100 locations ? What is the depth-width configuration/structure of the BRAM actually used - and is depth exactly 1100 or more ? If more, does it mean the remaining locations of this physical BRAM module go waste ?

if you choosing minimum area algorithm it will utilize the 8Kx2 (one 18Kb)  configuration to implement this thus you can use another independent 18Kb block in another instance. There is no way to utilize the remaining locations (8K-1100) in another instances.

 

 

 

 

 

-------------------------------------------------------------------------------------------------------------------------------
Reply if you have any queries, give kudos and accept as solution
-------------------------------------------------------------------------------------------------------------------------------
10 Replies
Moderator
Moderator
179 Views
Registered: ‎08-08-2017

Re: BRAM TIle and slice utilizationon exceeds max avilable on Zynq-xc7z045-ffg900-2 SoC.

Jump to solution

 Hi @bhawandeepsingh

The excess utilization of 186.5 Block RAM is not feasible to bring down, But you can try checking how much utilization is reduce with following BRAM utilization reduction methodologies

DRAM based implementation

1.If you are using inference based implementation , set RAM_STYLE = distributed which  instruct the synthesis tool to use the CB resources.

Capture.PNG

2. If you are using the XPM based implementation , use XPM_MEMORY_DPDISTRAM or other XPM_memory primitive and set Memory_type attribute to 

"Distribute"

Refer to the page 5 of libraries user guide for detailing of XPM

https://www.xilinx.com/support/documentation/sw_manuals/xilinx2018_2/ug974-vivado-ultrascale-libraries.pdf

3. If you have FIFO in your design and implemented using the Block RAM, change the implementation either to Builtin FIFO or distributed RAM

Capture.PNG

 

BRAM based implementation :

1. Choose the mimimum area algorithm to implementation.

Capture.PNG

The minimum area algorithm provides a highly optimized solution, resulting in a minimum number of block RAM primitives used, while reducing output multiplexing.

you may refer more on this in IP product guide.

https://www.xilinx.com/support/documentation/ip_documentation/blk_mem_gen/v8_0/pg058-blk-mem-gen.pdf

URAM based implementation 

Some BRAMs can be save if you use URAM resources in the design , but we dont have URAM resources in 7 series device.

-----------------------------------------------------------------------------------------------------------------------------------

Reply if you have any queries, Give Kudos and accept as solution

-----------------------------------------------------------------------------------------------------------------------------------

 

-------------------------------------------------------------------------------------------------------------------------------
Reply if you have any queries, give kudos and accept as solution
-------------------------------------------------------------------------------------------------------------------------------
163 Views
Registered: ‎10-15-2018

Re: BRAM TIle and slice utilizationon exceeds max avilable on Zynq-xc7z045-ffg900-2 SoC.

Jump to solution

Hi,

Thanks a lot for the detailed and informative answer.

We are not coding for RAM and allowing synthesis tool to infer.

We are actually instantiating LogicCore BRAM modules.

Also, to clarify, I do not want all of them to be implemented in LUTs. I just want all of them to be there - with whatever cannot be implemented in BRAM instances - that be implemented using Distribited LUTs and vice-versa.

I am not sure if most of the below solutions apply to direct instantiating method - except "Algorithm" = "Min area".

Will it impact "packing" of instantiated RAMs into a single physical BRAM ?

Thanks and sincerely

Bhawandeep Singh

0 Kudos
Moderator
Moderator
156 Views
Registered: ‎08-08-2017

Re: BRAM TIle and slice utilizationon exceeds max avilable on Zynq-xc7z045-ffg900-2 SoC.

Jump to solution

Hi @bhawandeepsingh

Minimum area algorithm selects primitives such that overall utilization in terms of RAMB36 (Equivalent to 2 RAMB16) block is less.

The example is given in product guide (page 42).

I am not sure if most of the below solutions apply to direct instantiating method - except "Algorithm" = "Min area".

-> I would recommend here to use the XPM based implementation to use Distributed RAM.You can replace the XPM base implementation for some of the BMG instances in you design.Though the XPM is also inference base, but instantiation template is similar to Primitives where memory is inferred based  Memory_type attribute. Take a look at  XPM instantiation templates in UG974.  You will find XPM based implementation more convenient.

-------------------------------------------------------------------------------------------------------------------------------------- 

Reply if you have any queries, give kudos and accept as solution

----------------------------------------------------------------------------------------------------------------------------------

 

-------------------------------------------------------------------------------------------------------------------------------
Reply if you have any queries, give kudos and accept as solution
-------------------------------------------------------------------------------------------------------------------------------
Tags (1)
142 Views
Registered: ‎10-15-2018

Re: BRAM TIle and slice utilizationon exceeds max avilable on Zynq-xc7z045-ffg900-2 SoC.

Jump to solution

Hi @pthakare

Thanks for your very insightful replies.

I still am not clear with a few things and I think I have a reason to try a last time in the direction in which I am trying, else I will need to make changes for XPM.

Presently, we are instantiating Block RAMs. I need some information to even be sure that we should face the problem we are facing. The point is - implementation says we are using up more than 545 of the 36 Kbit blocks - but I see in RTL - the sum of our instances is much much less than (36*545)Kbits.

Q1. - Please confirm that each of the 545 BRAMs on Zynq 7z045 is of size 36 Kbits. If this wrong, please help with the correct figure, I must know the size and structure of BRAM available clearly.

Q2. If each BRAM is of size 36Kbits - how is the structure of each - are there memories present with fixed depth and width such that total size of each is 36 Kbits and there are fixed number of memories present of each such configuration ? Or even the configurations are flexible ? For eg - presently - what happens when I instantiate a block RAM of width 2 bits and depth 1100 locations ? What is the depth-width configuration/structure of the BRAM actually used - and is depth exactly 1100 or more ? If more, does it mean the remaining locations of this physical BRAM module go waste ?

Please help with this. Thanks a lot for your help thus far.

 

 

0 Kudos
Moderator
Moderator
126 Views
Registered: ‎08-08-2017

Re: BRAM TIle and slice utilizationon exceeds max avilable on Zynq-xc7z045-ffg900-2 SoC.

Jump to solution

Hi @bhawandeepsingh

Q1. - Please confirm that each of the 545 BRAMs on Zynq 7z045 is of size 36 Kbits. If this wrong, please help with the correct figure, I must know the size and structure of BRAM available clearly.

Yes, Each if the BRAM is of Size 36Kbits  and  can be configured as either two independent 18 Kb RAMs, or one 36 Kb RAM. Each 36 Kb block RAM can be configured as a 64K x 1 (when cascaded with an adjacent 36 Kb block RAM), 32K x 1, 16K x 2, 8K x 4, 4K x 9, 2K x 18, 1K x 36, or 512 x 72 in simple dual-port mode. Each 18 Kb block RAM can be configured as a 16K x 1, 8K x2 , 4K x 4, 2K x 9, 1K x 18 or 512 x 36 in simple dual-port mode.

What happens when I instantiate a block RAM of width 2 bits and depth 1100 locations ? What is the depth-width configuration/structure of the BRAM actually used - and is depth exactly 1100 or more ? If more, does it mean the remaining locations of this physical BRAM module go waste ?

if you choosing minimum area algorithm it will utilize the 8Kx2 (one 18Kb)  configuration to implement this thus you can use another independent 18Kb block in another instance. There is no way to utilize the remaining locations (8K-1100) in another instances.

 

 

 

 

 

-------------------------------------------------------------------------------------------------------------------------------
Reply if you have any queries, give kudos and accept as solution
-------------------------------------------------------------------------------------------------------------------------------
98 Views
Registered: ‎10-15-2018

Re: BRAM TIle and slice utilizationon exceeds max avilable on Zynq-xc7z045-ffg900-2 SoC.

Jump to solution

Hi @pthakare

Thanks a lot, I have a much better understanding of my BRAM utilization numbers  and the fragmentation now.

I have a question -

1. Is there a way to specify Distributed memory to be initialized with values from a .coe file ? If yes , how ? If not -

2) => .coe files can only initialize LogicCore BRAM modules. Then I might not have the luxury to sit and write verilog code to initialize each location in each Distributed memory. In that case - is it possible to instantiate a LogicCore BRAM + associated .coe, but specify in directives for it to be realized using LUTs rather than the LogicCore BRAM?

Your replies have been very helpful so far. Thanks

Thanks and sincerely

Bhawandeep Singh

0 Kudos
Scholar u4223374
Scholar
88 Views
Registered: ‎04-26-2015

Re: BRAM TIle and slice utilizationon exceeds max avilable on Zynq-xc7z045-ffg900-2 SoC.

Jump to solution

I don't see how moving from BRAM to LUTRAM (distributed memory) is going to help. LUTRAM is essentially turning SLICEMs into RAM - and you're already over the slice limit. You might get BRAM utilization down to an acceptable level, but if it just results in 300% slice utilization then you still can't build the design.

It looks like either you need to do some manual packing (eg. I'm pretty sure that Vivado will not put two arrays into a single BRAM_18K, even though technically if each array only needs one port then this is manageable) or move to a larger chip (eg. Zynq UltraScale+).

82 Views
Registered: ‎10-15-2018

Re: BRAM TIle and slice utilizationon exceeds max avilable on Zynq-xc7z045-ffg900-2 SoC.

Jump to solution

Hi @u4223374

Thanks for the reply. This is confusing. The utilization report shows slice over utilization, but not LUT over utilization. In my case, LUTs for logic utilization ~ 90% and LUTs for DRAM utilization <25%. But slice utilization is ~117%. How exactly do these numbers fit ? Does slice refer to any and every kind of logical block on the FPGA - including DSPs, BRAMs etc - which means slice overutilization in my context refers to BRAM over utilization ?

 Either ways - do you know if I can go ahead with any of the two ways I asked in my previous post ? I ask since many BRAMs in our project are fragmented. I just want and need to be able to not have to use BRAM for those cases.

Thanks and sincerely

Bhawandeep Singh 

0 Kudos
Moderator
Moderator
57 Views
Registered: ‎08-08-2017

Re: BRAM TIle and slice utilizationon exceeds max avilable on Zynq-xc7z045-ffg900-2 SoC.

Jump to solution

Hi @bhawandeepsingh

Does slice refer to any and every kind of logical block on the FPGA - including DSPs, BRAMs etc - which means slice overutilization in my context refers to BRAM over utilization ?

-> Slice contains LUTs, Storage Elements (Flop or Latches) ,Multiplexers and carry logic.   DSP and BRAM sources are not part of  Configurable logic blocks (CLB contains SLICEL and SLICEM). CLBs in 7 series can contain two SLICEL or a SLICEL and a SLICEM. SLICEM only support storing data using Distributed RAM.

The CLB detailing is documented in UG474.

As you mentioned that Slice Utilization in you design is 117% .Basically it constitutes of SILICEL (Logic) and SLICEM (Logic + DRAM) utilization.As @u4223374 mentioned , It will not be helpful moving BRAM to Distributed RAM if Slice utilization is already exceeding. 

 

 

 

 

-------------------------------------------------------------------------------------------------------------------------------
Reply if you have any queries, give kudos and accept as solution
-------------------------------------------------------------------------------------------------------------------------------
0 Kudos
Scholar u4223374
Scholar
44 Views
Registered: ‎04-26-2015

Re: BRAM TIle and slice utilizationon exceeds max avilable on Zynq-xc7z045-ffg900-2 SoC.

Jump to solution

@bhawandeepsingh That means that you're using 90% of the LUTs for logic (which is already extremely high - you will have trouble getting timing closure at any reasonable clock speed) and another 25% used for LUT RAM. This is a total of 115% of the LUTs used, which is clearly too many.

 

BRAM and DSPs are not included in the slice/LUT utilization counts; the numbers you see are probably because Vivado is actively using LUT RAM to try to reduce BRAM usage, but there simply isn't enough space available.

0 Kudos