cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
wwlcumt
Explorer
Explorer
1,085 Views
Registered: ‎07-24-2020

Utilization optimizatioin for system generator/model composer model

Hi @zhendon 

We need urgent technical support about system generator/model composer used in Vivado2020. Please help us resolve the issue described below. 

1. We are converting Matlab simulink model implemented by system generator into VHDL modules. While too much LUTs and DSPs resources are used.

We are replacing Intel’s FPGA chip with Xilinx’s. While for the same function, the utilization with DSPBuilder and Quartus tool of Intel(used in our previous projects) are much lower than that with system generator/model composer. Please see the difference in the picture below:

t1.jpg

 

 

 

From the picture above, it appears that system generator used too much LUTs instead of Block RAM. Besides that, too much DSPs are use, which exceeds the max number of DSPs. So, how to optimize our model or configuration to utilize less resource?. We hope that utilization will even be less than Intel's.

 

2.  Will Model composer usually have less utilization compared with system generator for the same model? It seems yes in the picture below, which shows the utilization of Model composer for the same function. If yes, We want convert the Model composer into VHDL files, while it only has three types of outputs: IP catalog, C++ and System generator. If we use IP catalog, it will generate an IP with AXI-lite interface. While we would better to have the input and output ports interface listed in the model rather than AXI-lite. So how to get the VHDL files with input and output ports interface , or how to convert AXI-lite interface to the input and output ports interface listed in the model?

Besides that, we also want to optimize Model composer to use less resources than that with Intel’s DSPbuilder.

 

t2.jpg

 

Looking forward to hear from you soon. Thanks!

0 Kudos
12 Replies
wwlcumt
Explorer
Explorer
1,053 Views
Registered: ‎07-24-2020

Add some more information:

I use Vivado2020.1 and tried four different synthesis strategy showed as below. But the results are similiar and all of them utilize too much LUTs resources. 

t3.png

0 Kudos
zhendon
Community Manager
Community Manager
1,016 Views
Registered: ‎08-30-2011

Hi @wwlcumt 

Thank you. 

I have forwarded your questions to the moderator and asked them to engage. 

 

-------------------------------------------------------------------------------------------------
Don’t forget to reply, kudo, and accept as solution.
-------------------------------------------------------------------------------------------------
如果提供的信息能解决您的问题,请标记为“接受为解决方案”。
如果您认为帖子有帮助,请点击“奖励”。谢谢!
-------------------------------------------------------------------------------------------------
0 Kudos
vkanchan
Xilinx Employee
Xilinx Employee
995 Views
Registered: ‎09-18-2018

Hi @wwlcumt ,

Since the BRAMs utilized are zero in the report, it appears that LUT has been used for memory in the design.

I am not sure what blocks are used in the System generator design, but Sysgen has options to utilize BRAM or Distributed RAM as options. Could you please check if the memory options are selected appropriately.

Similar is the case with DSP slice. The blocks based on for DSP slice do have options to balance between DSP slice / LUT based multipliers and adders.

So the design needs to be checked for the resource options used in the Sysgen blocks.

0 Kudos
wwlcumt
Explorer
Explorer
983 Views
Registered: ‎07-24-2020

Hi 

Thanks very much for your reply.

1. How to utilize BRAM or Distributed RAM as options? If I want to utilize BRAM for block "AddSub", should I check in the checkbox circled in the picture below?

2. And if I want to reduce the utilization of DSP blocks for block "Mult", should I check in the checkbox circled in the picture below?

Untitled11.png

3. If I use model composer, How should I optimize these options?

Thanks!

0 Kudos
vkanchan
Xilinx Employee
Xilinx Employee
974 Views
Registered: ‎09-18-2018

Hi @wwlcumt ,

The ADDSUB and MULT IP do not use any memory. It uses LUT/FF for implementing the Arithmetic functions only.

If DSP slice is not required for ADD/SUB or Multiply, please implement using Fabric for ADD/SUB and uncheck the "use embedded multiplier" options for Mult.

Identify which block / subsytem occupies maximum resources from the Synthesis report. Alternatively, run a resource utilization report from Sysgen as shown below and then check which block is occupying more LUTs. If there is a memory block like FIFO, SPRAM or DPRAM, implement it using the BRAM.

resource_analysis.jpg
0 Kudos
wwlcumt
Explorer
Explorer
966 Views
Registered: ‎07-24-2020

Hi

I don‘t know why this module utilize so many LUTs. The utilization about the module I posted in the last message is listed in the picture below.

Please help me how to reduce LUTs.

Untitled11.png

0 Kudos
vkanchan
Xilinx Employee
Xilinx Employee
942 Views
Registered: ‎09-18-2018

Hi @wwlcumt ,

The ADD/SUb block seem to take more LUTs here because the function is implemented using "Fabric". So the subtracter takes up LUTs based on the width of the operands.

You can avoid this by re-targeting the Arithmetic functions with DSP48.

Use DSP48 for "a+b" and "a-b" blocks. The Mult blocks is already targeting the DSP48 blocks.

 

Please check the design if it uses DSP48 blocks for any non-arithmetic operations , like constant blocks. This can be moved to LUTs, freeing up DSP

0 Kudos
wwlcumt
Explorer
Explorer
928 Views
Registered: ‎07-24-2020

Hi

I don't have enough DSP48 to be used for ADD/Sub.

I want ADD/SUb block  take more Block Rams and registers instead of LUTs or DSP48.

In Quartus and DSPBuilder, I use the same model while it use lots of Block Rams and registers. You can see the picture I posted before.

So I really wonder how to transfer LUTs resources to Block Rams or registers when using system generator or model composer. Thanks!

0 Kudos
vkanchan
Xilinx Employee
Xilinx Employee
907 Views
Registered: ‎09-18-2018

Hi @wwlcumt 

The subsytem "speed up" takes a part of the LUTs and 2 DSP slices. However, other subsytems in the design take a lot of DSP slices and LUTs. Adders and MULT IP do not use BRAM blocks in this case.

If there are memories in those subsytems , consider changing it BRAM blocks from Distributed RAMs and changing few functions from DSP slices to LUTs.

 

resource_analysis2.jpg
wwlcumt
Explorer
Explorer
864 Views
Registered: ‎07-24-2020

Hi

Our model has no RAM function, but is mainly used for mathematical operations such as addition, subtraction, multiplication and division.Intel's DSPBuilder can use time sharing to reuse the logic in the model, and it will use a lot of BRAM to realize time sharing.Can other tools such as Model Composer or System Generator or Xilinx provide similar tools that can implement time-sharing multiplexing with a large number of BRAM, or other methods to implement time-sharing multiplexing plus, minus, multiply and divide logic?

0 Kudos
vkanchan
Xilinx Employee
Xilinx Employee
837 Views
Registered: ‎09-18-2018

Hi @wwlcumt ,

Time sharing of DSP resources is possible. However this should be designed explicitly. The DSP slices need to run at higher clock frequency compared to the input sample frequency. Then the extra clock cycles can be used for processing multiple samples. 

UG579, page65 describes this technique.

0 Kudos
wwlcumt
Explorer
Explorer
793 Views
Registered: ‎07-24-2020

Hi Thanks for your reply.

1. How to set time sharing in model composer? The FPGA frequency is 50 MHz and the sample frequency is 50KHz.

2. How to time sharing LUT/register resources? This is very important for us.

Tags (1)
0 Kudos