09-26-2020 09:12 PM
We need urgent technical support about system generator/model composer used in Vivado2020. Please help us resolve the issue described below.
1. We are converting Matlab simulink model implemented by system generator into VHDL modules. While too much LUTs and DSPs resources are used.
We are replacing Intel’s FPGA chip with Xilinx’s. While for the same function, the utilization with DSPBuilder and Quartus tool of Intel(used in our previous projects) are much lower than that with system generator/model composer. Please see the difference in the picture below:
From the picture above, it appears that system generator used too much LUTs instead of Block RAM. Besides that, too much DSPs are use, which exceeds the max number of DSPs. So, how to optimize our model or configuration to utilize less resource?. We hope that utilization will even be less than Intel's.
2. Will Model composer usually have less utilization compared with system generator for the same model? It seems yes in the picture below, which shows the utilization of Model composer for the same function. If yes, We want convert the Model composer into VHDL files, while it only has three types of outputs: IP catalog, C++ and System generator. If we use IP catalog, it will generate an IP with AXI-lite interface. While we would better to have the input and output ports interface listed in the model rather than AXI-lite. So how to get the VHDL files with input and output ports interface , or how to convert AXI-lite interface to the input and output ports interface listed in the model?
Besides that, we also want to optimize Model composer to use less resources than that with Intel’s DSPbuilder.
Looking forward to hear from you soon. Thanks!
09-27-2020 05:05 AM
Add some more information:
I use Vivado2020.1 and tried four different synthesis strategy showed as below. But the results are similiar and all of them utilize too much LUTs resources.
09-27-2020 07:30 PM
I have forwarded your questions to the moderator and asked them to engage.
09-27-2020 10:54 PM
Hi @wwlcumt ,
Since the BRAMs utilized are zero in the report, it appears that LUT has been used for memory in the design.
I am not sure what blocks are used in the System generator design, but Sysgen has options to utilize BRAM or Distributed RAM as options. Could you please check if the memory options are selected appropriately.
Similar is the case with DSP slice. The blocks based on for DSP slice do have options to balance between DSP slice / LUT based multipliers and adders.
So the design needs to be checked for the resource options used in the Sysgen blocks.
09-27-2020 11:31 PM
Thanks very much for your reply.
1. How to utilize BRAM or Distributed RAM as options? If I want to utilize BRAM for block "AddSub", should I check in the checkbox circled in the picture below?
2. And if I want to reduce the utilization of DSP blocks for block "Mult", should I check in the checkbox circled in the picture below?
3. If I use model composer, How should I optimize these options?
09-28-2020 12:37 AM
Hi @wwlcumt ,
The ADDSUB and MULT IP do not use any memory. It uses LUT/FF for implementing the Arithmetic functions only.
If DSP slice is not required for ADD/SUB or Multiply, please implement using Fabric for ADD/SUB and uncheck the "use embedded multiplier" options for Mult.
Identify which block / subsytem occupies maximum resources from the Synthesis report. Alternatively, run a resource utilization report from Sysgen as shown below and then check which block is occupying more LUTs. If there is a memory block like FIFO, SPRAM or DPRAM, implement it using the BRAM.
09-28-2020 12:52 AM
I don‘t know why this module utilize so many LUTs. The utilization about the module I posted in the last message is listed in the picture below.
Please help me how to reduce LUTs.
09-28-2020 03:08 AM - edited 09-28-2020 03:10 AM
Hi @wwlcumt ,
The ADD/SUb block seem to take more LUTs here because the function is implemented using "Fabric". So the subtracter takes up LUTs based on the width of the operands.
You can avoid this by re-targeting the Arithmetic functions with DSP48.
Use DSP48 for "a+b" and "a-b" blocks. The Mult blocks is already targeting the DSP48 blocks.
Please check the design if it uses DSP48 blocks for any non-arithmetic operations , like constant blocks. This can be moved to LUTs, freeing up DSP
09-28-2020 05:40 AM
I don't have enough DSP48 to be used for ADD/Sub.
I want ADD/SUb block take more Block Rams and registers instead of LUTs or DSP48.
In Quartus and DSPBuilder, I use the same model while it use lots of Block Rams and registers. You can see the picture I posted before.
So I really wonder how to transfer LUTs resources to Block Rams or registers when using system generator or model composer. Thanks!
09-28-2020 08:08 AM
The subsytem "speed up" takes a part of the LUTs and 2 DSP slices. However, other subsytems in the design take a lot of DSP slices and LUTs. Adders and MULT IP do not use BRAM blocks in this case.
If there are memories in those subsytems , consider changing it BRAM blocks from Distributed RAMs and changing few functions from DSP slices to LUTs.
09-30-2020 12:39 AM
Our model has no RAM function, but is mainly used for mathematical operations such as addition, subtraction, multiplication and division.Intel's DSPBuilder can use time sharing to reuse the logic in the model, and it will use a lot of BRAM to realize time sharing.Can other tools such as Model Composer or System Generator or Xilinx provide similar tools that can implement time-sharing multiplexing with a large number of BRAM, or other methods to implement time-sharing multiplexing plus, minus, multiply and divide logic?
10-01-2020 12:58 AM - edited 10-01-2020 01:04 AM
Hi @wwlcumt ,
Time sharing of DSP resources is possible. However this should be designed explicitly. The DSP slices need to run at higher clock frequency compared to the input sample frequency. Then the extra clock cycles can be used for processing multiple samples.
UG579, page65 describes this technique.
10-06-2020 10:56 PM
Hi Thanks for your reply.
1. How to set time sharing in model composer? The FPGA frequency is 50 MHz and the sample frequency is 50KHz.
2. How to time sharing LUT/register resources? This is very important for us.