UPGRADE YOUR BROWSER

We have detected your current browser version is not the latest one. Xilinx.com uses the latest web technologies to bring you the best online experience possible. Please upgrade to a Xilinx.com supported browser:Chrome, Firefox, Internet Explorer 11, Safari. Thank you!

cancel
Showing results for 
Search instead for 
Did you mean: 
Highlighted
Visitor iryont2
Visitor
522 Views
Registered: ‎06-17-2018

BRAM to DSP48A1 - setup path not met

Jump to solution

Hello

 

The device I'm currently working with is Spartan-6 XC6LX9. I have a design which stores coefficients in a dual-port Read Only RAM (BRAM). The RAM is inferred properly by the XST:

 

 

Found 1024x35-bit dual-port Read Only RAM <Mram_coefficients_int> for signal <coefficients_int>.

 

 

Another module is a 32x35 multiply-accumulate inferred to 4x DSP48A1:

 

 

Synthesizing (advanced) Unit <signed_mac>.
The following registers are absorbed into accumulator <pipe_int_6>: 1 register on signal <pipe_int_6>.
Found pipelined multiplier on signal <input1_int[31]_input2_int[34]_MuLt_0_OUT>:
- 6 pipeline level(s) found in a register connected to the multiplier macro output.
Pushing register(s) into the multiplier macro.

 

 

Inferred macros:

 

 

Advanced HDL Synthesis Report

Macro Statistics
# RAMs                                                 : 3
 1024x35-bit dual-port block Read Only RAM             : 1
 64x32-bit dual-port distributed RAM                   : 2
# Multipliers                                          : 4
 35x32-bit registered multiplier                       : 4

 

 

As far as I'm aware BRAM should operate up to 280 MHz or so. The same goes for DSP48A1 slice. Both are directly connected (dual-port Real Only RAM to multiply-accumulate unit). However, I'm having hard time meeting the constraint of 200 MHz clock by rather margin values:

 

 

 ================================================================================ 
 Timing constraint: PERIOD analysis for net "CLK_200MHz_int" derived from  NET "CLK_50MHz_IBUFG" PERIOD = 20 ns HIGH 50% INPUT_JITTER 0 ns;  divided by 4.00 to 5 nS and duty cycle corrected to HIGH 2.500 nS   
 For more information, see Period Analysis in the Timing Closure User Guide (UG612). 
  48384 paths analyzed, 8548 endpoints analyzed, 21 failing endpoints 
  21 timing errors detected. (21 setup errors, 0 hold errors, 0 component switching limit errors) 
  Minimum period is   5.367ns. 
 -------------------------------------------------------------------------------- 
  
 Paths for end point mac0/Mmult_input1_int[31]_input2_int[34]_MuLt_0_OUT (DSP48_X0Y0.A7), 1 path 
 -------------------------------------------------------------------------------- 
 Slack (setup path):     -0.367ns (requirement - (data path - clock path skew + uncertainty)) 
   Source:               coeffs_rom_Mram_coefficients_int1 (RAM) 
   Destination:          mac0/Mmult_input1_int[31]_input2_int[34]_MuLt_0_OUT (DSP) 
   Requirement:          5.000ns 
   Data Path Delay:      5.212ns (Levels of Logic = 0) 
   Clock Path Skew:      0.030ns (0.697 - 0.667) 
   Source Clock:         CLK_200MHz_int_BUFG rising at 0.000ns 
   Destination Clock:    CLK_200MHz_int_BUFG rising at 5.000ns 
   Clock Uncertainty:    0.185ns 
  
   Clock Uncertainty:          0.185ns  ((TSJ^2 + TIJ^2)^1/2 + DJ) / 2 + PE 
     Total System Jitter (TSJ):  0.070ns 
     Total Input Jitter (TIJ):   0.000ns 
     Discrete Jitter (DJ):       0.300ns 
     Phase Error (PE):           0.000ns 
  
   Maximum Data Path at Slow Process Corner: coeffs_rom_Mram_coefficients_int1 to mac0/Mmult_input1_int[31]_input2_int[34]_MuLt_0_OUT 
     Location             Delay type         Delay(ns)  Physical Resource 
                                                        Logical Resource(s) 
     -------------------------------------------------  ------------------- 
     RAMB16_X0Y14.DOB7    Trcko_DOB             2.100   coeffs_rom_Mram_coefficients_int1 
                                                        coeffs_rom_Mram_coefficients_int1 
     DSP48_X0Y0.A7        net (fanout=3)        2.951   coeffs_data_outa_int<7> 
     DSP48_X0Y0.CLK       Tdspdck_A_A1REG       0.161   mac0/Mmult_input1_int[31]_input2_int[34]_MuLt_0_OUT 
                                                        mac0/Mmult_input1_int[31]_input2_int[34]_MuLt_0_OUT 
     -------------------------------------------------  --------------------------- 
     Total                                      5.212ns (2.261ns logic, 2.951ns route) 
                                                        (43.4% logic, 56.6% route) 
  
 -------------------------------------------------------------------------------- 
  
 Paths for end point mac0/Mmult_input1_int[31]_input2_int[34]_MuLt_0_OUT1 (DSP48_X0Y1.A3), 1 path 
 -------------------------------------------------------------------------------- 
 Slack (setup path):     -0.274ns (requirement - (data path - clock path skew + uncertainty)) 
   Source:               coeffs_rom_Mram_coefficients_int2 (RAM) 
   Destination:          mac0/Mmult_input1_int[31]_input2_int[34]_MuLt_0_OUT1 (DSP) 
   Requirement:          5.000ns 
   Data Path Delay:      5.104ns (Levels of Logic = 0) 
   Clock Path Skew:      0.015ns (0.776 - 0.761) 
   Source Clock:         CLK_200MHz_int_BUFG rising at 0.000ns 
   Destination Clock:    CLK_200MHz_int_BUFG rising at 5.000ns 
   Clock Uncertainty:    0.185ns 
  
   Clock Uncertainty:          0.185ns  ((TSJ^2 + TIJ^2)^1/2 + DJ) / 2 + PE 
     Total System Jitter (TSJ):  0.070ns 
     Total Input Jitter (TIJ):   0.000ns 
     Discrete Jitter (DJ):       0.300ns 
     Phase Error (PE):           0.000ns 

 

It is obvious that I cannot do anything about BRAM delay time (2.1 ns) since it's exactly as the document of BRAM for Spartan-6 claims it to be. However, what about inputs of DSP48A which takes a lot setup time? Is there a possibility to reduce it? If so, how?

0 Kudos
1 Solution

Accepted Solutions
Moderator
Moderator
616 Views
Registered: ‎11-04-2010

Re: BRAM to DSP48A1 - setup path not met

Jump to solution
Hi, @iryont2 ,
You can consider to add pipe line register between BRAM and DSP.
-------------------------------------------------------------------------
Don't forget to reply, kudo, and accept as solution.
-------------------------------------------------------------------------
2 Replies
Moderator
Moderator
617 Views
Registered: ‎11-04-2010

Re: BRAM to DSP48A1 - setup path not met

Jump to solution
Hi, @iryont2 ,
You can consider to add pipe line register between BRAM and DSP.
-------------------------------------------------------------------------
Don't forget to reply, kudo, and accept as solution.
-------------------------------------------------------------------------
Visitor iryont2
Visitor
485 Views
Registered: ‎06-17-2018

Re: BRAM to DSP48A1 - setup path not met

Jump to solution

Ahh, yea, I thought about it before, but now I remember why I didn't go with it - the map couldn't place DSP48A slices. However, after messing around with extra cost tables and starting placer cost table it did map it correctly and the timings are met. Thank you :)

 

Edit1:

 

After adding pipelining to the core it couldn't map DSP48A slices again. I will mess around with the cost tables, so hopefully it can map it again. Following is the result of the map:

 

Phase 10.8  Global Placement
.......................
..................................................................................................................................................
..................................................................
................................
ERROR:Place:543 - This design does not fit into the number of slices available
   in this device due to the complexity of the design and/or constraints.

   Unplaced instances by type:

     DSP48A1    4 (25.0)

   Please evaluate the following:

   - If there are user-defined constraints or area groups:
     Please look at the "User-defined constraints" section below to determine
     what constraints might be impacting the fitting of this design.
     Evaluate if they can be moved, removed or resized to allow for fitting.
     Verify that they do not overlap or conflict with clock region restrictions.
     See the clock region reports in the MAP log file (*map) for more details
     on clock region usage.

   - If there is difficulty in placing LUTs:
     Try using the MAP LUT Combining Option (map lc area|auto|off).

   - If there is difficulty in placing FFs:
     Evaluate the number and configuration of the control sets in your design.

   The following instances are the last set of instances that failed to place:

   0. Placer RPM "Ppc" (size: 4)
      DSP48A1 mac3/Mmult_input1_int[31]_input2_int[34]_MuLt_0_OUT1
      DSP48A1 mac3/Mmult_input1_int[31]_input2_int[34]_MuLt_0_OUT2
      DSP48A1 mac3/Mmult_input1_int[31]_input2_int[34]_MuLt_0_OUT
      DSP48A1 mac3/Mmult_input1_int[31]_input2_int[34]_MuLt_0_OUT3

Phase 10.8  Global Placement (Checksum:2c772dda) REAL time: 17 secs 

Phase 11.9  Local Placement Optimization
ERROR:Place:543 - This design does not fit into the number of slices available
   in this device due to the complexity of the design and/or constraints.

   Unplaced instances by type:

     DSP48A1    4 (25.0)

   Please evaluate the following:

   - If there are user-defined constraints or area groups:
     Please look at the "User-defined constraints" section below to determine
     what constraints might be impacting the fitting of this design.
     Evaluate if they can be moved, removed or resized to allow for fitting.
     Verify that they do not overlap or conflict with clock region restrictions.
     See the clock region reports in the MAP log file (*map) for more details
     on clock region usage.

   - If there is difficulty in placing LUTs:
     Try using the MAP LUT Combining Option (map lc area|auto|off).

   - If there is difficulty in placing FFs:
     Evaluate the number and configuration of the control sets in your design.

   The following instances are the last set of instances that failed to place:

   0. Placer RPM "Ppc" (size: 4)
      DSP48A1 mac3/Mmult_input1_int[31]_input2_int[34]_MuLt_0_OUT1
      DSP48A1 mac3/Mmult_input1_int[31]_input2_int[34]_MuLt_0_OUT2
      DSP48A1 mac3/Mmult_input1_int[31]_input2_int[34]_MuLt_0_OUT
      DSP48A1 mac3/Mmult_input1_int[31]_input2_int[34]_MuLt_0_OUT3

ERROR:Place:120 - There were not enough sites to place all selected components.
   Some of these failures can be circumvented by using an alternate algorithm
   (though it may take longer run time). If you would like to enable this
   algorithm please set the environment variable XIL_PAR_ENABLE_LEGALIZER to 1
   and try again 


Phase 11.9  Local Placement Optimization (Checksum:2c772dda) REAL time: 17 secs 

Total REAL time to Placer completion: 17 secs 
Total CPU  time to Placer completion: 18 secs 
ERROR:Pack:1654 - The timing-driven placement phase encountered an error.

Mapping completed.
See MAP report file "core_map.mrp" for details.
Problem encountered during the packing phase.

Design Summary
--------------
Number of errors   :   5
Number of warnings :   0

 

Edit2:

 

Yep, it did map it again after changing Starting Placer Cost Table value.

0 Kudos