cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
Highlighted
802 Views
Registered: ‎04-11-2019

Improve design to make better tiiming

Jump to solution

Hello,
I currently work on project with xc7a100tfgg484-1 device.
The project contains DDC, DDR3, Ethernet and some other modules.

Unfortunately, I have got some timing problems, i.e. WNS ~ -10ns, TNS ~ -1000ns.
These problems sometimes result in device failures.
I think the cause of the problem is the register with high fanout.
It is used for storing data, that are written to setup registers of all device modules then.

I tried to reduce fanout by register duplicating.
But it led to timing problems in other places with the same average WNS and TNS.

Fixing these timing problem creates new problems in other places.
Maybe it's because of the high resourse utilization.

Why autorouting don't try to improve design to make better values of WNS and TNS?
What should I try to do to solve the problem?
Maybe I have to select another implementation strategy?

I use Vivado 2015.3.

Resourse utilization:
LUT 83%
LUTRAM 76%
FF 50%
BRAM 94%
DSP 86%
IO 61%
BUFG 34%
MMCM 83%
PLL 17%

0 Kudos
1 Solution

Accepted Solutions
Highlighted
Guide
Guide
728 Views
Registered: ‎01-23-2009

Without timing exceptions this is going to lead to large (and false) WNS failures.

Assuming this clock domain crossing circuit (CDCC) is sufficient (and I have no way of saying if it is or is not - the two flip-flop synchronizer is only appropriate for slow changing single bit signals), then the path between REG_A and REG_B is not a synchronous path, but in Vivado, all clocks are assumed to be synchronous by default, so the tools will time them as such. This will likely result in a very small requirement (there is no nice common multiple between 135MHz and any of your other clock frequencies), which show up as large violations.

Again, assuming this CDCC is sufficient, then the minimum properties/constraints are:

  • Marking REG_B and REG_C as a synchronizer chain
    • This can be done in the RTL or the XDC with the ASYNC_REG property
    • RTL (Verilog)
      • (*ASYNC_REG = TRUE) reg REG_B, REG_C;
    • XDC
      • set_property ASYNC_REG TRUE [get_cells {REG_B_reg REG_C_reg}]
  • Putting some exception on the path from REG_A to REG_B
    • Depending on the rest of the system it may be OK to declare the path false
      • set_false_path -from [get_cells REG_A_reg] -to [get_cells REG_B_reg]
    • Or a safer solution is to use a set_maxdelay -datapath_only
      • The value of the maxdelay also depends on the system, but it is safe to be the period of the faster of the two clocks (so 7.4ns)
      • set_max_delay -datapath_only -from [get_cells REG_A_reg] -to [get_cells REG_B_reg] 7.4

Avrum

 

View solution in original post

4 Replies
Highlighted
793 Views
Registered: ‎06-21-2017

You need to either reduce resource usage or use a larger FPGA.  83% LUT usage, 94% BRAM and 86% DSP usage is straining the limits of routability.  A newer version ov Vivado may optimize your design more efficiently, bit I think you will need to recode to use fewer resources or move to a larger FPGA.

Highlighted
Teacher
Teacher
784 Views
Registered: ‎11-14-2011

Without seeing your architecture it's almost impossible to say where you may be able to optimise further.

What clock frequency(ies) are you targeting? If you have multiple domains, is it possible to combine clock domains?

 

Certainly I agree with @bruce_karaffa that a larger device will reduce routing congestion and placement issues. 

----------
"That which we must learn to do, we learn by doing." - Aristotle
Highlighted
743 Views
Registered: ‎04-11-2019

Thanks for answers.
I try newer Vivado version (18.2). It gave very similar timing results.

In my design I have multiple clock domains:
300 MHz DDR3 clock (23K loads);
135 MHz DDC clock (52K loads)
125 MHz Ethernet TX clock (52K loads);
125 MHz Ethernet RX clock (6K loads);
DDR3 and DDC clocks are generated by MMCME2_ADV from the same physical clock.
Unfortunately, I don't see any way to combine clock domains.

Maybe adding properly user constraints help me to improve existing design. Possibly I think so because of poor understanding of the constraints.

For example, I use the follow logic for writing setup (control) registers in different modules. I write data to all setup registers in CLK_RX domain (125 MHz Ethernet RX clock) and then read data from setup registers in another clock domain, for example, CLK_DDC (135 MHz clock). To move the data from CLK_RX domain to CLK_DDC domain I use two additional flip-flop layers. Classical clock domain crossing synchronization issue. Rewriting setup registers occurs at the next device setup (i.e. after a very large time interval). What is the false path? From REG_1 to REG_2, from REG_2 to REG_3, from REG_3 to another logic?CDC.jpg

0 Kudos
Highlighted
Guide
Guide
729 Views
Registered: ‎01-23-2009

Without timing exceptions this is going to lead to large (and false) WNS failures.

Assuming this clock domain crossing circuit (CDCC) is sufficient (and I have no way of saying if it is or is not - the two flip-flop synchronizer is only appropriate for slow changing single bit signals), then the path between REG_A and REG_B is not a synchronous path, but in Vivado, all clocks are assumed to be synchronous by default, so the tools will time them as such. This will likely result in a very small requirement (there is no nice common multiple between 135MHz and any of your other clock frequencies), which show up as large violations.

Again, assuming this CDCC is sufficient, then the minimum properties/constraints are:

  • Marking REG_B and REG_C as a synchronizer chain
    • This can be done in the RTL or the XDC with the ASYNC_REG property
    • RTL (Verilog)
      • (*ASYNC_REG = TRUE) reg REG_B, REG_C;
    • XDC
      • set_property ASYNC_REG TRUE [get_cells {REG_B_reg REG_C_reg}]
  • Putting some exception on the path from REG_A to REG_B
    • Depending on the rest of the system it may be OK to declare the path false
      • set_false_path -from [get_cells REG_A_reg] -to [get_cells REG_B_reg]
    • Or a safer solution is to use a set_maxdelay -datapath_only
      • The value of the maxdelay also depends on the system, but it is safe to be the period of the faster of the two clocks (so 7.4ns)
      • set_max_delay -datapath_only -from [get_cells REG_A_reg] -to [get_cells REG_B_reg] 7.4

Avrum

 

View solution in original post