cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
Highlighted
Visitor
Visitor
503 Views
Registered: ‎04-07-2020

Vivado awful routing on a critical timing path

Hi,

I'm running Vivado in project mode.Design contains some BD IPs such as ethernet and DDR on the VCU118 board. In addition I have some RTL files with the project I do for the packet processing.

Design utilization is pretty low at this stage:


+----------------------------+--------+-------+-----------+-------+
| Site Type | Used | Fixed | Available | Util% |
+----------------------------+--------+-------+-----------+-------+
| CLB LUTs | 76701 | 0 | 1182240 | 6.49 |
| LUT as Logic | 73092 | 0 | 1182240 | 6.18 |
| LUT as Memory | 3609 | 0 | 591840 | 0.61 |
| LUT as Distributed RAM | 2344 | 0 | | |
| LUT as Shift Register | 1265 | 0 | | |
| CLB Registers | 176088 | 0 | 2364480 | 7.45 |
| Register as Flip Flop | 176075 | 0 | 2364480 | 7.45 |
| Register as Latch | 12 | 0 | 2364480 | <0.01 |
| Register as AND/OR | 1 | 0 | 2364480 | <0.01 |
| CARRY8 | 3666 | 0 | 147780 | 2.48 |
| F7 Muxes | 4179 | 0 | 591120 | 0.71 |
| F8 Muxes | 1292 | 0 | 295560 | 0.44 |
| F9 Muxes | 0 | 0 | 147780 | 0.00 |
+----------------------------+--------+-------+-----------+-------+

After a valid synthesis and PnR I get some violating timing paths. After reviewing one of them I see that the slack is due to awful route on a net with only one load, running between 2 adjacent cells. See the attached snapshot.

I have a 3.1nSec path but this net alone takes more than 2.5nSec for propagation!

Please advise.

 

 

bad route.png
0 Kudos
6 Replies
Highlighted
Voyager
Voyager
496 Views
Registered: ‎06-28-2018

Hi @celare 

Can you share the timing analysis results for that particular path?

0 Kudos
Highlighted
Visitor
Visitor
481 Views
Registered: ‎04-07-2020

This is the report. On bold is the problematic net:

 

Max Delay Paths
--------------------------------------------------------------------------------------
Slack (VIOLATED) : -1.770ns (required time - arrival time)
Source: design_1_i/tsg_uc_top/tsg_uc_app_top_0/inst/tsg_uc_app/axi_st_channel_async_eth_100g_0_pre/generic_async_fifo/ASYNC_FIFO_BASED_FLOPS.rddo_pop_ptr_reg[1]/C
(rising edge-triggered cell FDCE clocked by clk_out1_design_1_clk_wiz_app_a_clk_0 {rise@0.000ns fall@1.923ns period=3.846ns})
Destination: design_1_i/tsg_uc_top/tsg_uc_app_top_0/inst/tsg_uc_app/axi_st_channel_async_eth_100g_0_pre/generic_async_fifo/ASYNC_FIFO_BASED_FLOPS.rd_data_out_reg[200]/D
(rising edge-triggered cell FDCE clocked by clk_out1_design_1_clk_wiz_app_a_clk_0 {rise@0.000ns fall@1.923ns period=3.846ns})
Path Group: clk_out1_design_1_clk_wiz_app_a_clk_0
Path Type: Setup (Max at Slow Process Corner)
Requirement: 3.846ns (clk_out1_design_1_clk_wiz_app_a_clk_0 rise@3.846ns - clk_out1_design_1_clk_wiz_app_a_clk_0 rise@0.000ns)
Data Path Delay: 5.645ns (logic 0.448ns (7.936%) route 5.197ns (92.064%))
Logic Levels: 3 (LUT3=1 LUT5=1 LUT6=1)
Clock Path Skew: 0.063ns (DCD - SCD + CPR)
Destination Clock Delay (DCD): 5.772ns = ( 9.618 - 3.846 )
Source Clock Delay (SCD): 5.502ns
Clock Pessimism Removal (CPR): -0.208ns
Clock Uncertainty: 0.058ns ((TSJ^2 + DJ^2)^1/2) / 2 + PE
Total System Jitter (TSJ): 0.071ns
Discrete Jitter (DJ): 0.092ns
Phase Error (PE): 0.000ns
Clock Net Delay (Source): 2.825ns (routing 1.848ns, distribution 0.977ns)
Clock Net Delay (Destination): 2.664ns (routing 1.676ns, distribution 0.988ns)

Location Delay type Incr(ns) Path(ns) Netlist Resource(s)
------------------------------------------------------------------- -------------------
(clock clk_out1_design_1_clk_wiz_app_a_clk_0 rise edge)
0.000 0.000 r
G31 0.000 0.000 r default_sysclk1_300_clk_p (IN)
net (fo=0) 0.000 0.000 design_1_i/tsg_uc_top/tsg_uc_bsp_top/clk_wiz_cfg_clk/inst/clkin1_ibufds/I
HPIOBDIFFINBUF_X0Y204
DIFFINBUF (Prop_DIFFINBUF_HPIOBDIFFINBUF_DIFF_IN_P_O)
0.559 0.559 r design_1_i/tsg_uc_top/tsg_uc_bsp_top/clk_wiz_cfg_clk/inst/clkin1_ibufds/DIFFINBUF_INST/O
net (fo=1, routed) 0.050 0.609 design_1_i/tsg_uc_top/tsg_uc_bsp_top/clk_wiz_cfg_clk/inst/clkin1_ibufds/OUT
G31 IBUFCTRL (Prop_IBUFCTRL_HPIOB_M_I_O)
0.000 0.609 r design_1_i/tsg_uc_top/tsg_uc_bsp_top/clk_wiz_cfg_clk/inst/clkin1_ibufds/IBUFCTRL_INST/O
net (fo=1, routed) 0.390 0.999 design_1_i/tsg_uc_top/tsg_uc_bsp_top/clk_wiz_cfg_clk/inst/clk_in1_design_1_clk_wiz_cfg_clk_0
PLL_X0Y17 PLLE4_ADV (Prop_PLL_CLKIN_CLKOUT1)
0.033 1.032 r design_1_i/tsg_uc_top/tsg_uc_bsp_top/clk_wiz_cfg_clk/inst/plle4_adv_inst/CLKOUT1
net (fo=1, routed) 0.278 1.310 design_1_i/tsg_uc_top/tsg_uc_bsp_top/clk_wiz_cfg_clk/inst/clk_out2_design_1_clk_wiz_cfg_clk_0
BUFGCE_X0Y194 BUFGCE (Prop_BUFCE_BUFGCE_I_O)
0.028 1.338 r design_1_i/tsg_uc_top/tsg_uc_bsp_top/clk_wiz_cfg_clk/inst/clkout2_buf/O
net (fo=2, routed) 1.007 2.345 design_1_i/tsg_uc_top/tsg_uc_bsp_top/app_plls/clk_wiz_app_a_clk/inst/CLK_CORE_DRP_I/clk_inst/clk_in1
PLL_X0Y15 PLLE4_ADV (Prop_PLL_CLKIN_CLKOUT0)
0.033 2.378 r design_1_i/tsg_uc_top/tsg_uc_bsp_top/app_plls/clk_wiz_app_a_clk/inst/CLK_CORE_DRP_I/clk_inst/plle4_adv_inst/CLKOUT0
net (fo=1, routed) 0.271 2.649 design_1_i/tsg_uc_top/tsg_uc_bsp_top/app_plls/clk_wiz_app_a_clk/inst/CLK_CORE_DRP_I/clk_inst/clk_out1_design_1_clk_wiz_app_a_clk_0
BUFGCE_X0Y169 BUFGCE (Prop_BUFCE_BUFGCE_I_O)
0.028 2.677 r design_1_i/tsg_uc_top/tsg_uc_bsp_top/app_plls/clk_wiz_app_a_clk/inst/CLK_CORE_DRP_I/clk_inst/clkout1_buf/O
X1Y13 (CLOCK_ROOT) net (fo=4688, routed) 2.825 5.502 design_1_i/tsg_uc_top/tsg_uc_app_top_0/inst/tsg_uc_app/axi_st_channel_async_eth_100g_0_pre/generic_async_fifo/app_a_pll_clk
SLR Crossing[1->2]
SLICE_X32Y802 FDCE r design_1_i/tsg_uc_top/tsg_uc_app_top_0/inst/tsg_uc_app/axi_st_channel_async_eth_100g_0_pre/generic_async_fifo/ASYNC_FIFO_BASED_FLOPS.rddo_pop_ptr_reg[1]/C
------------------------------------------------------------------- -------------------
SLICE_X32Y802 FDCE (Prop_GFF2_SLICEL_C_Q)
0.081 5.583 r design_1_i/tsg_uc_top/tsg_uc_app_top_0/inst/tsg_uc_app/axi_st_channel_async_eth_100g_0_pre/generic_async_fifo/ASYNC_FIFO_BASED_FLOPS.rddo_pop_ptr_reg[1]/Q
net (fo=7, routed) 0.263 5.846 design_1_i/tsg_uc_top/tsg_uc_app_top_0/inst/tsg_uc_app/axi_st_channel_async_eth_100g_0_pre/generic_async_fifo/ASYNC_FIFO_BASED_FLOPS.rddo_pop_ptr[1]
SLICE_X32Y803 LUT3 (Prop_C6LUT_SLICEL_I0_O)
0.096 5.942 r design_1_i/tsg_uc_top/tsg_uc_app_top_0/inst/tsg_uc_app/axi_st_channel_async_eth_100g_0_pre/generic_async_fifo/ASYNC_FIFO_BASED_FLOPS.rd_data_out[576]_i_7__1/O
net (fo=578, routed) 2.336 8.278 design_1_i/tsg_uc_top/tsg_uc_app_top_0/inst/tsg_uc_app/axi_st_channel_async_eth_100g_0_pre/generic_async_fifo/ASYNC_FIFO_BASED_FLOPS.rd_data_out[576]_i_7__1_n_0
SLICE_X17Y845 LUT5 (Prop_C6LUT_SLICEM_I3_O)
0.123 8.401 r design_1_i/tsg_uc_top/tsg_uc_app_top_0/inst/tsg_uc_app/axi_st_channel_async_eth_100g_0_pre/generic_async_fifo/ASYNC_FIFO_BASED_FLOPS.rd_data_out[200]_i_2__1/O
net (fo=1, routed) 2.547 10.948 design_1_i/tsg_uc_top/tsg_uc_app_top_0/inst/tsg_uc_app/axi_st_channel_async_eth_100g_0_pre/generic_async_fifo/ASYNC_FIFO_BASED_FLOPS.rd_data_out[200]_i_2__1_n_0
SLICE_X14Y836 LUT6 (Prop_F6LUT_SLICEL_I5_O)
0.148 11.096 r design_1_i/tsg_uc_top/tsg_uc_app_top_0/inst/tsg_uc_app/axi_st_channel_async_eth_100g_0_pre/generic_async_fifo/ASYNC_FIFO_BASED_FLOPS.rd_data_out[200]_i_1__1/O
net (fo=1, routed) 0.051 11.147 design_1_i/tsg_uc_top/tsg_uc_app_top_0/inst/tsg_uc_app/axi_st_channel_async_eth_100g_0_pre/generic_async_fifo/ASYNC_FIFO_BASED_FLOPS.rd_data_out[200]_i_1__1_n_0
SLICE_X14Y836 FDCE r design_1_i/tsg_uc_top/tsg_uc_app_top_0/inst/tsg_uc_app/axi_st_channel_async_eth_100g_0_pre/generic_async_fifo/ASYNC_FIFO_BASED_FLOPS.rd_data_out_reg[200]/D
------------------------------------------------------------------- -------------------

(clock clk_out1_design_1_clk_wiz_app_a_clk_0 rise edge)
3.846 3.846 r
G31 0.000 3.846 r default_sysclk1_300_clk_p (IN)
net (fo=0) 0.000 3.846 design_1_i/tsg_uc_top/tsg_uc_bsp_top/clk_wiz_cfg_clk/inst/clkin1_ibufds/I
HPIOBDIFFINBUF_X0Y204
DIFFINBUF (Prop_DIFFINBUF_HPIOBDIFFINBUF_DIFF_IN_P_O)
0.462 4.308 r design_1_i/tsg_uc_top/tsg_uc_bsp_top/clk_wiz_cfg_clk/inst/clkin1_ibufds/DIFFINBUF_INST/O
net (fo=1, routed) 0.040 4.348 design_1_i/tsg_uc_top/tsg_uc_bsp_top/clk_wiz_cfg_clk/inst/clkin1_ibufds/OUT
G31 IBUFCTRL (Prop_IBUFCTRL_HPIOB_M_I_O)
0.000 4.348 r design_1_i/tsg_uc_top/tsg_uc_bsp_top/clk_wiz_cfg_clk/inst/clkin1_ibufds/IBUFCTRL_INST/O
net (fo=1, routed) 0.339 4.687 design_1_i/tsg_uc_top/tsg_uc_bsp_top/clk_wiz_cfg_clk/inst/clk_in1_design_1_clk_wiz_cfg_clk_0
PLL_X0Y17 PLLE4_ADV (Prop_PLL_CLKIN_CLKOUT1)
0.430 5.117 r design_1_i/tsg_uc_top/tsg_uc_bsp_top/clk_wiz_cfg_clk/inst/plle4_adv_inst/CLKOUT1
net (fo=1, routed) 0.238 5.355 design_1_i/tsg_uc_top/tsg_uc_bsp_top/clk_wiz_cfg_clk/inst/clk_out2_design_1_clk_wiz_cfg_clk_0
BUFGCE_X0Y194 BUFGCE (Prop_BUFCE_BUFGCE_I_O)
0.024 5.379 r design_1_i/tsg_uc_top/tsg_uc_bsp_top/clk_wiz_cfg_clk/inst/clkout2_buf/O
net (fo=2, routed) 0.888 6.267 design_1_i/tsg_uc_top/tsg_uc_bsp_top/app_plls/clk_wiz_app_a_clk/inst/CLK_CORE_DRP_I/clk_inst/clk_in1
PLL_X0Y15 PLLE4_ADV (Prop_PLL_CLKIN_CLKOUT0)
0.430 6.697 r design_1_i/tsg_uc_top/tsg_uc_bsp_top/app_plls/clk_wiz_app_a_clk/inst/CLK_CORE_DRP_I/clk_inst/plle4_adv_inst/CLKOUT0
net (fo=1, routed) 0.233 6.930 design_1_i/tsg_uc_top/tsg_uc_bsp_top/app_plls/clk_wiz_app_a_clk/inst/CLK_CORE_DRP_I/clk_inst/clk_out1_design_1_clk_wiz_app_a_clk_0
BUFGCE_X0Y169 BUFGCE (Prop_BUFCE_BUFGCE_I_O)
0.024 6.954 r design_1_i/tsg_uc_top/tsg_uc_bsp_top/app_plls/clk_wiz_app_a_clk/inst/CLK_CORE_DRP_I/clk_inst/clkout1_buf/O
X1Y13 (CLOCK_ROOT) net (fo=4688, routed) 2.664 9.618 design_1_i/tsg_uc_top/tsg_uc_app_top_0/inst/tsg_uc_app/axi_st_channel_async_eth_100g_0_pre/generic_async_fifo/app_a_pll_clk
SLR Crossing[1->2]
SLICE_X14Y836 FDCE r design_1_i/tsg_uc_top/tsg_uc_app_top_0/inst/tsg_uc_app/axi_st_channel_async_eth_100g_0_pre/generic_async_fifo/ASYNC_FIFO_BASED_FLOPS.rd_data_out_reg[200]/C
clock pessimism -0.208 9.410
clock uncertainty -0.058 9.352
SLICE_X14Y836 FDCE (Setup_FFF_SLICEL_C_D)
0.025 9.377 design_1_i/tsg_uc_top/tsg_uc_app_top_0/inst/tsg_uc_app/axi_st_channel_async_eth_100g_0_pre/generic_async_fifo/ASYNC_FIFO_BASED_FLOPS.rd_data_out_reg[200]
-------------------------------------------------------------------
required time 9.377
arrival time -11.147
-------------------------------------------------------------------
slack -1.770

Tags (1)
0 Kudos
Highlighted
Moderator
Moderator
452 Views
Registered: ‎04-18-2011

A good step here is to share the timing report on this path post synthesis. Is is meeting or almost meeting the setup on the max path here and what is the hold path like? If it has an unrealistic hold time you can see it try to fix it up at placement and add additional routing to fix it which can break the setup check 

-------------------------------------------------------------------------
Don’t forget to reply, kudo, and accept as solution.
-------------------------------------------------------------------------
0 Kudos
Highlighted
Guide
Guide
419 Views
Registered: ‎01-23-2009

It's very hard to say what's going on here.

Usually when we see a long route like this, it is due to a bad clock domain crossing path or bad clock structure that is creating a large hold time violation. The tool fixes this by adding routing, and you would see a serpentine route like this.

This path, though, is not a clock domain crossing path - it starts and ends on the same clock and goes through the same clock distribution (admittedly a long one, through two PLLs and two BUFGs, which may not be ideal, but isn't the source of your problem).

However, looking at the naming of the signals, there are some interesting keywords in them, with the words "async_fifo" showing up in multiple places. So while this path isn't itself a clock domain crossing path, it is somehow involved in clock domain crossing (that's what an ASYNC FIFO is). So my suspicion is that this is some part of the clock domain crossing circuit (CDCC) that is shared between the "synchronous" part of the CDC (source and destination on the same clock) and the asynchronous part of the CDC (source and destination on opposite paths), and that there is a problem with the asynchronous part. If this net is on both paths, then the asynchronous path may have the unreasonable hold time (i.e. is missing a constraint) and causing the tool to insert the large route, but you are actually seeing the failure on the synchronous part. I have seen some CDCCs that have characteristics like that (often when distributed SelectRAM are used - this one seems to indicate it uses flip-flops as the storage cells for the FIFO, so it may have similar characteristics).

One way to confirm this is to find all paths that go through this net. This can be done with a command like

report_timing -max_paths 10000 -nworst 10000 -delay_type min_max -name my_report -through [get_nets design_1_i/tsg_uc_top/tsg_uc_app_top_0/inst/tsg_uc_app/axi_st_channel_async_eth_100g_0_pre/generic_async_fifo/ASYNC_FIFO_BASED_FLOPS.rd_data_out[200]_i_2__1_n_0]

This should generate a report that shows all paths that pass through this net - look for ones that have different source and destination clocks...

Avrum

Highlighted
Visitor
Visitor
269 Views
Registered: ‎04-07-2020

Thank you for the answers.

As for the 2 concatenated PLLs this is a known issue. This project is running on VCU118 with limited amount of clocks (As far as I've seen) and we plan to stop doing this concatenation when we have our real board with all the input clocks we need.

As for the specific route path, for the specific path I've sent there was indeed a problem in the asynchronous constraints and I've seen some wrong hold paths that might have caused this.

But

After constraining this fifo, and this path is not CDC, it is just another sample of the read data, I still see setup violations due to even worst routing path.

Running the "report_timing -max_paths 10000 -nworst 10000 -delay_type min_max -name my_report -through ..." showed 36 setup and 36 hold paths. hold worst slack was 700pSec (positive). Setup was violated:  -121pSec. All source and destination clocks are the same one (322MHz).

Attaching the route path view

bad route - fanout of 1.PNG
bad route - awful route.PNG
0 Kudos
Highlighted
Visitor
Visitor
216 Views
Registered: ‎04-07-2020

Hi,

I've just tried the simple solution of unroute this specific net and then route it back.

Instead of 1.2nSec delay it routed the same net with 0.26nSec and I got back 1nSec on a path with 3.1nSec cycle!

It is so easy so why doesn't the router do it by itself?

 

0 Kudos