cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 

Using the Methodology Report Part Six: Design is not consistently routable

Moderator
Moderator
6 0 349

The analysis in this blog entry is based on a real customer issue where their DFX design was not consistently routable and facing routing overlap.

This blog will show some of the debug techniques we used to narrow down the root cause and to fix the issue. 

This is part six of the Using the Methodology Report series. For all entries in the series, see here.

 

Issue Explanation:

In this example, the user was facing a strange issue with their DFX design where it was not routable consistently and some of its nets remained unrouted.

The Tcl command report_route_status was showing the following, with 165 unrouted nets:

hemangd_0-1604063050050.png

Root cause analysis:

When we looked at the design, we found that there were really large hold violations of around - 4.6 ns present with inter clock paths as shown below.

However, these violations were not seen at the routed checkpoint. They were seen in the log at the start of route_design.

Note:  To analyze timing with estimated routing delays in more detail, use the estimated option for interconnect in the Timing summary report in the Vivado GUI.

You can check the Timing Summary for a design yourself using the options below:

  • In the Vivado GUI Go to Reports tab -> Timing -> Report Timing Summary
  • Run the Tcl command below:
report_timing_summary -file <filepath>/timingreport.txt

This interconnect setting controls whether net delays are calculated based on the estimated route distance between leaf cell pins, by the actual routed net, or whether they are excluded  from timing analysis. See (UG906) for more information. 

dfx7.PNG

 

Alternatively, you can use the Tcl command below to analyze timing with estimated routing delays. 

set_delay_mode -interconnect estimated 

 

dfx1.PNG

 
 

With the help of Report clock interaction, we can find these inter clock paths violations for all specific clock domains as shown below.

(To check the clock interaction report in the Vivado GUI, select Reports -> Timing -> Report Clock Interaction)

dfx2.PNG

 

Looking at these large hold violations, we came to the conclusion that either the clocking topologies had issues or the design was not properly constrained.

As a result both possibilities needed to be analyzed thoroughly. 

We observed that the clock path skew of this hold failed inter clock path (as shown below) was very high and looked suspicious.

dfx3.PNG

 

Vivado by default takes all clocks as synchronous. As a result these CDC asynchronous clock paths were also considered synchronous and so the incorrect clock skew was added here in the path. In this example it is around 4 ns.

So how did we know that these asynchronous CDCs were not properly constrained?

We got the information from the clock pair classification and Inter clock Constraints column (as shown below).

Refer to this blog entry for a better understanding of it.

hemangd_1-1604064044685.png

 

This caused the large hold violations, which caused the router to do a lot of hold fixing, which contributed to routing congestion.

The router will always prioritize fixing hold timing above setup timing as a design that fails hold timing will never work functionally while a design that fails setup can still function at lower frequencies.

Routing congestion cause by routing detours can lead to timing failures, but it can also cause unroutability.

Due to severe congestion, the router is unable to find free resources for routing. This is what happened with this specific case.

You can observe the amount of routing resources used by the router to fix the hold violation as a result of these underconstrained CDC paths.

Eventually, it caused the congestion/unrouted nets issue in this specific case.

The following screen capture shows one hold violation where the clock skew is 4 ns.

dfx4.PNG

The below image shows the total routing resources used by unsafe CDC paths where the hold is getting violated. 

dfx8.PNG

Also, one more point of analysis was the utilization that was under control and not above the threshold level. Again the root cause was improper constraints.

To check the resource utilization in the Vivado GUI, go to Reports tab -> Report Utilization.

Alternatively, you can run the report_utilization command in the Tcl console.

dfx5.PNG

How did the Methodology report help with this case?

By observing this report, we found that there were many methodology warnings present in the design.

The main warnings that affected the QoR of the design and needed to be resolved on a priority basis are listed below.

To open the methodology report in the Vivado GUI go to Report tab -> Report Methodology or in the Tcl console use report_methodology.

The screen capture below shows part of methodology report with warning messages for TIMING-6, 7, 8, 15 and 35.

dfx6.PNG

Based on the TIMING-6, TIMING-7, TIMING-8, and TIMING-35  warnings, the design is constrained incorrectly and it needs to be properly constrained.

For that, the user needs to refer to the clock interaction report to understand whether the inter clock paths are timed safe or not. For more information on the the Clock interaction report see (UG906).

The TIMING-15 warning states that there is a Large Hold violation on Inter clock paths which needs to be resolved before generating the bitstream. 

Because the router always tries to resolve hold violations and it is impacting on routing as well, it is recommended to properly constrain the design and clean the inter clock paths as mentioned in the above warning messages.

By checking the timing summary we observed that the Hold violations were very high, around -3 ns with inter clock paths.

For more information about these five warning messages and how to resolve them, see Appendix A of (UG906).

Conclusion:

Based on the analysis we observed that if the customer had followed and resolved the warnings from the methodology report at the initial stage of the debug then it would have taken much less time to debug this unrouted nets issue.

After constraints similar to the following were added, these ghost timing violations were resolved:

set_max_delay -datapath_only -from [<valid start point >] -to [<valid end points >] <Minimum clock period>

For more information on adding proper timing exceptions, (UG903) and this blog can be really helpful.

Finally, after these modifications were made, the user was able to increase the utilization of the reconfigurable module to 55% FF utilization.