cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 

RQS Design closure Suggestion ID "RQS_CLOCK-12"

syedz
Moderator
Moderator
4 0 454

In our previous blog entries, Improving QoR with report_qor_suggestions in Vivado and Design Closure with RQA and RQS, we learned how Report QOR Suggestions (RQS) helps in design closure with clocking, utilization, congestion and timing suggestions.

In this entry we cover the “RQS_CLOCK-12” clocking suggestion and how it helps in timing closure.

Prerequisites:

  • Know how to generate and apply report_qor_suggestions
  • Basic understanding of the CLOCK_LOW_FANOUT Constraint.

 

RQS_CLOCK-12:

The RQS_CLOCK-12 suggestion is an automatic incremental friendly suggestion generated for UltraScale and UltraScale+ devices. 

It uses the “CLOCK_LOW_FANOUT” property that will be assigned to either a clock net or a set of flip flops driven by a global clock buffer based on its load count.

  1. When applied on a clock net, the placement of the loads of a global clock buffer is constrained to a single clock region.
  2. When applied on a set of flip-flops, a new global clock buffer is replicated in parallel to the existing global clock buffer created during opt_design. The loads of the new global clock buffer are for these sets of flip-flops only, and are constrained to a single clock region.

Now let us see how the RQS_CLOCK-12 suggestion applies CLOCK_LOW_FANOUT to help timing closure of the design by reducing the clock skew.

Consider the two scenarios below in a routed design where there is an improper clock skew causing timing violation on the path from the flip-flop to the control pins (CE/CLR) of the global buffer.

Scenario 1:

RQS_12_1_Before_timing_ROOT.png

 

In this timing failed path, the clock buffer BUFGCE1 (clockout3_buf), the flip-flop and its driver BUFGCE2 (bufce_i) are all placed in the same clock region. The BUFGCE1 driving the flip-flop has high fanout (6419) and its clock net is spanning across the device as highlighted due to the loads.

The tool selects a CLOCK_ROOT location far from the global clock buffer driving it, causing high clock net delay and high clock skew.

Resolution for Scenario 1:

Apply CLOCK_LOW_FANOUT on the flip-flop so that a newly replicated BUFGCE (clkout3_buf_replica) is created from the original BUFGCE1 during opt_design which will drive only this critical flip flop. The net will now be constrained to a single clock region, reducing the clock net delay.

Also, because the source and loads are in the same clock region, CLOCK_LOW_FANOUT forces the clock root to be in the same clock region which helps in reducing the clock skew.

Schematic after CLOCK_LOW_FANOUT is applied on the critical flip-flop:

RQS_12_1_After_timing_ROOT.png

During the BUFG optimization phase of opt_design you should see a message on the global clock buffer created for the CLOCK_LOW_FANOUT property.

For example:

INFO: [Opt 31-1077] Phase BUFG optimization inserted 1 global clock buffer(s) for CLOCK_LOW_FANOUT.

Syntax:

set_property CLOCK_LOW_FANOUT TRUE [get_cells <flipflops_driven_by_globalclockbuffer>]

 

Scenario 2:

RQS_12_2_Before_timing_new_ROOT.png

In this timing failed path, the clock buffer BUFGCE1 (clkout1_BUFG_inst), the flip-flop and its driver BUFGCE2 are also placed in the same clock region. BUFGCE1 has low fanout (16) driving flip-flops but the loads are spread across multiple clock regions (marked in red). As a result, the tool selects a different CLOCK_ROOT to the global clock buffer driving it which causes high clock net delay and high clock skew.

Resolution for Scenario 2:

When the BUFGCE1 has low fanout (<2000) but the clock loads are spread across multiple clock regions, apply CLOCK_LOW_FANOUT on the clock net directly driven by BUFGCE1 so that the placement of all of its loads is constrained to a single clock region. This will reduce the clock net delay.

With source and loads now in the same clock region, CLOCK_LOW_FANOUT forces the clock root to be in the same clock region which helps to reduce the clock skew.

Schematic after CLOCK_LOW_FANOUT is applied on the clock net:

RQS_12_2_After_timing_ROOT.png

Syntax:

set_property CLOCK_LOW_FANOUT TRUE [get_nets <net_driven_by_globalclockbuffer>]

Summary

In this blog we have learned through the two example designs how the RQS_CLOCK-12 suggestion will be generated to apply the CLOCK_LOW_FANOUT property either on flip-flops or a clock net which are directly driven by a global clock buffer.

Tags (2)