UPGRADE YOUR BROWSER

We have detected your current browser version is not the latest one. Xilinx.com uses the latest web technologies to bring you the best online experience possible. Please upgrade to a Xilinx.com supported browser:Chrome, Firefox, Internet Explorer 11, Safari. Thank you!

cancel
Showing results for 
Search instead for 
Did you mean: 
Observer kevinclaycomb
Observer
760 Views
Registered: ‎12-19-2012

Constraining the clock network for DDR interface on Ultrascale+

Hello,

 

My team is migrating a 7 Series source synchronous center aligned DDR ADC interface to Ultrascale+.  In the 7 Series design we had a BUFR/BUFIO setup with ISERDES and IDELAYS.  IDELAYS only in the data paths and not on the bit clock.  We have migrated that over to a BUFGCE/BUFGCE_DIV setup successfully.  However, we are having trouble meeting timing (with the same FIXED IDELAY value) from build to build as we build out the rest of the design and insert/remove logic from the design.   We have appropriate input delay constraints on the interface and believe those are right (and they worked well on the previous 7 Series design).  The interface is 200MHz DDR and it can meet timing statically with margin using the right IDELAY value.

 

Investigation of this issue between two dcps (previous good and new failed) after a new build failed timing shows that the clock buffer is moving locations, but still inside the same clock region.  Further the routing delay between the two builds shows a significant skew between the two, which is causing the failure on the new build when it changes.  On 7 Series I would expect two clock buffer locations close together like that to not show significant differences in delay.  However, I think what I am reading in the documentation and the forums is telling me that on Ultrascale+ the clock network is much more dynamic - and indeed I found that although the placed buffers are both in the same clock region one of them has a clock root in the next region over.

 

In the end what I need is to constrain the tool so that the clock delay (IOB->BUFG->ISERDES) is close to the same value build after build so that a fixed IDELAY value will work consistently to meet timing. My first reaction was to fix the location of the clock buffer, but it looks like that is not recommended (and further is not the entire problem).  Is there a way to constrain the CLOCK_ROOT or give the tool a target delay to try to meet when routing the clock?  Is this the right way to approach this issue?

 

Any insight is appreciated,

 

Kevin

0 Kudos
2 Replies
Historian
Historian
742 Views
Registered: ‎01-23-2009

Re: Constraining the clock network for DDR interface on Ultrascale+

Take a look at the CLOCK_DELAY_GROUP attribute.

 

For other clocking schemes (like two clocks from the same MMCM) when you put different clock nets in the same CLOCK_DELAY_GROUP, this instructs Vivado that the clocks within the group must be balanced against each other - specifically the CLOCK_ROOT of the different clocks must be kept in the same clock region.

 

If you have proper constraints on the input interface and the tool says you pass timing, then the placement of the CLK buffer is not a problem; if the BUFG for CLK were in a "bad" location, your input constraints would fail... Are you sure your input constraints are correct? Take a look at this post on constraining source synchronous DDR input interfaces.

 

One would have thought that the same would be true for the CLK->CLKDIV clock crossing - if the clock skew were too great then it should fail timing. However, in the 7 series, I don't think there was a timing check on this - there was a physical check to ensure that a legal clocking scheme was used (BUFIO/BUFR or MMCM->BUFG/BUFG), but not an actual timing check. If this is done the same way in UltraScale, then bad BUFG/BUFGCE_DIV placement could cause device failures without static timing failures (which seems like a big weakness in the tool), but would be fixed by the CLOCK_DELAY_GROUP. Are you sure there are no critical warnings regarding these clocks?

 

Avrum

0 Kudos
Observer kevinclaycomb
Observer
712 Views
Registered: ‎12-19-2012

Re: Constraining the clock network for DDR interface on Ultrascale+

Hi Avrumw - Thanks for the reply.

 

Here are my clock and input_delay constraints.  For this interface there is a single data pair, frame clock, and bit clock(dclk).  I stared at several of your posts for quite a while when I originally generated them for the 7 Series design.  So thanks for that!

 

create_clock -period 5.000 -name dclk_A -waveform {0.000 2.500} [get_ports {dclk_p[0]}]

 

set_input_delay -clock dclk_A -max 1.575 [get_ports {{fclk_p[0]} {data_p[0]}}]
set_input_delay -clock dclk_A -min 0.925 [get_ports {{fclk_p[0]} {data_p[0]}}]
set_input_delay -clock dclk_A -clock_fall -max -add_delay 1.575 [get_ports {{fclk_p[0]} {data_p[0]}}]
set_input_delay -clock dclk_A -clock_fall -min -add_delay 0.925 [get_ports {{fclk_p[0]} {data_p[0]}}]

 

Just to be clear, these are the constraints that fail when the tool decides things should move.  It is not that the timing cannot be met statically, it is just that the FIXED IDELAY value that was appropriate for the previous place/route is no longer appropriate.  This doesn't always happen, but if we add/remove a chunk of logic or perhaps put in an chipscope ILA it typically does happen. 

 

On 7 Series our process for this was as follows:

 

1. Instantiate BUFIO/BUFR clock buffers to drive ISERDES/IDELAY

2. Setup input delay constraints

3. Run through place/route

4. Review timing report and determine the appropriate delay required for frame clock and data to meet timing.

5. Apply a fixed IDELAY value in the HDL

6. Rerun place/route to verify timing closure.

 

This always worked and once setup we never needed to go back and touch these values again I assume because of the physical routing constraints on the 7 series.  I am now wondering if this approach is even valid for the Ultrascale architecture since the routing of the clock and divided clock are not as physically constrained.  I realize that we could move to a more complicated setup where we adjust the IDELAY with a state machine or the new BISC/BITSLICE setup, but with this interface being slow enough to capture statically I was hoping to avoid the extra moving parts and keep it simple.

 

I will experiment with the CLOCK_DELAY_GROUP attribute and see how it effects things.

 

Thanks,

 

Kevin

0 Kudos