UPGRADE YOUR BROWSER

We have detected your current browser version is not the latest one. Xilinx.com uses the latest web technologies to bring you the best online experience possible. Please upgrade to a Xilinx.com supported browser:Chrome, Firefox, Internet Explorer 11, Safari. Thank you!

cancel
Showing results for 
Search instead for 
Did you mean: 
Observer mbarnard
Observer
780 Views
Registered: ‎09-30-2008

Trouble Implementing a Simple Source Sync. Input DDR Interface

Hi - 

 

I'm trying to implement a source synchronous edge-aligned input DDR interface with the clock rate at 300 MHz.  I route the clock thru an IBUFDS -> BUFG -> MMCM -> BUFG with the MMCM feedback path also routed thru a BUFG.  The MMCM is set to multiply up and divide down by three so the VCO runs at 900 MHz.

 

The data gets routed thru an IDELAY -> IDDR with the IDDR set to same edge pipelined.

 

In the timing analyzer I am seeing the clock time from the BUFG to the IDDR vary by over 2 ns across the Min at Slow Process Corner to the Max at Fast Process Corner.  The Min / Max input delays are +/- 100 ps.

 

The 2 ns variance is larger than the 1.67 ns data eye width, excluding setup / hold requirements which makes the situation worse.  Based on the timing results, it appears that I can never get this to work without some sort of alignment that runs based on temperature.

 

This seems wrong.  I've done these sort of interfaces before in older less capable parts and never seen this kind of clock delay variance.  What gives?

 

Mike Barnard

0 Kudos
7 Replies
Observer mbarnard
Observer
778 Views
Registered: ‎09-30-2008

Re: Trouble Implementing a Simple Source Sync. Input DDR Interface

Forgot to add that this a Kintex Ultrascale device:  XCKU060-FFVA1156-2E.

0 Kudos
Moderator
Moderator
753 Views
Registered: ‎04-18-2011

Re: Trouble Implementing a Simple Source Sync. Input DDR Interface

First how come you have a bufg on the input path of the mmcm? I don't think this will deskew the clock at the io logic.
Try IBUFDS - > mmcm and set it up in the wizard to do phase alignment (bufg in the feedback path and output driving a bufg)
Second tell us what the clock and data alignment should be is it centre or edge aligned.
Then show us the constraints and tell us the setup and hold slack.
As a rule of thumb if you don't get a positive number when you add the slack for set up and hold then it is going to be difficult to get this to meet timing.
There is a point where the input cannot be captured statically
First deskew the clock properly and see where you are
-------------------------------------------------------------------------
Don’t forget to reply, kudo, and accept as solution.
-------------------------------------------------------------------------
Historian
Historian
752 Views
Registered: ‎01-23-2009

Re: Trouble Implementing a Simple Source Sync. Input DDR Interface

First, you should not have the first BUFG in the clock path - the clock should go

 

IBUFDS -> MMCM/CLKIN -> MMCM/CLKOUTx -> BUFG -> IDDR/C

 

(the IBUFDS must be on a clock capable pin in the same region as the MMCM).

 

The extra BUFG is going to add LOTS of uncertainty.

 

Second, looking at the min to max variation of the clock alone is not enough. The MMCM will cancel out much of the clock path, and there will also be some uncertainty removed using the "pessimism removal".

 

All that being said, 300MHz DDR in a source synchronous interface is "difficult". The data valid window is 1.67ns (as you mentioned), which may be too small to capture statically - in the 7 series this was too fast for MMCM/BUFG capture, but was just on the edge of what could be done with "ChipSync" capture (BUFIO/BUFR). Take a look at this post on input capture clocking architectures.

 

But UltraScale/UltraScale+ has no direct analogue to the BUFIO. In theory the IBUFDS->BUFGCE/BUFGCE_DIV can be used instead (without an MMCM), but I have no experience with this, so don't know what the practical clock speed limit is.

 

If you can't get the interface to meet timing with these architectures, then you will need to consider dynamic capture...

 

Avrum

Observer mbarnard
Observer
730 Views
Registered: ‎09-30-2008

Re: Trouble Implementing a Simple Source Sync. Input DDR Interface

Hi - 

 

Thanks for responding to my question.  I appreciate the responses!  Also, thanks for the link.  So to answer some questions (from both responses):

 

1.  The clock is edge aligned with the data.

2.  The data arrives at the input pins within +/- 100 ps. of the clock arrival at it's input pins.

3.  The clock is on a clock capable I/O pin.

4.  In Figure 3-9 of UG572 (Ultrascale Clocking Resources) which shows how to deskew a clock the clock input goes thru an "IBUFG", which I interpreted to be a BUFG.  Good to know that an "IBUFG" is really an IBUF or in my case an IBUFDS.

5.  I've actually tried to get this interface to work without the MMCM using just an IBUFDS -> BUFG and I get very similar results, where the clock delay from the output of the final BUFG to the IDDR clock input has a very large variation over the process corners.

 

I will remove the input BUFG but I'm going to guess that it isn't going to make much difference.

 

I guess what I need to know is whether or not this is a normal amount of variation.  If so, then so be it.  I just don't remember seeing nearly this much before.

 

Regards,

 

Mike Barnard

 

0 Kudos
Observer mbarnard
Observer
699 Views
Registered: ‎09-30-2008

Re: Trouble Implementing a Simple Source Sync. Input DDR Interface

OK - 

 

I got rid of the BUFG in front of the MMCM and it made very little difference.

 

I guess that the timing analyzer is telling me the truth, and that there are large differences in clock delay from the BUFG output to the IDDR clock inputs over PVT.

 

So, with no local clocking buffers (BUFIO or BUFR), input DDR interfaces in Ultrascale perform worse than in older part families.

 

Mike Barnard

0 Kudos
Highlighted
Voyager
Voyager
663 Views
Registered: ‎04-26-2012

Re: Trouble Implementing a Simple Source Sync. Input DDR Interface

@mbarnard   "So, with no local clocking buffers (BUFIO or BUFR), input DDR interfaces in Ultrascale perform worse than in older part families."

 

 I've seen this as well - without[1] the dedicated I/O clocking paths, you are at the mercy of how the "ASIC class" clock distribution network gets assembled, which tends to be much more variable build-to-build and instance-to-instance, as Xilinx's algorithms do not appear to take I/O clocking constraints into account when placing elements and building the resulting clock tree.

 

 That said, I've had good success with moderate rate ( < 320 MHz SDR, 160 MHz DDR ) designs in Ultrascale by manually constraining the tools to create a clock distribution tree spanning only one clock region, with the clock root in the same region.

 

See the following thread for more info:

https://forums.xilinx.com/t5/Timing-Analysis/implementation-helps-FPGA-I-O-pass-STA/m-p/757816#M11358

"

" The only way I've found to get predictable Ultrascale timing for the equivalent of

" BUFIO=>I/O DDR, BUFR=>FABRIC is to manually force all of the following:

"   - LOC the input clock buffer to the clock region with the I/O DDR flops in question

"   - Force the clock root into that same clock region with USER_CLOCK_ROOT 

"   - create a PBLOCK to force anything on that same clock net into that same clock region

"

 

 The latest tools have added a CLOCK_LOW_FANOUT constraint, see UG949 page 101, that hopefully would make this process simpler than the USER_CLOCK_ROOT + pblock approach I describe above.

 

XAPP1324 is also a good reference for designs using Ultrascale 'component mode' I/O primitives.

 

-Brian

 

[1] "Native mode" has an internal direct strobe path, but I haven't done any designs using it yet

Tags (1)
Observer mbarnard
Observer
656 Views
Registered: ‎09-30-2008

Re: Trouble Implementing a Simple Source Sync. Input DDR Interface

Thank you so much for the info. brimdavis!  I'm glad that I'm not imagining this.  Much obliged.

 

Regards,

 

Mike Barnard

0 Kudos