UPGRADE YOUR BROWSER

We have detected your current browser version is not the latest one. Xilinx.com uses the latest web technologies to bring you the best online experience possible. Please upgrade to a Xilinx.com supported browser:Chrome, Firefox, Internet Explorer 11, Safari. Thank you!

cancel
Showing results for 
Search instead for 
Did you mean: 
Highlighted
Observer bgiesing
Observer
338 Views
Registered: ‎05-09-2017

Constraining Clock Domain Crossing for Related Phase Shifted Clocks

We have a situation very similar to that described in:

https://forums.xilinx.com/t5/Timing-Analysis/Direct-data-crossing-between-synchronous-clocks/td-p/1002173

In our case, we are using an Ultrascale KCKU060, capturing source-synchronous DDR data from 18 ADCs.  Each ADC drives a data clock (DCO) to the FPGA for capturing their respective data streams.  We also forward the actual sample clock as a common clock, as we need to reconcile all 18 streams and assure they remain synchronous and processed on the same clock edge.

The delay of the sample clock through the ADCs, combined with the differences in routing delays across 18 ADCs, gives DCOs that are phase-shifted from the the common clock in the FPGA.  However, the phase shift is not arbitrary and can be bounded, so we figured we could write some constraints to force the tool to analyze the clock domain crossing.  We use an MMCM on the common clock to allow us to place the common clock in the optimal window for the domain crossing.

We have attempted to use the set_clock_latency on the DCO clocks, setting both a min and a max, expecting that to force the tool to analyze the domain crossing (since all clocks are related in Vivado unless constrained otherwise).  It turned out that applying the latency to the DCOs broke the capture of the data itself at the input - which makes sense because apparently the tool has no idea that the input data moves with the clock when the clock latency goes min or max (this is the case for the ADCs). 

Then we applied the set_clock_latency min/max to the common clock.  This "fixed" the input data capture for each of the DCOs, but there is no indication that our min/max latency's are being applied at the clock domain crossing.  Indeed, the only thing applied was the phase shift on the common clock introduced by the MMCM.  Timing closed, but again the variation due to the latency did not appear to be applied.

We have tried multiple other variations, including using -include_generated_clocks directives and various virtual clock constraints per the directions of some of the ARs and forum posts without any success.

Is there any way to get the tool to analyze a clock domain crossing with min/max latency being applied to one of the clocks?

Thank you,

Brian

 

0 Kudos
9 Replies
Xilinx Employee
Xilinx Employee
316 Views
Registered: ‎05-14-2008

Re: Constraining Clock Domain Crossing for Related Phase Shifted Clocks

Can you provide a diagram to show the signal connections on this interface.

And can you give some screenshots of the timing report to explain what you saw or the entire timing report?

-vivian

-------------------------------------------------------------------------------------------------
Don’t forget to reply, kudo, and accept as solution.
-------------------------------------------------------------------------------------------------
如果提供的信息能解决您的问题,请标记为“接受为解决方案”。
如果您认为帖子有帮助,请点击“奖励”。谢谢!
-------------------------------------------------------------------------------------------------
0 Kudos
Observer bgiesing
Observer
256 Views
Registered: ‎05-09-2017

Re: Constraining Clock Domain Crossing for Related Phase Shifted Clocks

A PDF of one of the interfaces is attached.  The "common clock" is called DSP1_ADC_CLK_P/N in the schematic.  It enters the differential buffer then flows to the MMCM and onto a global clock tree.  The MMCM is set to delay the clock 5/8 cycle.

The source-sync clock from the ADC is called RF0_ADC2_DCO_P/N in the schematic, and the data is RF0_ADC2_D_P/N[7:2].  The data is routed to an IDELAY3 to allow for fixed-time delay with voltage-temperature compensation.  The ADC data/clock are routed to an IDDRE1 in same-edge pipelined mode (technically speaking, routed to an ISERDES3 that is set as an IDDRE1). 

We have attempted a number of variations of timing constraints, applying latency to the DCO and applying latency to the common clock, as well as attempting to constrain the input data capture using the clock pin and "virtual clocks".  My first post was somewhat inaccurate.  Whenever we try to constrain the input delay of the ADC data/DCO interface using the "virtual clock" method, we cannot close timing on the data capture.  When we constrain the input delay of the ADC data/DCO interface using the actual clock pin, then we close timing on that capture.  In no cases have we been able to see the latency getting included in the clock domain cross between the DCO and common clock.  Here are a set of constraints as examples of what we have tried.  I will run fresh builds, and post fresh constraints and timing results, to ensure I am posting timing reports reflective of specific constraint sets (we have tried so many things, I don't want to post something inconsistent).

Note that the -set_clock_latency is being used with -early/-late rather than -min/-max.  Is this a potential issue?  Also note that -set_clock_latency is being applied to a clock that has been defined with a -waveform directive.  Is this a potential issue? Does -waveform override -set_clock_latency?

Here are the constraints for our attempt to apply latency to the common clock:

create_clock -period  4.568 -name RF0_ADC2_DCO_P [get_ports RF0_ADC2_DCO_P]
create_clock -period 4.568 -name DSP1_ADC_CLK_P -waveform {0.000 2.284} [get_ports DSP1_ADC_CLK_P]
set_clock_latency -rise -fall -source -early -1.061 [get_ports DSP1_ADC_CLK_P]
set_clock_latency -rise -fall -source -late   1.061 [get_ports DSP1_ADC_CLK_P]

# Edge-Aligned Double Data Rate Source Synchronous Inputs -- FROM TEMPLATE
set input_adcclk_period 4.568;             # Period of input clock (full-period)
create_clock -name virt_adcclk -period $input_adcclk_period;
set skew_bre            0.280;             # Data invalid before the rising clock edge
set skew_are            0.100;             # Data invalid after the rising clock edge
set skew_bfe            0.280;             # Data invalid before the falling clock edge
set skew_afe            0.100;             # Data invalid after the falling clock edge
set_input_delay -clock RF0_ADC2_DCO_P -max [expr $input_adcclk_period/2 + $skew_afe] [get_ports {RF0_ADC2_D_P[*]}];
set_input_delay -clock RF0_ADC2_DCO_P -min [expr $input_adcclk_period/2 - $skew_bfe] [get_ports {RF0_ADC2_D_P[*]}];
set_input_delay -clock RF0_ADC2_DCO_P -max [expr $input_adcclk_period/2 + $skew_are] [get_ports {RF0_ADC2_D_P[*]}] -clock_fall -add_delay;
set_input_delay -clock RF0_ADC2_DCO_P -min [expr $input_adcclk_period/2 - $skew_bre] [get_ports {RF0_ADC2_D_P[*]}] -clock_fall -add_delay;
 
And here are the constraints for our attempt at setting latency for the DCO rather than the common clock:
 
create_clock -period 4.568 -name DSP1_ADC_CLK_P -waveform {0.000 2.284} [get_ports DSP1_ADC_CLK_P]
create_clock -period  4.568 -name RF0_ADC2_DCO_P -waveform {0.000  2.284} [get_ports RF0_ADC2_DCO_P]
set_clock_latency -rise -fall -source -early -0.991 [get_ports RF0_ADC2_DCO_P]
set_clock_latency -rise -fall -source -late   0.009 [get_ports RF0_ADC2_DCO_P]
# Edge-Aligned Double Data Rate Source Synchronous Inputs -- FROM TEMPLATE
set input_adcclk_period 4.568;             # Period of input clock (full-period)
create_clock -name virt_adcclk_02 -period $input_adcclk_period;
set skew_bre            0.280;             # Data invalid before the rising clock edge
set skew_are            0.100;             # Data invalid after the rising clock edge
set skew_bfe            0.280;             # Data invalid before the falling clock edge
set skew_afe            0.100;             # Data invalid after the falling clock edge
# Input Delay Constraint (Proto)
set_input_delay -clock virt_adcclk_02 -max [expr $input_adcclk_period/2 + $skew_afe] [get_ports {RFDC0_ADC2_D_P[*]}];
set_input_delay -clock virt_adcclk_02 -min [expr $input_adcclk_period/2 - $skew_bfe] [get_ports {RFDC0_ADC2_D_P[*]}];
set_input_delay -clock virt_adcclk_02 -max [expr $input_adcclk_period/2 + $skew_are] [get_ports {RFDC0_ADC2_D_P[*]}] -clock_fall -add_delay;
set_input_delay -clock virt_adcclk_02 -min [expr $input_adcclk_period/2 - $skew_bre] [get_ports {RFDC0_ADC2_D_P[*]}] -clock_fall -add_delay;

 

Thanks,

Brian

0 Kudos
Guide avrumw
Guide
219 Views
Registered: ‎01-23-2009

Re: Constraining Clock Domain Crossing for Related Phase Shifted Clocks

I have a couple of ideas...

First, are you sure this is doable? The differences in the paths inside the FPGA will already be pretty significant (since one is going through an MMCM and BUFG with clock deskew, and the other is going to a BUFGCE_DIV directly. This is in addition to the path through the ADC (from the reference clock to the DCO) - all this uncertainty has to be fairly significantly less than the 4.568ns...

First thing is to revisit the set_clock_latency. There are a couple of things to look at.

It looks like in your attempt to add the latency to the DCO clock you constrain the I/O with a virtual clock. You definitely don't want to do this. If you do so then you are explicitly saying "the virtual clock has no variation, but the DCO clock does, so when you are looking at the timing between them (i.e. the set_input_delay), take the uncertaintly into account". This is the exact opposite of what you want - you are trying to convince it that the I/O "moves" with the DCO clock. So specify the I/O with respect to the same clock that you define on the DCO input pin (RF*_ADC*_DCO_P) - so in your second set of constraints, replace virt_adc_clk with RF2_ADC0_DCO_P.

Next, while the description is fairly vague, I seem to remember that the uncertainty is treated differently if it is attached to the clock vs. if it is attached to the port. When attached to the port, it affects all the clock downstream from that, which means the stuff inside the FPGA. When attached to the clock it is possible that it will apply to all things referenced to that clock, including the set_input_delay commands that use that clock. If so, then this would do what you want - change the latency of the DCO clocks, but not affect the relationship between the data and the DCO clock. This command also has the -clock option, which may also change the behavior - try combinations of this and see if they do what you want. Yours is applied to the port, so it definitely will only affect the inside of the FPGA (and not the set_input_delay commands).

So, try

create_clock -period 4.568 -name DSP1_ADC_CLK_P -waveform {0.000 2.284} [get_ports DSP1_ADC_CLK_P]
create_clock -period  4.568 -name RF0_ADC2_DCO_P -waveform {0.000  2.284} [get_ports RF0_ADC2_DCO_P]
set_clock_latency -rise -fall -source -early -0.991 -clock [get_clocks RF0_ADC2_DCO_P]
set_clock_latency -rise -fall -source -late   0.009 -clock [get_clocks RF0_ADC2_DCO_P]
# Edge-Aligned Double Data Rate Source Synchronous Inputs -- FROM TEMPLATE
set input_adcclk_period 4.568;             # Period of input clock (full-period)
create_clock -name virt_adcclk_02 -period $input_adcclk_period;
set skew_bre            0.280;             # Data invalid before the rising clock edge
set skew_are            0.100;             # Data invalid after the rising clock edge
set skew_bfe            0.280;             # Data invalid before the falling clock edge
set skew_afe            0.100;             # Data invalid after the falling clock edge
# Input Delay Constraint (Proto)
set_input_delay -clock RF0_ADC2_DCO_P -max [expr $input_adcclk_period/2 + $skew_afe] [get_ports {RFDC0_ADC2_D_P[*]}];
set_input_delay -clock RF0_ADC2_DCO_P -min [expr $input_adcclk_period/2 - $skew_bfe] [get_ports {RFDC0_ADC2_D_P[*]}];
set_input_delay -clock RF0_ADC2_DCO_P -max [expr $input_adcclk_period/2 + $skew_are] [get_ports {RFDC0_ADC2_D_P[*]}] -clock_fall -add_delay;
set_input_delay -clock RF0_ADC2_DCO_P -min [expr $input_adcclk_period/2 - $skew_bre] [get_ports {RFDC0_ADC2_D_P[*]}] -clock_fall -add_delay;

Second, it is a poorer solution, but you may be able to do something with the set_clock_uncertainty command. When applied to a single clock it manifests as jitter on the clock. However, you can use the -from and -to options to specify an "inter-clock" uncertainty. This isn't the same as specifying a skew (min and max0, but you may be able to make that work.

Next, if all else fails, you can specify the timing requirements on the paths between the domains directly using the set_max_delay command and the set_min_delay command. Without the -datapath_only flag, the tools will use the requirement while still taking into account the clock insertion delays (so it will take into account the diffferent paths through the clocking resources). So, lets say the ADC (and routing imbalance) have the DCO clock coming 0.5 to 1.0ns later than the clock directly to the FPGA. You could use this information for set_max_delay commands between the clocks. The set_max_delay from REF clock to DCO clock would be (period-1ns) max, and 0.5ns min (I think - you would need to check the sign on this, it could be -0.5). In the other direction (from DCO clock to REF clock) it would be the opposite.

Finally, this is UltraScale+. In UltraScale+, the PHASESHIFT_MODE of the MMCM defaults to LATENCY (not WAVEFORM, which is what it did for all previous families). For this example, where you are crossing between clocks that do and don't go through the MMCM (and are trying to let the tools account for the skew between the clocks), I am pretty sure you want the MMCM to be in WAVEFORM mode - take a look at this post on PHASESHIFT_MODE. To change it, you need to set the property on the MMCM through the XDC file.

set_property PHASESHIFT_MODE WAVEFORM [get_cells <instance_name_of_mmcm>]

Let me know what you try and what results you get!

Avrum

0 Kudos
Observer bgiesing
Observer
192 Views
Registered: ‎05-09-2017

Re: Constraining Clock Domain Crossing for Related Phase Shifted Clocks

Thanks Avrum -

Firstly: no, we are not sure this is even possible.  Hopefully the tool will be able to tell us.

Secondly: one point of clarification with respect to the -set_clock_latency; should we be using -early/-late or -min/-max?  In previous answers, it seemed -min/-max was used.  I do not fully understand the implications of one method versus the other.

Once we are straight on that, we will start running some builds with your other suggestions.

Thanks again,

Brian

0 Kudos
Observer bgiesing
Observer
166 Views
Registered: ‎05-09-2017

Re: Constraining Clock Domain Crossing for Related Phase Shifted Clocks

First build finished.  Some progress, but still not quite there.  Here are the relevant constraints used:

 

create_clock -period  4.568 -name DSP1_ADC_CLK_P  [get_ports DSP1_ADC_CLK_P]
create_clock -period  4.568 -name RF0_ADC0_DCO_P [get_ports RF0_ADC0_DCO_P]
set_clock_latency -rise -fall -source -early -1.061 -clock [get_clocks RF0_ADC0_DCO_P] [get_clocks RF0_ADC0_DCO_P]
set_clock_latency -rise -fall -source -late  -0.061 -clock [get_clocks RF0_ADC0_DCO_P] [get_clocks RF0_ADC0_DCO_P]

# Edge-Aligned Double Data Rate Source Synchronous Inputs -- FROM TEMPLATE
set input_adcclk_period 4.568;             # Period of input clock (full-period)
set skew_bre            0.280;             # Data invalid before the rising clock edge
set skew_are            0.100;             # Data invalid after the rising clock edge
set skew_bfe            0.280;             # Data invalid before the falling clock edge
set skew_afe            0.100;             # Data invalid after the falling clock edge
# Input Delay Constraint
set_input_delay -clock RF0_ADC0_DCO_P -max [expr $input_adcclk_period/2 + $skew_afe] [get_ports {RF0_ADC0_D_P[*]}];
set_input_delay -clock RF0_ADC0_DCO_P -min [expr $input_adcclk_period/2 - $skew_bfe] [get_ports {RF0_ADC0_D_P[*]}];
set_input_delay -clock RF0_ADC0_DCO_P -max [expr $input_adcclk_period/2 + $skew_are] [get_ports {RF0_ADC0_D_P[*]}] -clock_fall -add_delay;
set_input_delay -clock RF0_ADC0_DCO_P -min [expr $input_adcclk_period/2 - $skew_bre] [get_ports {RF0_ADC0_D_P[*]}] -clock_fall -add_delay;

With those constraints, we failed the data capture at the IO.  It appears the tool applied the "late" clock source latency to the data path, but the "early" clock souce latency to the clock path.  Again, in our system these should be linked:

image (31).png

However, what's encouraging is that it does appear the tool properly applied the "late" latency at the clock domain crossing (I need to find the "early" corner too):

image (32).png

 

 

0 Kudos
Observer bgiesing
Observer
142 Views
Registered: ‎05-09-2017

Re: Constraining Clock Domain Crossing for Related Phase Shifted Clocks

We replaced the -early/-late constraints in the -set_clock_latency with -min/-max... we got the same results, and as far as we can tell there is no difference.  The data capture at the I/O fails because of the added skew between clock and data; but the clock-domain crossings are being analyzed and passing.

image (34).pngimage (33).png

So, we are still trying to figure out how to properly constrain the data catpure at the I/O in such a manner that the data follows the clock.

0 Kudos
Observer bgiesing
Observer
135 Views
Registered: ‎05-09-2017

Re: Constraining Clock Domain Crossing for Related Phase Shifted Clocks

Quick note: we are adding the -source_latency_included switch to the -set_input_delay constraint... will post results in a few hours, once the build completes.

0 Kudos
Observer bgiesing
Observer
96 Views
Registered: ‎05-09-2017

Re: Constraining Clock Domain Crossing for Related Phase Shifted Clocks

Adding the "-source_latency_included" switch to the "-set_input_delay" switch only did half of what we wanted... it no longer skews the data to one extreme, but still skews clock to the other extreme.  Thus, we still fail timing at the I/O data capture:

image (35).png

0 Kudos
Observer bgiesing
Observer
78 Views
Registered: ‎05-09-2017

Re: Constraining Clock Domain Crossing for Related Phase Shifted Clocks

Still no luck...

This time, we tried to add the -set_clock_latency to the common clock, encompassing the skews arcross all 18 of the incoming DCOs.  The tool says it passed timing, and that the latency constraint was accepted, but there is no evidence of the skew at the domain crossing.  Perhaps the MMCM is hiding it?

image (36).pngimage (37).png

0 Kudos