UPGRADE YOUR BROWSER

We have detected your current browser version is not the latest one. Xilinx.com uses the latest web technologies to bring you the best online experience possible. Please upgrade to a Xilinx.com supported browser:Chrome, Firefox, Internet Explorer 11, Safari. Thank you!

cancel
Showing results for 
Search instead for 
Did you mean: 
Scholar ronnywebers
Scholar
5,913 Views
Registered: ‎10-10-2014

XPMCDC_HANDSHAKE - how to determine DEST_SYNC_FF and SRC_SYNC_FF

Jump to solution

In UG953 v2016.4, I discovered some CDC macro's, they seem to be new in the latest Vivado version.

 

Q: I'm wondering how one correctly determines the values for DEST_SYNC_FF and SRC_SYNC_FF, i.e. for a destination clock of 100MHz and src clock of 30MHz (or vice versa)

 

 

** kudo if the answer was helpful. Accept as solution if your question is answered **
0 Kudos
1 Solution

Accepted Solutions
Historian
Historian
10,415 Views
Registered: ‎01-23-2009

Re: XPMCDC_HANDSHAKE - how to determine DEST_SYNC_FF and SRC_SYNC_FF

Jump to solution

This is a question that has no hard answer.

 

The number of synchronizer flip-flops along with the frequencies and the activity rate (how many clock crossing events per second are handled) along with the slack on the paths between the sycnhronizer flip-flops determines the Mean Time Between Failures of the CDC. Your goal is to use the number of flip-flops that results in an MTBF (from all synchronizers and other factors) that is high enough to meet your system requirements.

 

The best way to do this is to try a value and then use the report_synchronizer_mtbf command in Vivado (which only works for UltraScale and UltraScale+ devices). If the result is high enough, then you are OK.

 

However, at the frequencies you are looking at 2 is probably enough, 3 is almost certainly enough.

 

For this particular CDC (XPM_CDC_HANDSHAKE) the number of flip-flops has a significant impact on the number of events per second can cross the CDC boundary. For other CDCs, there is a weaker or even non-existent link (the number of FFs simply increases the latency, not the throughput).

 

You should note that the XPM_CDC_HANDSHAKE is the most complicated and slowest (in terms of number of events per second) CDC. The handshake function is rarely necessary when moving data between clock domains. In general, it is preferable to use the simplest CDC that will work in your application - in my experience that is rarely a CDC with handshake...

 

Avrum

10 Replies
Historian
Historian
10,416 Views
Registered: ‎01-23-2009

Re: XPMCDC_HANDSHAKE - how to determine DEST_SYNC_FF and SRC_SYNC_FF

Jump to solution

This is a question that has no hard answer.

 

The number of synchronizer flip-flops along with the frequencies and the activity rate (how many clock crossing events per second are handled) along with the slack on the paths between the sycnhronizer flip-flops determines the Mean Time Between Failures of the CDC. Your goal is to use the number of flip-flops that results in an MTBF (from all synchronizers and other factors) that is high enough to meet your system requirements.

 

The best way to do this is to try a value and then use the report_synchronizer_mtbf command in Vivado (which only works for UltraScale and UltraScale+ devices). If the result is high enough, then you are OK.

 

However, at the frequencies you are looking at 2 is probably enough, 3 is almost certainly enough.

 

For this particular CDC (XPM_CDC_HANDSHAKE) the number of flip-flops has a significant impact on the number of events per second can cross the CDC boundary. For other CDCs, there is a weaker or even non-existent link (the number of FFs simply increases the latency, not the throughput).

 

You should note that the XPM_CDC_HANDSHAKE is the most complicated and slowest (in terms of number of events per second) CDC. The handshake function is rarely necessary when moving data between clock domains. In general, it is preferable to use the simplest CDC that will work in your application - in my experience that is rarely a CDC with handshake...

 

Avrum

Scholar ronnywebers
Scholar
5,870 Views
Registered: ‎10-10-2014

Re: XPMCDC_HANDSHAKE - how to determine DEST_SYNC_FF and SRC_SYNC_FF

Jump to solution

thanks @avrumw, I was thinking the number of FF's was determined by the difference in frequency - I thought that a pulse on src_send would need to be detectable on the destination clock domain. But that doesn't seem to be the case (?). The src_send pulse is probably latchedin the src_clk domain, and later cleared by the dest_clk domain?

 

so the number of FF's is the actual number of synchronizer FF's, with a higher number of FF's improving MTBF.

 

but src_clk and dest_clk can have any relationship then? fast to slow, or slow to fast domain?

 

I agree that the handshake is the most complicated - I need to transfer a 16-bit number from one (fast) domain to the other (slower), there's a write enable pulse of 1 clock cycle in the fast domain. So it goes high for 1 src clock cycle when the value is updated. Guess I could use XPM_CDC_PULSE for that, to transfer the pulse to the other domain. Of course I'll need to latch the data in the src domain first.

** kudo if the answer was helpful. Accept as solution if your question is answered **
0 Kudos
Historian
Historian
5,830 Views
Registered: ‎01-23-2009

Re: XPMCDC_HANDSHAKE - how to determine DEST_SYNC_FF and SRC_SYNC_FF

Jump to solution

The src_send pulse is probably latchedin the src_clk domain, and later cleared by the dest_clk domain?

 

(Again, this is the problem with the XPM_CDC - you don't really know how they are implemented...)

 

But, I would expect that the handshake is basically the same as the toggle synchronizer I showed, but with the "toggle" synchronized back to the source domain as well; So the toggle is generated on the source domain, crossed to the destination domain (where it generates the outgoing pulse) and then crossed back to the source domain (where it generates the acknowledge).

 

For data that goes with it, it is clocked on the source domain (to keep it stable). When the toggle pulse arrives in the destination domain, the data is captured on the destination domain, knowing that it is stable (due to the handshaking).

 

so the number of FF's is the actual number of synchronizer FF's, with a higher number of FF's improving MTBF.

 

Yes

 

I agree that the handshake is the most complicated - I need to transfer a 16-bit number from one (fast) domain to the other (slower), there's a write enable pulse of 1 clock cycle in the fast domain. So it goes high for 1 src clock cycle when the value is updated. Guess I could use XPM_CDC_PULSE for that, to transfer the pulse to the other domain. Of course I'll need to latch the data in the src domain first.

 

I presume the CDC_HANDSHAKE will work in this situation, but it will be quite slow - the data that can be carried across this interface will be limited to one data every couple of clocks (a combination of source and destination clocks - probably something like 3 source and 3 destination). If you can live with this throughput then its fine.

 

However, at some point it just makes sense to use a clock crossing FIFO, either implemented in the fabric using a RAMB18/36 or distributed RAM, or using the FIFO18/36 directly. This has the advantage of being able to pass one data per slower clock (average).

 

Avrum

Tags (1)
Scholar ronnywebers
Scholar
5,824 Views
Registered: ‎10-10-2014

Re: XPMCDC_HANDSHAKE - how to determine DEST_SYNC_FF and SRC_SYNC_FF

Jump to solution

 

you're right about the fifo implementation, I'm using that also i.e for ADC data, fast and easier to use.

 

but for slower stuff (like a cpu IO register) a simple CDC does the job. Thanks for all the tips!

 

** kudo if the answer was helpful. Accept as solution if your question is answered **
0 Kudos
Scholar ronnywebers
Scholar
5,029 Views
Registered: ‎10-10-2014

Re: XPMCDC_HANDSHAKE - how to determine DEST_SYNC_FF and SRC_SYNC_FF

Jump to solution

@avrumw, in a recent discussion with a collegue, he argued/reasons about the number of sync stages needed in a CDC as follows :

 

1) for signals coming into an FPGA through a input pin, he  always clocks the signals using 2 FF's. Depending on the rise time of the signal, you'll get more or less chance to hit metastability events : if rise time is slow compared to the internal clock, chances that you create a metastability event become larger. But 2 FF's should do the job.

 

2) but for 'internal clock domain crossing, he states that just simply 'reclocking' the signal from the one domain into the other using a single FF is always sufficient, as the edges of signals inside an FPGA are (very) fast rising compared to external input signals, and hence chances for metastability on 'internal fpga signals' are  neglectable. The reclocking into the destination domain would be only necessary if the signal is used in more than 1 place, to make sure that the signal is interpreted does not depend on routing delay differences between the 2 or more destinations.

 

I personally think that - looking at the formula for metastability here :

 

1) the number of stages depends on both the frequency of the external signal and internal clock, the higher the frequencies/toggles, the more chances to hit a metastability event. Thus you'll need more stages when frequencies (toggles) of either signal increase.

 

2) for internal clocking I think that again, depending on the frequency difference, 2 or more stages might be needed. I think it's true that chances to get metastability events are smaller when edges are faster rising, but just always using 1 single stage is not sufficient.

 

can you give your opinion on this please?

 

also, referring to the Xilinx technote, is C1 and C2 known for Zynq / 7-series devices? 

 

** kudo if the answer was helpful. Accept as solution if your question is answered **
0 Kudos
Highlighted
Teacher muzaffer
Teacher
5,016 Views
Registered: ‎03-31-2012

Re: XPMCDC_HANDSHAKE - how to determine DEST_SYNC_FF and SRC_SYNC_FF

Jump to solution

@ronnywebers

 

1) for signals coming into an FPGA through a input pin, ... . But 2 FF's should do the job. NO, or not necessarily.

2) but for 'internal clock domain crossing, ... using a single FF is always sufficient. NO.

 

in either case, there is no absolute guarantee. In the long term, one FF is never a good idea, two is minimum.

 

Depending on src & target clock rates, toggle rate of the source data, distance between src & dst regs, and desired MTBF the number of desired flops may go upto 7 according to Xilinx (ask your FAE for the study spreadsheet).

 

- Please mark the Answer as "Accept as solution" if information provided is helpful.
Give Kudos to a post which you think is helpful and reply oriented.
Scholar markcurry
Scholar
5,014 Views
Registered: ‎09-16-2009

Re: XPMCDC_HANDSHAKE - how to determine DEST_SYNC_FF and SRC_SYNC_FF

Jump to solution

 

I can't really see the argument for going less than 2 flops.  I mean it's one extra FF (per synchronizer).  This buys you a large gain - you basically square a small probability failure, resulting in a very small probability of failure.

 

As bang for the buck goes, it's hard to do better.

 

My 2 cents...

 

Regards,

 

Mark

 

 

Historian
Historian
5,003 Views
Registered: ‎01-23-2009

Re: XPMCDC_HANDSHAKE - how to determine DEST_SYNC_FF and SRC_SYNC_FF

Jump to solution

1) for signals coming into an FPGA through a input pin, he  always clocks the signals using 2 FF's. Depending on the rise time of the signal, you'll get more or less chance to hit metastability events : if rise time is slow compared to the internal clock, chances that you create a metastability event become larger. But 2 FF's should do the job.

 

2) but for 'internal clock domain crossing, he states that just simply 'reclocking' the signal from the one domain into the other using a single FF is always sufficient, as the edges of signals inside an FPGA are (very) fast rising compared to external input signals, and hence chances for metastability on 'internal fpga signals' are  neglectable. The reclocking into the destination domain would be only necessary if the signal is used in more than 1 place, to make sure that the signal is interpreted does not depend on routing delay differences between the 2 or more destinations.

 

So, I don't really agree with the above - I will go into some more details below, but I feel it is important to state (pretty unequivocally) that number 2) is WRONG - pretty much dead wrong. If you design this way you are creating inherently failure prone devices. DO NOT DO THIS!

 

The first thing to realize is that there are two things that we need to worry about when talking about metastability

  a) the probability of creating a metastable event

  b) the characteristics of the resolution of the metastable event

These two are not the same thing, and different things affect them. They only combine together when we discuss the mean time between failures (MTBF) of CDCs.

 

The probability of a metastable event (a) is related to

  - the number of transitions per second the asynchronous signal makes

  - the slope of the signal (to a certain extent)

  - the gain characteristics of the sampling flip-flop (which are, in turn, related to the setup/hold requirements of the FF)

 

It is specifically not dependent on the clock frequency of the sampling flip-flop.

 

The characteristics of the resolution of a metastable event (b) depend on

  - time (almost exclusively)

  - the gain characteristics of the flip-flop

 

It is not dependent on the characteristics of the input signal (that created the metastability event in the first place).

 

So looking at (a), it is true that an input pin with a low slope will be (somewhat) more susceptible than an internal signal, but

  - only partially; the input signal actually drives the IBUF which has a fair gain - so even a slow changing input signal will have a more reasonable slope after the IBUF

     - if the slope is really bad, the output of the IBUF may oscillate (bounce), which will increase the probability of the metastable event also

  - even a very fast slope (in fact, theoretically an infinite slope) can still cause a metastable event

      - any time the setup/hold window of the flip-flop is violated, you can go metastable.

 

So it is absolutely untrue that you don't need 2FF synchronizers internally due to the fast slope. Even if you don't buy the argument, this is pretty much a proven fact - failure to use multi-stage synchronizers internally leads to system failures!

 

So, whenever you are dealing with asynchronous signals (be they truly asynchronous - i.e. from an outside source, or synchronous to a different clock) metastability can occur. You have to accept that. The goal is to reduce the probability that this metastable event can cause a system failure.

 

A metastable signal causes system failure due to inconsistency - multiple receivers of the metastable signal can interpret the signal differently - some receivers will see it as a 0 and others as a 1 (and the arrival time of the "value" that ultimately makes the decision is variable). To avoid this, there is really one rule - never sample a potentially metastable signal by more than one load. It is for this reason (and pretty much this reason alone) that we use multi-stage FF chains for synchronization. If the first FF is metastable, then the second FF is the only load that samples it. As long as it doesn't go metastable, then we are OK. But the probability of the 2nd FF going metastable has to do with time - how long does the second one sample after the first one went metsatable - the answer is "one destination clock period". So, based on the clock period of the sampling domain, one clock period may be enough to result in a low enough probability of the second FF going metastable, or it won't. If it isn't then we add a 3rd or 4th. This is purely dependent on the frequency of the sampling clock (and the characteristics of the FFs).

 

Finally, the MTBF becomes a computation of the two of the above (a and b) together - how often does the metastable event occur, and, each time it does occur, how likely is it for it not to resolve before causing a system failure.

 

Avrum

Visitor logicmeister
Visitor
1,168 Views
Registered: ‎09-25-2018

Re: XPMCDC_HANDSHAKE - how to determine DEST_SYNC_FF and SRC_SYNC_FF

Jump to solution

Hi Avrum,

 

I was wondering if you could clarify the comment "It is specifically not dependent on the clock frequency of the sampling flip-flop." in your reply. Maybe we are looking at the problem differently or using different terms.

 

My understanding is that if you are launching data from FF1/Q on Clk1, asynchronously capturing at FF2/D on Clk2 (which has no common base period with Clk1), the destination frequency does affect the probability of failure (and thus contribute to the MTBF).

 

Assume all parameters on the launching side (Clk1, activity rate on the FF1/Q, slope of output signal) and all parameters at FF2 (Tsu, Th, process, voltage, temperature, "Tau" of the silicon, gain, etc...) remain constant, except for the frequency of capture Clk2.

 

If Clk2 = 0 Hz (DC), there is zero probability of a failure, because data is never sampled.

 

If FF2 has Tsu = 1ns and Th = 1ns, the window in which Tsu/Th could be violated is 2ns wide.

 

If the frequency of Clk2 is 500 MHz (2ns), then every sample taken could go metastable (if the input is changing), because there is no way Tsu/Th can ever be met. (Assuming zero Clk-Q and routing delay, this would be an Fmax limit of FF2.)

 

If Clk2 = 250 MHz (4ns), then data changing at FF2/D has a 50% chance (2ns bad out of 4) of violating the Tsu/Th.

 

If Clk2 = 100 MHz (10ns), then data changing at FF2/D has a 20% chance (2ns bad out of 10) of violating the Tsu/Th.

 

etc...

 

The less often you open the window, the less chance you have of letting something bad in. :)

 

 

The frequency of the launching clock does not directly affect the probability of failure, FIT (Failures in Time) or MTBF (Mean Time Between Failures), but the toggle rate (data frequency) of the source does. The less often it changes (e.g., enabled once every 10 clocks), the less chance there is of a failure.

 

By implication, if FF1/Q is not gated with something else that is completely asynchronous or otherwise unclocked, then the data frequency will be less than or equal to the clock frequency it's derived from (typically < Clk1/2 or < Clk1/N where N is some integer).

 

At least that's my understanding. I've been enjoying and benefiting from your posts here for years (under outdated usernames) and look forward to your input on this one.

 

Cheers!

 

0 Kudos
841 Views
Registered: ‎01-22-2015

Re: XPMCDC_HANDSHAKE - how to determine DEST_SYNC_FF and SRC_SYNC_FF

Jump to solution

@logicmeister

 

      The less often you open the window, the less chance you have of letting something bad in. :)

I think you have the essense of it, but…  Ran Ginosar in his paper <here> develops a formula for calculating the rate of entering metastability as follows.  First, he assumes that data edges (transitions) are uniformly distributed over the logic clock period, TC.  -and that the interval for violating Tsu/Th has time-width, TW.  Thus, he argues that the probability of any data transition causing metastability is TW/TC.  Finally, he multiplies this by the rate, FD, that the data is changing (since it may not be changing on every clock cycle).  So, his final calculation is:

 

            Rate of entering metastability = FD*TW/TC

 

However, I sometimes wonder about Ginosar’s initial assumption that “data edges are uniformly distributed over the logic clock period”.  In a digital system where metastability is a problem, I argue that data transitions are probably grouped very near the TW window.  So, I think it makes more sense to talk about jitter and how it can push each data edge into or out of the TW window.  From this viewpoint, I think that Ginosar’s formula becomes independent of TC and a new term must be added to make the formula dependent on clock jitter.

 

Said another way, imagine a digital system with no clock jitter and data edges aligned in the center of the TW window. Then, the rate of entering metastability is FD, regardless of the clock period.

 

In his paper, Ginosar goes on to develop the rate of failure for a 2-Flip-Flop (2FF) synchronizer.  He does this by multiplying the above formula by the probability that metastability will settle within the 2FF.  This multiplicative term depends on TC since we expect the 2FF to resolve the metastability in one clock cycle. 

 

So, perhaps Avrum was referring to the fact that the probability of entering metastability may not dependent on the clock period.  However, I think that the probability of resolving metastability does depend on the clock period.

 

Cheers,
Mark