UPGRADE YOUR BROWSER

We have detected your current browser version is not the latest one. Xilinx.com uses the latest web technologies to bring you the best online experience possible. Please upgrade to a Xilinx.com supported browser:Chrome, Firefox, Internet Explorer 11, Safari. Thank you!

cancel
Showing results for 
Search instead for 
Did you mean: 
573 Views
Registered: ‎01-22-2015

async clocks conundrum

begin: suggestion

By default, Vivado considers all clocks to be synchronous.  So, when we attempt a direct transfer of data between two clock domains (aka clock-crossing), Vivado will perform timing analysis on the associated inter-clock paths.

When two clocks are known to be asynchronous, we have two choices:

1) We allow Vivado to think the two clocks are synchronous:  Thus, if we forget the two clocks are asynchronous and we attempt a direct clock-crossing, timing analysis may sometimes incorrectly tell us that the direct clock-crossing is possible  - because Vivado doesn’t know that the clocks are asynchronous.

2) We tell Vivado that the two clocks are asynchronous (using the set_clock_groups exception): Thus, if we forget the two clocks are asynchronous and we attempt a direct clock-crossing, timing analysis will incorrectly issue no warning  -  because the set_clock_groups constraint turns-off timing analysis for all inter-clock paths between the two clock domains.

So, when two clocks are asynchronous, we are damned if we do (tell Vivado the clocks are asynchronous) and damned if we don’t.

I suggest that a new constraint (perhaps set_clocks_async) is needed that would simply identify asynchronous clock pairs.  Further, use of set_clock_groups should be officially discouraged (as Avrum has long suggested).  Using information from the new set_clocks_async constraint, timing analysis would then fail/flag each inter-clock path between async clock domains unless a more “targeted” timing-exception (eg. set_max_delay -datapath_only) has been placed on the path.

end: suggestion

13 Replies
Scholar markcurry
Scholar
569 Views
Registered: ‎09-16-2009

Re: async clocks conundrum

I strongly disagree that set_clocks_async should be discourged.  Avrum and I have gone back and forth in a few of the forum posts regarding this.  It's better for a hallway conversation that posts.  But our groups encourages, and strongly suggests using set_clock_groups -async for clocks that are truly asynchronous.  All our FPGAs have them.

Xilnx has added a set_bus_skew command that supercedes async_clocks (i.e. this constraint is valididated even if clocks are tagged as async).  We successfully use these for our CDC paths.

I agree that there should be more literature regarding these various timing strategies. There's a log of info buried in these forums in multiple places, and elsewhere, but nothing really that brings it all together.  Scoped XDCs are a strong tool too that should be added to the list.

Regards,

Mark

0 Kudos
Teacher drjohnsmith
Teacher
561 Views
Registered: ‎07-09-2009

Re: async clocks conundrum

Being a lazy person,
And loving the C++ encapsulation idea,
would it be good if we could instantiate a block, that did the clock crossing,
We have macros in some devices,
then the instantiated block would have the constraints , all wrapped up by the Xilinx experts...
<== If this was helpful, please feel free to give Kudos, and close if it answers your question ==>
554 Views
Registered: ‎01-22-2015

Re: async clocks conundrum

@drjohnsmith  Thank you!  Yes, Xilinx IP for this would be welcomed by some.  As you note, the XPM_CDC macros (ref UG953) create the CDC structures for some devices.  Today, I tried using XPM_CDC_ARRAY_SINGLE to create a simple two-flop synchronizer.  Instantiating it into my design was easy and the macro automatically wrote a “False Path” timing exception for the path coming into the synchronizer (although, I can’t find the location of this constraint).  I’m not a big fan of IP/macros – but nice to know they are available.

@markcurry  Thank you!  I’m pretty solidly in Avrum’s camp on this one – but will keep an open mind and try to find/read discussions on Forum that you had with Avrum.  Thanks for introduction to set_bus_skew!  Although most of chapter 6 in UG903 is devoted to this constraint, I confess to still not understanding it fully –  the UG835 discussion of set_bus_skew was more readable – I'll continue studying it.

Mark

Guide avrumw
Guide
547 Views
Registered: ‎01-23-2009

Re: async clocks conundrum

markg@prosensing.com

I doubt that Xilinx is going to add another command in this area.

While I agree with your assessment - neither solution (on its own) is acceptable:

  • Allowing the tool to treat known asynchronous clocks as synchronous, which may result in failing or passing (incorrect) timing analysis
  • Using the set_clock_groups -asynchronous

But, there are other tools already in place for dealing with things like this. Xilinx has two command that help with this analysis

  • report_clock_interaction (particularly the graphical version of it)
  • report_cdc

While timing analysis doesn't have any concept of "structurally related", these two commands do - they realize that if two clocks can be traced back to a common clock source (either an MMCM/PLL, or even different MMCM/PLLs clocked by the same clock input) then they are structurally related. If they trace back to different clock input ports then they are not.

The report_clock_interaction shows the relationships between all clocks in the design with respect to their constraints. For every pair of clocks it identifies if they are structurally related (i.e. "safe"), and/or have exceptions or not (calling particular attention to clocks that are not "safe" and have no exceptions).

The report_cdc command looks at all paths between clock domains. For every path where the source and destination clock are not structurally related, it looks at the structure of the path (and the paths around it) to determine if this is a known and valid clock domain crossing circuit (CDCC). If it doesn't recognize it as a valid CDCC, then it will issue warnings - and this includes checking that the metastability flip-flops have the ASYNC_REG property attached to them.

By the way, there is a "trick" that can be used (althogh I am not sure that I recommend it). The issue with the first case (leaving the tools to analyze the paths between structurally unrelated clocks normally) is that they won't necessarily end up with a timing failure if you have no exceptions (i.e. you missed a path). If you have one clock that comes from a 100MHz oscillator and another one that comes from an unrelated 200MHz oscillator, then the path between them will have a requirement of 5ns and hence will likely pass timing - even though they are illegal since the clocks are unrelated. The "trick" is to not do this. For example, constrain your 200MHz clock to 5ns, but your 100MHz clock to 9.99ns. Now, the paths between them will result in a 0.01ns requirement (or something similar) which will result in a timing failure.

@markcurry ,

We will always probably disagree - at least somewhat - on this.

I agree that what you are doing (using the set_clock_groups -asynchronous and relying on the set_bus_skew command) is at least mostly valid. For many CDCCs, controlling the bus skew is really the thing that is required in order to ensure that the CDCC is functionally valid, and, as you said, the set_bus_skew constraint is not a "timing" constraint, and hence is not overridden by the set_clock_groups -asynchronous command.

But, the underlying problem remains - with the set_clock_groups -asynchronous, no other timing constraints can be placed on the clock domain crossing paths. This may have several problems:

  • If you accidentally miss a CDCC on a clock domain crossing path, then you will get no failing timing report
    • Now, it is worth mentioning that relying on report_timing to identify missing CDCCs is not a sufficient methodology, but its always good to have a backup
    • The report_clock_interaction will not flag this as a problem since the path has a "false_path" constraint on it (so it won't be marked as "unsafe" with no exceptions)
    • The report_cdc will still identify it as a bad CDCC
  • If you are using an "older" IP that hasn't migrated to the set_bus_skew methodology, and is still relying on set_max_delay -datapath_only (and I don't know if any of these still exist), then these IPs will be underconstrained (and hence can fail)
  • Even if you are using newer IP (for example the FIFO wizard using BRAM based FIFOs with asynchronous clocks), some IP use both constraints - set_bus_skew and set_max_delay -datapath_only
    • These are really doing two different things - the set_bus_skew ensures that the CDCC functions properly
    • The set_max_delay -datapath_only is limiting the latency of the CDCC
      • Without the set_max_delay -datapath_only, your CDCC will function, but the latency is unconstrained. This is true for even the "simple single bit slow changing CDCC" - without the set_max_delay -datapath_only the latency is not constrainted
      • This means that the time required to get through the CDCC may be long - some applications may not care, others may. If you have one that does care, and you have the set_clock_groups -asynchronous, then there is no way to limit it

So, if we really want to be "safest" we should (as the Xilinx IP does)

  • Use set_bus_skew to less than one clock period (depending on the CDCC its one source period, one destination period, or the smaller of the two)
    • This ensures that MUX/CE and Gray Code style clock crossers work
  • Use set_max_delay -datapath_only to limit the latency through the CDCC
    • The value may be different than the set_bus_skew value, but in most cases, you can set them to be the same
  • (and never use the set_clock_groups -asynchronous command!)

Avrum

539 Views
Registered: ‎01-22-2015

Re: async clocks conundrum

@avrumw 

Thank you!  -especially for discussion of set_bus_skew.

While timing analysis doesn't have any concept of "structurally related", these two commands do - they realize that if two clocks can be traced back to a common clock source (either an MMCM/PLL, or even different MMCM/PLLs clocked by the same clock input) then they are structurally related.
What is the difference between “structurally related” and “synchronous”? 
Inside the FPGA, what can cause “structurally related” clocks to become asynchronous (or mesochronous)?

0 Kudos
Guide avrumw
Guide
537 Views
Registered: ‎01-23-2009

Re: async clocks conundrum

What is the difference between “structurally related” and “synchronous”?

So, first, I made up the term "structurally related" - I don't think there is a real name for this. I defined it in my previous post - clocks that can be traced back to a common source somewhere inside the FPGA.

(It is theoretically possible to have related clocks coming in on separate pins - i.e. two different clocks from the same external PLL - the tools have no way of describing this).

Structurally related clocks can never be truly asynchronous - they would always be (at worst) mesochronous. A simple example is the same clock going through a BUFG and a BUFR - the outputs of these two buffers would technically be mesochronous (at least at reasonable frequencies - at really low frequencies you might be able to deal with them synchronously).

However, they can be "effectively asynchronous" - if you have an MMCM that (for example) has two outputs, one that is 1x the frequency of the input clock and the other that is 63.974/64 (CLKFBOUT_MULT_F=63.975, CLKOUT0_DIVIDE_F=64) then the effective requirement of this path is 1/512 of the period of the clock. Unless the clock is really slow, this requirement is impossible to meet (i.e. at 100MHz, this would be 0.0195ns - no path can meet that timing). So if your system really had to move data between these clocks, then you would have to treat them as if they were asynchronous.

Avrum

0 Kudos
Scholar markcurry
Scholar
523 Views
Registered: ‎09-16-2009

Re: async clocks conundrum

Since we're discussing this again - I've pulled some of my past arguments from the threads, and I'll add some concrete data. The problem with the "never use async clock groups" (lets call it false_path/datapath_only) is one of scalability, maintainablity, build performance, and likelihood of causing other, more critical, failures.

From one my our current designs - a mature design that has gone through many design reviews, and has had proper CDC formal analysis done (by a formal tool meant for the job)

From my clock_interaction report, I can sum all of my "Ignored" timing paths, due to "Asynchronous Groups". This total is 59469.

That's how many timing paths that are from one async clock to another in this design.

From my "Bus-Skew" reports, I can sum that total number of multi-bit paths that are constrained by bus_skew: 3707
From one of my own scripts, I can count single-bit CDCs instances: 2253

So:
Total async paths: 59469
Total Multibit Bus-Skew CDC: 3707
Total Single-bit CDC : 2253

Percent True CDC (of total async clock paths): (3707+2253)/59469 = 10%

The rest of those async clock paths? Semi-static paths. I.e. paths that are "relatively" constant. These can include:
  * Initialization registers we setup at boot time, and never change.
  * Similarly status registers configured at init, and never change, and can be freely sampled.
  * Very "Slow" paths - in our case, we modify/sample video related registers during vertical blanking - where there are 10s of microseconds of stable time before sampling.
  * Other misc "false" paths.

Our multibit bus-skew constraints are setup via scoped XDC files. We strongly encourage all our design team members to instantiate specific CDC modules - each one has a scoped XDC file to use for these timing constraints. To be fair, if using false_path/datapath_only method, we could accomplish this with scoped XDCs too. We do NOT (cannot actually, as Avrum explains) apply any latency goals across these paths. In a perfect world, I agree with Avrum that it would be nice if checked, but it's just not feasible.

Our design contain 0 multi-cycle paths, and very few explicit false_paths (<100) (usually as a result of some vendor IP).

If we chose to use the false_path/datapath_only method as suggested, we would somehow need to come up a set of false path rules for the other 90% of those async paths, ~50,000 paths. Even with creative regex, it shouldn't be too hard to imagine that this set of rules would be very difficult to maintain. In addition to making the tools run slow as all.

But that's not even the worst part. The worst part is that since these "real" false paths are more random, they're resistant to scoped XDC. One has to rely more on clever regex and other expressions to find all the paths. This becomes quite difficult to maintain. Something minor changes (perhaps hierarchy, or similar) and the false_path misses your intended target. Depending on your implementation setup, this can result in a critical error (synthesis stops), or a waste of time as the tool tries to optimize a path that is false. Or another (similar) error - one forgets to take care of one of those false paths - the tools will waste implementation time (or likely give up) on trying to meet an impossible timing goal.

But it get's EVEN worse.

Since one is relying on regex and other clever tricks, one runs the risk of messing up the rule, and inadvertently tagging a TRUE path as FALSE. This is a VERY CRITICAL FAILURE.

We tried, in the past using the recommended false_path/datapath_only method - and we hit this latter type of bug twice. Both bugs were found out quite late - on the manufacturing floor during systems test.

In a perfect world - I wish I could add latency goals to my CDC paths. But I cannot in any current methodology. (In all cases my engineering judgement tells me the rules aren't really neccesary - we have PLENTY of margin).

The above data points are for ONE of our current designs - 50,000 false path rules (false_path/datapath_only) rules, or just use async clock groups and have XDC timing files of about 100 lines for each FPGA. The XDC more properly describe designer intent - these two clocks are fully asynchronous - there's no point in having a tool assume a (fake) phase alignment between async clocks, and do a fake analysis.

Our current nightly build set is 55 FPGAs. There's just no way we could even begin to consider maintaining rulesets for (50K*55 FPGAs = too big a number)...

XDC constraints are the ugly wart of digital design - apart from minor syntax checking, there's very little to vette a ruleset as "good". One can't "simulate" an XDC file. There's really nothing available to "formally" verify a XDC file. About the only tool we do have - reduce the number/amount of XDC rules. From the software world, how many papers have been written regarding "ratio of bugs per line of code"? Less code = less bugs.

Regards,

Mark

521 Views
Registered: ‎01-22-2015

Re: async clocks conundrum

@markcurry  Thanks very much for the alternate view.  I'll need some time to digest it.

 

@avrumw 

A simple example is the same clock going through a BUFG and a BUFR - the outputs of these two buffers would technically be mesochronous…

You once explained to me that improper reset for the counter in the BUFR (when BUFR_DIVIDE > 1) can be a mechanism that produces mesochronous clocks. Are there other mechanisms in your “simple example” that can cause the output clocks to be mesochronous?

 

0 Kudos
508 Views
Registered: ‎01-22-2015

Re: async clocks conundrum

@markcurry 

I’ve had the luck and luxury of never having to wait more than 30 minutes for implementation of my FPGA projects.   The scale of your FPGA projects is simply staggering to me.   

I understand the gist of your comments to be:  Worrying about constraints for 10% of your 59469 async CDCs is manageable.  Worrying about constraints for 100% of your 59469 async CDCs is unmanageable.

I also understand that you use “set_clock_groups -async” for ALL async clock pairs in your project.  Then, for some async CDCs, you use a CDC-module and override or partially override “set_clock_groups -async” with “scoped XDC” constraints?  However, since “set_clock_groups -async” overrides almost all other XDC constraints, what exactly can you do with the “scoped XDC” constraints?   Only set_bus_skew?

You mention “regex”.  Are you referring to the “-regexp” argument sometimes used in Tcl commands?

Do all your developers use an HDL (Verilog or VHDL)?

0 Kudos
Scholar markcurry
Scholar
492 Views
Registered: ‎09-16-2009

Re: async clocks conundrum

Scoped XDC files are available with either methodology we're discussing here.  It's simply an XDC file that applies to a specific module.  For example, our multi-bit synchronizer is a module named "sync_type2".  We have a "scoped" XDC for this module "sync_type2.xdc".  It contains timing constraints that are applied whereever that module is instantiated.  One enables this in Vivado with:

read_xdc -ref sync_type2 sync_type2.xdc

This tells Vivado that every instance of "sync_type2" it finds in the entire design, to set the "current_instance" to the hierarchical path of that instance, and apply the constraints.

It's a very powerful tool used to enable shared constraint files.  It should be use more often that it is.  In my opinion every IP generated by Xilinx should include a scoped XDC.

Also note more app notes should be written on how to properly generate and use a scoped XDC file - there's some methodology differences.  Specifically, in a scoped XDC file, does one assume a clock is already defined on the lower-level "ports" of the instance, or should a clock be defined within the scoped XDC?  There's ordering issues to address too - which is applied first the "scoped" XDC or the top-level design XDC?


I also understand that you use “set_clock_groups -async” for ALL async clock pairs in your project.  Then, for some async CDCs, you use a CDC-module and override or partially override “set_clock_groups -async” with “scoped XDC” constraints?  However, since “set_clock_groups -async” overrides almost all other XDC constraints, what exactly can you do with the “scoped XDC” constraints?   Only set_bus_skew?


As discussed, setting async clock groups disables just about every other timing check available.  In fact in the early versions of Vivado, that was it - setting a async clock group disabled ANY other timing analysis between those paths.  We had some discussions on these forums during those days of ways around these problems.  Note one proposed solution (from someone in the ASIC world) discussed defining multiple clocks (on the same driver) in order to workaround the problem.  It was complicated, but I thought it could work - but never pursued the idea.  Xilinx didn't want to change the behaviour of the standard SDC in this regard.  A compromise Xilinx made - adding a new command: set_bus_skew.  I forget which version of Vivado it was added to.  Since it was a new XDC constraint, Xilinx could define it as they liked.  So set_bus_skew takes higher priority than set_clock_groups -async. Note that from what I understand however, "set_bus_skew" is an assertion, not a normal timing constraint. Meaning, I believe, that while the bus_skew analysis is done, and reported - it does NOT influence place and route.

I believe the industry is shooting itself in the foot here but not just solving the root cause problem and fixing SDC to address these problems....

"Regex" - Sorry "Regular Expression" - I was speaking in general here, not specifically towards the -regexp option used in TCL commands (although that option is often used).  Any wildcarding necessary in order to find your intended target for a timing exception rule.

Regards,

Mark

Guide avrumw
Guide
488 Views
Registered: ‎01-23-2009

Re: async clocks conundrum

Note that from what I understand however, "set_bus_skew" is an assertion, not a normal timing constraint. Meaning, I believe, that while the bus_skew analysis is done, and reported - it does NOT influence place and route.

This worried me when I read it - if it were true this would significantly reduce the usefulness of this command.

However the description (pasted below) from UG835 seems to pretty clearly state that this isn't the case - in summary it says:

  • The router (odd that is specifically says router and not placer and router) will work to meet the set_bus_skew requirement
  • The set_bus_skew operates on groups of paths instead of on individual paths, and hence is not affected by other path-based constraints (i.e. set_clock_groups)
  • The set_bus_skew "works well with" (i.e. should be used in conjunction with) the set_max_delay -datapath_only (as I said in my previous post, and is what some Xilinx IP does)

Description

Set the bus skew requirement on bus signals that cross clock domains. The bus skew constraint
defines the maximum skew spread between the fastest and slowest signals of the bus, and does
not consider the overall datapath delay. The Vivado router will try to satisfy the set_bus_skew
constraints. Example uses of the bus skew constraint include clock domain crossing for graycoded pointers, MUX-controlled and MUX-data holding CDC buses.

TIP: Bus skew constraints are not overridden by clock groups, max delay, or false path, because
set_bus_skew is a constraint between the signals of a bus, rather than on a particular path.

The set_bus_skew constraint can be combined with the set_max_delay constraint for good
results. The set_bus_skew constraint does not care about the absolute datapath delay, but
only about the relative arrival times of data at the destination, taking into account source and
destination clock skew. You can help set_bus_skew by also using set_max_delay -
datapath_only <SRC_CLK>. This constraint helps the Vivado placer to ensure that the
source and destination registers are not placed too far apart, so that the router can more easily
satisfy the set_bus_skew constraint. Refer to the Vivado Design Suite User Guide: Using
Constraints (UG903) for more information.

Avrum

Scholar markcurry
Scholar
484 Views
Registered: ‎09-16-2009

Re: async clocks conundrum

Ok, dug some more to try and remember where I read that information about bus_skew commands and how the tools use it.  The relevent doc is UG903:


The bus skew constraint is not a timing exception; rather, it is a timing assertion. Therefore, it does not interfere with the timing exceptions (set_clock_group, set_false_path, set_max_delay, set_max_delay -datapath_only, and set_multicycle_path) and their precedence.

The bus skew constraint is only optimized by the route_design command. To report the set_bus_skew constraints, use the report_bus_skew command from the command line or Tools -> Timing -> Report Bus Skew from the GUI. The bus skew constraints are not reported inside the Timing Summary report (report_timing_summary).


So, I was wrong in my interpretation - set_bus_skew - does direct implementation (but just to route, not place).

Regards,

Mark

450 Views
Registered: ‎01-22-2015

Re: async clocks conundrum

@markcurry  and all

The following post raises similar concerns about asynchronous/mesochronous clocks.   Any comments?

https://forums.xilinx.com/t5/Timing-Analysis/creating-mesochronous-clocks/m-p/986869

0 Kudos