01-25-2021 07:36 PM - edited 01-25-2021 07:39 PM
I've searched at length for a similar topic, but none seem to match my specific scenario.
I have a design which generates 3 clocks from the same 80Mhz Oscialltor Pin. Here is a list of the clocks and their functions:
200Mhz - "Fast Side" Core clock: Covers oversampling of a 33Mhz UART packet based system unto and including one port of a True Dual Port BRAM.
100Mhz - "Slow Side" Core clock: Covers everything from the opposite port of the TDP BRAM and to some slower(10Mhz) SPI-like shifting interfaces as Master.
20MHz - Clocks no internal logic and is buffered and sent off chip to support other chips that are slaves to the above SPI-like Master interfaces.
20Mhz - Inverted version of the above clock also sent to different SPI-like slaves(inverted purportedly to help balance power due to switching-I'm not the original designer)
Initially the design had a PLL_BASE generate the 200Mhz and 100MHz internal clocks and a separate DCM_SP generated the 20Mhz clocks(by dividing the 80 by 2 and using an RTL coded flip flop to provide the two phases of 20Mhz).
I thought this was lunacy and have implemented a much cleaner design that creates all four clocks from one PLL_BASE block(created using Clocking Wizard as the above PLL_BASE and DCM_SP were). But now, I get thousands of timing violations where previously there were none. I traced this back to the moving of the 20Mhz clocks generation to the PLL_BASE. Apparently, ISE now tries to treat them as synchronous to the 100MHz and 200MHz clocks even though they touch no flip flops and are simply buffered and sent to Output Pins(Not even sure I need the GBUF since there are no internal loads, but I thought it may give those clocks special low latency routing paths).
So, two questions:
1. How can I constrain the 20Mhz Clocks to have no relationship to the 200MHZ and 100MHZ domains? This is on a Spartan 6 part and using ISE tools. Note that I've temporarily removed the inverted phase 20Mhz from the design and still get thousands of timing violations.
2. Again, I inherited this design from another engineer who is no longer at the company and I noticed that in the constraints file they commented out some constraints. One was the Jitter spec on the input clock. With this included back into the constraints, the original design(before my clock changes) does not meet timing(a couple hundred violations - so close to the edge). But also commented out were some clock group definitions of the 200Mhz and 100Mhz clocks(by pointing through the hierarchy to the clocks in the PLL_BASE) and then two TIG directives, one in each direction from 200 to 100 and from 100 to 200 . With those included back in, we pass timing again(original design pre clock changes). However we're sort of overriding the ISE tools implicit definitions of the TIMESPECs as seen in the .BLD log file, and I'm not 100% sure we can claim asynchrony between these two domains. So to finally ask question 2 in two parts:
a. If indeed, the hard boundary between the 200Mhz and 100Mhz domains is the True Dual Port BRAM and nothing else(or any single bit control signals have proper CDC logic designed in). Can we constrain these two domains to be asynchronous? Note the data passed between the two is definitely not being dynamically processed(say like a FIFO), but is more dealt with in discrete "stages" where all of the buffer on the A side is filled by the UART Packet interface, then in a different phase of the cycle, the SPI side pulls data out and parcels it out sort of like a switch or router. If this is a valid claim to asynchrony then:
b. What is the proper way in an ISE UCF Constraints file to define the groups and declare them unrelated for timing purposes? One thing I noticed is that the time group only uses the RISING property(I'm new to Xilinx tools, so I may not be relaying that properly).
Thanks for any help that you can provide!
02-04-2021 02:48 PM
First, lets start with the 20MHz clocks. You say these do not clock any internal logic, but are only provided as references for off-chip devices. If that is really the case, we don't need to treat these as clocks at all - they are merely output signals that happen to have a periodic nature. The easiest way to deal with this is probably just to use a counter on the 200MHz domain that controls a single flip-flop (with no feedback) that is 5 cycles low and 5 cycles high. It is even easy to generate your second output (again from a dedicated flip-flop) that is low when the first is high and vice versa. These can then be packed into the input output block with the use of the IOB property. When done this way, these aren't "clocks" from the point of view of the FPGA, but are perfectly acceptable for the downstream device. This will simplify static timing.
Maybe this will fix the "problem" you are seeing, but what you are describing makes no sense. If the 100MHz and 200MHz clocks always came from the same PLL then the paths between them would have been timed (unless there was a TIG on them) - regardless of whether the 20MHz clocks also came from the same PLL or not; moving the 20MHz clock shouldn't have affected the 200MHz and 100MHz....
Now to the 200MHz and 100Mhz clocks. You say that they communicate only through a DPRAM, so, in theory, they could be treated as asynchronous domains. If you have any other crossings between these domains (other than the DPRAM itself) then you will need proper clock domain crossing circuits (CDCCs) between these domains if you treat them as asynchronous. And it seems that you do, since, if you didn't, then there wouldn't be failures between these domains regardless of how they were generated.
But I ask the question. Why bother? Paths between a 100MHz and 200MHz domain are no harder to meet than paths on the 200MHz domain themselves. If the two clocks come from the same PLL/DCM, and go through the same kind of clock buffer, the two clocks are almost perfectly in phase - crossing between them is a 5ns timing path - only slightly harder to meet than timing on the 200MHz domain (there is a bit more skew between clocks than on one clock). If you are having trouble with this crossing, I would wonder why... The DPRAM (if in true dual port mode - i.e. with one clock driving port A and the other driving port B) results in no timing paths between the domains. Any control signals that are going between them would need to be flop-to-flop if the domains were treated asynchronously (with additional metastability resolution circuits on the receiving end) - just going flop-to-flop synchronously (without the metastability flops) should really not be a problem at 5ns. So it's hard to see how you would be having timing violations between these two domains - you will have to show us what you are seeing (post the detailed timing report for the worst violation).
And as for static timing, if these come from the same PLL/DCM, then they will be treated as synchronous by default - no additional constraints are required.
If you do decide to treat the 200MHz and 100MHz as asynchronous (and, again, I don't see any reason to do this), then the answer is a FROM TO TIG (this is going to be from memory, so forgive me).
First you need to have Timing NaMes (TNMs also called timing groups) for the two clocks. Normally these would be created with TNM_NET commands using the top level nets of the clocks
NET "<CLK100_NET>" TNM_NET = TNM_CLK_100;
NET "<CLK200_NET>" TNM_NET = TNM_CLK_200;
But in the case where these are generated by a PLL (and the input clock to the PLL is constrained properly at the pin of the FPGA) the tools have already automatically created timing groups for these two clocks. You can (and probably should) use these instead. So you need to find the names of the groups automatically created by the tool for constraining these groups. You will find this in one of the log files from ndgbuild - I think the .bld file. It will show you the complete constraints for the output clocks of the PLL. In that constraint it will reference Timing NaMes for the two groups of flops - I don't remember the format from ISE - I think they are something like TNM_<name_of_pll_instance>_<name_of_output>, but I could be wrong.
Once you have these - either the automatically generated ones or the manually generated ones (and you should use the automatic ones if possible) then you create a TIG TIMESPEC.
TIMESPEC TS_200to100_TIG = FROM TNM_<200MHz_name> TO TNM_<100MHz name> TIG;
TIMESPEC TS_100to200_TIG = FROM TNM_<100MHz_name> TO TNM_<200MHz name> TIG;
But, again, I don't recommend this... You really need to diagnose why these paths were failing in the first place.
One was the Jitter spec on the input clock. With this included back into the constraints, the original design(before my clock changes) does not meet timing(a couple hundred violations - so close to the edge).
DON'T DO THIS! This is just wishful thinking - the jitter really exists, ignoring it just to make timing pass is just lying to the tools and forcing it to give the answers you want to hear rather than the truth (which is that this design fails timing with proper jitter constraints). But this is also somewhat curious - the PLL is a pretty darned good jitter attenuator, so even if there is fairly significant jitter on the input clock, only a small portion of that would end up on your internal clocks - it is hard to see how these would make a significant difference in timing...
As for the commented out TIGs - we sort of covered that above. If you insist on treating these domains as asynchronous, then TIGs may be warranted (although I prefer FROM TO DATAPATHONLY), but, again, I would recommend not using the TIGs and figuring out why these paths, which should be easy to meet synchronous paths, are failing.