07-10-2019 08:15 AM
This design includes a large section running on a 250 MHz clock. As expected, this part of the logic presents the most challenges to the P & R tools so I routinely use Smart Explorer to achieve timing closure. Smart Explorer results show the majory of results with timing failures in the region of 1nSec. However, successful results will have a worst-case timing slack that falls into one of two distinct groups, one of which is in the range 0 to 0.5 nSec and the other in the range 4.0 to 4.5 nSec. I believe results in the first group but the second group seem likely to be as a result of the timing analyser including an extra full cycle of the 250 MHz clock so my question is, can I safely accept solutions in the second group as being good timing closure or should I only allow solutions from the first group to be used?
07-10-2019 09:16 AM
07-10-2019 09:27 AM
So you are saying that SmartExplorer has some runs passing, and in those passing runs, you get positive slack of around either 0-0.5ns or 4-4.5ns?
Clearly having positive slack of anything near 4ns of positive slack on a (regular) path with a 4ns clock period is impossible - these results are highly suspect...
Of course, if these results are suspect, you need to understand why, in order to insure that the other results are valid - if your constraints are wrong, then they will likely be wrong for all runs, not just some of them.
07-11-2019 07:20 AM
Thansk for promptr reply.
Certainly agree, if the tools say it has failed then it has. Difficulty is that in the 4 nSec cases, it says it has passed.
Depends where the extra clock cycle occurs on whether it would cause an issue and I'm not sure how to find this out.
I'll think about partitioning, but the high speed part of the logic is a tight chain starting from a set of LVDS input pins and
going through several stages of data reordering before ending up at the inputs of a block of BRAM.
Have you ever encountered anything like this before?
07-11-2019 07:31 AM
07-11-2019 07:34 AM
Thanks for prompt reply.
1) yes, it reports a positive slack of 4nSec+ on some solutions and a positive slack of +0 to +0.3 nSec on others.
2) Any suggestions how to find out why it is doing this?
3) There is only one ucf constraint on this section of logic which sets the clock period, mark-space ratio and jitter.
Are there any other constraints I should be considering?
It almost looks like it is doing a multi-cycle analysis although I've not knowingly specified this - maybe I should have
a constraint to turn it off?
Only other thing to say is that the FPGA does work on both sets of solutions, although I'm not able to confirm if it
does have the necessary margins to allow for temperature, voltage and device variations.
Have you ever seen anything like this?
07-11-2019 07:40 AM
Yes, LVDS inputs via IOB generated by ISE IP tools, several levels of registers, some for data re-ordering, some to provide timing margins.
KEEP constraints to prevent timing margin registers being optimised away.
07-11-2019 07:14 PM
Can you provide both the 0-0.5ns or 4-4.5ns timing reports?
07-12-2019 10:18 AM
It doesn't happen with every run of Smart Explorer - sometimes they're all in one group but as soon as I get a run which has at least one of each, I'll post the .twx files - - assuming that Smart Explorer has saved both of course.
07-14-2019 06:39 PM
If you set "-best_n_runs" option for smartxplorer ("Keep only x best runs" option in GUI), smartxplorer will only keep the x best run results and remove the others.
If so, you'll probably won't have the 0-0.5ns results.
What's more, I don't think you need to wait for both results in one run.
As long as you got 0-0.5ns or 4-4.5ns result from any run (source files and constraints are not changed much), the results can be used for investigation that whether the 4-4.5ns slack is correct or not.
07-15-2019 06:57 AM
Attached are two twx files, basically from the same run of Smart Explorer. The one for Run3 (4.173 nSec) is directly from the Smart Explorer Results folder but it didn't save the second successful run, Run 6 (0.055 nSec) so I obtained this by setting the Placer Cost Table to 6, forcing a re-implement and saving the resulting .twx file.
07-15-2019 07:11 AM
The .twx files were rejected, so I've renamed them slightly, hopefully will work this time.
> Sorry, still doesn't work - this website says "The attachment's <filename.extension> content type (text/plain) does not match its file extension and has been removed" . I've tried extensions of ".twx" and ".txt" and both of these are rejected - any other suggestions? (my PC is Windows 10 64 bit + Google Chrome if that makes any difference)
ps1 now trying shortening filename - Nope, still doesn't work
ps2 - try zipping
07-15-2019 11:03 PM
I searched related information in the two timing reports. It looks to me that the 4.173ns and 0.055ns slacks are from paths under two different constraints. It is meaningless to compare between the two. Please correct me if I misunderstood anything.
1. For the "4.173ns" run, the 4.173ns slack comes from below constraint:
TS_PCIe_Interface_inst_xil_pcie_wrapper_s6_pcie_clk_125 = PERIOD TIMEGRP "PCIe_Interface_inst_xil_pcie_wrapper_s6_pcie_clk_125" TS_PCIe_Interface_inst_xil_pcie_wrapper_s6_pcie_gt_refclk_buf HIGH 50%;
The first slack under the same constraint in the 0.055ns run is "4.229ns" which is comparable.
2. For the "0.055ns" run, the 0.055ns slack comes from below constraint:
TS_PCIe_Interface_inst_xil_pcie_wrapper_s6_pcie_clk_62_5 = PERIOD TIMEGRP "PCIe_Interface_inst_xil_pcie_wrapper_s6_pcie_clk_62_5" TS_PCIe_Interface_inst_xil_pcie_wrapper_s6_pcie_gt_refclk_buf / 0.5 HIGH 50%;
The first slack under the same constraint in the 4.173ns run is "0.044ns" which is comparable.
So either those two are not the correct examples to analyze or you're comparing the wrong objects.
07-16-2019 03:55 AM
Thanks for your analysis of the .twx file.
I'm not sure what you mean by comparing the wrong objects - the design incorporates a number of separate clocks, each with their own timing constraints, each of which has to be met. Up to now, I have been assuming that when Smart Explorer reports a successful result, the timing slack it shows would be the worst-case, ie smallest, across the entire design. However, it seems from your analysis that this may not be the case, and instead it is displaying the smallest value for one of the clock timing constraints but not necessarily the absolute smallest value across the entire design. That isn't of great importance, so long as Smart Explorer's definition of timing closue is that all timing constraints are met and it seems from your analysis that this is probably the case. Would you agree?
Odd that the timing margin for the "PCIe_Interface_inst_xil_pcie_wrapper_s6_pcie_clk_125" is always 4 nSec + delta considering it is a 16 nSec period clock?
07-21-2019 08:10 PM
@hillipsrb Could you attatch the screenshots of the smartxplorer results that showed the worst slack of 0.055ns and 4.173ns if you still keep them?
Just want to be clear about the issue that is bothering you.
And for "PCIe_Interface_inst_xil_pcie_wrapper_s6_pcie_clk_125", the timing report says it is a 8ns period clock, not 16ns.
Paths for end point PCIe_Interface_inst/xil_pcie_wrapper/s6_pcie/PCIE_A1 (PCIE_X0Y0.PIPERXDATAA10), 1 path
Slack (setup path): 4.173ns (requirement - (data path - clock path skew + uncertainty))
Source: PCIe_Interface_inst/xil_pcie_wrapper/s6_pcie/GT_i/tile0_gtpa1_dual_wrapper_i/gtpa1_dual_i (HSIO)
Destination: PCIe_Interface_inst/xil_pcie_wrapper/s6_pcie/PCIE_A1 (CPU)
Data Path Delay: 3.632ns (Levels of Logic = 0)
Clock Path Skew: -0.107ns (0.464 - 0.571)
Source Clock: PCIe_Interface_inst/xil_pcie_wrapper/s6_pcie/mgt_clk rising at 0.000ns
Destination Clock: PCIe_Interface_inst/xil_pcie_wrapper/s6_pcie/mgt_clk rising at 8.000ns
Clock Uncertainty: 0.088ns
07-23-2019 03:31 AM
Soory don't have the screen shots for that particular run but will post from another similar one.
My concern is based on ISE's description of the displayed slack value as "Worse Case Slack". This description means to me that all the other slack timing values will be larger. If I saw a value in the 100 pSec region and knowing that there is a 250 MHZ clock for part of the logic I would be happy. However when the value given as the worst case is 4 nSec + around 100 pSec then this seemed suspicious and made me wonder if ISE was doing a multi-cycle timing calculation or something like that - if that was the case it would not be a workable solution for my logic. What added to the concern was that, without exception, the reported timing was always either in the 100 pSec region, on the 4 nSec region and never anything in between.
Your analysis of the twx files indicates that the 4 nSec results are from a completely different part of the logic running on a much slower clock where you could reasonably expect bigger timing margins and for some reason best known to the programmer, it seems that ISE is chosing to report these rather than the much smaller margins on the 250 MHz clock net. If that is the explanation, then I'm happy to just accept it as one of those things.
It is true that if I re-compile the design outside of Smart Explorer but using the mapper placer cost table value obtained from it, then it design always reports timing closure with the margin on the 250 MHz clock being reasonable.
Asuming the Smart Explorer doesn't always display the true worst-case timing margin, Is there any straightforward way to get this from ISE?