cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
fpgalearner
Voyager
Voyager
5,922 Views
Registered: ‎04-11-2016

timing constraints

Hi,

while compiling a design in vivado timing constraints doesn't meet.

see in attachment for one of them.

how to overcome this? 

xdc_error1.jpg
0 Kudos
15 Replies
florentw
Moderator
Moderator
5,906 Views
Registered: ‎11-09-2015

Hi @fpgalearner,

 

Could you attache the report for one of the failing path?

 

Also, could you open the failing path and share a screenshot of the schematic corresponding to this path (press F4)

 

Thanks,

 

Florent


Florent
Product Application Engineer - Xilinx Technical Support EMEA
**~ Don't forget to reply, give kudos, and accept as solution.~**
0 Kudos
fpgalearner
Voyager
Voyager
5,900 Views
Registered: ‎04-11-2016

hi @florentw

see attachment. 

I didn't find where to see schematic.

xdc_error2.jpg
0 Kudos
florentw
Moderator
Moderator
5,891 Views
Registered: ‎11-09-2015

Hi @fpgalearner,

 

To show the schematic, press F4 with the path opened (as on your last screenshot) with the path selected.

 

What is the reason for having an IBUF followend by an IBUFDS?

 

Also it seems that your signal is a clock (refclk_p) but the tool doesn't know that this is a clock.

 

Regards,

 

Florent


Florent
Product Application Engineer - Xilinx Technical Support EMEA
**~ Don't forget to reply, give kudos, and accept as solution.~**
0 Kudos
fpgalearner
Voyager
Voyager
5,882 Views
Registered: ‎04-11-2016

hi @florentw

see attachment for schematic.

Reason for having IBUF followed by IBUFDS, I don't know exactly as I am trying to modify reference design.

only I know is one oscillator(si5324) get frefernce input from other(LMH1983) and this reference clock does so.

 

"Also it seems that your signal is a clock (refclk_p) but the tool doesn't know that this is a clock."

 

where I can see this and correct correspondingly.

xdc_error3.jpg
0 Kudos
avrumw
Guide
Guide
5,855 Views
Registered: ‎01-23-2009

This is a pretty common problem. The failure is clearly coming from your global reset, which is fanning out to 1085 different end points (probably all of them directly).

 

Even at this relatively low frequency (133MHz) the tools cannot find a way to place the 1085 different endpoints so that they can all be reached by the one startpoint in 7.5ns.

 

The solution is to manage your reset appropriately. One way or another you need to split or reduce the load on your reset. This could be done by moving to a "partial" global reset, and not resetting all flip-flops, or replicating the source flop (either manually or letting the tool do it), or by designing your own pipelined reset tree.

 

Furthermore, getting this merely to pass isn't sufficient. This single sourced reset is clearly affecting how the tools are placing your entire design. So even if you get this to "just pass", you can often have problems with other timing paths since they have all been influenced by the requirement of keeping all these endpoints close to the startpoint.

 

Avrum

0 Kudos
avrumw
Guide
Guide
5,851 Views
Registered: ‎01-23-2009

What is the reason for having an IBUF followend by an IBUFDS?

 

@florentw,

 

Note that this isn't an IBUFDS, but an IBUFDS_GTE2; this is the differential input buffer dedicated to the reference clock inputs of the high speed transceiver (the GTP/GTX/GTH/GTY...). In some technologies, the tools actually need an IBUF on both the N and P inputs of these buffers (which doesn't really make sense, but that is what the tools require - this has been "fixed" in UltraScale, where the IBUFDS_GT* acts as an input buffer).

 

So, architecturally, this is correct; if the tools implemented this, then the clock is coming in on the high speed transceiver reference clock, and the buffer structure is legal. However, using the REFCLK input as a fabric clock source is not usually recommended - it doesn't have the same characteristics as a clock coming in from a SRCC/MRCC clock source. So, it begs the question, was this done on purpose (i.e. this clock really is the reference clock to a GT*) or is this a board design accident...

 

Avrum

0 Kudos
markcurry
Scholar
Scholar
5,848 Views
Registered: ‎09-16-2009

For what it's worth, I'd just start by pipelining the reset by a few stages, and seeing what the tool does from there.

 

Anything past this - manually creating trees, etc, starts to require you going in a basically preventing the synthesis/optimizer from doing it's job.

 

So far, this has been enough.  I've always been weary that we'll hit the same problem sometime, but so far it's not been too much of an issue.  We're more likely to hit the issues as we ignore the reckless-at-best "lazy" reset strategies that Xilinx advocates.

 

Regards,

 

Mark

avrumw
Guide
Guide
5,838 Views
Registered: ‎01-23-2009

as we ignore the reckless-at-best "lazy" reset strategies that Xilinx advocates

 

I understand what you are saying, but I want to be careful here.

 

It is true that Xilinx recommends the removal of global reset structures in favor of the default initialization of the registers at configuration time where possible. It is the "where possible" that is extremely important, and is often not explicitly stated when the "Xilinx recommendation" is repeated.

 

Xilinx's strategy (remove global resets where possible) is a very powerful recommendation.

 

Resets can take a huge amount of routing resources, can horribly disrupt the floorplan of a design, and dramatically increases the number of hard-to-meet timing paths. All of this makes timing closure harder to obtain and can significantly increase the tool run time. So removing them is a good idea.

 

The problem is the "where possible" is extremely hard to determine. Basically as the designer infers each and every flip-flop in the design, he/she needs to ask "Is it possible to remove the reset on this flip-flop". The answer can be hard to determine - it revolves around the question "Is it theoretically possible for this flip-flop to change state 'shortly' after configuration ends". If the answer is yes, then this flip-flop needs an explicit synchronous reset, if not then it can get by with the configuration initialization.

 

The problem is that if you get even one flip-flop wrong, and remove the reset from one that needs it, you end up with a system that may be unreliable as you power it up - it may fail to come out of "reset" properly. Furthermore, it is almost impossible to determine that you have done so by any conventional means (i.e. simulation won't catch it), and tracing an "unreliably resetting system" back to a particular flip-flop that needs a reset when you didn't provide it is incredibly difficult to do (as these things will tend to fail 1 in 100 or 1 in 1000 (or more) power-up cycles - good luck catching that and determining the cause (and, of course, Vivado ILA is almost useless here, since it, too, needs resets, so can't start its capture early enough to catch the cause of the problem)...

 

So, when implemented properly Xilinx's recommendation is the way to go. But, doing so is time consuming and risky...

 

Avrum

Tags (1)
markcurry
Scholar
Scholar
5,827 Views
Registered: ‎09-16-2009

Well summarized avrumw.  You explain the core issue well.  But it's even worse in many cases. 

 

Your question:

"Is it theoretically possible for this flip-flop to change state 'shortly' after configuration ends".  This belies another bug, in that many times (more often than many presume, IMHO) configuration != reset.  There's very often a separate reset signal that the user logic needs to be sensitive to - One that's unique and separate from FPGA configuration.

 

I once spent 3-4 weeks debugging one of these reset issues.  I started taking Xilinx advice not resetting everything.  And I'm careful with this stuff.  And I missed something.  A long bench debug process followed.  And as I'm sure you're well aware these things are nasty debug activities - often leading you down false paths in your diagnostics.

 

For us, there's absolutely no way we're doing this anymore (removing resets).  The benefits don't nearly approach the (potential and uncertain) costs.

 

I bang on Xilinx a LOT with their recommendations here.  I'm of the opinion that more could be done in the FPGA architecture to make a 'reset everything' strategy less costly.  But I don't think they're motivated to do this, as their point of view - it's a customer education problem - we're not designing our logic right.

 

--Mark

Tags (1)
0 Kudos
avrumw
Guide
Guide
4,960 Views
Registered: ‎01-23-2009

This belies another bug, in that many times (more often than many presume, IMHO) configuration != reset. 

 

I am not sure I understand this point. Yes, the INIT value is not (necessarily) the same as a reset value, but if you have no reset, then you have only the INIT value. The mechanism of setting the INIT value is pretty well documented; providing an initial value to the underlying reg/logic/signal that ends up inferring the flip-flop (otherwise it is a 0).

 

The other issue is that the time of exit from "configuration" (really when the GSR deasserts) is different than the deassertion of any global (or partial global) reset. This is, of course, the crux of the problem; you need to ensure that for a flip-flop with no explicit reset, nothing will disrupt the INIT value after configuration ends until the time that you are certain that your synchronous system is truly up and running (anything that uses the partial reset comes out of reset). This is what I mean by "it can't change shortly after configuration completes". If you can guarantee this, then you can do without reset - if you can't then you can't. But this determination is what makes the partial reset strategy complicated...

 

I'm of the opinion that more could be done in the FPGA architecture to make a 'reset everything' strategy less costly.

 

I am not certain what this might be. In most architectures (but not all) you can freely use global clock networks to carry the global reset to a large number of flip-flops; this reduces the use of general fabric routing, removes any effect on the placement due to reset, and probably reduces all the timing paths associated with resets. In previous generations, this was fairly unattractive since there were so few global clock networks. This is less true in UltraScale (but I don't know for certain how easy it is to use global clock networks for reset in UltraScale).

 

Other than having dedicated reset networks (like the clock networks), there really isn't anything else Xilinx can do - the reset must get to all the endpoints, and it must do so within one clock period; this takes routing resources and introduces timing paths...

 

Avrum

0 Kudos
markcurry
Scholar
Scholar
4,941 Views
Registered: ‎09-16-2009


@avrumw wrote:

This belies another bug, in that many times (more often than many presume, IMHO) configuration != reset. 

 

I am not sure I understand this point. Yes, the INIT value is not (necessarily) the same as a reset value, but if you have no reset, then you have only the INIT value. The mechanism of setting the INIT value is pretty well documented; providing an initial value to the underlying reg/logic/signal that ends up inferring the flip-flop (otherwise it is a 0).

 

The other issue is that the time of exit from "configuration" (really when the GSR deasserts) is different than the deassertion of any global (or partial global) reset. This is, of course, the crux of the problem; you need to ensure that for a flip-flop with no explicit reset, nothing will disrupt the INIT value after configuration ends until the time that you are certain that your synchronous system is truly up and running (anything that uses the partial reset comes out of reset). This is what I mean by "it can't change shortly after configuration completes". If you can guarantee this, then you can do without reset - if you can't then you can't. But this determination is what makes the partial reset strategy complicated...

 

 

 


 

Here's another of example of why it's even it's even more difficult to rely on the GSR deassert, and configuration only: What happens if your clock is unstable after configuration is complete?

 

I'd consider that as further trouble (upon the already difficult things you've noted) of just relying on configuration values.

 

Why would your clock be unstable?  Well, most of my clocks come from a MMCM, which is not locked when configuration is complete.  So, all flops tied to a clock coming from a MMCM - have an even further difficulty of holding configuration values.  Can anyone reliably say that all FF's tied to an MMCM output clock will behave deterministically when the PLL is unlocked?  Will I always have a consistent initialization state when lock is done? I'd have trouble arguing that point.

 

Anything using a PCIE_CLK on a X86 desktop - that clock will usually NOT be reliable at cold boot.  The FPGA confguration will be complete BEFORE PCIE_CLK is stable.  (Well, it's a race, and one that's not reliably one-way or the other).  You must explicitly reset with PCIE_RSTB pin coming from the interface.  (This is the example that bit me a few years ago).

 

What if the SW running on some microprocessor on the board just desires to reset the FPGA, post configuration? 

 

There just so many thinks working against you not explicitly resetting darn near everything.  The white paper Xilinx likes to point people at pulls some (completely made up) number of 95+ percent of times you just don't need a reset, and that just a small minority of the time should you really reset.  But in reality, I'd argue it's exactly the opposite - it's a small minority case where one shouldn't explicitly reset.  Yes pipeline datapath registers are PROBABLY ok.  There's others sure.  But as you said difficult to always be certain, and the consequences of getting one wrong can be very troublesome.

 

Regards,

 

Mark

fpgalearner
Voyager
Voyager
4,902 Views
Registered: ‎04-11-2016

Hi @avrumw

 

concerning this:

"The solution is to manage your reset appropriately. One way or another you need to split or reduce the load on your reset. This could be done by moving to a "partial" global reset, and not resetting all flip-flops, or replicating the source flop (either manually or letting the tool do it), or by designing your own pipelined reset tree."

 

I am replacing 2 resets port from a reference design with 2 internal signal with initial value '0'. The reason is: I have a custom designed fpga board having no any reset port but similar to evaluation board.

 

I tested the reference design with standard evaluation board and there it works and all I did is match the pin configuration corresponding to custom board. The components like UART, Reset etc which is not available in the custom board I commented that port and defined as internal signal.

 

Is it right to do in such case?

 

Don't know how to do this:

"This could be done by moving to a "partial" global reset, and not resetting all flip-flops, or replicating the source flop (either manually or letting the tool do it), or by designing your own pipelined reset tree"

 

Any hints?

0 Kudos
markcurry
Scholar
Scholar
4,850 Views
Registered: ‎09-16-2009


fpgalearner wrote:

Don't know how to do this:

"This could be done by moving to a "partial" global reset, and not resetting all flip-flops, or replicating the source flop (either manually or letting the tool do it), or by designing your own pipelined reset tree"

 

Any hints?


Did you try my hint with just pipelining the reset?  Here's what I mean.  A typical reset synchronizer circuit would look something like (verilog):

reg [ STAGES - 1 : 0 ] reset_i_D;
always @( posedge clk_i or posedge reset_i)
  if (reset_i)
    reset_i_D <= {STAGES{1'b1}};  // Assert on reset in
  else
    reset_i_D <= { reset_i_D, 1'b0 };

assign reset_o = reset_i_D[ STAGES - 1 ];

This code represents a typical circuit to pass the active edge of reset to downstream logic asynchronously, and properly synchronize the inactive edge.  (There's other similar circuits, use the above as an example, you probably have something like this).

 

Typically, it's recommended that "STAGES" be set to at least 2 (for metastabilty reasons).  I'm suggesting increasing "STAGES" to more than 2 - not for metastability concerns - but to allow the place and route tool more wiggle room.  Most of the time your circuit can handle more delay here just fine.  I'd think increasing this to 4-5 might be worth a shot.

 

I'd also take a look at the max_fanout attributes.  This all hinges on allowing the tool to replicate as necessary some of those pipeline stages, and reducing fanout of each leg, such that the tool has a better chance on meeting timing.

 

I find this first step much easier than going in and starting to build the tree yourself.  That's certainly an option, but you really need to dig in at that point, and get a lot of details correct.  I'd much rather the tool do this for me.

 

Regards,

 

Mark

 

0 Kudos
avrumw
Guide
Guide
4,841 Views
Registered: ‎01-23-2009

@markcurry

 

Typically, it's recommended that "STAGES" be set to at least 2 (for metastabilty reasons).  I'm suggesting increasing "STAGES" to more than 2 - not for metastability concerns - but to allow the place and route tool more wiggle room. 

 

We have to be careful with this. The shift register you are using is serving two purposes

   - the first M stages are for metastability (M is at least 2)

   - the last N-M stages are for pipelining to allow replication

 

From the RTL code point of view, these two look the same, and in your RTL a single shift register is used for both.

 

However as far as the tools are concerned, it is very important that the tools be able to tell the difference between them. For Vivado, this is specifically done with the ASYNC_REG property; it must be placed on the first M stages and must not be placed on the last N-M stages.

 

Without the ASYNC_REG on the first M flip-flops, the tool will not recognize this as a legal synchronizer. Aside from things like failing report_cdc, this can potentially lead to system failure from a number of causes

  - if the tool replicates any one of these flip-flops, then it is an invalid synchronizer

  - if there is too much routing delay between any two of these flip-flops, then it weakens (or invalidates) the quality of the metastability resolution

 

The ASYNC_REG property prevents both of the above.

 

Conversely, if you put the ASYNC_REG on all N registers, they all become part of the metastability resolution chain. This is fine from the point of view of metastability, but will not give the tool any "wiggle room" - a flip-flop with the ASYNC_REG property cannot be replicated.

 

When both types of flip-flops are in a single shift register like this, you can't set the ASYNC_REG on the first M and not the rest in in your RTL code (you cannot set a Verilog attribute on some bits of a bus, and not others). You probably can do it in your synthesis XDC file but it is really better to do it in the RTL...

 

Finally, it is worth pointing out that the extra pipelining only helps if either

  - the reset synchronizer and all (or at least most of) its loads are in the same module or

  - the design is set to flatten_hierarchy = full or rebuilt (both of which I personally avoid)

 

If neither of these are true, then the tool can't replicate the flip-flops in the chain, since it must preserve the one and only port carrying the reset signal from one module to another.

 

Avrum

0 Kudos
muzaffer
Teacher
Teacher
4,826 Views
Registered: ‎03-31-2012

@markcurry 

>> We're more likely to hit the issues as we ignore the reckless-at-best "lazy" reset strategies that Xilinx advocates.

 

My take is "avoiding work you don't need to do" is not being lazy. I do a lot of dsp based designs and the way I look at reset is decided by the partition in the design: data-path doesn't need reset and control logic does. There is no point in resetting all of those dsp48s and fabric logic which take an input from the (say) ADC and produce an output to DAC given there is a state machine which manages the pipeline and generate the right enables and other control signals for the output. Obviously the control state machine needs a very robust reset implementation but the datapath needs none of it, data simply flows through it. Only maybe the CE signals for power management are necessary from the state machine.

- Please mark the Answer as "Accept as solution" if information provided is helpful.
Give Kudos to a post which you think is helpful and reply oriented.
0 Kudos