UPGRADE YOUR BROWSER

We have detected your current browser version is not the latest one. Xilinx.com uses the latest web technologies to bring you the best online experience possible. Please upgrade to a Xilinx.com supported browser:Chrome, Firefox, Internet Explorer 11, Safari. Thank you!

cancel
Showing results for 
Search instead for 
Did you mean: 
Observer dordije
Observer
499 Views
Registered: ‎09-06-2018

Timing analysis for spartan 6 and external logic

Jump to solution

Hi, 


I'm using a Spartan 6 with ISE 14.7 to control some external chip. That chip generates 250MHz clock used to synchronize control signals generated in the FPGA. So, the synchronization clock is an input to the FPGA, while control signals are the outputs as shown on the picture.

Blank Diagram.png

The required setup time for the control signal is about 1.75ns relative to synchronization clock. There are also propagation delays of the signals on the PCB which should be included in the analysis (assume 1ns for clock and control). For now, I have tried using DCM on the input clock, however the timing closure could not be achieved. The best output delay that can be achieved is around 8ns. I have used OFFSET OUT constraint AFTER synchronization clock. Is there another way to constraint this design so that setup time is met? Is this actually possible using Spartan 6?

0 Kudos
1 Solution

Accepted Solutions
Highlighted
Guide avrumw
Guide
431 Views
Registered: ‎01-23-2009

Re: Timing analysis for spartan 6 and external logic

Jump to solution

OK there are at least two different issues here.

The first is the one that @drjohnsmith focussed on - how are you going to bring data from a 125MHz domain to an unrelated 250MHz domain. Even though these two clocks are multiples of eachother, if they are unrelated (i.e. coming from two different reference sources), then they must be treated as asynchronous domains, and a proper clock domain crossing circuit (CDCC) must be placed between them. NOTE: In ISE, two independent clock inputs are considered unrelated, which means that ISE will NOT perform timing analysis on any path between them without an extra constraint (a FROM/TO). In otherwords, if you don't explicitly constrain the path between the two domains, there is no constraint on them, and ISE will not flag the path as failing (or anything else).

There is a second issue, though - can you meet synchronous timing on an output interface at 250MHz with an input clock? The answer is definitely "no" with this clock structure, and probably still "no" with most other clock structures.

As you already found out using an IBUFG -> BUFG -> IOB flip-flop (and the clock input must be on a clock capable pin and you definitely should be using an IOB flip-flop), you don't get anywhere near the requirement of 2.25ns (the 4.00ns period minus the 1.75ns setup requirement) - if you add the two 1.0ns board delays, the requirement is down to 0.25ns! Clearly impossible.

So clearly you need to look at modifying the timing. You say you put a DCM in the path, so you have the ability to manipulate the phase of the internal clock with respect to the external one - with the DCM at this speed you can do pretty much any delay (from 0 to 4ns). This will help, but will it be enough?

The bigger question is the variability - you say your device needs 1.75ns of setup - how much hold does it need? I will assume (for the moment) that it is 0. This means that you need a 1.75ns valid window for your device, leaving 2.25ns for remaining uncertainty. Your board delays, while long, do not vary much with process/voltage/temperature (PVT) - you can assume something like 10% variation - so this 2ns of delay adds about another 0.2ns of variation.

Inside the FPGA, the DCM will cancel out much of the PVT variation of some of the components - the delay of the IBUF, the route to the DCM (which is dedicated), from the DCM to the BUFG (also dedicated), and through the global clock tree. This is cancelled out to a very small amount (theoretically 0, but not actually 0). This leaves the delay of the IOB flip-flop and the OBUF. The IOB flip-flop has a fairly small delay (say around 0.5ns - just guessing), but the OBUF (the actual output driver) has a very large one. The magnitude depends greatly on the I/O standard, the drive strength and the slew rate - based on these parameters it can be anywhere between 3ns (for the fastest ones) to 8ns or so (for the slowest).

A rule of thumb for PVT variation of silicon transistors is 3:1 - so if your output delay is 8ns at the slowest PVT corner, then it will be around 2.66ns at the fastest, for a whopping 5.33ns of PVT uncertainty. This is WAY WAY WAY more than your system can tolerate. So lets use the fastest drive and slew (which may cause some serious overshoot and ringing) of 3ns and add all the uncertainty up

  • DCM compensated stuff (IBUF, BUFG, global clock) lets say 0.2ns
  • OBUF 3ns*2/3 = 2ns
  • Board delay: 2ns * 0.1 = 0.2ns

This adds up to 2.4ns. Your device (we guess) needs a 1.75ns setup/hold window, so together this adds up to 4.2ns - larger than the 4ns of your period. So, no, it is not possible to meet timing this way (even assuming best case).

There is one other clock topology that can help - take a look at (wow, they don't show this model for any modern FPGA - lets go back to Virtex-5) - UG190 Figure 2-9 - "Board-Level Clock using DDR Register with External Feedback". This kind of clocking model can still be used with modern devices (including the Spartan-6). By doing this (assume there is an OBUF after the DDR - there has to be) - the DCM cancels out the delay of the ODDR and OBUF as well as the BUFG and IBUFG. The net result is that if this clock (on the BUFG) clocks another IOB flip-flop, the skew between the output and the fed back clock (which is now "in phase" with the input clock at the pins) is very small. When done this way, the total delay from the input clock to the output data is almost 0, and the PVT variation is almost 0. You can also use the CLK90 (with another BUFG) to get a delay of 1/4 clock with (again) almost no variation. With this topology you can probably meet the timing you need.

But, there are a whole bunch of issues.

First, there is absolutely no way to generate timing reports for this (or any other) output structure in ISE - ISE can generate a maximum output delay, but not a meaningful minimum, so you cannot analyze the PVT variation of the system (so you have to guess).

Second, this topology needs a specific feedback path on your board (from an output pin back to another clock capable pin in the same bank as the input clock). If the board is already designed, or you are using a development board, then this path doesn't exist (and hence you can't use it).

Third, the internal clock (the output of the BUFG) is essentially phase unknown - so trying to do any synchronous clock crossing to/from that domain is difficult or impossible. But you are already saying that you are using clock domain crossing techniques from your 125MHz domain, so the fact that this internal clock has a different phase doesn't make this any harder (I assume you are using correct clock domain crossing techniques).

Fourth, this internal clock cannot be used for any meaningful input capture (this is really related to the above) - the phase of the clock is so unknown that it is impossible to know where the required setup/hold window of an input captured with this clock will be.

Avrum

View solution in original post

9 Replies
Teacher drjohnsmith
Teacher
481 Views
Registered: ‎07-09-2009

Re: Timing analysis for spartan 6 and external logic

Jump to solution
Well thats never going to work,..
You have two clocks, not related to each other,
so its impossible to do timing analysis.

What you basically have is Clock Domain Crossing, ( CDC )

Ways around,may be
Is the 250 MHz constant ? could the FPGA run off that clock ?

Without more details , not much more I can say

<== If this was helpful, please feel free to give Kudos, and close if it answers your question ==>
0 Kudos
Observer dordije
Observer
478 Views
Registered: ‎09-06-2018

Re: Timing analysis for spartan 6 and external logic

Jump to solution

Hi @drjohnsmith ,

Thank you for your response. The FPGA runs on 125 MHz clock and 250MHz is a constant value. Yes, it is clock domain crossing from a 125 to 250 MHz external clock.

Kind regards

0 Kudos
Teacher drjohnsmith
Teacher
468 Views
Registered: ‎07-09-2009

Re: Timing analysis for spartan 6 and external logic

Jump to solution
if 250 MHz is always available,
bring that into a clock pin on the FPGA, and use that internally,

Your are going to have to ensure your output registers are in the IOB
<== If this was helpful, please feel free to give Kudos, and close if it answers your question ==>
0 Kudos
Observer dordije
Observer
465 Views
Registered: ‎09-06-2018

Re: Timing analysis for spartan 6 and external logic

Jump to solution

That clock is brought through the pin and buffered. Also, all output registers are in the IOB. However, I was not sure if Spartan 6 can actually achieve this timing constraints dictated by the external logic.

0 Kudos
Teacher drjohnsmith
Teacher
460 Views
Registered: ‎07-09-2009

Re: Timing analysis for spartan 6 and external logic

Jump to solution
250 MHz out should be possible,
the 1ns board delay worries me though . thats how long a track ?

What sort of speed is the control signal ? Is it a single 250 MHz wide pulse,or a level ?
<== If this was helpful, please feel free to give Kudos, and close if it answers your question ==>
0 Kudos
Highlighted
Guide avrumw
Guide
432 Views
Registered: ‎01-23-2009

Re: Timing analysis for spartan 6 and external logic

Jump to solution

OK there are at least two different issues here.

The first is the one that @drjohnsmith focussed on - how are you going to bring data from a 125MHz domain to an unrelated 250MHz domain. Even though these two clocks are multiples of eachother, if they are unrelated (i.e. coming from two different reference sources), then they must be treated as asynchronous domains, and a proper clock domain crossing circuit (CDCC) must be placed between them. NOTE: In ISE, two independent clock inputs are considered unrelated, which means that ISE will NOT perform timing analysis on any path between them without an extra constraint (a FROM/TO). In otherwords, if you don't explicitly constrain the path between the two domains, there is no constraint on them, and ISE will not flag the path as failing (or anything else).

There is a second issue, though - can you meet synchronous timing on an output interface at 250MHz with an input clock? The answer is definitely "no" with this clock structure, and probably still "no" with most other clock structures.

As you already found out using an IBUFG -> BUFG -> IOB flip-flop (and the clock input must be on a clock capable pin and you definitely should be using an IOB flip-flop), you don't get anywhere near the requirement of 2.25ns (the 4.00ns period minus the 1.75ns setup requirement) - if you add the two 1.0ns board delays, the requirement is down to 0.25ns! Clearly impossible.

So clearly you need to look at modifying the timing. You say you put a DCM in the path, so you have the ability to manipulate the phase of the internal clock with respect to the external one - with the DCM at this speed you can do pretty much any delay (from 0 to 4ns). This will help, but will it be enough?

The bigger question is the variability - you say your device needs 1.75ns of setup - how much hold does it need? I will assume (for the moment) that it is 0. This means that you need a 1.75ns valid window for your device, leaving 2.25ns for remaining uncertainty. Your board delays, while long, do not vary much with process/voltage/temperature (PVT) - you can assume something like 10% variation - so this 2ns of delay adds about another 0.2ns of variation.

Inside the FPGA, the DCM will cancel out much of the PVT variation of some of the components - the delay of the IBUF, the route to the DCM (which is dedicated), from the DCM to the BUFG (also dedicated), and through the global clock tree. This is cancelled out to a very small amount (theoretically 0, but not actually 0). This leaves the delay of the IOB flip-flop and the OBUF. The IOB flip-flop has a fairly small delay (say around 0.5ns - just guessing), but the OBUF (the actual output driver) has a very large one. The magnitude depends greatly on the I/O standard, the drive strength and the slew rate - based on these parameters it can be anywhere between 3ns (for the fastest ones) to 8ns or so (for the slowest).

A rule of thumb for PVT variation of silicon transistors is 3:1 - so if your output delay is 8ns at the slowest PVT corner, then it will be around 2.66ns at the fastest, for a whopping 5.33ns of PVT uncertainty. This is WAY WAY WAY more than your system can tolerate. So lets use the fastest drive and slew (which may cause some serious overshoot and ringing) of 3ns and add all the uncertainty up

  • DCM compensated stuff (IBUF, BUFG, global clock) lets say 0.2ns
  • OBUF 3ns*2/3 = 2ns
  • Board delay: 2ns * 0.1 = 0.2ns

This adds up to 2.4ns. Your device (we guess) needs a 1.75ns setup/hold window, so together this adds up to 4.2ns - larger than the 4ns of your period. So, no, it is not possible to meet timing this way (even assuming best case).

There is one other clock topology that can help - take a look at (wow, they don't show this model for any modern FPGA - lets go back to Virtex-5) - UG190 Figure 2-9 - "Board-Level Clock using DDR Register with External Feedback". This kind of clocking model can still be used with modern devices (including the Spartan-6). By doing this (assume there is an OBUF after the DDR - there has to be) - the DCM cancels out the delay of the ODDR and OBUF as well as the BUFG and IBUFG. The net result is that if this clock (on the BUFG) clocks another IOB flip-flop, the skew between the output and the fed back clock (which is now "in phase" with the input clock at the pins) is very small. When done this way, the total delay from the input clock to the output data is almost 0, and the PVT variation is almost 0. You can also use the CLK90 (with another BUFG) to get a delay of 1/4 clock with (again) almost no variation. With this topology you can probably meet the timing you need.

But, there are a whole bunch of issues.

First, there is absolutely no way to generate timing reports for this (or any other) output structure in ISE - ISE can generate a maximum output delay, but not a meaningful minimum, so you cannot analyze the PVT variation of the system (so you have to guess).

Second, this topology needs a specific feedback path on your board (from an output pin back to another clock capable pin in the same bank as the input clock). If the board is already designed, or you are using a development board, then this path doesn't exist (and hence you can't use it).

Third, the internal clock (the output of the BUFG) is essentially phase unknown - so trying to do any synchronous clock crossing to/from that domain is difficult or impossible. But you are already saying that you are using clock domain crossing techniques from your 125MHz domain, so the fact that this internal clock has a different phase doesn't make this any harder (I assume you are using correct clock domain crossing techniques).

Fourth, this internal clock cannot be used for any meaningful input capture (this is really related to the above) - the phase of the clock is so unknown that it is impossible to know where the required setup/hold window of an input captured with this clock will be.

Avrum

View solution in original post

Observer dordije
Observer
314 Views
Registered: ‎09-06-2018

Re: Timing analysis for spartan 6 and external logic

Jump to solution

Hi @avrumw,

Thank you very much for your answer. It explains a lot to me. As I do not have a specific feedback path on the board, the other clock topology cannot be used in this case. For clock domain crossing, I have been using 2-flip flop synchronizers with 250MHz synchronization clock. I am aware of metastability issues, but I believed this would be enough to cross from 125MHz to 250MHz domain.

Back to the device, it needs 1.75ns of setup and 0ns of hold time, as you have already assumed. The picture below shows the relation between the control signal and synchronization clock. 

Untitled.png

It is also stated that "this arrangement requires that the round trip delay from pin 55 to pin 59 is: k × tSYNC_CLK – 1.75ns, where k is an integer and 1.75 ns is the specified minimum setup time. For minimum latency, ensure that k = 1. " Is it possible to use k=2 in this topology with DCM?

Kind regards

0 Kudos
Teacher drjohnsmith
Teacher
282 Views
Registered: ‎07-09-2009

Re: Timing analysis for spartan 6 and external logic

Jump to solution
I am constantly amazed just how difficult chip manufacturers make it to interface FPGAs to their chips. .........

<== If this was helpful, please feel free to give Kudos, and close if it answers your question ==>
Guide avrumw
Guide
258 Views
Registered: ‎01-23-2009

Re: Timing analysis for spartan 6 and external logic

Jump to solution

Is it possible to use k=2 in this topology with DCM?

Based on my previous analysis, the answer is no. Since the sum of the uncertaintly plus required setup/hold time is greater than the clock period, there is no guaranteed 1.75ns wide valid window at any point on this output pin - so regardless of what you do to shift it (backward to k=1 or forward to k=2) it is not wide enough.

Avrum