cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
Highlighted
4,444 Views
Registered: ‎01-22-2015

Sampled Digital Interface

Jump to solution

The image below describes what I will call a "Sampled Digital Interface" between a FPGA and an external device.

 

Other Description:

    CLK2-period is 8x CLK1-period

    CLK2 is asynchronous with CLK1

    "2FF-Sync" is a standard 2-Flop synchronizer

 

My FPGA firmware uses a state machine clocked by CLK1:

    state-1) Wait for rising edge of Cs

    state-2) Wait N-cycles of CLK1 and then latch/save the value of Ds, (4 < N < 8)

    state-3) Return to state-1

 

I think that the delay time restriction, 4 < N, is necessary to account for both the CLK2-to-DAT skew and the settling time of the 2FF-Sync?

 

Will this approach allow the FPGA to reliably capture data, DAT, coming from the "External Device"?

 

SmDgIn1.jpg

 

0 Kudos
1 Solution

Accepted Solutions
Highlighted
Guide
Guide
7,880 Views
Registered: ‎01-23-2009

Assuming the uncertainty between CLK2 and DAT is (a fair bit) less than the clock period of CLK1 then this should work (and should work reliably).

 

To figure out the robustness of the interface, you need to look at the width of the data eye of DAT with respect to CLK2, and then subtract out one clock period of CLK1 (and a bit more) - see the analysis below.

 

Draw yourself a little diagram that has the timing relationships of CLK2 and DAT marked out (including the valid/invalid periods). Then place the rising edge of CLK1 at exactly the same time as CLK2, then draw the results assuming that the flip flop CS_reg "just misses" the edge of CLK2, and "just gets" the edge - since you can't know which it will do in this case. Count off the N-cycles of CLK1 in both cases and then determine the minimum margin of the DAT window (after 2 FFs) with respect to the Nth edge of CLK1. Subtract out "a little" for the setup/hold requirement of the flip-flops and a little more for clock skew and clock jitter, and this is the robustness of this capture mechanism.

 

And, by the way, this is done all the time - it is known to work, and 8x is usually more than sufficient (as long as the DAT data window is wide enough). (And is commonly referred to as "oversampling" the interface).

 

Standard caveats: Be sure to place ASYNC_REG property on the 2-Flop synchronizers, and possibly consider 3-Flop synchronizers (depending on the frequency of CLK1).

 

Avrum

View solution in original post

5 Replies
Highlighted
Guide
Guide
7,881 Views
Registered: ‎01-23-2009

Assuming the uncertainty between CLK2 and DAT is (a fair bit) less than the clock period of CLK1 then this should work (and should work reliably).

 

To figure out the robustness of the interface, you need to look at the width of the data eye of DAT with respect to CLK2, and then subtract out one clock period of CLK1 (and a bit more) - see the analysis below.

 

Draw yourself a little diagram that has the timing relationships of CLK2 and DAT marked out (including the valid/invalid periods). Then place the rising edge of CLK1 at exactly the same time as CLK2, then draw the results assuming that the flip flop CS_reg "just misses" the edge of CLK2, and "just gets" the edge - since you can't know which it will do in this case. Count off the N-cycles of CLK1 in both cases and then determine the minimum margin of the DAT window (after 2 FFs) with respect to the Nth edge of CLK1. Subtract out "a little" for the setup/hold requirement of the flip-flops and a little more for clock skew and clock jitter, and this is the robustness of this capture mechanism.

 

And, by the way, this is done all the time - it is known to work, and 8x is usually more than sufficient (as long as the DAT data window is wide enough). (And is commonly referred to as "oversampling" the interface).

 

Standard caveats: Be sure to place ASYNC_REG property on the 2-Flop synchronizers, and possibly consider 3-Flop synchronizers (depending on the frequency of CLK1).

 

Avrum

View solution in original post

Highlighted
4,374 Views
Registered: ‎01-22-2015

          To figure out the robustness of the interface …… Draw yourself a little diagram…

That was my next question. Your explanation is very clear and makes perfect sense to me.

 

For this oversampled-interface (thanks for correct name), maybe we must tell implementation to keep Ds_reg and Cs_reg close to (or the same distance from) the FPGA ports, DAT and CLK2.  Otherwise, the CLK2-to-DAT skew effectively changes?  So, should we put timing constraints on the oversampled-interface for this purpose – or for other purposes?

 

An alternative to the oversampled-interface is the standard synchronous approach.  That is, make CLK1=CLK2, maybe use the FPGA clock module to deskew CLK1, write set_input_delay constraints, etc.  This synchronous approach is still a little cryptic to me (especially the set_input_delay constraints), while operation and analysis of the asynchronous oversampled-interface seems perfectly clear.  Can you say a little (or a lot) about pros/cons of using the asynchronous vs synchronous approach for digital interfaces that reach outside the FPGA?

0 Kudos
Highlighted
Guide
Guide
4,363 Views
Registered: ‎01-23-2009

For this oversampled-interface (thanks for correct name), maybe we must tell implementation to keep Ds_reg and Cs_reg close to (or the same distance from) the FPGA ports, DAT and CLK2.

 

All (recent) FPGA families have "IOB flip-flops" - these are registers that are actually located in the same cell as the input buffer (in the Input/Output Buffer - IOB). Since they are part of the same cell, there is effectively no routing delay between the input buffer and the IOB flip-flop. For an interface like this you want (really need) to insure that both Ds_reg and Cs_reg get packed into the IOB. This is done by setting the IOB attribute on the port in your XDC

 

set_property IOB TRUE [get_ports {DAT CLK2}]

 

An alternative to the oversampled-interface is the standard synchronous approach.  That is, make CLK1=CLK2, maybe use the FPGA clock module to deskew CLK1, write set_input_delay constraints, etc.  This synchronous approach is still a little cryptic to me (especially the set_input_delay constraints), while operation and analysis of the asynchronous oversampled-interface seems perfectly clear.  Can you say a little (or a lot) about pros/cons of using the asynchronous vs synchronous approach for digital interfaces that reach outside the FPGA?

 

Wherever possible, you should attempt to do normal synchronous interfaces (rather than oversampling). The most obvious advantage is that you don't lose the interface margin due to the uncertainty of the phase of your oversampling clock (the 1 clock period of CLK1 uncertainty). Furthermore, with proper constraints, the tools give you a simple "pass/fail" result of the static timing analysis of the interface - something you cannot get with the oversampled interface.

 

Furthermore, there are frequencies where oversampling (with conventional I/O) simply won't work - you need to be able to run your internal clock several times faster than the data rate - and there is in inherent limit to the speed of an internal clock (depending on the device this is around 600-800MHz); you can go effectively faster with DDR, but even with that, this places an upper limit on the oversampling technique.

 

My recommendation is that you take the time to learn the constraint mechanism so that the set_input_delay command makes sense to you - it isn't "cryptic" - it just takes some time to learn properly...

 

Avrum

Highlighted
4,354 Views
Registered: ‎01-22-2015

Avrum - many thanks for your time and answers! 

 

  ...take the time to learn the constraint mechanism so that the set_input_delay command makes sense to you..

You are, of course, right.  I will make the effort.

 

When we use the constraint, set_input_delay, on a path then this allows timing analysis to be run on the path. Timing analysis and implementation are run together, with implementation “doing things” to help the path pass timing analysis.  Is the “doing things” more than just modifications to "place and route"?

 

Mark

0 Kudos
Highlighted
Guide
Guide
4,269 Views
Registered: ‎01-23-2009

When we use the constraint, set_input_delay, on a path then this allows timing analysis to be run on the path.

 

Yes. Without a set_input_delay, the logic on the input is not part of a complete static timing path, and hence is ignored by the timing analysis.

 

Timing analysis and implementation are run together, with implementation “doing things” to help the path pass timing analysis.  Is the “doing things” more than just modifications to "place and route"?

 

So, normally, yes, and a bit more.

 

Static timing analysis (STA) is an integral part of all the steps in FPGA implementation. Vivado synthesis is "timing aware" and will construct different logic given different constraints - often sacrificing speed for area when necessary as determined by the constraints.

 

After synthesis, STA will help the placer determine where on the die to place components and help the router to choose the best routes for the nets between the components - the tools must preserve the functionality, and attempt to find placement and routing that meet the STA requirements (as determined by the constraints).

 

Depending on the path, though, the tools have more or less freedom in how they can implement a path. When looking at a typical input path, for example, we have generally given the synthesis tool no flexibility; we have directly instantiated or inferred the capture flip-flop, and have specifically constructed the clocking scheme (what kind of clock buffer is used, whether a DCM/MMCM/PLL is used, and if so, how it is driven and programmed). So the synthesized result is independent of constraints.

 

In many cases (in fact, ideally) the same is true of place and route. The "best" way to implement an input interface is to capture the data in an IOB flip-flop, an IDDR, or an ISERDES. These resources are located in the input/output block (IOB), and hence are in a fixed location based on the pin you have chosen for the input.

 

So for these interfaces, the constraints do not have any mechanism of influencing the performance of the interface. As a result, the only thing the STA can do is either confirm (or deny) that the interface will operate properly. With proper constraints (that completely describe the characteristics of the clock and the input signals, including jitter, latency, duty cycle imbalance, etc...), if the tool says the interface meets timing, then the interface will work reliable on any part of the specified type, at any legal combination of board voltage and device temperature.

 

So for most I/O interfaces, this is all the constraints do - determine whether the interface will work or not. Of course, this is extremely important to most designers (!), and this is the only way to do it...

 

Avrum