cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
Highlighted
Participant
Participant
12,404 Views
Registered: ‎12-06-2011

[Vivado] Constraining an asynchronous I/O interface

In chapter 5 of UG903, it explains how to constrain I/O pins on the FPGA for setup, hold and clock-to-out for talking to another chip which has a synchronous interface. This is straightforward enough.

 

In Vivado, it seems that one is expected to always specify two things (i) a clock, which defines the period for a synchronous interface and (ii) board delays + setup, hold & clock-to-out for the other device. From these, Vivado can calculate: (1) setup & hold time required of the FPGA inputs and (2) clock-to-out required of the FPGA outputs. It is therefore implied that all I/O pins are synchronous to something (which can be a virtual clock).

 

My question is: what is the methodology for constraining setup, hold and clock-to-out required of FPGA pins when the other device has an asynchronous interface (e.g. a Flash chip in asynchronous mode), where there is no clock common to the FPGA and the other device?

 

In ISE, one could directly specify required setup, hold and clock-to-output times of FPGA I/O pins. ISE's approach means that it is not implied that an I/O interface is synchronous (I'm not bashing Vivado, merely noting the difference).

 

Vivado's approach seems to imply that if an I/O interface is asynchronous, you must invent one or more virtual clocks, corresponding to sets of setup, hold & clock-to-out parameters for the various pins on the other device.

 

The other way of doing it is that you pretend that your interface is synchronous. Setup between pins (e.g. between CE# and Address of the Flash chip) can be satisfied using a sequencer than inserts the correct number of cycles of delay (using an internal clock in FPGA). Then you just need to make sure that your FPGA I/O pins use IOB flip-flops. But I find this approach less rigorous, and you will not in general be able to run at maximum possible speed because you have to add some timing margin due to min/max uncertainties.

 

 

 

Is my understanding correct, or is there some other way of constraining FPGA I/O pins for asynchronous interfaces? Are there some TCL commands other than set_input_delay and set_output_delay, for this sort of situation?

 

Thanks in advance for any pointers.

Tom

 

8 Replies
Highlighted
Contributor
Contributor
3,696 Views
Registered: ‎10-16-2017

I wonder why this question is left unanswered?

It is quite interesting what is the proper and recommended way for constraining asynchronous IO interfaces.

0 Kudos
Mentor
Mentor
3,645 Views
Registered: ‎06-09-2011

@adatatw,

 

First, if you have an asynchronous signal, you only need to constrain the path using set_minimum_delay in Vivado. Take a look at Vivado TCL reference guide to see numerous switches you can use with this command. 

 

Be careful, ISE approach for defining asynchronous signals is From: to  constraint which it is also used for defining multi-cycle paths. Refer to page 60 of ug612: 

Asynchronous.jpg

 

Hope this will help,

Hossein

Thanks,
Hossein
0 Kudos
Highlighted
Guide
Guide
3,565 Views
Registered: ‎01-23-2009

So, first...

 

In the case of asynchronous interfaces, I generally prefer to use IOB flip-flops for all outputs, and guarantee the inter-signal timing (ADDR->WE/RE, CE->WE/RE, DATA->WE) using guaranteed mechanisms of generating delay. This can be

  - using different edges of the clock for these events (rising and falling can be used)

  - using different phases of a clock (i.e. use an MMCM to generate a 90 degree phase shifted clock to generate two other possible edges)

  - using the ODELAY (if you are using HPIO)

 

In these cases, constraints aren't necessary (although you can still certainly add them) - the interface is "correct by design".

 

However, if you do want to constrain them, then you can do this in Vivado - it involves a little creative "persuasion" (aka lying) to the tools, but it can be done.

 

Let's take the ADDR -> RE. It has a setup before the asserting edge of RE, and has a hold with respect to the deasserting edge of the RE (or maybe with respect to the asserting edge of RE - it doesn't really matter).

 

Even though RE is a strobe, is not periodic, and is not truly a clock, from the static timing point of view we can define it as a generated clock. Assuming "my_re_reg" is the clocked object (either a fabric flip-flop or an IOB flip-flop or an ODDR) that generates the RE (connected to the pad), we can do

 

create_generated_clock -name re_clk -source [get_pins my_re_reg/C] -divide_by 1 [get_ports RE]

 

With this generated clock, we can now define the required output delays for ADDR

 

set_output_delay -clock re_clk -max <setup_requirement>  [get_ports ADDR*]

set_output_delay -clock re_clk -min<-hold_requirement>  [get_ports ADDR*]

 

[edit: fixed typo above - second line is -min]

 

If we also want to define the relationship to WE, we can do the same thing - create a generated clock

 

create_generated_clock -name we_clk -source [get_pins my_we_reg/C] -divide_by 1 [get_ports WE]

 

And specify the timing requirements with respect to that. Since these would be the second set of timing requirements on the ADDR port, we need to use the -add_delay so that we don't override the previous ones.

 

set_output_delay -clock we_clk -max <setup_requirement>  [get_ports ADDR*] -add_delay

set_output_delay -clock we_clk -min<-hold_requirement>  [get_ports ADDR*] -add_delay

 

[edit: fixed typo above - second line is -min]

 

Now your ADDR bus is completely constrained with respect to both the RE and WE timing requirements. It is worth pointing out that this is WAY more than you can do with UCF in ISE... (ISE is completely incapable of analyzing hold timing on output ports...)

 

Avrum

Tags (1)
Highlighted
Explorer
Explorer
3,460 Views
Registered: ‎09-13-2011

@avrumw Your posts are brilliant and I'm exactly looking for a way to constrain something that is "correct by design", just checking numbers I know will be good, just to make sure that if I one day make a change to the design it will warn me if I break something.

I too think IOB packing is important, not to say obligatory. In this way it is good that Vivado actively warns if IOB is set TRUE and it can't pack them unlike ISE where it was more like "pack what you can I don't care".

I'm not sure I fully understand above. You use -max in both first set_output_delay constraints, so only max timing analysis but with a negative number for hold - so it is hold time for max delay I presume? Doesn't it need -add_delay also on the second constraint?

 

Is there a reason to not consider min timing analysis in this asynchronous case? Could min delay timing be added like:

 

set_output_delay -clock we_clk -max <setup_requirement>  [get_ports ADDR*] -add_delay

set_output_delay -clock we_clk -max <-hold_requirement>  [get_ports ADDR*] -add_delay

set_output_delay -clock we_clk -min <min1_requirement>  [get_ports ADDR*] -add_delay

set_output_delay -clock we_clk -min <-min2_requirement>  [get_ports ADDR*] -add_delay

 

Here only considering we_clk.

What would 'min1_requirement' and 'min2_requirement' be here? I guess the min1_requirement would be hold requirement in min timing analysis and -min2_requirement would be setup requirement?

 

Further, the -divide_by I guess I can use on output constraining if my design for instance divides 'main-clock' to we_clk with a number, but I also guess that it wouldn't be of much use on input constraining due to the receiving clock being 'main-clock'?

Thank you so much in advance for any input on this.
Br

tsjorgensen

0 Kudos
Highlighted
Explorer
Explorer
3,454 Views
Registered: ‎09-13-2011

Regarding the -divide_by X I just made a small test. It doesn't seem to be useful on the output constraints either; the timing analysis will always check to the 'main-clock'-edge so for instance if 'main-clock' is 100MHz and -divide_by is set to 16 the output generated clock will be have edges at 0-80-160ns but a timing constraint (like set_output_delay) is then just done from 150ns to 160ns leaving the same 10ns as -divide_by 1.

 

I know the tool have no idea when I output the 'ADDR' (or whatever) and then have to presume that it can be output on the 'main-clock'-edge at 150ns and not what would make design sense at 80ns (falling edge of generated clock). Then of-course it only has 10ns to meet the next rising edge of 'main-clock' on the outside device. Is there any way I can constrain so that it checks for 80ns even though this doesn't really make sense?

 

Where else can I use the -divide_by option? Where does it make sense?

 

What would be logical to set a set_output_delay constraint to in this scenario? (div 16 of source clock, 'ADDR' out on falling edge of divided clock) -max 0 -min 0? Or not setting any constraint at all?

 

0 Kudos
Highlighted
Guide
Guide
3,450 Views
Registered: ‎01-23-2009

I'm not sure I fully understand above. You use -max in both first set_output_delay constraints, so only max timing analysis but with a negative number for hold - so it is hold time for max delay I presume? Doesn't it need -add_delay also on the second constraint?

 

Sorry - that was a typo - the second one should be -min; I will edit the original to fix the error.

 

Avrum

0 Kudos
Highlighted
Guide
Guide
3,447 Views
Registered: ‎01-23-2009

Where else can I use the -divide_by option? Where does it make sense?

 

There are a couple of things going on here.

 

The first is that we are "lying" to the tool - the we and re aren't real clocks, hence the divide factor for them is largely irrelevant.

 

The second is that if you do specify a divide on these clocks, you are creating a (synchronous) clock crossing path; the tools are using the rules they use to time synchronous paths between different domains; the flip-flops driving address are on the main clock and the outputs are on the generated clock. In any case where you have a path like this, the tools do the proper analysis; even if the destination clock can only change every N cycles, the main clock can change any cycle, including on cycle N-1, so the timing path is still (correctly) constrained to 1 main clock cycle - regardless of the value of N.

 

However (as an example - and one that doesn't apply here), if you define a generated clock with -divide_by N (say you are constraining the output of a clock generated with a BUFGCE and the appropriate enable) and then use that clock to drive both the startpoint and endpoint of a path, then the path will (again correctly) be constrained to N main clock cycles.

 

Avrum  

0 Kudos
Highlighted
Observer
Observer
1,132 Views
Registered: ‎05-21-2018

0 Kudos