cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
siddadd
Observer
Observer
11,686 Views
Registered: ‎04-03-2012

Meeting timing across Synchronous clock domains

Jump to solution

I have an MMCM that generates a 100 MHz clock. The input clock to the MMCM is running at 250 MHz.  In the Vivado (2015.2) IP Clocking Wizard, I check the Phase Alignment option.

 

I have some data (more than 1 bit wide) that is crossing the 250MHz clock domain (command_array) to the 100 MHz clock (client0_state) domain. I am using a dual flip-flop synchronizer to pass the data across the boundary. I am not sure what timing constraints I need to add in so as to close timing. If I use set_max_delay -from -to <ns>, what should the ns value be? Also should I be using the ASYNC_REG attribute on the registers even though they belong to synchronous clocks?

 

Below is the timing report for the particular path :

 

Max Delay Paths
--------------------------------------------------------------------------------------
Slack (VIOLATED) : -2.180ns (required time - arrival time)
Source: a0/fpga0/LBL_SLOT0.i_slot0_host_interface/my_bridge/i_bridge/i_command_state/command_array_reg[7][1]/C
(rising edge-triggered cell FDRE clocked by userclk2 {rise@0.000ns fall@2.000ns period=4.000ns})
Destination: a0/fpga0/LBL_SLOT0.i_slot0_host_interface/my_bridge/i_bridge/i_command_state/LBL_STATE_OUTPUT_ASSIGN[7].client0_state_reg[22]/D
(rising edge-triggered cell FDRE clocked by clk_out1_mmcm_i250_o100_o200 {rise@0.000ns fall@5.000ns period=10.000ns})
Path Group: clk_out1_mmcm_i250_o100_o200
Path Type: Setup (Max at Slow Process Corner)
Requirement: 2.000ns (clk_out1_mmcm_i250_o100_o200 rise@10.000ns - userclk2 rise@8.000ns)
Data Path Delay: 0.589ns (logic 0.204ns (34.610%) route 0.385ns (65.390%))
Logic Levels: 0
Clock Path Skew: -3.327ns (DCD - SCD + CPR)
Destination Clock Delay (DCD): 1.462ns = ( 11.462 - 10.000 )
Source Clock Delay (SCD): 4.789ns = ( 12.789 - 8.000 )
Clock Pessimism Removal (CPR): 0.000ns
Clock Uncertainty: 0.196ns ((TSJ^2 + DJ^2)^1/2) / 2 + PE
Total System Jitter (TSJ): 0.072ns
User System Jitter : 0.051ns
Discrete Jitter (DJ): 0.109ns
Phase Error (PE): 0.131ns
Clock Domain Crossing: Inter clock paths are considered valid unless explicitly excluded by timing constraints such as set_clock_groups or set_false_path.

Location Delay type Incr(ns) Path(ns) Netlist Resource(s)
------------------------------------------------------------------- -------------------
(clock userclk2 rise edge)
8.000 8.000 r
GTHE2_CHANNEL_X1Y35 GTHE2_CHANNEL 0.000 8.000 r b/pcihip0/pcihip0_inst/pcie3_7x_0_i/inst/gt_top.gt_top_i/pipe_wrapper_i/pipe_lane[0].gt_wrapper_i/gth_channel.gthe2_channel_i/TXOUTCLK
net (fo=1, routed) 1.050 9.050 b/pcihip0/pcihip0_inst/pipe_clock_i/CLK_TXOUTCLK
MMCME2_ADV_X1Y8 MMCME2_ADV (Prop_mmcme2_adv_CLKIN1_CLKOUT3)
0.077 9.127 r b/pcihip0/pcihip0_inst/pipe_clock_i/mmcm_i/CLKOUT3
net (fo=1, routed) 1.935 11.062 b/pcihip0/pcihip0_inst/pipe_clock_i/userclk2
BUFGCTRL_X0Y16 BUFG (Prop_bufg_I_O) 0.093 11.155 r b/pcihip0/pcihip0_inst/pipe_clock_i/userclk2_i1.usrclk2_i1/O
net (fo=94986, routed) 1.634 12.789 a0/fpga0/LBL_SLOT0.i_slot0_host_interface/my_bridge/i_bridge/i_command_state/psl_clk
SLICE_X194Y347 FDRE r a0/fpga0/LBL_SLOT0.i_slot0_host_interface/my_bridge/i_bridge/i_command_state/command_array_reg[7][1]/C
------------------------------------------------------------------- -------------------
SLICE_X194Y347 FDRE (Prop_fdre_C_Q) 0.204 12.993 r a0/fpga0/LBL_SLOT0.i_slot0_host_interface/my_bridge/i_bridge/i_command_state/command_array_reg[7][1]/Q
net (fo=3, routed) 0.385 13.378 a0/fpga0/LBL_SLOT0.i_slot0_host_interface/my_bridge/i_bridge/i_command_state/command_array_reg[7]__0[1]
SLICE_X196Y348 FDRE r a0/fpga0/LBL_SLOT0.i_slot0_host_interface/my_bridge/i_bridge/i_command_state/LBL_STATE_OUTPUT_ASSIGN[7].client0_state_reg[22]/D
------------------------------------------------------------------- -------------------

(clock clk_out1_mmcm_i250_o100_o200 rise edge)
10.000 10.000 r
BUFGCTRL_X0Y16 BUFG 0.000 10.000 r b/pcihip0/pcihip0_inst/pipe_clock_i/userclk2_i1.usrclk2_i1/O
net (fo=94986, routed) 1.246 11.246 a0/fpga0/i_clock_synth/inst/clk_in1
MMCME2_ADV_X0Y4 MMCME2_ADV (Prop_mmcme2_adv_CLKIN1_CLKOUT0)
-2.642 8.604 r a0/fpga0/i_clock_synth/inst/mmcm_adv_inst/CLKOUT0
net (fo=1, routed) 1.315 9.919 a0/fpga0/i_clock_synth/inst/clk_out1_mmcm_i250_o100_o200
BUFGCTRL_X0Y0 BUFG (Prop_bufg_I_O) 0.083 10.002 r a0/fpga0/i_clock_synth/inst/clkout1_buf/O
net (fo=76415, routed) 1.460 11.462 a0/fpga0/LBL_SLOT0.i_slot0_host_interface/my_bridge/i_bridge/i_command_state/bbstub_clk_out1
SLICE_X196Y348 FDRE r a0/fpga0/LBL_SLOT0.i_slot0_host_interface/my_bridge/i_bridge/i_command_state/LBL_STATE_OUTPUT_ASSIGN[7].client0_state_reg[22]/C
clock pessimism 0.000 11.462
clock uncertainty -0.196 11.266
SLICE_X196Y348 FDRE (Setup_fdre_C_D) -0.067 11.199 a0/fpga0/LBL_SLOT0.i_slot0_host_interface/my_bridge/i_bridge/i_command_state/LBL_STATE_OUTPUT_ASSIGN[7].client0_state_reg[22]
-------------------------------------------------------------------
required time 11.199
arrival time -13.378
-------------------------------------------------------------------
slack -2.180

Tags (3)
0 Kudos
1 Solution

Accepted Solutions
arpansur
Moderator
Moderator
21,517 Views
Registered: ‎07-01-2015

Hi @siddadd,

 

Tool is analyzing for worst case. So requirement is 2ns. 

So you can use multicycle path constraint. Please go through page-96 of following link to get more information on multicycle path:

http://www.xilinx.com/support/documentation/sw_manuals/xilinx2015_4/ug903-vivado-using-constraints.pdf

 

Thanks,
Arpan

Thanks,
Arpan
----------------------------------------------------------------------------------------------
Kindly note- Please mark the Answer as "Accept as solution" if information provided is helpful.

Give Kudos to a post which you think is helpful and reply oriented.
----------------------------------------------------------------------------------------------

View solution in original post

4 Replies
muravin
Scholar
Scholar
11,680 Views
Registered: ‎11-21-2013

Hello @siddadd,

 

There are 2 ways to address this issue, I'd say the FPGA designers are polarized as to which one is correct, but I would suggest you try 1 and then switch to 2 once you get annoyed :)

 

1. The most "correct" way is that you use set_max_delay -from -to <ns>, and optionally with -datapath_only option, in order to not analyze the clock insertion delay, which in many cases breaks your timing, i.e. for the clocks that are not integer ratios, the clock edges will sooner or later "collide". For clocks with integer ratios, use set_multicycle_path.

 

2. What we do is as following: all of our designs by default have clocks that are completely decoupled from each other using set_false_path. All clock domain crossings are done through a dedicated module that is a double DD FFs with (as you have adequately pointed out), an ASYNC_REG attribute on the FFs. For things like busses that require FIFOs, we use built-in BRAM FIFOs or, in rare cases, design our own that again, use ASYNC_REG attribute on the converted FIFO pointers.

 

Hope this helps.

Vlad

Vladislav Muravin
arpansur
Moderator
Moderator
21,518 Views
Registered: ‎07-01-2015

Hi @siddadd,

 

Tool is analyzing for worst case. So requirement is 2ns. 

So you can use multicycle path constraint. Please go through page-96 of following link to get more information on multicycle path:

http://www.xilinx.com/support/documentation/sw_manuals/xilinx2015_4/ug903-vivado-using-constraints.pdf

 

Thanks,
Arpan

Thanks,
Arpan
----------------------------------------------------------------------------------------------
Kindly note- Please mark the Answer as "Accept as solution" if information provided is helpful.

Give Kudos to a post which you think is helpful and reply oriented.
----------------------------------------------------------------------------------------------

View solution in original post

nagabhar
Xilinx Employee
Xilinx Employee
11,625 Views
Registered: ‎05-07-2015

HI @siddadd

 

you can use set_max_delay -datapathonly  from Q output of source register to the D input of the  first of the desitnation sync register. You can choose any value for this max delay. It is just for the implementation tool to not totally ignore the timing requirement of this net which will happen when you use set_false_path.

"Also should I be using the ASYNC_REG attribute on the registers even though they belong to synchronous clocks?"
Applying AYSC_REG attribute on the two synchronizing registers (at destination) will tell the tool to place them close to each other which is essential for dual FF sysnchronizer to improve MTBF.
Ref: page 142 of UG912

Thanks
Bharath
--------------------------------------------------​--------------------------------------------
Please mark the Answer as "Accept as solution" if information provided addresses your query/concern.
Give Kudos to a post which you think is helpful.
--------------------------------------------------​-------------------------------------------
0 Kudos
avrumw
Guide
Guide
11,186 Views
Registered: ‎01-23-2009

Whoa...

 

All of this discussion has been on how to add an exception to this path. From what I can see, though, this is NOT the correct thing to do.

 

@siddadd said that the data is "more than one bit wide" and that it crosses from the 250MHz domain that is the input to the MMCM to the 100MHz domain that is the output of the MMCM. This does not sound like a path that can/should be solved by a synchronizer (at least, not a simple 2 stage synchronizer).

 

The bus is multi-bit, and there is no mechanism in a 2 stage synchronizer to ensure bus coherency. Adding the exception simply makes the tools report that timing passes, but will not result in a working design!

 

The problem with the path as it was originally posted is that there is a HUGE phase difference between the 250MHz input clock and the 100MHz ouput clock (due to the MMCM) - the phase difference between them is too large to cross synchronously.

 

This needs to be solved a different way. The most obvious way is not to run the 250MHz portion of the domain on the input of the MMCM, but on an output of the MMCM.

 

Take the 250MHz input and then generate two outputs from the MMCM - one running at 250MHz and one running at 100MHz. Put both clocks through the same kind of buffer (probably BUFGs). Now the two clocks are in phase, and there is a true 2ns worst separation between them (minus a bit for clock skew and MMCM output skew). The tools should be able to pass between these domains as long as you put no (or little) logic between the last FF on the 250MHz domain and the first FF on the 100MHz domain. Note - this is not an asynchronous clock crossing path - it is synchronous, and hence needs no synchronization circuit, nor any timing constraints, and it should meet timing.

 

I see that the 250MHz domain comes from a GTX. There should be no problem adding an MMCM between the TXOUTCLK and TXUSRCLK and clocking all the data on that domain.

 

I also see something odd about your 100MHz clock. If it really came from the 250MHz clock from a GTX, then it should have shown up in the timing report. But, your timing report shows the 100MHz clock as if it were a primary clock. This would happen if you did a "create_clock" on the output of the MMCM - you should not do this; the tools will automatically derive the proper generated clocks coming from an MMCM.

 

Of course, I don't have all the details of the system. If, for some reason, you can't change the logic on the 250MHz domain to run from the output of the MMCM, then you will need a different solution. The solution for this system will depend on a whole bunch of things - how often is the data from the 250MHz domain used by the 100MHz domain? How often does the data on the 250MHz domain change? (and probably a few more).

 

But simply slapping an exception on this path is only masking the problem - not fixing it.

 

Avrum