UPGRADE YOUR BROWSER

We have detected your current browser version is not the latest one. Xilinx.com uses the latest web technologies to bring you the best online experience possible. Please upgrade to a Xilinx.com supported browser:Chrome, Firefox, Internet Explorer 11, Safari. Thank you!

cancel
Showing results for 
Search instead for 
Did you mean: 
Observer jtmiller.cec
Observer
533 Views
Registered: ‎02-22-2019

Help with source synchronous OSERDES output interface constraints and analysis

I want to drive multiple DACs using a source-synchronous OSERDES interface with an Artix-7. It's an Analog Devices DAC that accepts a data clock input (DCI) and shifts DCI by 90 degrees using an internal DLL to create a center-aligned capture clock, data sampling clock (DSC). I'm using a 4:1 OSERDES with CLKDIV at 250MHz and CLK at 500MHz. The datasheet gives the setup/hold specifications relative to DCI, but I want to constrain to DSC since that makes more sense to me. The setup/hold constraints relative to DCI are -288/615ps, respectively. So, for DCS, these should translate to 212/115ps. Here's my code implementing the constraints:

create_clock -period 4.000 -name clkin [get_ports clkin]
set_input_jitter [get_clocks clkin] 0.001

create_generated_clock -name clk_2x [get_pins i_clocks_and_reset/i_mmcm/CLKOUT0]
create_generated_clock -name clk [get_pins i_clocks_and_reset/i_mmcm/CLKOUT1]

set tsu 0.212
set th 0.115

set DACS 16
set DAC_DATA_PHY_BITS 8

for {set dac 0} {$dac < $DACS} {incr dac} {

    set dac_ports DAC_FRAME\[$dac\]
    for {set i 0} {$i < $DAC_DATA_PHY_BITS} {incr i} {
        lappend dac_ports DAC_DATA\[[expr $dac*$DAC_DATA_PHY_BITS + $i]\]
    }
    
    set dsc dac_dsc\[$dac\]
    set dci_pin g_dac\[$dac\].i_dac_phy/i_oserdese2_dci/CLK
    set dci_port DAC_DCI\[$dac\]

    create_generated_clock -name $dsc -source [get_pins $dci_pin] -edges {1 2 3} -edge_shift {0.5 0.5 0.5} [get_ports $dci_port]
    
    set_output_delay -clock $dsc -max $tsu [get_ports $dac_ports]
    set_output_delay -clock $dsc -max $tsu [get_ports $dac_ports] -clock_fall -add_delay
    set_output_delay -clock $dsc -min $th [get_ports $dac_ports]
    set_output_delay -clock $dsc -min $th [get_ports $dac_ports] -clock_fall -add_delay
    
    set_false_path -setup -rise_from [get_clocks clk_2x] -fall_to [get_clocks $dsc]
    set_false_path -setup -fall_from [get_clocks clk_2x] -rise_to [get_clocks $dsc]
    set_false_path -hold -rise_from [get_clocks clk_2x] -rise_to [get_clocks $dsc]
    set_false_path -hold -fall_from [get_clocks clk_2x] -fall_to [get_clocks $dsc]
}

The worst-case setup analysis looks reasonable (yes it's terribly tight, but given the clock pessimism there's not too much to be done other than shifting DSC slightly):

Max Delay Paths
--------------------------------------------------------------------------------------
Slack (MET) :             0.001ns  (required time - arrival time)
  Source:                 g_dac[3].i_dac_phy/g_data[0].i_oserdese2_data/CLK
                            (rising edge-triggered cell OSERDESE2 clocked by clk_2x  {rise@0.000ns fall@1.000ns period=2.000ns})
  Destination:            DAC_DATA[24]
                            (output port clocked by dac_dsc[3]  {rise@0.500ns fall@1.500ns period=2.000ns})
  Path Group:             dac_dsc[3]
  Path Type:              Max at Slow Process Corner
  Requirement:            0.500ns  (dac_dsc[3] rise@0.500ns - clk_2x rise@0.000ns)
  Data Path Delay:        1.943ns  (logic 1.943ns (100.000%)  route 0.000ns (0.000%))
  Logic Levels:           1  (OBUFDS=1)
  Output Delay:           0.212ns
  Clock Path Skew:        1.709ns (DCD - SCD + CPR)
    Destination Clock Delay (DCD):    4.384ns = ( 4.884 - 0.500 )
    Source Clock Delay      (SCD):    2.900ns
    Clock Pessimism Removal (CPR):    0.225ns
  Clock Uncertainty:      0.053ns  ((TSJ^2 + DJ^2)^1/2) / 2 + PE
    Total System Jitter     (TSJ):    0.071ns
    Discrete Jitter          (DJ):    0.078ns
    Phase Error              (PE):    0.000ns

    Location             Delay type                Incr(ns)  Path(ns)    Netlist Resource(s)
  -------------------------------------------------------------------    -------------------
                         (clock clk_2x rise edge)
                                                      0.000     0.000 r
    N21                                               0.000     0.000 r  clkin (IN)
                         net (fo=0)                   0.000     0.000    i_clocks_and_reset/clkin
    N21                  IBUFDS (Prop_ibufds_I_O)     0.866     0.866 r  i_clocks_and_reset/i_bufds_clkin/O
                         net (fo=3, unplaced)         0.584     1.450    i_clocks_and_reset/clk_250
                         MMCME2_ADV (Prop_mmcme2_adv_CLKIN1_CLKOUT0)
                                                      0.078     1.528 r  i_clocks_and_reset/i_mmcm_rf_clk/CLKOUT0
                         net (fo=1, unplaced)         0.646     2.174    i_clocks_and_reset/clk_500_mmcm
                         BUFG (Prop_bufg_I_O)         0.081     2.255 r  i_clocks_and_reset/i_bufg_clk_500/O
                         net (fo=80, unplaced)        0.646     2.900    g_dac[3].i_dac_phy/clk_2x
    OLOGIC_X0Y82         OSERDESE2                                    r  g_dac[3].i_dac_phy/g_data[0].i_oserdese2_data/CLK
  -------------------------------------------------------------------    -------------------
    OLOGIC_X0Y82         OSERDESE2 (Prop_oserdese2_CLK_OQ)
                                                      0.418     3.318 r  g_dac[3].i_dac_phy/g_data[0].i_oserdese2_data/OQ
                         net (fo=1, estimated)        0.000     3.318    g_dac[3].i_dac_phy/data_0
    AB24                 OBUFDS (Prop_obufds_I_O)     1.525     4.843 r  g_dac[3].i_dac_phy/g_data[0].i_obufds_data/O
                         net (fo=0)                   0.000     4.843    DAC_DATA[24]
    AB24                                                              r  DAC_DATA[24] (OUT)
  -------------------------------------------------------------------    -------------------

                         (clock dac_dsc[3] rise edge)
                                                      0.500     0.500 r
    N21                                               0.000     0.500 r  clkin (IN)
                         net (fo=0)                   0.000     0.500    i_clocks_and_reset/clkin
    N21                  IBUFDS (Prop_ibufds_I_O)     0.826     1.326 r  i_clocks_and_reset/i_bufds_clkin/O
                         net (fo=3, unplaced)         0.439     1.765    i_clocks_and_reset/clk_250
                         MMCME2_ADV (Prop_mmcme2_adv_CLKIN1_CLKOUT0)
                                                      0.074     1.839 r  i_clocks_and_reset/i_mmcm_rf_clk/CLKOUT0
                         net (fo=1, unplaced)         0.613     2.453    i_clocks_and_reset/clk_500_mmcm
                         BUFG (Prop_bufg_I_O)         0.077     2.530 r  i_clocks_and_reset/i_bufg_clk_500/O
                         net (fo=80, unplaced)        0.613     3.143    g_dac[3].i_dac_phy/clk_2x
    OLOGIC_X0Y74         OSERDESE2 (Prop_oserdese2_CLK_OQ)
                                                      0.397     3.540 r  g_dac[3].i_dac_phy/i_oserdese2_dci/OQ
                         net (fo=1, estimated)        0.000     3.540    g_dac[3].i_dac_phy/dci
    U21                  OBUFDS (Prop_obufds_I_O)     1.344     4.884 r  g_dac[3].i_dac_phy/i_obufds_dci/O
                         net (fo=0)                   0.000     4.884    DAC_DCI[3]
    U21                                                               r  DAC_DCI[3] (OUT)
                         clock pessimism              0.225     5.109
                         clock uncertainty           -0.053     5.057
                         output delay                -0.212     4.845
  -------------------------------------------------------------------
                         required time                          4.845
                         arrival time                          -4.843
  -------------------------------------------------------------------
                         slack                                  0.001

However, the hold path analysis seems to drop the FPGA portion of the capture clock completely:

Min Delay Paths
--------------------------------------------------------------------------------------
Slack (MET) :             2.621ns  (arrival time - required time)
  Source:                 g_dac[3].i_dac_phy/g_data[6].i_oserdese2_data/CLK
                            (rising edge-triggered cell OSERDESE2 clocked by clk_2x  {rise@0.000ns fall@1.000ns period=2.000ns})
  Destination:            DAC_DATA[30]
                            (output port clocked by dac_dsc[3]  {rise@0.500ns fall@1.500ns period=2.000ns})
  Path Group:             dac_dsc[3]
  Path Type:              Min at Fast Process Corner
  Requirement:            -0.500ns  (dac_dsc[3] fall@1.500ns - clk_2x rise@2.000ns)
  Data Path Delay:        0.865ns  (logic 0.865ns (100.000%)  route 0.000ns (0.000%))
  Logic Levels:           1  (OBUFDS=1)
  Output Delay:           0.115ns
  Clock Path Skew:        -1.193ns (DCD - SCD - CPR)
    Destination Clock Delay (DCD):    0.000ns = ( 1.500 - 1.500 )
    Source Clock Delay      (SCD):    1.193ns = ( 3.193 - 2.000 )
    Clock Pessimism Removal (CPR):    -0.000ns
  Clock Uncertainty:      0.053ns  ((TSJ^2 + DJ^2)^1/2) / 2 + PE
    Total System Jitter     (TSJ):    0.071ns
    Discrete Jitter          (DJ):    0.078ns
    Phase Error              (PE):    0.000ns

    Location             Delay type                Incr(ns)  Path(ns)    Netlist Resource(s)
  -------------------------------------------------------------------    -------------------
                         (clock clk_2x rise edge)
                                                      2.000     2.000 r
    N21                                               0.000     2.000 r  clkin (IN)
                         net (fo=0)                   0.000     2.000    i_clocks_and_reset/clkin
    N21                  IBUFDS (Prop_ibufds_I_O)     0.362     2.362 r  i_clocks_and_reset/i_bufds_clkin/O
                         net (fo=3, unplaced)         0.114     2.476    i_clocks_and_reset/clk_250
                         MMCME2_ADV (Prop_mmcme2_adv_CLKIN1_CLKOUT0)
                                                      0.051     2.527 r  i_clocks_and_reset/i_mmcm_rf_clk/CLKOUT0
                         net (fo=1, unplaced)         0.320     2.847    i_clocks_and_reset/clk_500_mmcm
                         BUFG (Prop_bufg_I_O)         0.026     2.873 r  i_clocks_and_reset/i_bufg_clk_500/O
                         net (fo=80, unplaced)        0.320     3.193    g_dac[3].i_dac_phy/clk_2x
    OLOGIC_X0Y62         OSERDESE2                                    r  g_dac[3].i_dac_phy/g_data[6].i_oserdese2_data/CLK
  -------------------------------------------------------------------    -------------------
    OLOGIC_X0Y62         OSERDESE2 (Prop_oserdese2_CLK_OQ)
                                                      0.177     3.370 r  g_dac[3].i_dac_phy/g_data[6].i_oserdese2_data/OQ
                         net (fo=1, estimated)        0.000     3.370    g_dac[3].i_dac_phy/data_6
    V18                  OBUFDS (Prop_obufds_I_O)     0.688     4.058 r  g_dac[3].i_dac_phy/g_data[6].i_obufds_data/O
                         net (fo=0)                   0.000     4.058    DAC_DATA[30]
    V18                                                               r  DAC_DATA[30] (OUT)
  -------------------------------------------------------------------    -------------------

                         (clock dac_dsc[3] fall edge)
                                                      1.500     1.500 f
                         clock pessimism              0.000     1.500
                         clock uncertainty            0.053     1.553
                         output delay                -0.115     1.438
  -------------------------------------------------------------------
                         required time                         -1.438
                         arrival time                           4.058
  -------------------------------------------------------------------
                         slack                                  2.621

So I guess my questions are - 

1. Have I created the constraints correctly?

2. What happened to the FPGA's portion of the capture clock for hold analysis?

3. Do you have any suggestions for improvement?

0 Kudos
5 Replies
Observer jtmiller.cec
Observer
492 Views
Registered: ‎02-22-2019

Re: Help with source synchronous OSERDES output interface constraints and analysis

*bump*

0 Kudos
Moderator
Moderator
429 Views
Registered: ‎03-16-2017

Re: Help with source synchronous OSERDES output interface constraints and analysis

Hi @jtmiller.cec , 

For creating a center-aligned capture clock, you can use MMCM directly. Create 90 degree phase shift from MMCM itself rather changing setup and hold values at the output port of FPGA.

Use that clock and apply output delay constraints on it based on setup/hold values mentioned in DAC datasheet. 

This will change the slack values which you are seeing in your present scenario.

 

Regards,
hemangd

Don't forget to give kudos and mark it as accepted solution if your issue gets resolved.
0 Kudos
Observer jtmiller.cec
Observer
422 Views
Registered: ‎02-22-2019

Re: Help with source synchronous OSERDES output interface constraints and analysis

Thanks for the reply hemangd. 

I can certainly create a shifted clock using the MMCM. I'll try that method and re-run the timing to see what I get.

However, the DAC datasheet makes it seem like the DAC's internal DLL is the preferred method for clocking the data (nowhere is disabling the DLL even mentioned, although there is a SPI register to do so).

I'm more interested in why the hold analysis doesn't account for the internal FPGA path of the generated clock. Do you have an explanation for this?

0 Kudos
Moderator
Moderator
396 Views
Registered: ‎03-16-2017

Re: Help with source synchronous OSERDES output interface constraints and analysis

Hi @jtmiller.cec ,

>>I'm more interested in why the hold analysis doesn't account for the internal FPGA path of the generated clock. Do you have an explanation for this?

On this, I would require more information on your design. Hence, i have sent you private message on to your email ID. 

Regards,
hemangd

Don't forget to give kudos and mark it as accepted solution if your issue gets resolved.
0 Kudos
Guide avrumw
Guide
311 Views
Registered: ‎01-23-2009

Re: Help with source synchronous OSERDES output interface constraints and analysis

A couple of things here.

First, this is an output interface, and, with the timing specified it is almost perfectly edge aligned - the skew allowed is -385ps to +288 ps in a 1000ps data window.

I can tell you right away, that the "best" way to do this is the simplest - simply drive your data from an OSERDES and drive your forwarded clock from either a similarly configured OSERDES or an ODDR (which are really the same thing) using the same internal clock at 500MHz. The source clock buffer will have little effect on the actual timing of the interface, but you might bet a little better jitter characteristic using a BUFIO - possibly using the "High Performance Clock" from the MMCM (you didn't specify where the clock was coming from). But even using a BUFG would be fine (although might have some more duty cycle distortion).

The interface is not "perfectly centered" - there is just under 100ps imbalance between the leading and trailing edge, although that may be more of a duty cycle adjustment thing. But assuming this is correct, your ideal clock would be 50ps earlier with respect to your data - this is really tiny. Any attempt to use a second clock to generate this phase shift will only make things worse - you will right off the bat pay the tSTATPHAOFFSET of +/-120ps between two clocks from the same MMCM - much more uncertainty than the 50ps you can save in "perfectly" matching the specifications (and in addition, you will now be dealing with separate clock trees, so more variation from that). I suppose it might be possible to use the ODELAY on both the output data and output clock, and adust the tap of the data to be one tap larger than the clock, but even that may cost more uncertainty than would be gained by trying to cancel out the assymmetry.

Given this, all the resources are locked - your clock is on a dedicated network, and all the outputs are using the OSERDES. The timing of this interface is "fixed".

Now, I rarely recommend this, but in this case...

Constraints don't matter. Even if you constrain this properly, the tool will simply tell you if it thinks this interface will meet the specifications or it won't. If it doesn't, there is nothing you can do to make it better.

And it won't...

This is the one case that I have found the tools to be extremely pessimistic. When you ask it for the simplest source syncronous output interface, the resulting timing analysis generates something like +/-600ps of skew between the forwarded clock and forwarded data. This is purely due to the mechanism the tool uses to calculate on-chip-variation. So if you constrain it properly, the tools will simply tell you the interface fails.

So, what do you do. This is the one and only place where I don't trust the timing results of the tool. The on-die-variation is too broad for a case like this - is it really likely that two adjacent (or nearby) OBUFs will vary in performance by +/-600ps (since the OBUF is the vast majority of this variation)?  Conventional wisdom (and the rules of thumb used by the FPGA communitiy before we have Vivado that could really do this kind of analysis) was no - +/-100 ps was reasonable for adjacent pins, and a little more for pins that were further apart, but still in the same I/O bank (and you should really have each interface in the same bank as its clock).

So now you need to make a judgement call - do you proceed with this design in spite of the fact the tool says it fails, or do you proceed anyway. I can't make that call for you, but...

As for the problem with the constraints, you are right - the tool is analyzing the hold incorrectly and I can't tell you why. However, your constraints are very complicated... The way you are defining them:

  • You cannot use managed constraints (since you are using loops)
  • You have a number of constraints with generated names including defining clock names that look like they are a "bus" - i.e. dac_dsc[0], dac_dsc[1]...
  • You are defining different constraints for different bits of the same bus
  • You are using a generated clock with -edge_shift

None of this looks incorrect or illegal, but it is a fair bit more complex than most constraints, and something here may be messing up the constraints system.

My suggestion, even though it is more verbose and possibly less intuitive is:

  • Get rid of the for loop - copy and paste for each of the interfaces
    • Alternatively you might try and use some kind of scoped constraints if each of your interfaces is generated in its own instance of one module
    • While the restrictions are a pain, it is always better to avoid using unmanaged constraints...
  • Get rid of the edge shift and define the relationship to the DCI directly
  • Use simpler clock names (without the square brakets)
  • Split the DAC_DATA bus into separate busses for each interface (rather than one concatenated bus for all interfaces)

And even with this, all you are going to prove is that the tools will tell you the interface fails...

Avrum

Tags (1)