UPGRADE YOUR BROWSER

We have detected your current browser version is not the latest one. Xilinx.com uses the latest web technologies to bring you the best online experience possible. Please upgrade to a Xilinx.com supported browser:Chrome, Firefox, Internet Explorer 11, Safari. Thank you!

cancel
Showing results for 
Search instead for 
Did you mean: 
Highlighted
Observer pstootman
Observer
9,384 Views
Registered: ‎03-29-2015

Implementation details for DCM -ve phase shift solution to "hold timing + non-GCLK" problem ?

Hi all,
 
Thanks for your time.
 
I have a situation involving a new functional requirement for an older PCB design (ie hardware can’t be changed unfortunately, in field) where we would like an ADC to provide samples in parallel format at 100Msps to a Spartan 3 (XC3S4000FG676, –5 speed), but unfortunately the data valid clock from ADC is not connected to one of the FPGA global clock input pins, it is connected to a general purpose IO pin.
 
It is a dual 14-bit ADC, so I have 28 data lines and 1 ‘data valid’ clock signal being driven by ADC into FPGA.
 
I’ve read quite a few of the posts here related to fixing hold timing problems and non-GCLK issues.
 
I read about idea of using DCM as phase-shifter to create a phase-shifted version of the external clock signal, and then clocking the input data lines using this new shifted clock. Ie ‘move sampling clock back earlier into the middle of the data valid period (centre of ‘eye’).
I have spare DCMs, so this seems promising.
 
I’m looking for more detailed example or advice on how to implement this DCM solution; I think I can figure out the DCM instantiation and phase-shifting DCM options.
But I’m unsure about;
- do I need to add any Xilinx elements in my VHDL code between external clock input pin and the DCM CLKIN port (eg IBUF, IBUFG, etc)
-I should use the CLK0 output of DCM as the new sampling clock?
-I should connect CLK0 also to the CLKFB input of DCM ?

-what specific timing constraints to put in my UCF to make a robust solution?

  -- eg do I still keep the CLOCK_DEDICATED_ROUTE option I’ve been using?

  -- eg do I still use OFFSET IN VALID BEFORE style constraint as I’m doing now (does this constraint propagate through the DCM, with ISE automatically taking into account phase shifting options of the DCM when it subsequently analyses the setup/hold at the final flip flops where data lines will be registered) ?

 
More specific info for my scenario, if useful;
- The ADC sampling clock is 100MHz, generated in FPGA (by a DCM) and provided from FPGA to the ADC; so the ‘data valid’ signal returning from ADC into FPGA is also 100MHz (but not aligned with 100MHz clock inside FPGA).
- I have some logic inside FPGA to synchronise the ADC data inputs to an FPGA internal clock before subsequent processing.
- I guess due to using external data valid signal as a clock for flip flops in this synchronisation logic, ISE sees this and somehow routes external clock onto global clock network which adds delay to the clock path.
- In essence, I can’t meet hold time requirements, I guess because data arrives too early at flip flop compared to the clock.
- At ADC output pins, ADC data lines are valid 5.7ns before the ‘data valid signal’ rising edge, and for 1.4ns after this edge. Ignoring PCB skew (pretty similar track lengths for all the lines), I guess this should be ‘true enough’ at FPGA pins.
- So, I need to pull back the sampling rising edge from 5.7ns after data valid, to (5.7+1.4)/2=3.55ns after data valid  (back into middle of ‘eye’) ?

 

Or alternatively, is there some simple way to make delay for clock from pad to flip flop lower if number of destination flip flops is low? I think the external data valid input is only being used as the clock for about 69 flip flops (just enough for the circuit which synchronises external input data to internal FPGA clock domain); so does ISE really need to delay it by putting it through a BUFGMUX ? I probably don't need global routing for this clock...
 
Below are my existing constraints and attached extract from the timing report showing existing hold problems. [I haven’t added the new phase-shifting DCM yet, as above not 100% sure how to do the details of it].
 
Many thanks,
Pete
 
---------------------------------------------------

 

#Single sampling clock provided from FPGA to ADC is used for both I and Q channel
INST "RXADC_CLKP" TNM = ADC_CLOCK_OUTPUT;  
 
#Single 'data valid' clock provided from ADC to FPGA (desire rising edge of this signal to clock data into FPGA)
INST "RXADC_DATA_VALID" TNM = ADC_CLOCK_INPUT;
 
# Define output clock period (100MHz), this is clock provided by FPGA to ADC
NET "RXADC_CLKP" TNM_NET = RXADC_CLKP_TNM;
TIMESPEC TS_RXADC_CLKP = PERIOD "RXADC_CLKP_TNM" 9.8 ns HIGH 50%;
 
# Define input clock period (100MHz), this is data valid clock provided by ADC to FPGA
# RXADC_DATA_VALID pin is not on a GCLK pin
NET "RXADC_DATA_VALID" CLOCK_DEDICATED_ROUTE = FALSE;
NET "RXADC_DATA_VALID" TNM_NET = RXADC_DATA_VALID_TNM;
TIMESPEC TS_RXADC_DATA_VALID = PERIOD "RXADC_DATA_VALID_TNM" 9.8 ns HIGH 50%;
 
# Define input data to input clock relationship
TIMEGRP "ADC_DATA_IN" OFFSET = IN 5.7 ns VALID 7.1ns BEFORE "RXADC_DATA_VALID";
 
Hold_Violation_Example.JPG
0 Kudos
2 Replies
Instructor
Instructor
9,378 Views
Registered: ‎08-14-2007

Re: Implementation details for DCM -ve phase shift solution to "hold timing + non-GCLK" problem ?

- do I need to add any Xilinx elements in my VHDL code between external clock input pin and the DCM CLKIN port (eg IBUF, IBUFG, etc)
 
You can instantiate an IBUF.  An IBUFG is just an IBUF that requires a global clock capable pin, so you don't want that.  You could also leave it out and XST will infer the IBUF for you.  Generally the reason to instantiate the IBUF is if you need to run the clock input to more than one clock resource.  For example both the DCM and directly to a BUFG.  This is useful if you want to generate logic to reset the DCM if it loses lock.  That logic needs a free-running clock other than the DCM's outputs, which will stop while the DCM is in reset.
 
-I should use the CLK0 output of DCM as the new sampling clock?
 
My guess is yes, but you should look at the timing reports to make sure.  If you still have a hold time violation you might need to use the CLK90 output falling edge to get 270 degrees, instead.
 
-I should connect CLK0 also to the CLKFB input of DCM ?

 

The CLK0 output will go to a global clock buffer (BUFG or BUFGMUX).  The output of that BUFG goes back to the CLKFB to remove the delay of the clock buffer.  That way you should be able to use the (buffered) CLK0 for your input sampling and get the better hold time you need.

 

-what specific timing constraints to put in my UCF to make a robust solution?

  -- eg do I still keep the CLOCK_DEDICATED_ROUTE option I’ve been using?

  -- eg do I still use OFFSET IN VALID BEFORE style constraint as I’m doing now (does this constraint propagate through the DCM, with ISE

 

You pretty much keep the constraints you have.  You'll still need the CLOCK_DEDICATED_ROUTE for the DCM since the pin is not global clock capable.  The OFFSET IN constraints will propagate through the DCM and any phse shifting will be taken into account.

 

Or alternatively, is there some simple way to make delay for clock from pad to flip flop lower if number of destination flip flops is low?

 

Not in the original Spartan 3.  Spartan 3A parts have some other clocking options as well as input delay (delay the data input to the flip-flop D to fix hold time rather than advancing the clock).  I think the DCM will be your best bet.

 

-- Gabor
Observer pstootman
Observer
9,350 Views
Registered: ‎03-29-2015

Re: Implementation details for DCM -ve phase shift solution to "hold timing + non-GCLK" problem ?

Thanks Gabor, very helpful.

Today I've coded solution following your advice, now timing constraint is met.
Constraint_Summary.JPG
Its not exactly clear to me if I've managed to put the clock 'in the middle' of the data valid 'eye', but the setup/hold times are met.

Below is my VHDL code in case useful to others.

Constraints are unchanged, so still exactly as in original post.

What seems interesting is that when using PHASE_SHIFT of 0, slack reported as -7.748 and ISE declares constaint not met;

Timing_PS0.JPG

When using PHASE_SHIFT -255, slack reported as 2.013ns, and ISE declares constraint met;

Timing_PS-255.JPG

But doesn't ISE (14.7) know from PERIOD constraint this is a repetitive clock with 9.8ns period; so 'rising at 0.000ns' is basically same clock alignment as 'rising at -9.761ns' ? So shouldn't both 0 and -255 cause ISE to say constraints pass ?

Is -255 (very end of delay line?) safe to use if there is some jitter in external clock ?

Also, I'm wondering if I should use 10ns (nominal value, 100MHz) in the PERIOD constraints instead of 9.8ns (my very conservative estimate of minimum value in case actual clock frequency a bit faster than 100MHz), ISE seems to use this period number when calculating the 'clock arrival' term in the timing analysis; its only about 0.2ns difference in the reported setup/hold slack values, but I like to enter the constraints as correctly as possible.

Tomorrow, I will test my new build on the hardware.

Thanks,
Pete

 

----------------------------------------------------------------------

-- Instantiate IBUF between input RXADC_DATA_VALID pin and DCM input
-- RXADC_DATA_VALID is not on a global clock input pad (pin K3 is not a GCKx pin) (unfortunately).
IBUF_RXADC_DATA_VALID: IBUF
   port map
   (
      I => RXADC_DATA_VALID,
      O => RXADC_DATA_VALID_BUFFERED
   );

----------------------------------------------------------------------

-- Instantiate BUFG between DCM output and global clock network
-- (ie put DCM output through the global clock network driver buffer)
-- Feedback input for DCM (RXADC_DATA_VALID_DCM_OUTPUT_BUFFERED) comes from the global clock network
-- ADC data samples are now sampled on the rising edge of RXADC_DATA_VALID_DCM_OUTPUT_BUFFERED
BUFG_RXADC_DATA_VALID: BUFG
   port map
   (
      I => RXADC_DATA_VALID_DCM_OUTPUT,
      O => RXADC_DATA_VALID_DCM_OUTPUT_BUFFERED
   );

----------------------------------------------------------------------

-- Instantiation of component 'DCM'
-- Generate phase-shifted version of RXADC_DATA_VALID signal
DCM_RXADC_DATA_VALID: DCM
   generic map
   (
      --We are using the DLL, with CLK0 fed back (via BUFG) to CLKFB input
      CLK_FEEDBACK            => "1X",

      --NA because CLKDV output is not used
      --Set to lowest legal integer value as default
      CLKDV_DIVIDE            => 2.0,

      --NA because CLKFX outputs are not used
      --Set to lowest legal integer value as default
      CLKFX_DIVIDE            => 1,

      --NA because CLKFX outputs are not used
      --Set to lowest legal integer value as default
      CLKFX_MULTIPLY          => 2,

      --Don't divide input clock by 2 immediately on entry to DCM
      CLKIN_DIVIDE_BY_2       => FALSE,

      --Input clock is 100MHz (nominal)
      CLKIN_PERIOD            => 10.000,

      --We will use fixed phase shift, because we want constant relationship
      --between external clock and phase-shifted internal clock
      CLKOUT_PHASE_SHIFT      => "FIXED",

      --Clock and the data we want to sample with it are coming from the
      --same device (ADC)
      DESKEW_ADJUST           => "SOURCE_SYNCHRONOUS",

      DFS_FREQUENCY_MODE      => "LOW",
      DLL_FREQUENCY_MODE      => "LOW",

      --Make a nice 50% clock inside FPGA
      --(the logic which retimes input ADC data lines to internal FPGA clock
      --domain uses falling edge)
      DUTY_CYCLE_CORRECTION   => TRUE,

      --Default JF value for DLL low frequency mode is xC080,
      --if we don't specify a custom value here, default will be used
      --FACTORY_JF              => x"C080",

      --Refer Xilinx document xapp462.pdf
      --For XC3S4000 -5 (and -4) speed grades, FINE_SHIFT_RANGE is 10ns
      --CLKIN clock period is 10ns (nominal), so thus;
      --SHIFT_DELAY_RATIO is 10/10 = 1
      --PHASE_SHIFT_LIMITS = +/-255
      --Phase shift (ns) = (PHASE_SHIFT/256) * 10ns
      --Advance (make earlier) the input clock by -255/256*10ns = -9.961ns
      --For PHASE_SHIFT = 0, 'clock rising at' in timing report is at 0.000ns
      --For PHASE_SHIFT = -255, 'clock rising at' in timing report is at -9.761ns
      --(because we used 9.8ns in period constraint in case clock a bit faster
      --than nominal 10ns, so -255/256*9.8 = -9.761),
      --and this constraint;
      --TIMEGRP "ADC_DATA_IN" OFFSET = IN 5.7 ns VALID 7.1ns BEFORE "RXADC_DATA_VALID"
      --is now met
      --Not sure yet why 0 fails and -255 passes, as clock period is 10ns
      PHASE_SHIFT             => -255,

      --During FPGA configuration, don't wait for this DCM to be locked before asserting DONE signal
      --(else if ADC chip is in sleep mode, FPGA won't start)
      STARTUP_WAIT            => FALSE
   )
   port map
   (
      --Generated CLK0 (after BUFG)
      CLKFB                   => RXADC_DATA_VALID_DCM_OUTPUT_BUFFERED,

      --External input clock (after IBUF)
      CLKIN                   => RXADC_DATA_VALID_BUFFERED,

      DSSEN                   => '0',

      --Must tie these PS inputs low, because we are using phase shift,
      --but not using dynamic phase shift
      PSCLK                   => '0',
      PSEN                    => '0',
      PSINCDEC                => '0',

      -- Never reset (todo review, not sure if this is good idea yet)
      RST                     => '0',

      CLKDV                   => open,
      CLKFX                   => open,
      CLKFX180                => open,

      -- DCM output which is fed to BUFG
      -- (then output of BUFG clocks the ADC data lines)
      CLK0                    => RXADC_DATA_VALID_DCM_OUTPUT,

      CLK2X                   => open,
      CLK2X180                => open,
      CLK90                   => open,
      CLK180                  => open,
      CLK270                  => open,

      --Locked indication (todo review, if this is useful to use or not)
      LOCKED                  => RXADC_DATA_VALID_DCM_LOCKED,

      PSDONE                  => open,
      STATUS                  => open
   );

 

 

0 Kudos