UPGRADE YOUR BROWSER

We have detected your current browser version is not the latest one. Xilinx.com uses the latest web technologies to bring you the best online experience possible. Please upgrade to a Xilinx.com supported browser:Chrome, Firefox, Internet Explorer 11, Safari. Thank you!

cancel
Showing results for 
Search instead for 
Did you mean: 
Highlighted
Visitor prophet36
Visitor
9,879 Views
Registered: ‎08-13-2011

Clocking external devices using DCM-generated clock

Jump to solution

Hi,

 

Firstly, I'm not pretending to be a professional FPGA developer: for me it's just a hobby, so don't laugh.  :-)


I have a simple single-word SDRAM-controller[1], which works OK on my Spartan-6 LX9 board[2] when clocked at 48MHz: an incoming 48MHz clock runs into a DCM which matches it, and the output of the DCM clocks all internal logic and the SDRAM itself via an ODDR2 block[3]. The controller works fine for simple applications (e.g writing data to the SDRAM over USB & reading it back again), but more complex applications run into difficulties because the address & control lines driven by the FPGA are not registered, so as the application complexity increases, the tools find it more and more difficult to meet the timing constraints, and data corruption ensues.

 

So, I was thinking about an alternative approach: use a DCM 150% clock-multiplier to generate a 72MHz clock, and use that to drive a new controller whose outputs are registered and packed into the IOBs. The increased clock frequency means I can achieve roughly the same read latency as the original design, in spite of the extra pipeline stage and the extra NOPs needed by the SDRAM at 72MHz.

 

So I looked at the timing-report[4], and saw that the now-registered (IOB-packed) outputs lag behind the internal clock by between 2.5 and 7.5ns. But I also noticed that the external clock output to the SDRAM also lags by roughly the same amount. So the first thing that occurred to me was to simply invert the clock that is being output to the SDRAM. But then problems arise when reading the data back from the SDRAM, which becomes stable at most 6ns after the clock rising-edge, and is held valid for at least 3ns after the following clock rising-edge.

 

So I have a to-scale timing-diagram which looks like this (14ns clock period, 0.5ns per division):

 

http://i.imgur.com/BkAzoJ5.png

 

Using the "FAST" process-corner timings, the SDRAM's data-out is valid at the falling-edge of the FPGA's internal clock, but is probably no longer valid at the following rising-edge.

 

Using the "SLOW" process-corner timings, the SDRAM's data-out is probably still settling down at the falling-edge of the FPGA's internal clock, but it is valid at the following rising-edge.

 

So, what do designers normally do in this scenario? Naively, looking at the timings, adding a fixed phase delay of 1.5ns to the inverted clock before sending it out to the SDRAM would guarantee that the SDRAM's data is valid at the FPGA on the rising-edge of the internal clock (assuming the FPGA can cope with a zero hold time on the incoming data), irrespective of the process-corner.

 

Any suggestions or other guidance gratefully received.

 

Chris

 

[1]https://github.com/makestuff/mem-ctrl/blob/20140311/vhdl/mem_ctrl_rtl.vhdl#L221

[2]https://github.com/makestuff/lx9/tree/r3

[3]https://github.com/makestuff/readback/blob/20140311/templates/fx2min/vhdl/top_level.vhdl#L132

[4]https://gist.github.com/makestuff/9722787#file-timingreport-L109

 

0 Kudos
1 Solution

Accepted Solutions
Instructor
Instructor
14,856 Views
Registered: ‎08-14-2007

Re: Clocking external devices using DCM-generated clock

Jump to solution

A DCM can create clocks with output phases 90 degrees apart.  You can use this property to generate two clocks at 0 degrees and 90 degrees.  Then the bulk of your logic runs on the 0 degree clock, but the ODDR that sends a clock to the memory device runs from the 90 degree clock.  In addition it effectively inverts this clock (if C0 and C1 use rising and falling edges respectively, then D0 and D1 are connected to '0' and '1' respectively).  So it seems to run at 270 degrees or just 90 degrees advanced from the data output switching times when output registers are placed in the IOB.  This provides for less hold time (but still quite a bit - 1/4 clock period) but more setup time (3/4 clock period).

 

It's been a while since I last made a single-data-rate SDRAM controler, and usually those were system synchronous, i.e. the clock to the DRAM chips was driven by the same source as the clock to the FPGA - it wasn't driven by the FPGA.  However I did do one where the clock was driven by the FPGA (A Spartan 2E) and in that case i drove out the clock on a clock capable pin, and used feedback from the same pin to drive the internal clock of the FPGA, giving me the same timing as in a system-synchronous environment.  You don't need a DCM for this approach.

 

If you already have a board layout, and the clock pin driving the SDRAM is not clock capable, all is not lost.  If you have at least one clock capable pin that is unused, then you can use it to create the required feedback structure to generate the FPGA's internal clock.  Use a global clock driving ODDR's to both the feedback clock pin (unused clock capable IO) and the clock to the SDRAM.  Use the input from the feedback clock pin to drive the internal clocks.  At 72 MHz you don't really need to use a DCM in this path, just take the input from the clock feedback and send it to a BUFG.  The only reason you'd need the DCM is to generate 72 MHz for the clock you're driving out.

-- Gabor
0 Kudos
6 Replies
Instructor
Instructor
14,857 Views
Registered: ‎08-14-2007

Re: Clocking external devices using DCM-generated clock

Jump to solution

A DCM can create clocks with output phases 90 degrees apart.  You can use this property to generate two clocks at 0 degrees and 90 degrees.  Then the bulk of your logic runs on the 0 degree clock, but the ODDR that sends a clock to the memory device runs from the 90 degree clock.  In addition it effectively inverts this clock (if C0 and C1 use rising and falling edges respectively, then D0 and D1 are connected to '0' and '1' respectively).  So it seems to run at 270 degrees or just 90 degrees advanced from the data output switching times when output registers are placed in the IOB.  This provides for less hold time (but still quite a bit - 1/4 clock period) but more setup time (3/4 clock period).

 

It's been a while since I last made a single-data-rate SDRAM controler, and usually those were system synchronous, i.e. the clock to the DRAM chips was driven by the same source as the clock to the FPGA - it wasn't driven by the FPGA.  However I did do one where the clock was driven by the FPGA (A Spartan 2E) and in that case i drove out the clock on a clock capable pin, and used feedback from the same pin to drive the internal clock of the FPGA, giving me the same timing as in a system-synchronous environment.  You don't need a DCM for this approach.

 

If you already have a board layout, and the clock pin driving the SDRAM is not clock capable, all is not lost.  If you have at least one clock capable pin that is unused, then you can use it to create the required feedback structure to generate the FPGA's internal clock.  Use a global clock driving ODDR's to both the feedback clock pin (unused clock capable IO) and the clock to the SDRAM.  Use the input from the feedback clock pin to drive the internal clocks.  At 72 MHz you don't really need to use a DCM in this path, just take the input from the clock feedback and send it to a BUFG.  The only reason you'd need the DCM is to generate 72 MHz for the clock you're driving out.

-- Gabor
0 Kudos
Visitor prophet36
Visitor
9,855 Views
Registered: ‎08-13-2011

Re: Clocking external devices using DCM-generated clock

Jump to solution

Hi Gabor,

What an excellent answer, thank you. You're actually suggesting two different approaches, correct? One solution with the DCM generating two clocks with 0° and 90° phase-shift, and another (presumably preferred) solution using the feedback clock pin.

The ramClk_out signal is driven on pin 58 of the tq144 package, which is IO_L14P_D11_2, which is (I assume) not clock-capable. This is annoying because an earlier revision of the PCB had it on IO_L31P_GCLK31_D14_2, which (I assume) is clock-capable. I thought the GCLK pins were only used for input clocks, not for output clocks, but evidently it is useful for feedback purposes. And unfortunately I have no spare GCLK pins. But for future reference, you're suggesting I configure the DCM with external feedback from the GCLK pin I actually drive with the DCM? And presumably that will then show up in the timing report clock-to-pad section as 0ns?

Meanwhile, I'll try the dual-clock-with-phase-shift solution.

Chris

0 Kudos
Instructor
Instructor
9,838 Views
Registered: ‎08-14-2007

Re: Clocking external devices using DCM-generated clock

Jump to solution

Actually my second suggestion only used the DCM as a frequency generator to bump 36 MHz up to 72 MHz.  The phase is not important, because the clock from the feedback pin has the same timing (at the pin) as the clock to the SDRAM.  The delay from the feedback input buffer and global clock buffer then provides plenty of hold time.  You could use a DCM in this path to reduce that delay if it is too much, but I don't think it necessary at these frequencies.  The Spartan 2e project where I last used this technique ran at 108 MHz and used no DLL (Spartan 2 has no DCM, only a CLK_DLL).

 

If your design has other inputs that come in synchronous to the 36 MHz input clock, then using the DCM with phase shift may prove a better choice of the two methods.

-- Gabor
0 Kudos
Visitor prophet36
Visitor
9,832 Views
Registered: ‎08-13-2011

Re: Clocking external devices using DCM-generated clock

Jump to solution

OK now I'm confused. Let's make it concrete:


entity top_level is
    port(
        clk48_in    : in  std_logic;
        ramClk_out  : out std_logic;
        ramAddr_out : out std_logic_vector(11 downto 0);
        :
    );
end entity

architecture rtl of top_level is
    signal sysClk000 : std_logic;
    signal sysClk180 : std_logic;
begin
    -- Generate the a 72MHz clock from the incoming 48MHz
    cg2: entity work.cg2
        port map(
            CLK_IN1  => sysClk_in,
            CLK_OUT1 => sysClk000,
            CLK_OUT2 => sysClk180,
            :
        );

    -- Drive ramClk_out
    clk_drv: ODDR2
        port map(
            D0 => '1',
            D1 => '0',
            C0 => sysClk000,
            C1 => sysClk180,
            Q  => ramClk_out
        );
    :
end architecture;

So let's say I could go back in time and fix my PCB so that ramClk_out is mapped to IO_L31P_GCLK31_D14_2. If I understand you correctly, you're suggesting using feedback from the ramClk_out pin itself, such that sysClk000 is in phase with the SDRAM clock at the FPGA's ramClk_out pin. So if I register ramAddr_out and pack it into the IOBs, and run trce -a top_level_map.ncd, I should get a timing report that quotes a 0ns delay on ramClk_out and a 2.5~7.5ns delay on ramAddr_out (both with respect to sysClk000). Is that right?


So the question is, how is it possible to configure a DCM_SP instance with external clock feedback coming from the same pin that it's driving? All the external feedback scenarios in UG382 appear to be using a second pin on the FPGA and (presumably) a track on the PCB wiring the ramClk_out pin back to a ramClk_fb input.


Chris

0 Kudos
Instructor
Instructor
9,819 Views
Registered: ‎08-14-2007

Re: Clocking external devices using DCM-generated clock

Jump to solution

The "feedback clock" I mentioned does not go to the DCM.  My understanding was that the 72 MHz did not need to be in phase with any particular external event.  If that's not the case, then it might make sense to run it back to the DCM.

 

In any case you need the output of the ODDR2 to feed an IOBUF.  The IOBUF's "O" pin, which comes from the input buffer, making the terminology a bit non-sensical, then feeds a BUFG to run the SDRAM controller.  This would be true regardless of whether you also used it to provide feedback to tge DCM.  The output of that BUFG then has a phase slightly delayed from the board-level clock, thereby providing hold time.  The only advantage of using external feedback to the DCM would be to correct the phase of the 72 MHz boad level clock to match the rising edge of the incoming 36 MHz clock.  It would make absolutely no difference to the interface between the FPGA and the SDRAM.

-- Gabor
0 Kudos
Visitor prophet36
Visitor
9,814 Views
Registered: ‎08-13-2011

Re: Clocking external devices using DCM-generated clock

Jump to solution

OK, now I understand. There were a couple of surprising things: the pin with the IOBUF must be declared inout, and the IOBUF's tristate input must be driven by a second ODDR2 (even though both D0 and D1 inputs are zero [=output]). And sure enough, it only works if ramClk_out drives a GCLK pin.


Here's the code just in case someone else needs it:


entity top_level is
    port(
        clk48_in    : in    std_logic;
        ramClk_out  : inout std_logic;
        locked_out  : out   std_logic;
        ramAddr_out : out   std_logic_vector(11 downto 0);
        :
    );
end entity;

architecture rtl of top_level is
    signal clk72_000    : std_logic;
    signal clk72_180    : std_logic;
    signal clk72        : std_logic;
    signal tri          : std_logic;
    signal intClk       : std_logic;
    signal sysClk       : std_logic;
    signal ramAddr      : std_logic_vector(11 downto 0) := x"000";
    signal ramAddr_next : std_logic_vector(11 downto 0);
begin
    -- Infer registers
    process(sysClk)
    begin
        if ( rising_edge(sysClk) ) then
            ramAddr <= ramAddr_next;
        end if;
    end process;

    -- Generate the a 72MHz clock from the incoming 48MHz clock
    clk_mul: entity work.clk_mul
        port map(
            CLK_IN1  => clk48_in,
            CLK_OUT1 => clk72_000,
            CLK_OUT2 => clk72_180,
            LOCKED   => locked_out
        );
    
    -- Explicitly instantiate a DDR block to drive IOBUF output
    clk_drv: ODDR2
        port map(
            D0 => '1',
            D1 => '0',
            C0 => clk72_000,
            C1 => clk72_180,
            Q  => clk72
        );

    -- Need another ODDR2 for IOBUF's tristate control
    tri_drv: ODDR2
        port map(
            D0 => '0',
            D1 => '0',
            C0 => clk72_000,
            C1 => clk72_180,
            Q  => tri
        );

    -- Explicitly instantiate an IOBUF, to route clock back in:
    iobuf_inst: iobuf
        port map(
            I  => clk72,
            O  => intClk,
            IO => ramClk_out,
            T  => tri
        );

    -- Finally buffer it to drive internal logic:
    bufg_inst: bufg
        port map(
            I => intClk,
            O => sysClk
        );

    ramAddr_out <= ramAddr;
    :

The timing report now correctly quotes the IOB-packed ramAddr_out signals relative to the ramClk_out pin:


Clock ramClk_out to Pad
---------------+-----------------+------------+
               |Max (slowest) clk|  Process   |
Destination    |  (edge) to PAD  |   Corner   |
---------------+-----------------+------------+
ramAddr_out<0> |        10.053(R)|      SLOW  |
ramAddr_out<1> |        10.053(R)|      SLOW  |
ramAddr_out<2> |        10.053(R)|      SLOW  |
ramAddr_out<3> |        10.053(R)|      SLOW  |
:

Thanks Gabor!

0 Kudos