cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 

MicroZed Chronicles: Clocking Techniques of Old

xtech-blogs
Xilinx Employee
Xilinx Employee
1 4 777

Editor’s Note: This content is republished from the MicroZed Chronicles, with permission from the author.

 

Modern FPGA devices really spoil us. They include PLLs, DCM, DSP etc. and a range of interfaces that significantly ease our developments. Recently though, I faced a situation where the PLL could not be used for safety reasons. This got me scratching my head a little about how I was going to generate a 20 MHz clock from a 50 MHz source without a PLL and also without changing the oscillator frequencies since the 50 MHz is supplied by another satellite payload in this case.

Looking back in the notes I’ve gathered over the years, I remembered that the great Peter Alfke published several circuits that could be used in this situation to provide non-integer divisions and clock multiplication. These techniques often come in handy so I thought it would be good to refresh a couple of these techniques and show the performance in modern silicon (Basys 3 Artix-7 FPGA board).

Let’s start with looking at a simple clock doubler circuit. This circuit uses an XNOR gate to generate pulses for each edge of the input clock. As a delay is required to do this in one of the inputs of the XNOR gate, this is provided by the delay register. This logic design will of course not output a 50:50 duty cycle waveform.  

398_Fig1.jpg

By implementing this in a Basys 3 board and providing the clock to the circuit at 10 MHz, we would expect a 20 MHz output routed to the pins of the Pmod output pin. Careful selection of the drive strength, slew rate, pull down, and termination settings presented a reasonable waveform. 

398_Fig2.png

In addition to the clock multiplication circuit, Peter also presented several division circuits, a divide by 1.5, 2.5, 3 and 5.  All of these are based on the use of synchronous counters and then a latch circuit to generate the output.

All of these can be implemented easily in VHDL or Verilog. For the following examples, I used a 10 MHz clock to demonstrate the division algorithms. The first implementation is a divide by 1.5, followed by a divide by 2.5, 3 and 5. The division by 3 and 5 also provide a 50:50 duty cycle.

We should expect to see frequencies like the following when working with a 10 MHz input clock:

 

Division Factor

Output Frequency

1.5

6.66 MHz

2.5

4 MHz

3

3.33 MHz

5

2 MHz

 

The RTL for the multiplier and counter as implemented is shown below.

library IEEE;

use IEEE.STD_LOGIC_1164.ALL;

 

use IEEE.NUMERIC_STD.ALL;

 

entity top is port(

    clk : in std_logic;

    clk_1d5 : out std_logic;

    clk_2d5 : out std_logic;

    clk_3 : out std_logic;

    clk_5 : out std_logic;

    clk_op : out std_logic

);

end top;

 

architecture Behavioral of top is

 

signal inv : std_logic := '0';

signal delay : std_logic := '0';

signal clk_double : std_logic :='0';

signal clk_2d5_cnt : unsigned(2 downto 0) := (others => '0');

signal clk_1d5_cnt : unsigned(1 downto 0) := (others => '0');

signal clk_3_cnt : unsigned(1 downto 0) := (others => '0');

signal clk_5_cnt : unsigned(2 downto 0) := (others => '0');

 

begin

 

clk_delay : process(clk_double)

    begin

       if rising_edge(clk_double) then

          delay <= inv;

       end if;

    end process;

 

    inv <= not delay;

    clk_double <= clk XNOR inv;

    clk_op <= clk_double;

 

    delay_1_5 : process(clk)

       begin

          if rising_edge(clk) then

             if clk_1d5_cnt = 2 then

                clk_1d5_cnt <= (others => '0');

             else

                clk_1d5_cnt <= clk_1d5_cnt + 1;

             end if;

          end if;

       end process;

 

    clk_1d5 <= '1' when clk_1d5_cnt = 0 and clk ='1' else

    '1' when clk_1d5_cnt = 1 and clk ='1' else

    '1' when clk_1d5_cnt = 1 and clk ='0' else

    '1' when clk_1d5_cnt = 2 and clk ='0' else

    '0';

 

    delay_2_5 : process(clk)

       begin

          if rising_edge(clk) then

             if clk_2d5_cnt = 4 then

                clk_2d5_cnt <= (others => '0');

             else

                clk_2d5_cnt <= clk_2d5_cnt + 1;

             end if;

          end if;

       end process;

 

    clk_2d5 <= '1' when clk_2d5_cnt = 1 and clk ='1' else

    '1' when clk_2d5_cnt = 1 and clk ='0' else

    '1' when clk_2d5_cnt = 3 and clk ='0' else

    '1' when clk_2d5_cnt = 4 and clk ='1' else

    '0';

 

    delay_3 : process(clk)

    begin

       if rising_edge(clk) then

          if clk_3_cnt = 2 then

                clk_3_cnt <= (others => '0');

             else

                clk_3_cnt <= clk_3_cnt + 1;

             end if;

          end if;

    end process;

 

    clk_3 <= '1' when clk_3_cnt = 1 and clk ='1' else

    '1' when clk_3_cnt = 1 and clk ='0' else

    '1' when clk_3_cnt = 2 and clk ='1' else

    '0';

 

    delay_5 : process(clk)

    begin

       if rising_edge(clk) then

          if clk_5_cnt = 4 then

             clk_5_cnt <= (others => '0');

          else

             clk_5_cnt <= clk_5_cnt + 1;

          end if;

       end if;

    end process;

 

    clk_5 <= '1' when clk_5_cnt = 0 and clk ='1' else

    '1' when clk_5_cnt = 0 and clk ='0' else

    '1' when clk_5_cnt = 1 and clk ='1' else

    '1' when clk_5_cnt = 1 and clk ='0' else

    '1' when clk_5_cnt = 2 and clk ='1' else

    '0';

 

end Behavioral;

 

Running the algorithm on the Basys 3 hardware and observing the output with the scope shows the following waveforms.

Division by 1.5 to generate a 6.66MHz clockDivision by 1.5 to generate a 6.66MHz clock

 

Division by 2.5 to generate a 4 MHz clockDivision by 2.5 to generate a 4 MHz clock

 

Division by 3 to generate a 6.66 MHz clockDivision by 3 to generate a 6.66 MHz clock

 

Division by 5 to generate a 2 MHz clockDivision by 5 to generate a 2 MHz clock

These circuits work well. Of course, there will be some drift with temperature and voltage changes, especially in the multiplier circuit.

Of course, PLL or DCM should be your first choice for implementing clock multiplication and division if available.

The techniques that Peter demonstrated so well in his tech notes that helped so many developers in the earlier days of FPGA development may come in handy for some applications. Such elegant techniques deserve remembering from time to time.  

 

 

 

4 Comments
drjohnsmith
Teacher
Teacher

How does P&R and synthesis respond to the code ?

Few bits worry me ,

 

a) The output is potentially glitchy, as its a gate decode

b) The output is on a logic line, not a clock line, so can not be used inside the FPGA as a clock at all easily

c) The clk input is being used as a clock into the registers, i.e. on high speed **bleep** network, AND being used as a local gate input, i.e. slow network. Id imagine the tools are going to shout about this , how did yo turn off the warnings.

d) How did you constrain this not to end up with the gating on one side of the chip, the registers on the other ? Imagine a larger FPGA , there could be ns delay on the two "clocks" 

 

In conclusion,

   this is a good trick, but it has serious constraints about what it does and how you should use it that the user needs to be aware of,

  I'm worried in these days of cut and paste and no thinking.... how this sort of design could propagate without a caveats emporium ..

 

 

 

 

xtech-blogs
Xilinx Employee
Xilinx Employee

@drjohnsmith 

Thank you for reading the blog. The author of this blog is happy to talk you through this offline. Please send an email to him (adam@Adiuvoengineering.com).

The following technical support options are also available to Xilinx customers:

  • Technical information is available online 24 hours a day from the Support website
  • Technical Support staff are available to respond to your questions in the Community Forums

 

drjohnsmith
Teacher
Teacher

As a super user for quiet some time,

    I am open to an offline discussion with them if they want to message us,

 

ronnywebers
Advisor
Advisor

@drjohnsmith I had about the same questions. I think you can even see such 'combinatorial glitch' on some of the scope pictures (?)

These were indeed techniques in the early days of FPGA/CPLD, where there were just no PLLs available.

I could see some use for this to drive some really slow logic on the 'outside' (?). On the 'inside' of a modern FPGA, I'd expect this to be (very) tricky without proper constraining ... I'm not sure if this is even possible? What happens if you use such 'clock output' from a counter to drive other logic? That would be interesting to explain, I never even tried it, but curious to see more about this, and how this could lead to problems if you're not aware what's happening under the hood. That could be a great 'part 2' for this blog.

I would also be great to show the 'proper ways' of doing the same with modern FPGA's. I think there are 2 ways, one with PLL/MMCM if available, the other one by using a source clock that is a factor higher in freq (i.e. 10 times or more) than the frequencies you need to generate, and then just use plain counters / fsm and a 'tick' at the right moment?