Sign In

Don't have a Xilinx account yet?

  • Choose to receive important news and product information
  • Gain access to special content
  • Personalize your web experience on Xilinx.com

Create Account

Username

Password

Forgot your password?
XClose Panel
Xilinx Home
Reply
Expert Contributor
avrumw
Posts: 456
Registered: ‎01-23-2009
0

Re: CLOCK vs. CLOCK ENABLE

The two approaches are basically similar - we are "enabling" one clock pulse out of every 250.

 

In post #4, the clock enable is distributed to every FF that runs at this frequency, consuming routing resources and potentially creating static timing paths (as the enable has to fan out to all FFs on this domain within one 50MHz clock period).

 

Using the BUFGCE, the decimated clock is distributed as a clock on the global clock network. No additional routing is needed (other than the single route to the CE input of the BUFGCE). In addition, there will be a reduction in consumed power since the "high fanout" clock network is running at 200KHz, rather than 50MHz.

 

The disadvantage is that a 2nd BUFG is used, one for the 50MHz domain and one for the decimated domain. There are a finite number of BUFG on a device (the number varies by family). As I mentioned, these two domains are skew balanced, so signals can pass between them synchronously.

 

A similar solution can be implemented using a BUFHCE in Virtex-6 and all 7 Series devices if the number of FFs on the decimated domain is small enough to fit into one clock region. This has the advantage of not using an additional BUFG.

 

Avrum

Expert Contributor
eteam00
Posts: 7,505
Registered: ‎07-21-2009
0

Re: CLOCK vs. CLOCK ENABLE

[ Edited ]

In post #4, the clock enable is distributed to every FF that runs at this frequency, consuming routing resources and potentially creating static timing paths (as the enable has to fan out to all FFs on this domain within one 50MHz clock period).

 

  • Timing problems (if any) with distributing the clock enable are usually solved with either manual or automatic logic replication (i.e. generating multiple copies of the clock enable signal).
  • For dansci's purposes, it seems that the same logic must be capable of 50MHz operation as well as 200KHz operation.
  • For other designs, there are timing attribute facilities for proper static timing analysis of logic which use a high-frequency clock but are designed to operate at lower frequencies with a clock enable.  Example (derived from UG612 v13.4 page 78):

NET clk_200K_en TNM = slow_exception;
NET clk_50MHz   TNM = normal;
TIMESPEC TS01 = PERIOD normal 20 ns;
TIMESPEC TS02 = FROM slow_exception TO slow_exception TS01*250;

 

Using the BUFGCE, the decimated clock is distributed as a clock on the global clock network. No additional routing is needed (other than the single route to the CE input of the BUFGCE). In addition, there will be a reduction in consumed power since the "high fanout" clock network is running at 200KHz, rather than 50MHz.

 

You neglected to mention the power consumption penalty incurred by the additional clock distribution network for the 200KHz clock.  Even though the frequency is low, the very capacitive clock network must be driven at very high power levels to maintain the same low clock skew requirements of high-frequency clocks.  Clock skew tolerance does not scale with clock frequency.  Some of the clock buffer/distribution power is dynamic, some is static.

 

The disadvantage is that a 2nd BUFG is used, one for the 50MHz domain and one for the decimated domain. There are a finite number of BUFG on a device (the number varies by family). As I mentioned, these two domains are skew balanced, so signals can pass between them synchronously.

 

Agreed.  As you described in your earlier posts, the same 50MHz clock signal must drive both the BUFG (for the ungated buffered 50MHz clock) and the BUFCE (for the 200KHz rate gated buffered clock) to maintain timing alignment between the two clock domains.

 

-- Bob Elkind

SIGNATURE:
README for newbies is here: http://forums.xilinx.com/t5/New-Users-Forum/README-first-Help-for-new-users/td-p/219369

Summary:
1. Read the manual or user guide. Have you read the manual? Can you find the manual?
2. Search the forums (and search the web) for similar topics.
3. Do not post the same question on multiple forums.
4. Do not post a new topic or question on someone else's thread, start a new thread!
5. Students: Copying code is not the same as learning to design.
6 "It does not work" is not a question which can be answered. Provide useful details (with webpage, datasheet links, please).
7. You are not charged extra fees for comments in your code.
8. I am not paid for forum posts. If I write a good post, then I have been good for nothing.
Expert Contributor
bassman59
Posts: 4,671
Registered: ‎02-25-2008
0

Re: CLOCK vs. CLOCK ENABLE


avrumw wrote:

The two approaches are basically similar - we are "enabling" one clock pulse out of every 250.

 

In post #4, the clock enable is distributed to every FF that runs at this frequency, consuming routing resources and potentially creating static timing paths (as the enable has to fan out to all FFs on this domain within one 50MHz clock period).

 

Using the BUFGCE, the decimated clock is distributed as a clock on the global clock network. No additional routing is needed (other than the single route to the CE input of the BUFGCE). In addition, there will be a reduction in consumed power since the "high fanout" clock network is running at 200KHz, rather than 50MHz.


The original meaning of "decimation" was that the Roman general killed one out of every ten centurions, as a way to instill order through fear. It doesn't mean "completely wipe out." Now, I know the DSP community has adopted the term "decimation" for its own nefarious use.

 

Having said that -- there's a real danger of dividing a high-frequency clock down using flip-flops (in a counter, shift register, whatever) and then driving that new clock onto a global net, and it's this:

 

That divided clock has some clock-to-out delay with respect to the high-frequency clock from which it is derived. So if you're trying to clock stuff generated in the high-frequency clock domain with that divided clock, you may not meet timing and your design can fail in odd ways.

 

Ask me how I know this.


----------------------------------------------------------------
Yes, I do this for a living.
Expert Contributor
avrumw
Posts: 456
Registered: ‎01-23-2009

Re: CLOCK vs. CLOCK ENABLE

 


bassman59 wrote:

 

Having said that -- there's a real danger of dividing a high-frequency clock down using flip-flops (in a counter, shift register, whatever) and then driving that new clock onto a global net, and it's this:

 

 

 

Absolutely! Doing the above is definitely not recommended. This is where the power of the BUFGCE comes in - we are NOT driving a generated clock onto a global network, but we are generating a gated clock using the clock buffer. This is done specifically to avoid the problem you mention; the gated clock from the BUFGCE and an ungated clock on a BUFG have very little skew, and hence you can cross between these two domains synchronously (assuming the input to the BUFG and BUFGCE are the same clock).

 

Avrum

 

Expert Contributor
eteam00
Posts: 7,505
Registered: ‎07-21-2009
0

Re: CLOCK vs. CLOCK ENABLE

[ Edited ]

Having said that -- there's a real danger of dividing a high-frequency clock down using flip-flops (in a counter, shift register, whatever) and then driving that new clock onto a global net, and it's this:

 

That divided clock has some clock-to-out delay with respect to the high-frequency clock from which it is derived. So if you're trying to clock stuff generated in the high-frequency clock domain with that divided clock, you may not meet timing and your design can fail in odd ways.

 

Agreed.

 

Either of the three following approaches can work reliably:

  • Use of a single high-frequency clock to generate multiple lower-frequency clock enables (my preference)
  • Use of BUFCE to generate and distribute a gated second clock (the approach of which Avrum is particularly fond).
  • Use a single high-frequency clock to generate multiple, closely aligned lower-frequency clocks (see Gabor's post #4)

There are design situations for which one approach or the other may be particularly well-suited (or poorly-suited).  For many designs, these three solutions are roughly equally capable.

 

-- Bob Elkind

SIGNATURE:
README for newbies is here: http://forums.xilinx.com/t5/New-Users-Forum/README-first-Help-for-new-users/td-p/219369

Summary:
1. Read the manual or user guide. Have you read the manual? Can you find the manual?
2. Search the forums (and search the web) for similar topics.
3. Do not post the same question on multiple forums.
4. Do not post a new topic or question on someone else's thread, start a new thread!
5. Students: Copying code is not the same as learning to design.
6 "It does not work" is not a question which can be answered. Provide useful details (with webpage, datasheet links, please).
7. You are not charged extra fees for comments in your code.
8. I am not paid for forum posts. If I write a good post, then I have been good for nothing.
Expert Contributor
bassman59
Posts: 4,671
Registered: ‎02-25-2008
0

Re: CLOCK vs. CLOCK ENABLE


avrumw wrote:

 


bassman59 wrote:

 

Having said that -- there's a real danger of dividing a high-frequency clock down using flip-flops (in a counter, shift register, whatever) and then driving that new clock onto a global net, and it's this:

 

 

 

Absolutely! Doing the above is definitely not recommended. This is where the power of the BUFGCE comes in - we are NOT driving a generated clock onto a global network, but we are generating a gated clock using the clock buffer. This is done specifically to avoid the problem you mention; the gated clock from the BUFGCE and an ungated clock on a BUFG have very little skew, and hence you can cross between these two domains synchronously (assuming the input to the BUFG and BUFGCE are the same clock).

 

Avrum

 


Oh, wait -- now I see ... 

 

The global clock is the input I to the BUFGCE. The divider output drives the CE input on the BUFGCE. The output of the BUFGCE is now gated by the divider but still very low skew with reference to the global clock after a BUFG.

 

Excellently clever!

 

(Filed away for future use.)


----------------------------------------------------------------
Yes, I do this for a living.
Visitor
dansci
Posts: 16
Registered: ‎04-27-2012
0

Re: CLOCK vs. CLOCK ENABLE

Sorry, but I was on holiday. Nice to see such an interest. I analysed all discussion and would like first to try avrumw solution. I would be glad when you could check this:

 

    BufgClk: BUFG
    port map (
       I => clk,
       O => clkOutBufg);
    
    BufgceClk: BUFGCE
    port map (
       I  => clk,
       CE => ce,
       O  => clkOutBufgce);
        
 clock_divider: process (clkOutBufg, divide)
    begin
        if (rising_edge(clkOutBufg)) then
            
            if (divide = '1') then  -- here ce every asserted once every 200 clocks
                if (cnt < div) then
                    cnt <= cnt+1;
                    ce <= '0';
                else
                    cnt <= (others=>'0');
                    ce <= '1';
                end if;
            elsif (divide = '0') then -- here ce always 1, so clock has original 50 MHz frequency
                    ce <= '1';
            end if;
            
        end if;
    end process clock_divider;

 I have never used primitives but hope it is correct way. 'divide' signal is asserted in main state machine, when I need to change frequency to higher. Should I use in other places in my project 'clkOutBufg' signal instead of 'clk' like I have now or it is only for that counter process?

Visitor
dansci
Posts: 16
Registered: ‎04-27-2012
0

Re: CLOCK vs. CLOCK ENABLE

I forgot about constraints. Now I get these two errors:

 

ERROR:Place:1018 - A clock IOB / clock component pair have been found that are not placed at an optimal clock IOB /
   clock site pair. The clock component <clk_IBUFG_BUFG> is placed at site <BUFGMUX7>. The IO component <clk> is placed
   at site <PAD123>.  This will not allow the use of the fast path between the IO and the Clock buffer. If this sub
   optimal condition is acceptable for this design, you may use the CLOCK_DEDICATED_ROUTE constraint in the .ucf file to
   demote this message to a WARNING and allow your design to continue. However, the use of this override is highly
   discouraged as it may lead to very poor timing results. It is recommended that this error condition be corrected in
   the design. A list of all the COMP.PINs used in this clock placement rule is listed below. These examples can be used
   directly in the .ucf file to override this clock rule.
   < NET "clk" CLOCK_DEDICATED_ROUTE = FALSE; >
ERROR:Place:1018 - A clock IOB / clock component pair have been found that are not placed at an optimal clock IOB /
   clock site pair. The clock component <BufgceClk/BUFGMUX> is placed at site <BUFGMUX4>. The IO component <clk> is
   placed at site <PAD123>.  This will not allow the use of the fast path between the IO and the Clock buffer. If this
   sub optimal condition is acceptable for this design, you may use the CLOCK_DEDICATED_ROUTE constraint in the .ucf
   file to demote this message to a WARNING and allow your design to continue. However, the use of this override is
   highly discouraged as it may lead to very poor timing results. It is recommended that this error condition be
   corrected in the design. A list of all the COMP.PINs used in this clock placement rule is listed below. These
   examples can be used directly in the .ucf file to override this clock rule.
   < NET "clk" CLOCK_DEDICATED_ROUTE = FALSE; >

 When I set this: NET "clk" CLOCK_DEDICATED_ROUTE = FALSE; I have two similar warnings instead of errors.

Expert Contributor
avrumw
Posts: 456
Registered: ‎01-23-2009
0

Re: CLOCK vs. CLOCK ENABLE

The instantiations of the primitives looks reasonable. There may be a couple of VHDL errors in the genertion of ce (but please note, that I am not a VHDL expert) - the signal "divide" should not be in the sensitivity list of the process, and I think there is an off by one error (this will count from 0 to div, which is div+1 counts).

 

As for the errors/warnings you are getting - both are a problem. They are indicating that the tool cannot use a dedicated route to the inputs of both BUFGs, and it is essential that they do.

 

You have not indicated what technology you are using, but in the Spartan families (3, 3E, 3A, 3AN, and 6), there are some pretty stringent limitations to driving BUFGMUX inputs. However, it should be possible to create the clock structure we need by using a pair of adjacent BUFGMUX with one using the I0 input and one using the I1. I would have thought the tool could figure this out for itself, but either

  - it can't (and we need to help it)

  - the input clock in question ("clk") is not coming from a GCLK pin (a clock capable pin) - I don't know what "PAD123" is, since I don't know what device you are using

  - there are LOC constraints on the BUFG/BUFGCE that are preventing the proper sharing

 

For this to work, the clock input ("clk") must either be coming directly from a GCLK pin (a clock capable pin) or from a DCM/MMCM/PLL output and the  two BUFGs (BufgClk and BufgceClk) must be arranged so as to be able to share the inputs.

 

Coming from DCM/PLL, there are less restrictions - depending on the technology, the BUFGs in the same quadrant can share a DCM/PLL input.

 

Coming directly from a GCLK pin, the only sharing that can occur is between the I0 input of one BUFG and the I1 input of the adjacent one. The tool should be able to map either the BUFG or BUFGCE to the I1 input, but if it can't you may need to help it - you should be able to replace the BUFG instantiation with

 

Bufgclk: BUFGMUX

  port_map (

    O   -> ClkOutBufg,

    I0  -> '0',

    I1 -> clk,

   S  -> '1'

);

 

Now the "clk" input is used in the I0 of the BUFGCE and the I1 of the BUFG (now a BUFGMUX). These should be able to be shared in an adjacent pair of BUFGMUXs - see figure 1-3 in UG382 v1.4 for Spartan6 and figure 2-11 in UG331 v1.8 for Spartan 3/3E/3A.

 

You shouldn't need to go as far as specifying LOC constraints for the BUFGMUX/BUFGCE to place them in an adjacent pair, but if you do, then find the correct ones that can be driven by your "clk" input, and put the LOC constraints in your UCF.

 

Avrum

Expert Contributor
rcingham
Posts: 2,010
Registered: ‎09-09-2010
0

Re: CLOCK vs. CLOCK ENABLE

"The signal "divide" should not be in the sensitivity list of the process"

Correct.
However, this will affect only simulation.

------------------------------------------
"If it don't work in simulation, it won't work on the board."