cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
aleksazr
Contributor
Contributor
13,220 Views
Registered: ‎09-05-2008

Fast counter (how to multicycle)

Jump to solution

In the files linked below there are 4 counters:

 

c1  4-bit counter, max speed 445 MHz

c2  28-bit counter, 208 MHz

c3  32-bit counter, 198 MHz

c4  4-bit counter with carry out enabling 28-bit counter, max speed 200 MHz

 

With the c4 version I was hoping to get 445 MHz on 4-bit counter and no problems

with a larger 28-bit counter because it works 16 times slower (28 MHz)

 

I've constrained the 28-bit counter with this:

TIMESPEC TS_COUNTER = FROM "COUNTER_FAST" TO "COUNTER_SLOW" TS_CLK / 16;

 

Can someone help me constrain it properly, so that it can work as fast as a 4-bit counter?

 

The counter outputs will eventually be written into BRAM.

 

Also, if I register the carry out, why don't I see it in the list of FFs in the Constraints Editor?

There are only cnt1 and cnt2 FFs, no carry out FF.

 

Btw, c4 is my first attempt of creating a muticycle logic.

 

download project files

 

0 Kudos
1 Solution

Accepted Solutions
gszakacs
Professor
Professor
20,385 Views
Registered: ‎08-14-2007

I was able to build the 250 MHz, S3A version using the following changed line in the UCF:

 

TIMESPEC TS_COUNTER = FROM "COUNTER_SLOW" TO "COUNTER_SLOW" TS_CLK / 16;

Also in the synthesis options I had to change the hierarchy separator from the default "/" to "_" in order to match the instance names in your UCF.  I verified that the TS_COUNTER spec was actually used:

 

================================================================================
Timing constraint: TS_COUNTER = MAXDELAY FROM TIMEGRP "COUNTER_SLOW" TO TIMEGRP
"COUNTER_SLOW"         TS_CLK / 16;
For more information, see From:To (Multicycle) Analysis in the Timing Closure User Guide (UG612).

 406 paths analyzed, 68 endpoints analyzed, 0 failing endpoints
 0 timing errors detected. (0 setup errors, 0 hold errors)
 Maximum delay is   4.879ns.
--------------------------------------------------------------------------------

By the way, when I have a similar timing problem, where I want a multicycle path for elements with a common clock enable (in this case the clock enable is "carry") I usually use the TNM_NET method to define the timing group like:

 

NET "carry" TNM_NET = COUNTER_SLOW;

This has the advantage of picking up any flops that you might not have intentionally created but run on the same clock enable, including replicated flops and auto-generated flops with no name from the source code.  In the case of your simple counter there are no such flops (the timing report chows the same number of endpoints and paths).  However this single line in the UCF replaces 28 lines in your UCF.

 

Do you understand why the timing spec for the slow counter must be from COUNTER_SLOW to COUNTER_SLOW and not from COUNTER_FAST to COUNTER_SLOW?

-- Gabor

View solution in original post

0 Kudos
10 Replies
aleksazr
Contributor
Contributor
13,212 Views
Registered: ‎09-05-2008

I've changed

TIMESPEC TS_COUNTER = FROM "COUNTER_FAST" TO "COUNTER_SLOW" TS_CLK / 16;

to

TIMESPEC TS_COUNTER = FROM "COUNTER_SLOW" TO "COUNTER_SLOW" TS_CLK / 16;

 

and that seems to do the trick, although not working at 445 MHz, but at 340 MHz.

 

Is that the correct way? If it is, why is it not working at full 445 MHz?

0 Kudos
htsvn
Xilinx Employee
Xilinx Employee
13,205 Views
Registered: ‎08-02-2007

Hi

 

Attach the files to the forum post. The syntax of your constraints seems to be fine.

 

--Hem

----------------------------------------------------------------------------------------------
Kindly note- Please mark the Answer as "Accept as solution" if information provided is helpful.

Give Kudos to a post which you think is helpful and reply oriented.
----------------------------------------------------------------------------------------------
0 Kudos
aleksazr
Contributor
Contributor
13,201 Views
Registered: ‎09-05-2008

Here are the files.

0 Kudos
htsvn
Xilinx Employee
Xilinx Employee
13,193 Views
Registered: ‎08-02-2007

Hi

 

multi-200MHz design seems to fail for an internal frequency of 250MHz because there are too many levels of logic.

 

Since this is a cnt_2 is a 28 bit counter, refer to page 223 and 224 of this doc http://www.xilinx.com/support/documentation/sw_manuals/xilinx13_3/ug612.pdf.

 

This includes the coding technique needed to reduce down the timing violation of a higher width counter.

 

--Hem

----------------------------------------------------------------------------------------------
Kindly note- Please mark the Answer as "Accept as solution" if information provided is helpful.

Give Kudos to a post which you think is helpful and reply oriented.
----------------------------------------------------------------------------------------------
0 Kudos
aleksazr
Contributor
Contributor
13,181 Views
Registered: ‎09-05-2008

That document shows how to split a larger counter into two smaller ones (unfortunately in verilog),

which I have already done - I've split a 32-bit counter into a 4-bit and a 28-bit counter.

 

Is that not enough? Should I split the 28-bit counter also?

A 4-bit counter can work at 445 MHz and a 28-bit counter is working at a 16 times slower rate,

so I presumed the whole thing will work at 445 MHz, but it works at 340 MHz.

 

I have tried theese two constraints,

TIMESPEC TS_COUNTER = FROM "COUNTER_FAST" TO "COUNTER_SLOW" TS_CLK / 16;

TIMESPEC TS_COUNTER = FROM "COUNTER_SLOW" TO "COUNTER_SLOW" TS_CLK / 16;

the later one giving better results, so I presume that is the correct one,

but I need a confirmation if that is all I need to constrain.

 

1. Is that properly constrained?

 

2. Why the whole thing works at 340 and not at 445 MHz?

445 MHZ  /  16 = 30 MHz, so the 28-bit should have no problems, and the 4-bit counter hasn't changed...

or am I missing something?

 

3. Why don't I see "carry" FF in Constraints Editor? (if I change the "multi - 200 MHz max\_source\tester.vhd")

 

0 Kudos
gszakacs
Professor
Professor
13,166 Views
Registered: ‎08-14-2007

In your multi-counter code, you use a combinatorial carry for the longer counter.  This will essentially synthesize exactly as if you made one long counter:

 

carry <= '1' when cnt1=15 else '0';

process (CLK) begin
    if rising_edge(CLK) then
        cnt1 <= cnt1 +1;

--        if cnt1=15 then        -- in this version, why can't I see carry FF
--            carry <= '1';    -- in the list of FFs in the Constraints Editor?
--        else
--            carry <= '0';
--        end if;

        if carry='1' then
            cnt2 <= cnt2 +1;
        end if;
    end if;
end process;

You were on the right track with the commented-out code (i.e. the carry should come from a flip-flop - not gates).  However in order to have the carry when the 4-bit counter equals 15, you need to decode state 14 due to the additional flop delay like:

 

process (CLK) begin
    if rising_edge(CLK) then
        cnt1 <= cnt1 +1;

        if cnt1=14 then        -- in this version, why can't I see carry FF
            carry <= '1';    -- in the list of FFs in the Constraints Editor?
        else
            carry <= '0';
        end if;

        if carry='1' then
            cnt2 <= cnt2 +1;
        end if;
    end if;
end process;

 

At this point you should be able to set a multicycle constraint on the second counter.  If you have a period constraint for the original clock (e.g. 2.247 ns), then you need to multiply (not divide) that constraint by 16.  If the original clock has a frequency constraint (e.g. 445 MHz), then divide it by 16 as you posted.  Note that the "PERIOD" constraint type can be specified either way, as a frequency or as a period.  This confuses the issue when you generate constraints based on the existing PERIOD constraint due to the need to multiply or divide depending on the original constraint syntax.  Personally I always specify periods rather than frequencies to match the name of the constraint.  Then multicycle constraints are always a multiple of the original period.

-- Gabor
0 Kudos
aleksazr
Contributor
Contributor
13,153 Views
Registered: ‎09-05-2008

 

Thanks for pointing out that I should set carry when cnt1 is 14.

 

My current UCF file is

INST "cnt2_0" TNM = TS_SLOW_CLOCK;
INST "cnt2_1" TNM = TS_SLOW_CLOCK;

.

.

INST "cnt2_26" TNM = TS_SLOW_CLOCK;
INST "cnt2_27" TNM = TS_SLOW_CLOCK;
TIMESPEC TS_ = FROM "TS_SLOW_CLOCK" TO "TS_SLOW_CLOCK" TS_CLK / 16;


My first attempt was

TIMESPEC TS_ = FROM "TS_FAST_CLOCK" TO "TS_SLOW_CLOCK" TS_CLK / 16;

(the complete UCF file is attached in my previous posts)

 

Both versions make sense to me... please advise.

 

 

 

 

3. Why don't I see "carry" FF in Constraints Editor? (if I change the "multi - 200 MHz max\_source\tester.vhd")

 

This was my bad, sorry. I haven't actually created the FF, LOL, the lines were commented out.

 

 

0 Kudos
gszakacs
Professor
Professor
20,386 Views
Registered: ‎08-14-2007

I was able to build the 250 MHz, S3A version using the following changed line in the UCF:

 

TIMESPEC TS_COUNTER = FROM "COUNTER_SLOW" TO "COUNTER_SLOW" TS_CLK / 16;

Also in the synthesis options I had to change the hierarchy separator from the default "/" to "_" in order to match the instance names in your UCF.  I verified that the TS_COUNTER spec was actually used:

 

================================================================================
Timing constraint: TS_COUNTER = MAXDELAY FROM TIMEGRP "COUNTER_SLOW" TO TIMEGRP
"COUNTER_SLOW"         TS_CLK / 16;
For more information, see From:To (Multicycle) Analysis in the Timing Closure User Guide (UG612).

 406 paths analyzed, 68 endpoints analyzed, 0 failing endpoints
 0 timing errors detected. (0 setup errors, 0 hold errors)
 Maximum delay is   4.879ns.
--------------------------------------------------------------------------------

By the way, when I have a similar timing problem, where I want a multicycle path for elements with a common clock enable (in this case the clock enable is "carry") I usually use the TNM_NET method to define the timing group like:

 

NET "carry" TNM_NET = COUNTER_SLOW;

This has the advantage of picking up any flops that you might not have intentionally created but run on the same clock enable, including replicated flops and auto-generated flops with no name from the source code.  In the case of your simple counter there are no such flops (the timing report chows the same number of endpoints and paths).  However this single line in the UCF replaces 28 lines in your UCF.

 

Do you understand why the timing spec for the slow counter must be from COUNTER_SLOW to COUNTER_SLOW and not from COUNTER_FAST to COUNTER_SLOW?

-- Gabor

View solution in original post

0 Kudos
avrumw
Guide
Guide
13,129 Views
Registered: ‎01-23-2009

By the way, when I have a similar timing problem, where I want a multicycle path for elements with a common clock enable (in this case the clock enable is "carry") I usually use the TNM_NET method to define the timing group like:

 

NET "carry" TNM_NET = COUNTER_SLOW;

 

Just a warning on this...

 

The format is clear - any clocked element that is combinatorially reachable from the net "carry" is put in the group COUNTER_SLOW.

 

But lets look at a possible situation:

 

always @(posedge clk)
begin
  if (counter == MAX_CNT)
    counter <= 0;
  else
    counter <= counter + 1'b1;
end

assign carry = (counter == MAX_CNT);

always @(posedge clk)
begin
if (carry)
begin
// all stuff in here is multicycle
end
end

From your code, you want any FF that is enabled with "carry" to be part of the group. But lets look at what synthesis can do with this.

 

The synthesis engine could end up sharing the term (counter == MAX_CNT) that is used in the continuous assign to "carry" as well as the (counter == MAX_CNT) that is used to reload the counter. Now, there is only one net, and it is called "carry".

 

Now the TNM will find all FFs that are combinatorially reachable from the net "carry", which includes the flip-flops that make up the counter "counter". Now your counter FFs are part of the group, and hence part of the multicycle path - this is a constraint error.

 

It doesn't even matter if the "counter" FFs are in a sub-module and the "carry" net is grabbed at the top level of the design - the net "carry" is still the same net, and the "counter" FFs will be in the group.

 

As a result, I never use the "NET xxx TNM = group_name;" format on anything other than a clock - when used on a data signal, there is always the possibility that the net can be used or re-used in a way that you can't predict.

 

And (by the way) if you think this is made up - it isn't... I have seen the Xilinx tools do exactly what I have described above...

 

Avrum

0 Kudos
gszakacs
Professor
Professor
7,768 Views
Registered: ‎08-14-2007

Yes, you can get into trouble with TNM_NET, but the tools do exactly what you tell them.  In this case, I have changed the original design so that the "carry" signal is the output of a flop detecting state 14 (on during state 15) of the prescaler.  This signal goes nowhere but the clock enable of the 28-bit counter.

 

There are other problems with TNM_NET including when you use:

 

always @ (posedge clk) clk_ena <= !clk_ena;

 

In this case you have a clock enable that depends on itself, which is clearly not a multicycle path.

 

So yes you have to be careful.  But in a large design where you're using clock enables instead of multiple related clock domains, it's not so hard to stay out of trouble by managing the clock enable signals.  And in those same cases, you'd spend a long time creating instance-name based timing groups.

-- Gabor
0 Kudos