UPGRADE YOUR BROWSER

We have detected your current browser version is not the latest one. Xilinx.com uses the latest web technologies to bring you the best online experience possible. Please upgrade to a Xilinx.com supported browser:Chrome, Firefox, Internet Explorer 11, Safari. Thank you!

cancel
Showing results for 
Search instead for 
Did you mean: 
Adventurer
Adventurer
4,405 Views
Registered: ‎11-26-2016

32bit counter @250MHz timing issues

Jump to solution

Hello,

 

A part of a design contains a 32 bit counter which keeps track of transfered AXIS data bytes.

The problem is that the design runs with 250MHz and fails timing at the logic part of the counter.

What possibilities do I have to fix the issue? I read about DSP's and splitting up the counter, but I thought the tool would handle this best.

 

The next state logic is straight forward:

 

  datastream_byte_cnt_next <= TDATA_BYTE_WIDTH when en_datastream_byte_cnt = '0' else
                              datastream_byte_cnt + cnt_bits(s_axis_s_tkeep) when tvalid_tready_tlast = '1' else
                              datastream_byte_cnt + TDATA_BYTE_WIDTH         when tvalid_tready = '1' else
                              datastream_byte_cnt;

The register:

 

  register_p : process (s_aclk) is
  begin
    if rising_edge(s_aclk) then
        datastream_byte_cnt <= datastream_byte_cnt_next;
      end if;
    end if;
  end process register_p;

Within a FSM a comparison with a limit is performed:

 

if datastream_byte_cnt >= datastream_byte_cnt_limit then

 

 

Thank you in advance.

32bit_timing.PNG
timing_failed.PNG
0 Kudos
1 Solution

Accepted Solutions
Mentor hgleamon1
Mentor
6,949 Views
Registered: ‎11-14-2011

Re: 32bit counter @250MHz timing issues

Jump to solution

My understanding of HDL in general is that you need to write code that the synthesiser can match to templates in order to build the logic required.

 

If you want a counter, you should simply describe that function as simply as possible so that the synthesiser understands it is a counter (and you can check the synthesis report to see how the synthesiser has understood your code) and knows how to build it efficiently. I have noticed (with older versions of ISE, anyway) that combining a state machine with an adder/counter produces less efficient hardware than separating the two.

 

If you need more complex logic to control the counter - resetting, loading, up/down, etc. then that should be separated out from the counter to allow the counter logic to remain as streamlined as possible, in order to meet timing for that construction.

 

Naturally, there are many different ways to do the same thing in VHDL (or Verilog) but, in my opinion, your code doesn't immediately infer a counter and possibly that's why the implementation struggled at higher frequencies. I could, of course, be wrong but when I have timing problems I usually try to fix it by simplifying the critical path logic.

 

Anyway, good that you fixed your issue.

----------
"That which we must learn to do, we learn by doing." - Aristotle
0 Kudos
6 Replies
Scholar embedded
Scholar
4,391 Views
Registered: ‎06-09-2011

Re: 32bit counter @250MHz timing issues

Jump to solution

@so-lli1,

You should try to keep logic path as short as possible. You may just simply put it inside a process and register them before mux. Take a look at below code as a clue:

process(Clk)
begin
 if rising_edge(Clk) then
  if Rst = '1' then
   datastream_byte_cnt_next <= (others =>'0');
   a <= (others =>'0');
   b <= (others =>'0');
  else
   a<= datastream_byte_cnt + cnt_bits(s_axis_s_tkeep) ;
   b<=  datastream_byte_cnt + TDATA_BYTE_WIDTH;
   if  en_datastream_byte_cnt = '0' then
    datastream_byte_cnt_next <= TDATA_BYTE_WIDTH;
   elsif tvalid_tready_tlast = '1' then
     datastream_byte_cnt_next <= a;
   elsif tvalid_tready = '1' then
     datastream_byte_cnt_next <= b;
   else
     datastream_byte_cnt_next <=  datastream_byte_cnt;
   end if;
  end if;
 end if;
end process;

This way tool will add some flip flops and decrease the logic path. 

Or, you can simply put the mux inside the process - of course with another syntax for VHDL-93, for VHDL-200x the same can be applied inside process - and see if it improves timing.

Another hint  to get better performance: make your condition signals uniqe - en_datastream_byte_cnt, tvalid_tready_tlast, tvalid_tready - and decide according to status of all of them in every case. This will help tool synthesis your code better.

 

Hope this will help,

Hossein

0 Kudos
Voyager
Voyager
4,352 Views
Registered: ‎06-24-2013

Re: 32bit counter @250MHz timing issues

Jump to solution

Hey @so-lli1,

 

Using DSP units for fast counters is definitely an option too and they can go up to the maximum clock speed of the unit, but it doesn't help you much if the condition to be counted is complex.

Luckily you can almost always solve this problem by pipelining, unless you need the counter result for some kind of feedback (which seems not to be the case here).

 

Hope this helps,

Herbert

-------------- Yes, I do this for fun!
0 Kudos
Adventurer
Adventurer
4,344 Views
Registered: ‎11-26-2016

Re: 32bit counter @250MHz timing issues

Jump to solution

First of all, thank you for your responses!

 

@embedded I will give it a try and see if it helps.

 

@hpoetzl Is there any example out there regarding the implemenation using a DSP? I tried to take a look at the Binary Counter IP from Xilinx since according to the docs it makes use of DSPs, but it is locked...

0 Kudos
Mentor hgleamon1
Mentor
4,275 Views
Registered: ‎11-14-2011

Re: 32bit counter @250MHz timing issues

Jump to solution

I have read this thread a few times over and, I must say, I'm a bit confused.

 

I don't see a counter; I see a two process state machine and then some other state machine making a decision.

 

I would be tempted to hard code the counter (or adder, if it doesn't increment in single units) completely separately and simply load it when your other state machine decides it is time to load.

 

So, if you are keep track of bytes transferred, your counter/adder is in a separate process:

 

P_BYTE_TRACKER : process(s_aclk) is

begin

  if rising_edge(s_aclk) then

    if (reset_condition = '1') then

      byte_count <= 0;

    elsif (load_condition = '1') then

      byte_count <= byte_count + 1; -- I guess you are incrementing in single units

                                    -- but maybe some other value is more appropriate

    end if;

  end if;

end process P_BYTE_TRACKER;

 

Then you can determine your synchronous logic to assert load_condition in a different process (could be from a controlling state machine) and your comparison in a synchronous process (could be the same controlling state machine or a different one).

 

Essentially, you need to pipeline the logic to your counter/adder (as others have mentioned). Separating it out to a different, synchronous, process seems this most logical method of implementing this, in my opinion.

----------
"That which we must learn to do, we learn by doing." - Aristotle
0 Kudos
Adventurer
Adventurer
4,140 Views
Registered: ‎11-26-2016

Re: 32bit counter @250MHz timing issues

Jump to solution

@hgleamon1

You are right, it is more an adder than a counter.

However, I dont see how putting the logic in a separate process would make any difference to the code i presented (Yes I know what pipelining is and why timing failed).

I simply split up the register and the next-state logic. In both cases the tool ended up with the same logic structure.

 

I worked around the issue by removing the tkeep input byte count part of the adder, since the comparison is >= anyway. This was enough for the tool to meet the timing.

 

if datastream_byte_cnt >= datastream_byte_cnt_limit then

Removed this:

 

datastream_byte_cnt + cnt_bits(s_axis_s_tkeep) when tvalid_tready_tlast = '1' else

Thank you for your support!

0 Kudos
Mentor hgleamon1
Mentor
6,950 Views
Registered: ‎11-14-2011

Re: 32bit counter @250MHz timing issues

Jump to solution

My understanding of HDL in general is that you need to write code that the synthesiser can match to templates in order to build the logic required.

 

If you want a counter, you should simply describe that function as simply as possible so that the synthesiser understands it is a counter (and you can check the synthesis report to see how the synthesiser has understood your code) and knows how to build it efficiently. I have noticed (with older versions of ISE, anyway) that combining a state machine with an adder/counter produces less efficient hardware than separating the two.

 

If you need more complex logic to control the counter - resetting, loading, up/down, etc. then that should be separated out from the counter to allow the counter logic to remain as streamlined as possible, in order to meet timing for that construction.

 

Naturally, there are many different ways to do the same thing in VHDL (or Verilog) but, in my opinion, your code doesn't immediately infer a counter and possibly that's why the implementation struggled at higher frequencies. I could, of course, be wrong but when I have timing problems I usually try to fix it by simplifying the critical path logic.

 

Anyway, good that you fixed your issue.

----------
"That which we must learn to do, we learn by doing." - Aristotle
0 Kudos