cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
ronnywebers
Advisor
Advisor
15,067 Views
Registered: ‎10-10-2014

FSM coding - 1 vs 2 vs 3 process style - which one is preferred

Jump to solution

There seem to be 3 kinds of 'templates' for FSM, using 1, 2 or 3 processes. 

 

UG901 - Synthesis, HDL coding techniques (chapter 3), FSM components, shows a 3 process FSM like this :

 

fsm 3 process.png

 

the Verilog and VHDL example a few pages further shows a single process FSM.

 

I've read in some forum posts that 2 and 3 process coding style is inherently more buggy, and that 1 process coding is preferred. But without any further explanation why. Is this a correct statement? As UG901 only shows a single process FSM, I'm tempted to believe this. But many books use 2 or 3 process FSMs.

 

** kudo if the answer was helpful. Accept as solution if your question is answered **
0 Kudos
1 Solution

Accepted Solutions
avrumw
Expert
Expert
21,482 Views
Registered: ‎01-23-2009

First, I want to state that this is all just "style".

Done properly, all three types of implementations can be equally correct and (more or less) equally efficient in terms of the final result.

So it really is style - which is easier.

I (and others) will argue that the one process state machine is the easiest. Some of the reasons have been listed here - no need to deal with complicated sensitivity lists, not having two versions of your state variable (the current and the next), not being able to infer accidental latches (although this is a bug not really specific to FSM coding style). But for me, mostly, it is just less typing (hence less opportunity for bugs)!

The one process state machine also has the advantage that it can directly code for the CE of the flip-flop - inside the clocked process an "elseless if" maps directly to the clock enable. if the "next_state" logic is separate you need to explicitly (either at the top or in each if/else branch) explicitly code "next_state <= state" which codes for a multiplexer with feedback. The tool can almost certainly move back and forth between these two (MUX with feedback vs. CE) during optimization, but I find the concept of the "else do nothing" more in line with what we are trying to do in a state machine.

Now, as for the question about outputs, we do have two competing coding styles. The 1 process state machine cannot do combinatorial outputs. Thus, it is impossible to have anything in the "output function". This limits the one process state machine to be "Simple Moore" - a Moore machine with the output logic being identically equal to the current state. This meshes well with the concept that "all module outputs come from flip-flops", when the state machine outputs have to leave a module.

While this looks like a big restriction, it can be minimized - any Moore machine can essentially be converted to a Simple Moore machine by choosing the state bits carefully (or having more state bits). Furthermore, Mealy machines can be converted to Moore machines, although this can increase the length of the critical path from the inputs.

So, we end up with the statement that Mealy machines can't be the outputs of a module - and this is a restriction that I wholeheartedly agree with - you should never have a combinatorial path through a module!

As for the idea of putting flip-flops after the "output function",  that would certainly meet the "all outputs come from flip-flops" rule. But this combined thing is a "state machine followed by flip-flops" - it is not "just a state machine". With the extra flip-flops the outputs reflect what happened two clocks ago, not on the previous clock - so, by definition, this is no longer truly a "state machine".

And, by the way, there is nothing that says that you always have to do the same thing. While I am a strong proponent of the 1 process state machine, I occasionally (OK, rarely) use a two process state machine - mostly when I need to convert a Mealy machine to a Moore machine (using the "next_state" logic as part of generating the outputs through the "output function".

Avrum

View solution in original post

18 Replies
drjohnsmith
Teacher
Teacher
15,044 Views
Registered: ‎07-09-2009

with one big process , our less likely / its impossible to make un intended latches,

 

as everything is under the clock 

 

two and three process state machines tend to have big sensitivity lists in VHDL,

     ( Unless you use the new all option !! ) 

 

company policy is the other driving force.

 

We used to use two process blocks 20 odd years ago, but now all single process.

 

Unless your doing asynchronous state machines, 

     then all bets are off...

 

 

<== If this was helpful, please feel free to give Kudos, and close if it answers your question ==>
nagabhar
Xilinx Employee
Xilinx Employee
15,043 Views
Registered: ‎05-07-2015

HI @ronnywebers

 

>>ive read in some forum posts that 2 and 3 process coding style is inherently more buggy,
I don't see why this should be the case. 

Writing a 2 process FSM is just a convenience to separate out the sequential  part  of the FSM  and describe the  next logic  in a  combination-logic process.
the sequential process will simply have curr_state <= next_state statement.

UG901 providing  a single sequential process FSM  example should not be interpreted as a suggestion to prefer the single process approach.





Thanks
Bharath
--------------------------------------------------​--------------------------------------------
Please mark the Answer as "Accept as solution" if information provided addresses your query/concern.
Give Kudos to a post which you think is helpful.
--------------------------------------------------​-------------------------------------------
ronnywebers
Advisor
Advisor
15,029 Views
Registered: ‎10-10-2014

thanks @nagabhar@drjohnsmith - additional question if I may : if process 3 (output) generates signals going to the outside world, like i.e. SPI interface, wouldn't an output register be necessary? Or a process 3 made a clocked process instead of combinatorial?

** kudo if the answer was helpful. Accept as solution if your question is answered **
0 Kudos
avrumw
Expert
Expert
21,483 Views
Registered: ‎01-23-2009

First, I want to state that this is all just "style".

Done properly, all three types of implementations can be equally correct and (more or less) equally efficient in terms of the final result.

So it really is style - which is easier.

I (and others) will argue that the one process state machine is the easiest. Some of the reasons have been listed here - no need to deal with complicated sensitivity lists, not having two versions of your state variable (the current and the next), not being able to infer accidental latches (although this is a bug not really specific to FSM coding style). But for me, mostly, it is just less typing (hence less opportunity for bugs)!

The one process state machine also has the advantage that it can directly code for the CE of the flip-flop - inside the clocked process an "elseless if" maps directly to the clock enable. if the "next_state" logic is separate you need to explicitly (either at the top or in each if/else branch) explicitly code "next_state <= state" which codes for a multiplexer with feedback. The tool can almost certainly move back and forth between these two (MUX with feedback vs. CE) during optimization, but I find the concept of the "else do nothing" more in line with what we are trying to do in a state machine.

Now, as for the question about outputs, we do have two competing coding styles. The 1 process state machine cannot do combinatorial outputs. Thus, it is impossible to have anything in the "output function". This limits the one process state machine to be "Simple Moore" - a Moore machine with the output logic being identically equal to the current state. This meshes well with the concept that "all module outputs come from flip-flops", when the state machine outputs have to leave a module.

While this looks like a big restriction, it can be minimized - any Moore machine can essentially be converted to a Simple Moore machine by choosing the state bits carefully (or having more state bits). Furthermore, Mealy machines can be converted to Moore machines, although this can increase the length of the critical path from the inputs.

So, we end up with the statement that Mealy machines can't be the outputs of a module - and this is a restriction that I wholeheartedly agree with - you should never have a combinatorial path through a module!

As for the idea of putting flip-flops after the "output function",  that would certainly meet the "all outputs come from flip-flops" rule. But this combined thing is a "state machine followed by flip-flops" - it is not "just a state machine". With the extra flip-flops the outputs reflect what happened two clocks ago, not on the previous clock - so, by definition, this is no longer truly a "state machine".

And, by the way, there is nothing that says that you always have to do the same thing. While I am a strong proponent of the 1 process state machine, I occasionally (OK, rarely) use a two process state machine - mostly when I need to convert a Mealy machine to a Moore machine (using the "next_state" logic as part of generating the outputs through the "output function".

Avrum

View solution in original post

drjohnsmith
Teacher
Teacher
15,007 Views
Registered: ‎07-09-2009

output registers,

 

I'd typically re register the output of the FSM , 

     in which case the register might be put into the IOB, 

        

Vivado will just work till it meets your timing constraints, 

    so as to is it in the IOB or not, is not 100 % to do with the FSM.

 

Vivado could register duplicate, have one inside the FPGA , and a duplicate in the IOB,

   

remember , TCL can fix anything.

 

 

<== If this was helpful, please feel free to give Kudos, and close if it answers your question ==>
ronnywebers
Advisor
Advisor
14,969 Views
Registered: ‎10-10-2014

thanks @avrumw

 

I've rewritten my 3 process FSM to a single process FSM, and it looks indeed much cleaner as you said.

 

I'm wondering if I can get it even more cleaner, so could you please review my questions below. Based on your feedback I will cleanup the code further in a next reply.

 

 

	
	-- signal declarations
	...   
	signal s_fifo_wren      : std_logic := '0';

    constant kTRIGGER_LEVEL : std_logic_vector(13 downto 0) := "00000000100101"; 
    
    type trigger_state_type is (idle, waitingfortrigger, triggered);
    signal trigger_state : trigger_state_type;
    
    signal s_holdoff_count                  : unsigned(7 downto 0);
    signal s_signal_above_triggerlevel      : std_logic;
    signal s_signal_above_triggerlevel_p1   : std_logic;
    signal s_rising_edge_tick               : std_logic;
    
    signal s_log_event_tick                 : std_logic;
	...

begin

	-- *** 2-FF CDC for FSM enable bit (reg_io_0, bit 1) ***
	-- the reg_io_0(1) bit comes from a CPU register in another (at least 2x faster) clock domain
	-- we need to sync this bit into the CLOCK_IO domain first, before using it in the FSM 	
	process(CLOCK_IO)
	begin
		if rising_edge(CLOCK_IO) then
		    if RESET_IO = '1' then						-- synchronously reset the CDC regs
		        s_enable_sync_1 <= '0';
		        s_enable_sync_2 <= '0';
		    else
		        s_enable_sync_1 <= reg_io_0(1);			-- bit coming from CLOCK_CPU domain
		        s_enable_sync_2 <= s_enable_sync_1;		-- 2-FF synchronizer
		    end if;
		end if;
	end process; 

	...

    -- *** oscilloscope 'rising edge trigger' ***
    
    -- first detect if the ADC input signal goes above trigger level (hard-coded constant for this example)
    s_signal_above_triggerlevel <= '1' when reg_adc_data_i > kTRIGGER_LEVEL else '0';
    
    -- generate a 'rising edge tick' when ADC input crosses this trigger level from low to high
    s_signal_above_triggerlevel_p1 <= s_signal_above_triggerlevel when rising_edge(CLOCK_IO);
    s_rising_edge_tick <= s_signal_above_triggerlevel and not s_signal_above_triggerlevel_p1;
    
    -- *** FSM *** 
	-- function : 		
	-- on a s_rising_edge_tick, generate a s_log_event_tick
	-- then wait a specified amount of clock cycles (s_holdoff_count) before accepting a new trigger

    fsm_th_trig : process(CLOCK_IO)
    begin
        if rising_edge(CLOCK_IO) then
            if RESET_IO = '1' then      -- synchronous reset
                trigger_state <= idle;
                s_holdoff_count <= (others => '0');
                s_log_event_tick <= '0';
            else
                case trigger_state is 
                    when idle =>
                        s_log_event_tick <= '0';
                        if s_enable_sync_2 = '1' then
                            trigger_state <= waitingfortrigger;
                        else
                            trigger_state <= idle;
                        end if;
                        
                    when waitingfortrigger =>
                        if s_enable_sync_2 = '0' then
                            trigger_state <= idle;
                            s_log_event_tick <= '0';
                        else
                            if s_rising_edge_tick = '1' then
                                s_holdoff_count <= X"40";       -- preload holdoff_count, hardcoded for this example
                                s_log_event_tick <= '1';        -- generate single log tick on rising edge
                                trigger_state <= triggered;
                            else
                                trigger_state <= waitingfortrigger;
                                s_log_event_tick <= '0';
                            end if;
                        end if; 
                        
                    when triggered =>
                        s_log_event_tick <= '0';                    
                        if s_holdoff_count = X"00" then
                            trigger_state <= waitingfortrigger; -- stay here until holdoff_count expires
                        else
                            s_holdoff_count <= s_holdoff_count - 1;
                            trigger_state <= triggered;
                        end if;
                        
                    when others =>
                        s_log_event_tick <= '0';                           
                        trigger_state <= idle;
                end case;
            end if;
        end if;
    end process fsm_th_trig;
    
	-- connect signals to the event log FIFO
    s_fifo_wren <= s_log_event_tick;	-- s_log_event_tick = output of the FSM
	s_fifo_data <= s_event_time;		-- s_event_time is some counter running on CLOCK_IO	

 

I've put some explanation in comments, I hope this is clear enough

 

Q1 : if I understand your comment 'inside the clocked process an "elseless if" maps directly to the clock enable' means that I can rewrite for example the idle state like this :

 

when idle =>
    s_log_event_tick <= '0';
    if s_enable_sync_2 = '1' then
        trigger_state <= waitingfortrigger;
    end if;

 so the else is not needed, just like in a regular clocked counter where the count <= count + 1 does not need an else case with count <= count

 

Q2 : can I drop the 'when others', or is it safer to leave it there?

 

Q3 : can I put a default value of '0' for s_log_event_tick  before the line 'case trigger state is', such that I only need to write s_log_event_tick <= '1' in the waitingfortrigger state?

 

 

** kudo if the answer was helpful. Accept as solution if your question is answered **
0 Kudos
avrumw
Expert
Expert
14,952 Views
Registered: ‎01-23-2009

I am not a VHDL expert so I am a bit hesitant to answer these (I use Verilog/SystemVerilog), but...

 

Q1: Yes

 

Q2: It is probably a good idea to keep the "when others".

 

In terms of behavior, this doesn't do anything - without it, you are implicitly stating "If I am in a state other than the ones I have listed here, then do nothing". Since you are enumerating all the legal states, then this means "If I am in an illegal state then stay here" - since you shouldn't ever be in an illegal state, then this doesn't matter.

 

However, failures happen (see "Single Event Upsets"), and it is generally preferable for a failed state machine to return to one of the legal states.

 

Of course this all gets far more complex when the synthesis tool starts recoding the state vectors...

 

Q3: This is certainly legal in Verilog/SystemVerilog, and I am pretty certain it is also legal in VHDL.

 

Avrum

drjohnsmith
Teacher
Teacher
14,944 Views
Registered: ‎07-09-2009

you will need the when others , especially if you simulate.

 

VHDL , has 9 logic levels,

     the state machine 'only' defines 0 or 1 levels, so the tools complain about the missing states unless you include the others.

 

The joys of a tight defined language, I love it, but others hate it.

 

If you have the logic space in the FPGA,

     a style that I see in companies is to deifne a oops state ,

           and the 'others' sate jumps to the oops state

 

 and the oops jumps to the reset state.

 

then you have the option of defining an error flag in the oops state for the run time system,

     

     when oops  =>

          trigger_state <= idle;

   

     when others =>

           trigger_state <= oops;

 

 

again in some companies standards are the constant,  put a B in the front to 110 % guarantee its binary.

 

constant kTRIGGER_LEVEL : std_logic_vector(13 downto 0) := B"00000000100101";

 

              

            

<== If this was helpful, please feel free to give Kudos, and close if it answers your question ==>
ronnywebers
Advisor
Advisor
14,865 Views
Registered: ‎10-10-2014

thanks @drjohnsmith@avrumw

 

I've updated part of the code with your suggestions, this becomes much more readable - guess more optimal is not possible? (I left out the oops state, but thanks for the great tip @drjohnsmith)

 

Q: about the counter : is this common practice to code it into the FSM, or is it better to leave counters out of the FSM, and use enable signals etcetera?

 

fsm_th_trig : process(CLOCK_IO)
begin
	if rising_edge(CLOCK_IO) then
		if RESET_IO = '1' then      -- synchronous reset
			trigger_state <= idle;
			s_holdoff_count <= (others => '0');
			s_log_event_tick <= '0';
		else
			-- default signal values
			s_log_event_tick <= '0';
			
			-- states
			case trigger_state is 
				when idle =>
					if s_enable_sync_2 = '1' then
						trigger_state <= waitingfortrigger;
					end if;
					
				when waitingfortrigger =>
					if s_enable_sync_2 = '0' then
						trigger_state <= idle;
					else
						if s_rising_edge_tick = '1' then
							s_holdoff_count <= X"40";       -- preload holdoff_count, hardcoded for this example
							s_log_event_tick <= '1';        -- generate single log tick on rising edge
							trigger_state <= triggered;
						end if;
					end if; 
					
				when triggered =>                
					if s_holdoff_count = X"00" then
						trigger_state <= waitingfortrigger; -- stay here until holdoff_count expires
					else
						s_holdoff_count <= s_holdoff_count - 1;
					end if;
					
				when others =>                        
					trigger_state <= idle;
			end case;
		end if;
	end if;
end process fsm_th_trig;

 

** kudo if the answer was helpful. Accept as solution if your question is answered **
0 Kudos
drjohnsmith
Teacher
Teacher
10,677 Views
Registered: ‎07-09-2009

looks very professional,

 

as for counter in sm or outside with an enable..

 

Easier to read in the sm, 

    and most tools most of the time understand what you want and give a enabled counter,

 

depends how critical / close to the edge you are on speed / size 

 

For this one, I'd probably go the way you have,

 

but , despite what some might say, its down to taste as much as anything

<== If this was helpful, please feel free to give Kudos, and close if it answers your question ==>
avrumw
Expert
Expert
10,670 Views
Registered: ‎01-23-2009

I highly prefer keeping the counter in the same process as the state machine.

 

Trying to keep the counter separate has some disadvantages...

 

If you try and use Moore outputs for communication between the two, then you end up with one clock of latency. This can be compensated for by reducing the count value by one, but creates corner cases (what if your counter reload value is only 1?)

 

If you try and use Mealy outputs for the communication, then you end up repeating case conditions in both your state machine and counter, For example, your counter increment condition in the separate counter would be (forgive the language change...)

 

 

if ((trigger_state ==  waitingfortrigger) && !s_enable_sync2 && s_rising_edge_tick) 
   s_holdoff_count <= 8'h40
else if ((trigger_state == triggered) && (s_holdoff_count != 0))
   s_holdoff_count <= s_holdoff_count - 1'b1;

which ends up replicating a whole whack of the conditions that already exist (and hence are decoded for) in the state machine. The tool can probably share this logic, but it quickly becomes difficult to maintain this code if it needs to change to fix bugs (as the conditions exist in 2 places).

 

Conversely, I don't see any disadvantage to keeping them in the same process. The tools are capable of separating the "state machine" part of the process from the "counter" part of the process and still do all the optimizations on the state machine.

 

Avrum

ronnywebers
Advisor
Advisor
2,036 Views
Registered: ‎10-10-2014

@avrumw  I hope I can get back to this post once more, it has been extremely helpful to me so far, my HDL life changed after I started coding 1 process FSMs

However (cause I ran into a strange issue in another FSM) : I'm currently wondering if I need to assign a value to a signal or output value in each state, i.e. suppose I want to read the 's_holdoff_count' from outside the FSM (i.e. connect it to an AXI register to monitor the value, just as an example) : what does the compiler do with 's_holdoff_count' in the states or if/else cases where I don't explicitly provide it with a value, or don't write something like s_holdoff_count <= s_holdoff_count (so it keeps it's current value)?

What does the compiler do in such situation? Does it asume the 's_holdoff_count <= s_holdoff_count' in these cases (so 'do nothing' with the value), or can it assign '0', or?

Or should I put 's_holdoff_count <= s_holdoff_count' in the 'default signal' section right before the 'case' statement starts?

** kudo if the answer was helpful. Accept as solution if your question is answered **
0 Kudos
drjohnsmith
Teacher
Teacher
2,028 Views
Registered: ‎07-09-2009

Suggest yo start a new thread , so that this does not become lost.

for reference, 

try this free book

http://freerangefactory.org/

 

<== If this was helpful, please feel free to give Kudos, and close if it answers your question ==>
richardhead
Scholar
Scholar
2,019 Views
Registered: ‎08-01-2012

@ronnywebers 

It might be easier with a code example. In VHDL, in a clocked process, a signal assignment to itself is not required as this is the normal behaviour of a register - it holds its value unless otherwise assigned. Assigning to itself will make no functional difference to the code, just add superfluous lines of code. No signal is ever assigned a default value out of thin air - all assignments are made somewhere.

so the following are functionally and hardware wise identical:

 

type state_t is (s1, s2);
signal state : state_t;

process(clk)
begin
  if rising_edge(clk) then
    case state is
      when s1 =>
        state <= s2;
      when s2 => 
        if change_state then
          state <= s1;
        end if;
    end case;
  end if;
end process;
        
type state_t is (s1, s2);
signal state : state_t;

process(clk)
begin
  if rising_edge(clk) then
    state <= state;  -- not needed - this will be infered because its a register.
    case state is
      when s1 =>
        state <= s2;
      when s2 => 
        if change_state then
          state <= s1;
        end if;
    end case;
  end if;
end process;

 

 

The only difference is in simulation you'll get extra 'transaction attribute toggles (as this attribute toggles on every signal assignment). But thats simulation only.

Btw, if you ever need to output the state, there is a quick way to get it without manually encoding it:

 

output_state : integer;

output_state <= state_t'pos(state);

 

This outputs the state as an integer. It doesnt affect the state encoding (which will default to one hot if you leave it at default with more than 3 states) but is a nice easy thing to connect to status registers if needed, and no need to create and debug extra signals. You just need to know the position of the states.

avrumw
Expert
Expert
1,947 Views
Registered: ‎01-23-2009

so the following are functionally and hardware wise identical:

I would argue that they are not quite identical. Certainly they are functionally identical, but is the hardware really identical?

If you look at a state machine, for each bit, there is a "next state" generator in front of the flip-flop - a bubble of logic in front of the flip-flop. This determines the next state based on the current state and the inputs; in the most general form this can be viewed as a giant MUX that selects one of the states under the appropriate condition. When you assign the state to explicitly hold the same state, you are adding an input condition to the MUX. Literally this means "under these conditions assign the next state to this value, which happens to be the current state" - a feedback of the Q back to the D (through an input combination of the MUX).

Conversely if you have an else-less if in a branch of the state machine (in a one process state machine coding style), this reads "do not change the state under this condition". When a flip-flop does not need to update, the tools can use the CE (disable the CE) to hold the same value. 

So the "state <= state" codes for a MUX, the non-assignment of state codes for a CE.

There are advantages and disadvantages (from a hardware implementation) for both; the moving logic to the CE from the D can speed up the design - less logic in the combinatorial logic on the D input. Conversely, using CEs can mean different control sets, which can lead to poorer slice utilization.

The synthesis tool can convert back and forth between these two implementations; they are functionally identical and only differ in optimization, and the synthesis does lots of optimization. However, I suspect that the tool can more easily convert a CE to a MUX if there is a control set problem, but less easily identify one of several MUX inputs as a "keep the same value" and move it to the CE.

In fact it is for this reason that I generally do not prefer the FSM format that has an explicit "next_state" combinatorial process - by definition, you must have a "next_state <= state" (or assign all branches of the process) which codes for a MUX; there is no explicit way of coding for a CE in this FSM coding style.

Avrum

 

ronnywebers
Advisor
Advisor
1,841 Views
Registered: ‎10-10-2014

Thanks a lot @avrumw  and @richardhead for your insights, I never thought so 'low level' about coding style and the resulting hardware (like one version compiles to CE, the other one to a MUX).

I'm not sure if this also answers my actual question, guess so, but just to make sure I expanded a bit on the simple code piece (left out the definitions to KISS). I also added a 'when others' state, maybe this is not really needed anymore with modern compilers?

Version 1 is how I currently write my code

Version 2 is where I have my doubts about : should I add the ' stored_value <= stored_value;' or not to the default section before the else case, just like I do for the 'done_tick'? Or is there a difference here, namely I want the 'done_tick' to be '0' in all states, except for s3 where it generates a single tick. While the 'stored_value' is an inferred register, which will only update in s2 if the condition 'change_state' is '1'?

Version 3 is something that I find a lot in code I inherited from a previous collegue, which I call the 'I-don't-really-trust-the-compiler-so-I-code-everything-in-every-state'-version. I guess that technically there's nothing wrong with version 3, but it's hard to read and complete overkill / redundant lines of code

 

-- version 1
type state_t is (s1, s2, s3);
signal state : state_t;

process(clk)
begin
  if rising_edge(clk) then
    -- default assignment
    done_tick <= '0'

    case state is
      when s1 =>
        state <= s2;
      when s2 => 
        if change_state then
          stored_value <= data_from_bram;
          state <= s3;
        end if;
      when s3 =>
        done_tick <= '1';
        state <= s1;
      when others => 
        state <= s1;  -- safeguard
    end case;
  end if;
end process;

 

 or

 

-- version 2
type state_t is (s1, s2, s3);
signal state : state_t;

process(clk)
begin
  if rising_edge(clk) then
    -- default assignment
    done_tick <= '0'
    stored_value <= stored_value;

    case state is
      when s1 =>
        state <= s2;
      when s2 => 
        if change_state then
          stored_value <= data_from_bram;
          state <= s3;
        end if;
      when s3 =>
        done_tick <= '1';
        state <= s1;
      when others => 
        state <= s1;  -- safeguard
    end case;
  end if;
end process;

 

or 

 

 

-- version 3
type state_t is (s1, s2, s3);
signal state : state_t;

process(clk)
begin
  if rising_edge(clk) then
    case state is
      when s1 =>
        stored_value <= stored_value;
        done_tick <= '0';
        state <= s2;
      when s2 => 
        if change_state then
          stored_value <= data_from_bram;
          done_tick <= '0';
          state <= s3;
        else
          stored_value <= stored_value;
          done_tick <= '0';
          state <= s2;
        end if;
      when s3 =>
        stored_value <= stored_value;
        done_tick <= '1';
        state <= s1;
      when others => 
        stored_value <= stored_value;
        done_tick <= '0';
        state <= s1;  -- safeguard
    end case;
  end if;
end process;

 

 

I could even add a fourth one that I've seen in legacy code, with in the else statement an 'else NULL', but I stay far away from that as I don't see the use for NULL

So if I'm correct, functionally all 3 are the same, but the compiler will code version 2 and 3 to a MUX, and version one to CE?

Maybe @drjohnsmith was right to start a new thread (?) however I have already pointed a few colleagues to this post, as it's very educational when it comes to single process FSM's, and I think this completes the last bits in the post. And last but not least, I recently watched the Xilinx on-demand video training of VHDL, where they seem to recommend 2 or 3 process FSMs over single process FSM's... kind of strange, since I switched to single process FSMs after this post, my coding life became so much easier...

** kudo if the answer was helpful. Accept as solution if your question is answered **
0 Kudos
richardhead
Scholar
Scholar
1,832 Views
Registered: ‎08-01-2012

@ronnywebers 

Version 3 can be good if you are trying to reduce control sets. in V3, the outputs will simply be a decoded version of the state. The others will use clock enables on the output regs to control the output based on the state - and this may make routing harder. I generally use V1 and dont really worry too much about clock enables - for 200Mhz designs on an ultrascale resources are usually not much of an issue.

I do note though that "when => others" may mask errors in your code. In a case statement you are required to provide a case for all possible values. If you are using a custom state type, if you add a new state, and forget to actually add a behaviour for that state, if there is no when => others state you will get a syntax error. A when others case will just take the "others" option for that state. I have never tested, but I do not know if a synthesisor will actually use the when others for actual illegal states  - I suspect it actually wont - and the when others may actually be a false comfort blanket. In the default for a 1-hot state machine I beleive it will only use the state bits it actually cares about and not check the state bits it doesnt need. If you are writing code where illegal states must be accounted for, you will likely not be default state encodings.

I have seen posts arguing and showing that 2/3 processes can be "faster", but in 15 years, Ive not had a problem with a single process style.

maps-mpls
Mentor
Mentor
1,671 Views
Registered: ‎06-20-2017

In a poor imitation of E.B. White, but with the intent to add to the already solved discussion, this is what I claim:

  1.  In the old days, synthesizers would choke on 1 process state machines.
  2.  Many VHDL authors recommended 2 and 3 process state machines.  It was good advice before 1995.
  3.   It also helped students who were learning about Mealy and Moore state machines (academic classifications) as part of their engineering education.
  4.  Synthesizers got better.
  5.  People who wrote VHDL books keep following and recommending old rules of thumb that--though once good advice--were made obsolete by better synthesizers.
  6.   In general, a single process state machine is far more readable and maintainable than a 2 or 3 process state machine.  No scrolling up and down in your file.  You can literally read it straight through one time and construct a state transition table (in your mind or on paper).
  7.  Separating combinational  and sequential logic is the synthesizer's day job.  The designer should stick to designing and not attempt to micromanage the synthesizer by coding up 2 or 3 process state machines.
  8.   Mealy v. Moore are just labels and classifications used in academic settings.  I care about latency, timing, and functionality in the real world.  What particular type of state machine the synthesizer implements is no matter.  It can be Mealy, More, or registered Mealy—it is of no concern to me as long as it meets requirements and makes timing.
  9.  For timing considerations, outputs should be registered anyway, including in state machines, unless there is some reason not to (e.g., same cycle acknowledge on the RDY signal in AXI interfaces have to be combinational).
  10.  When using an enumerated data type in VHDL, and after fully enumerating every possible value in a case statement, the "when others =>" clause has no effect in simulation.  That is, you cannot then write a test bench that will execute any of the lines under the "when others =>" clause, as code coverage tools (or a little thinking) can confirm.  
  11.  And the "when others =>" clause in a fully enumerated case statment usually doesn’t have an effect in synthesis--unless your synthesizer user guide specifically states that it will look at the “when others =>” (default for verilog) for purposes of generating a safe state machine.   And if the synthesizer does support it for safe state machines, it makes verification of pre-synthesis RTL with a fully enumerated case statement, shall we say, "challenging."  It is hard to simulate a scenario that is logically impossible in the pre-synthesis RTL.
  12.   In simulation it will be impossible to execute any of your statements in the “when others =>” clause if you have a fully enumerated state machine.  The state variable is only allowed to take a finite number of possible enumerated values, and the case statement by being fully enumerated already covers every single possible value.  That is, logically, there are no others.
  13.   Consequently, any company standard that requires a “when others =>” in a fully enumerated case statement may be considered to be promulgating a silly and misguided holdover requirement maintained by those who confuse user enumerated types used in a fully enumerated case statement with standards based enumerated types (like std_logic_vector) where it is almost always impractical to fully enumerate every possible combination (e.g., a two bit std_logic_vector would require 81 entries before rendering the "when others =>" superfluous.  Likewise, a three bit std_logic_vector would require 729 entries in a case statement.)
  14. It is possible for a synthesizer to pay attention to the “when others =>” clause in a fully enumerated case statement for purposes of creating a safe state machine, but you will have to look at your synthesis user guide for the synthesizer you are using.  In my experience, different synthesizers support safe state machines with different constructs and different attribute syntax.  It is not IEEE driven (well, I haven’t looked at the latest standard, which will probably be supported by mainstream EDA tools sometime in the 2030s), and there are probably other vendor defined attributes you will need to add to communicate to the synthesizer that you want a safe FSM.  The synthesis user guide for the synthesizer you're using is the governing document for safe FSMs.  Different synthesizers support safe state machine synthesis options using different design entry techniques (different attributes or constraints, for example).
  15. If you need combinational  outputs (mealy), and only if you need them (e.g., for same cycle acknowledge mentioned above), you can do that with a concurrent assignment based on inputs and current state outside of the process.  I usually do this right after the FSM process as a concurrent assignment, to avoid unnecessary typing of a seperate process and tiresome maintenence of a combinational sensitivity list.  (Yes, VHDL-2008 has an ALL option to fix sensitivity list maintenance, but I'm not using VHDL-2008 until it is consistently supported in synthesis and simulation).   Unregistered Mealy outputs should be driven by functional specifications, and if not required, to beat a dead horse, outputs should be registered.  If there is no functional requirement for a combinational  output (combination of inputs and current state) then you should register your outputs for static timing analysis purposes.
  16. If you include datapath assignments in the state machine, be very careful.  I usually as a matter of preference have data path assignments occur in states that persist for one clock cycle.  E.g.,

 

 

 

 

  when SOME_LD_STATE => 
    data_path <= iData;
    my_state  <= SOME_OTHER_STATE;

 

 

 

 

  1. When combined with auto FSM encoding setting in the synthesizer, this will often create a "CE" from the state bit if the synthesizer opts for one-hot, though this may not be necessary.  (E.g, the LUTs are underutilized in this scenario).
  2.   Also always use auto encoded state machines unless there is a really good reason to force a certain encoding (e.g., to comply with a communications standard, but even then, there are other ways to comply with state encoding for communications purposes).  If you follow the other guidelines regarding levels of logic, registered outputs, etc., the synthesizer in my experience can be trusted to pick the best FSM encoding.
  3. This also has the benefit of providing an opportunity for a multi-cycle path constraint to data_path, depending on the rest of the design.
  4. If you have a synchronous reset in a synchronous process (FSM or otherwise), make sure everything on the left-hand side of an assignment in the “else” part of the reset conditional is also on the left-hand side of a literal assignment in the “if” part of the reset conditional, or you will infer "CE" functionality (data enable, also referred to as "clock enable" by Xilinx).   This can affect slice utilization and levels of logic unnecessarily, though more recent FPGA architectures mitigate this issue.  This issue is particularly challenging since state machines have a tendency to grow as requirements evolve, and designers are not always careful to apply reset to new registers added to the state machine process after it was originally coded.  Failure to be diligent here can lead to "CE Inference" and sometimes "CE inference" leads to failure to meet timing symptoms in other parts of your design, making troubleshooting challenging.  Consequently, design reviews should always make a point to verify "CE Inference" avoidance for synchronous resets, especially in state machines.  (Though US/US+ architecture mitigates the problem, "CE inference" with a synchronous reset is probably not what the designer intended, and should be avoided anyway.)
  5. As always, avoid reset altogether if it is not necessary, and just rely on the INIT functionality.  This will reduce congestion, eliminate risk of "CE inference" with synchronous resets, and improve slice utilization.

You are free and encouraged to do your own experiments and come to your own conclusions.  Synthesizers evolve over time, and the IEEE gives more latitude for how synthesizers work than they do for simulators.  So even the advice here may evolve over time.

 

EDITED To fix errors.

*** Destination: Rapid design and development cycles *** Unappreciated answers get deleted, unappreciative OPs get put on ignored list ***