05-28-2016 03:16 PM
email@example.com raised recently (see posting) the question how to generate a binary encoded state number for a large FSM, which for plain performance reasons should be synthesized with a one_hot encoding.
Such a state number is for example useful for debug and/or monitoring purposes.
I wanted such a state number output in one of my designs for a 113 state FSM, and wanted an implementation which is portable and gives a predictable state number. I choose to add a mapper process like
proc_snum : process (R_STATE) variable isnum : slv7 := (others=>'0'); begin isnum := (others=>'0'); case R_STATE is when s_000 => isnum := slv(to_unsigned( 0,7)); when s_001 => isnum := slv(to_unsigned( 1,7)); ... snip when s_118 => isnum := slv(to_unsigned(118,7)); when s_119 => isnum := slv(to_unsigned(119,7)); when others => isnum := "1111111"; end case; SNUM <= isnum; end process proc_snum;
This works fine with ISE, but interfered in vivado with the fsm extraction logic. The FSM was recognized, but than not re-encoded as one_hot. See this thread for the full story. The problem occurs only in the full design, a simplified reproducer works as expected (see this posting).
So I tried a different way to express the state number generation. Since the state variable is an enumerated type one can use the 'pos attribute to determine the serial number of the state, like
proc_snum : process (R_STATE) begin SNUM <= slv(to_unsigned(state_type'pos(R_STATE),7)); end process proc_snum;
Well defined in VHDL, question was whether it is synthesizeable. Turns out it is. The code of a simple 'proof-of-principle' design is attached as ready to use vivado 2016.1 project as big2p_snum2.tar. The synthesized FMS is one_hot encoded. The SNUM generation logic seems ok. SNUM(0) for example, the least significant bit, should be '1' for all odd numbered states. A look in the schematics shows that the logic behind is indeed an or of all the odd numbered state flops (see snapshot20.png).
So I've added this simple logic to the original code. Again the code is attached, see example_snum2.tar. It synthesizes, the encoding is one_hot. However, this time the state generation logic is done differently. The output DM_STAT_SE[snum is now connected to R_STATE_reg (see snapshot21.png) It turns out that two sets of state registers are generated in parallel
R_STATE_reg with a binary encoding
FSM_onehot_R_STATE_reg with one_hot encoding
and that the whole transition logic is created twice.
The logic driving R_STATE_reg is very deep (see snapshot22.png), what one gets with a binary encoding, and exactly what one wants to avoid with a one_hot encoding.
Why the synthesis make such a seemingly crazy decision is hard to understand. The wide or needed and generated for big2p_snum2.tar is only three logic levels deep.
Bottom line is that I'm still looking for a good way to generate such a state number in vivado.
Any help or hint very welcome.
With best regards, Walter
05-30-2016 01:12 AM
I can't really offer advice (at this stage) but I am curious as to why you have generated such a massive state machine? Do you really need 113 states in a single machine?
When I did my electronic engineering degree, without studying HDLs, FSMs were naturally implemented in physical hardware - producing such a big machine was almost impossible to hand-wire into breadboards so complex logic would be broken down into smaller FSMs with single bit handshaking between two or more machines to control the flow of logic.
I can't comment on how XST or Vivado truly extract FSMs but with massive machines, possibly including complex logic or many, many transitions, I can understand how the tools don't quite produce the results you want.
06-05-2016 07:27 AM
the state machine is the core of a historic CISC CPU. To use a fairly large FSM is not uncommon for this, you'll find quite a few projects in opencores which follow this approach.
I raise this in the forum to point out that such a seemingly simple task creates havoc with the FSM extraction and optimization logic of vivado:
In simple reproducer cases the results is ok, in the real case not.
06-06-2016 11:54 PM
To use a fairly large FSM is not uncommon for this, you'll find quite a few projects in opencores which follow this approach.
That doesn't mean that it is good or even good practice to that. I'm sure there are some fine OpenCores about but I've also seen some downright dodgy ones, too. I will accept that finding this approach in several places carries some weight (unless they were written by the same engineer).
Anyway, I digress.
I see you wrote that it extracts and enumerates correctly in ISE (any particular version?) but not Vivado. I have never used Vivado in anger so my contribution ends here, I'm afraid (short of reiterating my suggestion of simplification).
I await some input from Xilinx, though.
06-12-2016 09:38 AM
The cases I referred to were FPGA implementations of historic CISC CPUs, done by different persons. The original implementation of these CPUs used typically microcode executed by a sequencer which controlled the data path logic. That's Just another way to express that a state machine was used. Simply re-building these old designs 1-to-1 on an FPGA is not a good solution. There are legal issues (architecture patents expire, the copyright on the microcode doesn't), and technical issues (concepts like multi-phase clocks or clock cycle stretching don't map well to FPGAs). So for cases which had a modest number of micro-states the most straightforward solution is a 'case..when' FSM. For more complex cases, the cut is most likely at about 100 states, it's better or even necessary to build a micro-engine and micro-code, which results however in quite a bit more tooling and implementation effort.
So much to way sometimes larger FSMs pop up, there may be other good reasons too.
If somebody has input for the original question I'd truly appreciate this.
With best regards, Walter