cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
jkirwan
Visitor
Visitor
674 Views
Registered: ‎08-11-2020

Verilog semantics question -- Is there a semantic difference between these two "equivalent" ways of writing the double-dabble algorithm?

I'm only just self-learning verilog. (I'm a hobbyist. Nothing more.) I've had lots of experience wire-wrapping CPUs from old 7400 series parts around 1974 or so. But only recently have I become interested in revisiting those times with FPGAs. I just bought a Digilent board for this purpose and I'm only now becoming familiar with the Webpack version of Vivado. Verilog is entirely new to me. (I've been studying only for a few hours' time, so far.)

I've had some good success, already. And code I've written works fine. (I've tested it and observed the behavior on the Digilent board, which does as I expected and wanted.) But I've a question that I've boiled down to a simple pair of two different ways of expressing the same thing (in my mind, anyway.) And I'd like to know if there is an important semantic difference. (The Implementation's "Report Utilization" tells me there's no difference in how it gets implemented on the XC7A35T device. So I can't tell from that report.)

I'm implementing the "dabble algorithm" here in two different ways. Both synthesize okay and I can run the result on the FPGA board I have with equal results. But here's the two bits of code to compare:

module doubledabble ( binin, adj3out );

   input binin;
   output adj3out;

   wire [3:0] binin;
   reg [3:0] adj3out;

   always @(*) begin
      adj3out = (binin > 4 ? binin + 3 : binin );
   end

endmodule

and then,

module doubledabble ( binin, adj3out );

   input binin;
   output adj3out;

   wire [3:0] binin;
   wire [3:0] adj3out;
   
   assign adj3out = (binin > 4 ? binin + 3 : binin );

endmodule

Both appear to me to work equally well and produce the exact same results when looking at the number of LUTs and slices used when I combine this into a more complex module. But I'd like to know if there is any semantic difference, broadly speaking.

Thanks in advance for any constructive thoughts,

Jon

 

 

Tags (1)
0 Kudos
2 Replies
steven_bellock
Contributor
Contributor
631 Views
Registered: ‎10-25-2018

For synthesis those two examples are equivalent, and for simulation they are 99.99 % equivalent, differing mainly in how the value at adj3out gets scheduled.

As a side note, since 2001 Verilog has supported ANSI C-style port declarations.

 

module doubledabble (input wire [3:0] binin, output reg [3:0] adj3out);

 

 

jkirwan
Visitor
Visitor
610 Views
Registered: ‎08-11-2020


@steven_bellock wrote:

For synthesis those two examples are equivalent, and for simulation they are 99.99 % equivalent, differing mainly in how the value at adj3out gets scheduled.

As a side note, since 2001 Verilog has supported ANSI C-style port declarations.

 

module doubledabble (input wire [3:0] binin, output reg [3:0] adj3out);

@steven_bellock 

Thanks so much for the quick and direct answer. I was hoping my developing mental models weren't too far afield and you've helped provide me with just a little more confidence in what I think I'm learning, right now. I'm only just developing mental concepts for the language and even tiny misunderstandings now could cause me a great deal of wasted re-education time later. So it really is very much appreciated!

I have one additional question that now comes to mind, now that you've helped confirm some thoughts I had.

I've had an experience I cannot well fathom in terms of how it is instantiated into the slices. The results of the test make little sense to me. (In both cases, I'm using the Webpack -- free -- version.) The following helps to illustrate my quandary.

Suppose I implement the same behavioral result with the following snippet. (It's in the same module format, which if you request I'd be happy to include, but I just wanted to narrow the focus and hope you'll trust me enough that the surrounding code is equivalent.)

   repeat(13) begin
      if ( z[19:16] > 4 ) z[19:16] = z[19:16] + 3;
      if ( z[23:20] > 4 ) Z[23:20] = z[23:20] + 3;
      if ( z[27:24] > 4 ) z[27:24] = z[27:24] + 3;
      if ( z[31:28] > 4 ) z[31:28] = z[31:28] + 3;
      if ( z[35:32] > 4 ) z[35:32] = z[35:32] + 3;
      z[35:1] = z[34:0]; // shift by 1
   end

Now, in the above case there are a few things of note. One is that I'm not referencing the doubledabble module. Instead, I'm explicitly identifying the specific areas involved. Another is that, in effect, I'm instantiating doubledouble five times each loop, times thirteen loops, for a total of 65 instantiations of that "mux + adder + comparator" group. There's also a shift operation here, worth noting for the moment, as well. This achieves the same exact "work to be done" as what I wanted before. (There's a bit array that is someone larger and the input source is positioned within it before the above loop starts. But while the output bits are also "elsewhere" it's the same total algorithm in the end.)

In the following case, I write:

   genvar i;
      
   assign tmpout[0]= binin[0];
   doubledabble ( .binin( { 1'b0, binin[15:13] } ),
                  .adj3out( tmp[11] ) );
   generate for ( i= 11; i > 0; i= i - 1 ) begin
      doubledabble ( .binin( { tmp[i][2:0], binin[i+1] } ),
                     .adj3out( tmp[i-1] ) );      
   end
   endgenerate
   doubledabble ( .binin( { tmp[0][2:0], binin[1] } ),
                  .adj3out( tmpout[4:1] ) );
   doubledabble ( .binin( { 1'b0, tmp[11][3], tmp[10][3], tmp[9][3] } ),
                  .adj3out( tmp[20] ) );      
   generate for ( i= 20; i > 12; i= i - 1 ) begin
      doubledabble ( .binin( { tmp[i][2:0], tmp[i-12][3] } ),
                     .adj3out( tmp[i-1] ) );      
   end
   endgenerate
   doubledabble ( .binin( { tmp[12][2:0], tmp[0][3] } ),
                  .adj3out( tmpout[8:5] ) );
   doubledabble ( .binin( { 1'b0, tmp[20][3], tmp[19][3], tmp[18][3] } ),
                  .adj3out( tmp[26] ) );      
   generate for ( i= 26; i > 21; i = i - 1 ) begin
      doubledabble ( .binin( { tmp[i][2:0], tmp[i-9][3] } ),
                     .adj3out( tmp[i-1] ) );      
   end
   endgenerate
   doubledabble ( .binin( { tmp[21][2:0], tmp[12][3] } ),
                  .adj3out( tmpout[12:9] ) );
   doubledabble ( .binin( { tmp[26][3], tmp[25][3], tmp[24][3], tmp[23][3] } ),
                  .adj3out( tmp[27] ) );
   doubledabble ( .binin( { tmp[27][2:0], tmp[22][3] } ),
                  .adj3out( tmp[28] ) );
   doubledabble ( .binin( { tmp[28][2:0], tmp[21][3] } ),
                  .adj3out( tmpout[16:13] ) );
   assign tmpout[17]= tmp[27][3];
   assign tmpout[18]= tmp[28][3];
 
   always @(*) begin
      bcdout= tmpout;
   end

The above achieves the same result. Both work fine and I have no problems, either way, in terms of the implemented and downloaded result onto the board. (The layout and floorplanning is different, however, and the center of my question, perhaps.)

The following two pictures arrive from the Synthesis results. I've used the RTL Analysis's "Open Elaborated Design" in order to drag out the information shown. The first image corresponds to the "repeat(13)" method and while it appears to me that it might take up more resourses, it actually takes up less. The second image corresponds to the "generate" method. It also results from the RTL Analysis's elaborated design display using the latter code, above. Same behavior, two different ways of writing verilog. The schematic resulting from the second case above even looks simpler and uses far fewer adds and compares and muxes. Yet it takes more resources.

Any thoughts about why? (I'm just learning, so I'm sure it will be obvious once I get a better handle on the slices, their limitations and strengths. But I'm struggling through this, right now, and would love a "boost up" on it.)

Finally, I'm including a last diagram which shows yet another way to implement the double-dabble algorithm in simple combinatorial logic using 7400 series SSI devices. (The "HA" shown there is just a half-adder using two gates, an XOR and an AND, of course.)

Thanks in advance. If you think I should ask this separately, I'll gladly do that. I'm just hoping you might have an immediate clue for me, is all.

Thanks, again

Jon

 

 

xilinx 011.png
xilinx 012.png
EESE432.png
0 Kudos