cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
chandu_sathi
Adventurer
Adventurer
663 Views
Registered: ‎05-08-2019

Issue with Input operand load to shift regsiters SRL32?

Hello all,

I am trying to implement shift register using srl32 on VCU108, verilog vivado. But it is not supporting to load input operand into shift registers. It is consuming more LUT for input operand. Whereas, it working fine with initial zero content and consuming very less LUT. Please help me with the optimal way to load input operand and consume less LUT.  I have attached my code. 

module shift_s(
data_ina,
load_s,
clk, 
shift_s, 
s_in, 
s_out
 );
 input [1023:0] data_ina;
 input load_s;
 input clk;
 input shift_s;
 input [31:0] s_in;
 output [31:0] s_out;

 reg [1023:0] s;

 
 

 
 always @ (posedge clk)
 begin
    if(load_s)
    begin
        s<=data_ina;

    end
    else if (shift_s)
    begin
         s<={s_in,s[1023:32]};
    end
 end
 assign s_out = s[31:0];
endmodule

 

0 Kudos
2 Replies
steven_bellock
Contributor
Contributor
573 Views
Registered: ‎10-25-2018

You have implemented a priority encoder for the load_s and shift_s signals, where in the event that both load_s and shift_s are asserted then load_s gets priority. If you can assure that neither signal is asserted at the same time then you, and SystemVerilog, can employ the unique0 keyword, which, if Vivado does not have a bug, should reduce the LUT count a bit.

unique0 if (load_s) begin
   s <= data_ina;
end
else if (shift_s) begin
   s <= {s_in, s[1023:32]};
end
0 Kudos
bitjockey
Adventurer
Adventurer
490 Views
Registered: ‎03-21-2011

First, it might be a little easier to read as 2D, this is really a 32-element-deep, dword-wide shift register.  Which would be 32 parallel 32b shift registers for s[31..0]

However the main problem is that SRL32 aren't loadable for all of their 32 bits in a single cycle. You get random-access read via the A(0..4) bits but writing is serial (even technically during bitstream config under the hood.)  You don't see a 32 bit bus going into the element in the description https://www.xilinx.com/support/documentation/user_guides/ug574-ultrascale-clb.pdf  (p30-31)

32 SRL-32s means you can set up to 32 of those bits at once, not all 1K.  The reason zero fill at config is working is because zeros are being shifted in via the config logic as the bistream is shifted into the chip.  You could probably (if it's like RAMs?) have a default fill that is non-zero at BOOT, but that's it.

The other option is if you have 32 idle cycles you can shift in the new fill before resuming normal operations at the output.  With output logic to know when to ignore output as partial-fill.

I don't know your full application however you could do this with 64 SRL32s and have continual functionality if you have 32 cycles of loading before hand by ping-ponging between an in-use "bank" of 32 and a "being pre-loaded" "bank" and then when preloading is done select that bank to mux to the output.  Double your current logic, but doesn't need 1K flip-flops and muxes.

0 Kudos