cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
Agner
Observer
Observer
412 Views
Registered: ‎05-08-2020

Cannot prevent distributed RAM

Jump to solution

I am making a softcore with 64 kB of RAM. Vivado is using distributed RAM for this rather than block RAM, which is rather wasteful. I am getting this warning:

SYNTH #1 Warning The instance data_cache_inst/dataram_reg_0_255_0_0 is implemented as distributed LUT RAM for the following reason: The timing constraints suggest that the chosen mapping will yield a better timing.

This constraint is not helping:

 

set_property RAM_STYLE BLOCK [get_cells -hierarchical dataram*]

 

Lowering the clock frequency does not help either.

The device is an Artix-7 100T on a Nexys board.

Code for the RAM module:

`include "defines.vh"


// read/write data cache, (2**`DATA_ADDR_WIDTH)*8 bytes
// read/write data cache, 4096*64 bits = 32kB
module data_cache (
    input clock, // clock
    input clock_enable,                 // clock enable. Used when single-stepping
    input [`COMMON_ADDR_WIDTH-1:0] read_write_addr, // Address for reading and writing from/to ram
    // The lower 3 bits of read_write_addr indicate a byte within an 8 bytes line
    input read_enable,  // read enable
    input [1:0] read_data_size,  // 8, 16, 32, or 64 bits read
    input [7:0] write_enable,            // write enable for each byte separately 
    input [63:0] write_data_in, // Data in. Always 64 bits. Any part of the write bus can be used when the data size is less than 64 bits
    output reg [`RB1:0] read_data_out // Data out
);

//`define DATA_ADDR_WIDTH 16

// read/write data ram
reg [63:0] dataram [0:(2**(`DATA_ADDR_WIDTH-3))-1]; // 64kB RAM

// split read/write address into double-word index, and byte index
logic [`DATA_ADDR_WIDTH-4:0] address_hi; 
logic [2:0] address_lo; 
logic address_valid;

always_comb begin
    address_hi = read_write_addr[`DATA_ADDR_WIDTH-1:3]; // index to 64-bit lines
    address_lo = read_write_addr[2:0];                  // index to byte within line
    address_valid = read_write_addr[`COMMON_ADDR_WIDTH-1:`DATA_ADDR_WIDTH] == 0; // excluce code addresses
end 


// Data write:
/* This version gives problems addressing the ram in the constraints file:
genvar i; // byte index
generate 
for (i = 0; i < 8; i++) begin
    always_ff @(posedge clock) if (clock_enable & address_valid) begin
        // write data to RAM. Each byte enabled separately
        if (write_enable[i]) begin
            dataram[address_hi][(i*8)+:8] <= write_data_in[(i*8)+:8];
        end    
    end
end
endgenerate*/

always_ff @(posedge clock) if (clock_enable & address_valid) begin
    // write data to RAM. Each byte enabled separately
    if (write_enable[0]) dataram[address_hi][ 7: 0] <= write_data_in[ 7: 0];
    if (write_enable[1]) dataram[address_hi][15: 8] <= write_data_in[15: 8];
    if (write_enable[2]) dataram[address_hi][23:16] <= write_data_in[23:16];
    if (write_enable[3]) dataram[address_hi][31:24] <= write_data_in[31:24];
    if (write_enable[4]) dataram[address_hi][39:32] <= write_data_in[39:32];
    if (write_enable[5]) dataram[address_hi][47:40] <= write_data_in[47:40];
    if (write_enable[6]) dataram[address_hi][55:48] <= write_data_in[55:48];
    if (write_enable[7]) dataram[address_hi][63:56] <= write_data_in[63:56];
end


// data read. Must have natural alignment
always_ff @(posedge clock) if (clock_enable & address_valid) begin    
    if (read_enable) begin
        if (read_data_size == 0) begin // 8 bits
            case (address_lo)
            0: read_data_out <= dataram[address_hi][7:0];
            1: read_data_out <= dataram[address_hi][15:8];
            2: read_data_out <= dataram[address_hi][23:16];
            3: read_data_out <= dataram[address_hi][31:24];
            4: read_data_out <= dataram[address_hi][39:32];
            5: read_data_out <= dataram[address_hi][47:40];
            6: read_data_out <= dataram[address_hi][55:48];
            7: read_data_out <= dataram[address_hi][63:56];
            endcase
        end else if (read_data_size == 1) begin // 16 bits
            case (address_lo[2:1])
            0: read_data_out <= dataram[address_hi][15:0];
            1: read_data_out <= dataram[address_hi][31:16];
            2: read_data_out <= dataram[address_hi][47:32];
            3: read_data_out <= dataram[address_hi][63:48];
            endcase        
        end else if (read_data_size == 2) begin // 32 bits
            case (address_lo[2])
            0: read_data_out <= dataram[address_hi][31:0];
            1: read_data_out <= dataram[address_hi][63:32];
            endcase        
        end else begin // 64 bits
            read_data_out <= dataram[address_hi];
        end
    end
end

endmodule
0 Kudos
1 Solution

Accepted Solutions
seamusbleu
Voyager
Voyager
365 Views
Registered: ‎08-12-2008

Follow the Vivado inference logic template.  I'm guessing the fact that you've described a ram with a single address (thus a single port ram, but haven't described how to handle read vs write selection, that maybe is what is causing you problems.

Tools->language templates->verilog->synthesis constructs->coding examples->ram->blockram->single port->byte-wide write->(read or write) first mode

<== If this was helpful, please feel free to give Kudos, and accept as Solution if it answers your question ==>

View solution in original post

4 Replies
bruce_karaffa
Scholar
Scholar
390 Views
Registered: ‎06-21-2017

I think that this cannot be mapped to BRAM because of the way the code is written, specifically the way clock enable and address valid are used in the code.  One way to find out if this can be mapped to BRAM is to instantiate the BRAM and write the code to use the instantiated primitives.  If you can't do this, the synthesizer can't either.

0 Kudos
dgisselq
Scholar
Scholar
379 Views
Registered: ‎05-21-2015

@Agner ,

You should be fine if you just adjust your read logic:

always_ff @(posedge clk)
if (clock_enable && address_valid && read_enable)
   read_data_out <= dataram[address_hi];

Your big nested if statement might be what's causing the problem.

I do remember needing, in the past, to break RAM with byte enables into separate RAMs in order to get the tool to support it.  I can't remember if Vivado was one of those tools.  I thought Vivado did better than that, but I'm not sure if I tried it against a 64-bit word vs a 32-bit word.

Dan

seamusbleu
Voyager
Voyager
366 Views
Registered: ‎08-12-2008

Follow the Vivado inference logic template.  I'm guessing the fact that you've described a ram with a single address (thus a single port ram, but haven't described how to handle read vs write selection, that maybe is what is causing you problems.

Tools->language templates->verilog->synthesis constructs->coding examples->ram->blockram->single port->byte-wide write->(read or write) first mode

<== If this was helpful, please feel free to give Kudos, and accept as Solution if it answers your question ==>

View solution in original post

Agner
Observer
Observer
268 Views
Registered: ‎05-08-2020

Thank you for your help. It turned out that the problem was the multiplexer (case (...)) before the output register (read_data_out). Putting the multiplexer after the output register solved the problem.

0 Kudos