03-17-2021 06:20 PM
I am trying to understand distributed RAM inference in Ultrascale architecture. I wrote the basic RTL for a distributed RAM, the main part of the code is below,
localparam WIDTH = 8; localparam DEPTH = 64; (* ram_style = "distributed" *) reg [WIDTH-1:0] mem [0:DEPTH-1]; always @(posedge clk) begin if (wen) mem[waddr] <= wdata; end assign rdata = mem[raddr];
Here, waddr and raddr are 6-bit wide and wdata and rdata are 8-bit wide.
I was expecting Vivado (2019.2) to synthesize this code as one RAM64M8 that maps to one SLICEM. However, synthesis schematic tells me there are TWO RAM64M8 instances in the design. And surprisingly, the output of H-LUT in one of the SLICEMs is unconnected.
I assume there may be a reason why Vivado does it this way. Is it because I used separate addresses for write and read (especially the H-LUT)?
If I change the read data assignment as assign rdata = mem[waddr] I get 8 RAM64X1S intances after synthesis. That looks okay, but I couldn't understand why I don't get one RAM64M8. Then, may be it's because I am using only one read port out of the 8 ports? Don't know.
I am hoping to get some clarity on this, thanks.
03-17-2021 07:44 PM
First, the RAM64M8 is just a wrapper - there is no such thing on the die. You should expand these modules to see what is actually inside them.
A single LUT can be a 64x1 SINGLE PORT RAM. When you have "assign rdata=mem[waddr]" you are synthesizing a single ported RAM (only one address), and hence it should map to one RAM64X1S per bit - so the resources of that are correct - 8 LUTs.
When you use "assign rdata=mem[raddr]", now you are asking for a DUAL PORT RAM (two addresses). For each 64x1 dual port RAM you need two LUTs - not one.
I suspect this is what you are seeing, but it is obscured by the RAM64M8 - if you open that module I suspect you will find 8 RAM64X1S (or equivalent) instances in each one - thus resulting in the 16 LUTs you would expect in a 64x8 Dual-port RAM.
03-18-2021 03:55 AM
Thanks Avrum. I expanded the modules and found that there were indeed 16 LUTs.
Also I think the same LUT configuration can support TWO asynchronous read ports? If I add the line `assign rdata2 = mem[waddr]` to the code in my original post, it still uses 16 LUTs. But now I have two read ports.
Since SLICEM LUTs have separate read and write ports, I was expecting a SIMPLE Dual-port RAM as mentioned in ug574 (page 26). However, that page also says "H-LUT is always effectively a single-port memory" no matter the configuration. Does that mean SIMPLE dual-port configuration is not achievable using SLICEM LUTs?