UPGRADE YOUR BROWSER
We have detected your current browser version is not the latest one. Xilinx.com uses the latest web technologies to bring you the best online experience possible. Please upgrade to a Xilinx.com supported browser:Chrome, Firefox, Internet Explorer 11, Safari. Thank you!
07-11-2013 03:40 PM
Is it possible to instantiate a block memory directly from Verilog? Specifically, rather than creating a specific block RAM cell with IP Catalog or using TCL commands, can I just insert the "blk_mem_gen_v8_0" instantiation with all the correct parameters in my Verilog file? (see bottom of post for example)
It seems like this should be possible as I've tried instantiating the IP Catalog generated Verilog file which just has the "blk_mem_gen_v8_0" instantiation. However, when I do it from within my own code I get the error below:
[Synth 8-439] module 'blk_mem_gen_v8_0' not found
Thanks for your help!
blk_mem_gen_v8_0 #(
.C_FAMILY("kintex7"),
.C_XDEVICEFAMILY("kintex7"),
.C_ELABORATION_DIR("./"),
.C_INTERFACE_TYPE(0),
.C_AXI_TYPE(1),
.C_AXI_SLAVE_TYPE(0),
.C_HAS_AXI_ID(0),
.C_AXI_ID_WIDTH(4),
.C_MEM_TYPE(1),
.C_BYTE_SIZE(9),
.C_ALGORITHM(1),
.C_PRIM_TYPE(1),
.C_LOAD_INIT_FILE(0),
.C_INIT_FILE_NAME("no_coe_file_loaded"),
.C_INIT_FILE("dprf256X24.mem"),
.C_USE_DEFAULT_DATA(0),
.C_DEFAULT_DATA("0"),
.C_RST_TYPE("SYNC"),
.C_HAS_RSTA(0),
.C_RST_PRIORITY_A("CE"),
.C_RSTRAM_A(0),
.C_INITA_VAL("0"),
.C_HAS_ENA(0),
.C_HAS_REGCEA(0),
.C_USE_BYTE_WEA(0),
.C_WEA_WIDTH(1),
.C_WRITE_MODE_A("WRITE_FIRST"),
.C_WRITE_WIDTH_A(24),
.C_READ_WIDTH_A(24),
.C_WRITE_DEPTH_A(256),
.C_READ_DEPTH_A(256),
.C_ADDRA_WIDTH(8),
.C_HAS_RSTB(0),
.C_RST_PRIORITY_B("CE"),
.C_RSTRAM_B(0),
.C_INITB_VAL("0"),
.C_HAS_ENB(1),
.C_HAS_REGCEB(0),
.C_USE_BYTE_WEB(0),
.C_WEB_WIDTH(1),
.C_WRITE_MODE_B("WRITE_FIRST"),
.C_WRITE_WIDTH_B(24),
.C_READ_WIDTH_B(24),
.C_WRITE_DEPTH_B(256),
.C_READ_DEPTH_B(256),
.C_ADDRB_WIDTH(8),
.C_HAS_MEM_OUTPUT_REGS_A(0),
.C_HAS_MEM_OUTPUT_REGS_B(0),
.C_HAS_MUX_OUTPUT_REGS_A(0),
.C_HAS_MUX_OUTPUT_REGS_B(0),
.C_MUX_PIPELINE_STAGES(0),
.C_HAS_SOFTECC_INPUT_REGS_A(0),
.C_HAS_SOFTECC_OUTPUT_REGS_B(0),
.C_USE_SOFTECC(0),
.C_USE_ECC(0),
.C_HAS_INJECTERR(0),
.C_SIM_COLLISION_CHECK("ALL"),
.C_COMMON_CLK(1),
.C_ENABLE_32BIT_ADDRESS(0),
.C_DISABLE_WARN_BHV_COLL(0),
.C_DISABLE_WARN_BHV_RANGE(0),
.C_USE_BRAM_BLOCK(0)
) I_dprf256X240 (
.clka(clk),
.rsta(1'B0),
.ena(1'B0),
.regcea(1'B0),
.wea(wea),
.addra(wr_addr),
.dina(wr_data),
.douta(),
.clkb(clk),
.rstb(1'B0),
.enb(enb),
.regceb(1'B0),
.web(1'B0),
.addrb(rd_addr),
.dinb(24'B0),
.doutb(rd_data),
.injectsbiterr(1'B0),
.injectdbiterr(1'B0),
.sbiterr(),
.dbiterr(),
.rdaddrecc(),
.s_aclk(1'B0),
.s_aresetn(1'B0),
.s_axi_awid(4'B0),
.s_axi_awaddr(32'B0),
.s_axi_awlen(8'B0),
.s_axi_awsize(3'B0),
.s_axi_awburst(2'B0),
.s_axi_awvalid(1'B0),
.s_axi_awready(),
.s_axi_wdata(24'B0),
.s_axi_wstrb(1'B0),
.s_axi_wlast(1'B0),
.s_axi_wvalid(1'B0),
.s_axi_wready(),
.s_axi_bid(),
.s_axi_bresp(),
.s_axi_bvalid(),
.s_axi_bready(1'B0),
.s_axi_arid(4'B0),
.s_axi_araddr(32'B0),
.s_axi_arlen(8'B0),
.s_axi_arsize(3'B0),
.s_axi_arburst(2'B0),
.s_axi_arvalid(1'B0),
.s_axi_arready(),
.s_axi_rid(),
.s_axi_rdata(),
.s_axi_rresp(),
.s_axi_rlast(),
.s_axi_rvalid(),
.s_axi_rready(1'B0),
.s_axi_injectsbiterr(1'B0),
.s_axi_injectdbiterr(1'B0),
.s_axi_sbiterr(),
.s_axi_dbiterr(),
.s_axi_rdaddrecc()
);
07-12-2013 09:01 PM
Hi,
In vivado GUI go to window --> Language templates. You can find the instantiation template for BRAM as well sample code for inference here (in coding examples).
Thanks,
deepika.
07-11-2013 06:17 PM
It's probably easier to just infer block RAM if you don't want to use either a CoreGen IP or
a macro or primitive. The code you see in the Verilog file generated by CoreGen is only
for simulation. If you look closely at the module, all of the internals (everything between
the last port and the endmodule statement) is within a translate_off pragma. That module
that gets instantiated with a lot of parameters is not synthesizable, and it exists only
in a simulation library. For synthesis, CoreGen creates a structural model that it synthesizes
and creates a .ngc which is what you end up with in your project. The .ngc file is a precompiled
netlist in a Xilinx format, but in a way similar to an EDIF netlist. Without it, the Verilog file would
be an empty black-box for synthesis.
But for most use cases, you don't really need CoreGen for BRAM because there are perfectly
good templates in the XST manual for inferring it. In addition, inferred BRAM is easier to initialize,
since you can use simple Verilog code in an initial block including $readmemh if you want to
initialize the memory from a hex file.
07-12-2013 05:41 PM
Thanks you for your response Gabor. I'm using Vivado and have been searching the Vivado documentation without finding a template. Can someone point me to the right Vivado document, or post the Vivado Verilog template?
Thank you,
Dave
07-12-2013 09:01 PM
Hi,
In vivado GUI go to window --> Language templates. You can find the instantiation template for BRAM as well sample code for inference here (in coding examples).
Thanks,
deepika.
07-15-2013 10:10 AM
Thank you very much vermalad! I found the templates within Vivado and also in UG953.
I did find one minor error in both sources. The Verilog template for BRAM_TDP_MACRO has the following line in it which must be removed to get past an error. The VHDL template does not have this line. As I said, the errant parameter is in both locations, the UG953 documentation as well as the Verilog template within Vivado.
.INIT_FF(256'h0000000000000000000000000000000000000000000000000000000000000000),
[Synth 8-3438] module 'BRAM_TDP_MACRO' declared at '/cad/xilinx/Vivado/2013.2/data/verilog/src/unimacro/BRAM_TDP_MACRO.v:28' does not have any parameter 'INIT_FF' used as named parameter override
Thanks again,
Dave
07-15-2013 12:23 PM
The correct way to connect the write enable signals isn't completely clear to me for the BRAM_SDP_MACRO. The template has both WE and WREN:
.WE(WE), // Input write enable, width defined by write port depth
.WREN(WREN) // 1-bit input write port enable
So presumably if I want a single write enable for the whole word, I would connect it to WREN. If I want byte enables I would connect the multi-bit signal to WE. What do I connect to the other port? At the moment I am connecting single bit write enables to both ports but am not sure what to do with WREN with multi-bit write enables.
Thanks,
Dave
07-15-2013 12:45 PM
I'm pretty sure that the two ports are ANDed together. i.e. if you have a 4-bit wide WE signal
then asserting it to 4'b1100 would only write the upper half of the total port width and only if
WREN is also asserted at the same time. If you want to be sure, you can create a simple
test bench that instantiates one of these macros and try it out via simulation.
07-15-2013 06:15 PM
It looks like you are right. It would be great if the documentation would explicitly explain how to do it though.
I have encountered another issue. If I generate the memory with IP Catalog or with TCL commands (create_ip), or instantiate it with the blk_mem_gen_v8_0 template, I don't get any warnings about port mismatches when I connect it up. If I use the BRAM_SDP_MACRO, I get a warning when I try to create an 8-bit wide memory with 256 entries.
When I simulate using 8-bit address buses everything is fine in the first case (anything but BRAM_SDP_MACRO). If I use an 8-bit address bus with the BRAM_SDP_MACRO, I get simulation failures as the data does not seem to be getting written into the RAM. When I pay attention to the warnings and use an 11-bit address (with top 3 bits 0) then it works fine. But now I'm wondering if I'm really getting a 2048 entry deep RAM and just wasting the majority of it. The working instantiation (with 3 extra bits) is below. What is the correct way to do this?
Thanks,
Dave
BRAM_SDP_MACRO #(
.BRAM_SIZE("18Kb"), // Target BRAM, "18Kb" or "36Kb"
.DEVICE("7SERIES"), // Target device: "7SERIES"
.WRITE_WIDTH(8), // Valid values are 1-72 (37-72 only valid when BRAM_SIZE="36Kb")
.READ_WIDTH(8), // Valid values are 1-72 (37-72 only valid when BRAM_SIZE="36Kb")
.DO_REG(0), // Optional output register (0 or 1)
.INIT_FILE ("NONE"),
.SIM_COLLISION_CHECK ("ALL"), // Collision check enable "ALL", "WARNING_ONLY",
// "GENERATE_X_ONLY" or "NONE"
.SRVAL(72'h000000000000000000), // Set/Reset value for port output
.INIT(72'h000000000000000000), // Initial values on output port
.WRITE_MODE("READ_FIRST"), // Specify "READ_FIRST" for same clock or synchronous clocks
// Specify "WRITE_FIRST for asynchronous clocks on ports
.INIT_00(256'h0000000000000000000000000000000000000000000000000000000000000000),
...
.INITP_0F(256'h0000000000000000000000000000000000000000000000000000000000000000)
) I_dprf256X8 (
.DO(dat_o[7:0]), // Output read data port, width defined by READ_WIDTH parameter
.DI(dat_i[7:0]), // Input write data port, width defined by WRITE_WIDTH parameter
.RDADDR({3'b0,rd_addr[7:0]}), // Input read address, width defined by read port depth
.RDCLK(clk), // 1-bit input read clock
.RDEN(rd_en), // 1-bit input read port enable
.REGCE(1'b0), // 1-bit input read output register enable
.RST(1'b0), // 1-bit input reset
.WE(wr_en), // Input write enable, width defined by write port depth
.WRADDR({3'b0,wr_addr[7:0]}), // Input write address, width defined by write port depth
.WRCLK(clk), // 1-bit input write clock
.WREN(1'b1) // 1-bit input write port enable
);
07-15-2013 06:33 PM
The minimum size of a block RAM in newer Xilinx FPGA's is 16 Kbits, or 18 Kbits if you're able to
use the "parity" bits (requires a minimum port size of 9). So in effect any block RAM macro that
would give you fewer bits is "throwing the rest away" because there're no other way to get to
those bits once you've used up both address ports. So if you want to instantiate the macros,
then you should respect the minimum memory size of 16 Kbits and then zero out the unused
address bits. Core Generator allows you to define smaller memories and hooks them up
with wrappers so you don't see the wasted address bits, but in the end you're wasting bits
any way you look at it. This would also be true if you instantiated the memory primitives
instead of the macros.
Personally I find that it makes sense to infer memory instead of instantiating it. For one thing,
there's no question about how the write enables work - you have the source code in front
of you. For another, the synthesizer has the ability to decide what memory (block or
distributed) best uses the remaining resources of the FPGA. Also the behavioral simulation
runs from your own source code, making it simpler, faster, and more likely to do what you wanted.
Finally, although I've never ported a design from Xilinx to another vendor, it does make the code
more portable - assuming the other vendor's tools are able to synthesize RAM from the same
behavioral description. From a portability standpoint, it really makes sense to infer anything
you can.
You also ran into the initialization vector bug. For an inferred memory, if you want to initialize it
you can write simple code in an initial block, either looping through the memory to set it to
some standard value, or reading it from an external file using $readmemh or $readmemb.
You can argue that instantiating the memories will show you exactly what you're getting
when you synthesize the code, and I'll give you that, but inferring the memory can end up
with a more readable source, and possibly lead to reduced design errors that come from
misunderstanding the fine points of the memory primitives (like byte write enables).
07-16-2013 10:13 AM
Thank you Gabor, that makes sense. I had tried inferring memory previously but Vivado took 5-10 times longer to run. I have a total of nearly 1MB of memories on my device which seems to be at the edge of what Vivado can handle (or our server!).
07-17-2013 05:29 PM
Has anyone noticed Vivado run time differences using the Verilog template versus creating memories from the IP Catalog or through the create_ip TCL command? It seems that Vivado is taking far longer now to generate a BIT file. I'm wondering if inferred memories would be just as fast (i.e. not very fast at all).
09-05-2013 03:38 PM
After further investigation it appears that Vivado runtime isn't affected much unless the memories are fairly large. My runtime went from 86 minutes to 90 minutes after swapping a macro-defined memory with an inferred 16KB memory. Attempting to infer four 128KB memories brought the machine to its knees however, I'm sticking with create_ip for those larger memories.