cancel
Showing results for 
Search instead for 
Did you mean: 
Highlighted
Explorer
Explorer
8,657 Views
Registered: ‎11-22-2016

Is there a way to infer simple dual port block RAMs in READ_FIRST mode?

I am stealing the title from a previously "solved" question, because my problem is identical with one small change.  I have an inferred dual port BRAM which I would like to operate at 500 MHz (on a Virtex-7 device with speed grade -2); according to DS183 I should be able to operate at up to 600 MHz (F_MAX_BRAM_WF_NC), however I do not seem to have enough (or any?) control over how the block ram is inferred.

 

I have included the complete project below.  When synthesised with Vivado it fails timing, and the failing margin implies a target BRAM frequency of 477.33 MHz, corresponding to speed parameter F_MAX_BRAM_RF_DELAYED_WRITE, described as "When in SDP RF mode and there is possibility of overlap between port A and port B addresses".

 

Looking at the implemented design, in particular at the inferred BRAM, I see WRITE_MODE_A set to READ_FIRST, but WRITE_MODE_B (on a read-only port!) set to WRITE_FIRST, and I presume that this is the limiting factor.

 

I use the following code fragment to infer my BRAM:

 

    process (clk_i) begin
        if rising_edge(clk_i) then
            if write_strobe_i = '1' then
                memory(to_integer(write_addr_i)) <= write_data_i;
            end if;
            read_data <= memory(to_integer(read_addr_i));
            read_data_o <= read_data;
        end if;
    end process;

 

Unfortunately, I have been unable to find any detailed information on how to write the kind of BRAM I need; the most recent documentation seems to refer to ISE which suggests the use of shared variables ... which Vivado does not support!

 

 

Edit.  Well, that's rather sad.  When I attempted to post this message, I get the following message:

 

 

 

I'm afraid I'm going to have to dump everything inline instead

 

top.vhd

-------

library IEEE;
use IEEE.STD_LOGIC_1164.ALL;
use IEEE.NUMERIC_STD.ALL;

entity top is
port (
    clk_i : in std_logic;
    
    delay_i : in unsigned(3 downto 0);
    data_i : in std_logic_vector(3 downto 0);
    data_o : out std_logic_vector(3 downto 0)
);
end top;

architecture Behavioral of top is
    signal data_in : std_logic_vector(data_i'RANGE);
    signal data_out : std_logic_vector(data_o'RANGE);
begin

delay_inst : entity work.long_delay generic map (
    WIDTH => data_i'LENGTH
) port map (
    clk_i => clk_i,
    delay_i => delay_i,
    data_i => data_in,
    data_o => data_out
);

process (clk_i) begin
    if rising_edge(clk_i) then
        data_in <= data_i;
        data_o <= data_out;
    end if;
end process;

end Behavioral;

 

long_delay.vhd

--------------

-- Programmable long delay.  This delay uses block ram.

library ieee;
use ieee.std_logic_1164.all;
use ieee.numeric_std.all;

entity long_delay is
    generic (
        WIDTH : natural
    );
    port (
        clk_i : in std_logic;

        delay_i : in unsigned;
        data_i : in std_logic_vector(WIDTH-1 downto 0);
        data_o : out std_logic_vector(WIDTH-1 downto 0)
    );
end;

architecture long_delay of long_delay is
    constant ADDR_BITS : natural := delay_i'LENGTH;
    subtype address_t is unsigned(ADDR_BITS-1 downto 0);

    signal write_addr : address_t := (others => '0');
    signal read_addr : address_t;

begin
    memory_inst : entity work.block_memory generic map (
        ADDR_BITS => ADDR_BITS,
        DATA_BITS => WIDTH
    ) port map (
        clk_i => clk_i,
        read_addr_i => read_addr,
        read_data_o => data_o,
        write_strobe_i => '1',
        write_addr_i => write_addr,
        write_data_i => data_i
    );

    process (clk_i) begin
        if rising_edge(clk_i) then
            write_addr <= write_addr + 1;
            read_addr <= write_addr - delay_i;
        end if;
    end process;
end;

 

block_memory.vhd

----------------

-- Memory mapped into Block RAM.  We double buffer the read data to ensure that
-- the BRAM is fully registered.
--
-- The delay from read_addr_i to read_data_o is 2 clock ticks:
--
-- clk_i            /       /       /       /       /
-- read_addr_i    --X  A    X----------------------------
-- read_data      ----------X M[A]  X--------------------
-- read_data_o    ------------------X M[A]  X------------
--
-- The relationship with data written into the same location is shown by the
-- figure below:
--
-- clk_i            /       /       /       /       /
-- write_strobe_i __/^^^^^^^\____________________________
-- write_addr_i   --X  A    X----------------------------
-- write_data_i   --X  D    X----------------------------
-- memory[A]      ----------X  D
-- read_addr_i    ----------X  A    X--------------------
-- read_data      ------------------X  D    X------------
-- read_data_o    --------------------------X  D    X----
--
-- This shows that the written data at any address must be read one tick later.

library ieee;
use ieee.std_logic_1164.all;
use ieee.numeric_std.all;

entity block_memory is
    generic (
        ADDR_BITS : natural;
        DATA_BITS : natural
    );
    port (
        clk_i : in std_logic;

        -- Read interface
        read_addr_i : in unsigned(ADDR_BITS-1 downto 0);
        read_data_o : out std_logic_vector(DATA_BITS-1 downto 0);

        -- Write interface
        write_strobe_i : in std_logic;
        write_addr_i : in unsigned(ADDR_BITS-1 downto 0);
        write_data_i : in std_logic_vector(DATA_BITS-1 downto 0)
    );
end;

architecture block_memory of block_memory is
    -- Block RAM
    subtype data_t is std_logic_vector(DATA_BITS-1 downto 0);
    type memory_t is array(0 to 2**ADDR_BITS-1) of data_t;
    signal memory : memory_t := (others => (others => '0'));
    attribute ram_style : string;
    attribute ram_style of memory : signal is "BLOCK";

    signal read_data : std_logic_vector(DATA_BITS-1 downto 0);

begin
    process (clk_i) begin
        if rising_edge(clk_i) then
            if write_strobe_i = '1' then
                memory(to_integer(write_addr_i)) <= write_data_i;
            end if;
            read_data <= memory(to_integer(read_addr_i));
            read_data_o <= read_data;
        end if;
    end process;
end;

 

clocks.xdc

----------

create_clock -period 2.0 -name CLK [get_ports clk_i]

 

0 Kudos
15 Replies
Highlighted
Explorer
Explorer
8,641 Views
Registered: ‎11-22-2016

Re: Is there a way to infer simple dual port block RAMs in WRITE_FIRST mode?

I have a feeling I'm getting a bit confused about which is the fastest mode of operation for the BRAM (after all, I don't care what happens when there is an address conflict).  Also, let me correct a mistake in my original posting: for my -2 device, the F_MAX_BRAM_WF_NC parameter is 543.77 MHz, not 600 MHz.

 

According to my reading of DS183 I don't want to be in RF (Read First) mode, and I've just looked at page 12 of PW231 to confirm this, so I've edited the subject title to reflect this.

0 Kudos
Highlighted
Explorer
Explorer
8,632 Views
Registered: ‎11-22-2016

Re: Is there a way to infer simple dual port block RAMs in WRITE_FIRST mode?

Ok.  With help from my expert colleague (Isa Uzun, many thanks!) I have a workaround: simply add this line to the constraints file:

 

set_property WRITE_MODE_A WRITE_FIRST [get_cells delay_inst/memory_inst/memory_reg]

 

This is an acceptable hack, but I have to say that if this is the only valid solution then Vivado needs a bit more work on its synthesis inference!

0 Kudos
Highlighted
Xilinx Employee
Xilinx Employee
8,615 Views
Registered: ‎08-01-2008

Re: Is there a way to infer simple dual port block RAMs in WRITE_FIRST mode?

you can use block memory generator core PG058 or use language templates for reference code
Thanks and Regards
Balkrishan
--------------------------------------------------------------------------------------------
Please mark the post as an answer "Accept as solution" in case it helped resolve your query.
Give kudos in case a post in case it guided to the solution.
0 Kudos
Highlighted
Explorer
Explorer
8,613 Views
Registered: ‎11-22-2016

Re: Is there a way to infer simple dual port block RAMs in WRITE_FIRST mode?

> language templates for reference code

Can you provide any relevant links for this? I have done quite a lot of research on this topic, and I'm convinced that Vivado cannot be reliably persuaded to infer the correct behaviour based purely on reference code.

My example is sufficiently simple that I'm sure a working template would be easy to present.

P.S. Why was I unable to attach my files to my original posting?
0 Kudos
Highlighted
Moderator
Moderator
8,541 Views
Registered: ‎07-21-2014

Re: Is there a way to infer simple dual port block RAMs in WRITE_FIRST mode?

@araneidae

 

You can find the examples in below link:

https://www.xilinx.com/support/documentation/sw_manuals/xilinx2016_3/ug901-vivado-synthesis.pdf

 

Also;, if the memory code is reading the data before writing into the memory then tool should be able to infer BRAM with the READ_FIRST mode.

For Ex:

process(clkA)
 begin
  if rising_edge(clkA) then
   if enA = '1' then
    readA <= my_ram(conv_integer(addrA));
    if weA = '1' then
     my_ram(conv_integer(addrA)) <= diA;

    end if;
  end if;
  regA <= readA;
 end if;
end process;

Capture.PNG

Thanks,
Anusheel
-----------------------------------------------------------------------------------------------
Search for documents/answer records related to your device and tool before posting query on forums.
Search related forums and make sure your query is not repeated.

Please mark the post as an answer "Accept as solution" in case it helps to resolve your query.
Helpful answer -> Give Kudos
-----------------------------------------------------------------------------------------------

 

 

0 Kudos
Highlighted
Explorer
Explorer
8,518 Views
Registered: ‎11-22-2016

Re: Is there a way to infer simple dual port block RAMs in WRITE_FIRST mode?

I wrote a long and carefully written reply, which the web server discarded with the message "authentication token mismatch". Lovely.
0 Kudos
Highlighted
Explorer
Explorer
8,514 Views
Registered: ‎11-22-2016

Re: Is there a way to infer simple dual port block RAMs in WRITE_FIRST mode?

Ok, a brief version of my lost reply.  If I synthesise the relevant example from UG901, simple_dual_one_clock.vhd (pages 112-113, the result is that the write port is configured in read_first mode.

 

It's turning out that it is possible to merge simple_dual_one_clock.vhd and rams_02.vhd (pp 110-111), and it is looking like this produces the desired result, but I'll need to do more checking; I don't really regard "use language templates for reference code" as a very satisfying response given the complexity of what's actually required.  I'll post a complete working example shortly.

0 Kudos
Highlighted
Xilinx Employee
Xilinx Employee
8,513 Views
Registered: ‎08-01-2008

Re: Is there a way to infer simple dual port block RAMs in WRITE_FIRST mode?

You can use portA any of mode supported by BRAM i,e, No change. Read first or write firsts however port B only supported write first mode .

You may use block memory generator core
https://www.xilinx.com/support/documentation/ip_documentation/blk_mem_gen/v8_0/pg058-blk-mem-gen.pdf
Thanks and Regards
Balkrishan
--------------------------------------------------------------------------------------------
Please mark the post as an answer "Accept as solution" in case it helped resolve your query.
Give kudos in case a post in case it guided to the solution.
0 Kudos
Highlighted
Explorer
Explorer
10,464 Views
Registered: ‎11-22-2016

Re: Is there a way to infer simple dual port block RAMs in WRITE_FIRST mode?

Ok, a combination of the two examples in UG901 together with an extra layer of hierarchy (seems to be part of the necessary dance required by Vivado) seems to do the trick:

 

-- Simple Dual-Port Block RAM with One Clock
-- Correct Modelization with a Shared Variable
-- File:simple_dual_one_clock.vhd merged with rams_02.vhd
library IEEE;
use IEEE.std_logic_1164.all;
use IEEE.std_logic_unsigned.all;

entity simple_dual_one_clock is
    port(
        clk : in std_logic;
        ena : in std_logic;
        enb : in std_logic;
        wea : in std_logic;
        addra : in std_logic_vector(9 downto 0);
        addrb : in std_logic_vector(9 downto 0);
        dia : in std_logic_vector(15 downto 0);
        doa : out std_logic_vector(15 downto 0);
        dob : out std_logic_vector(15 downto 0)
    );
end simple_dual_one_clock;

architecture syn of simple_dual_one_clock is
    type ram_type is array (1023 downto 0) of std_logic_vector(15 downto 0);
    shared variable RAM : ram_type;
    
begin
    process(clk)
    begin
        if clk'event and clk = '1' then
            if ena = '1' then
                if wea = '1' then
                    RAM(conv_integer(addra)) := dia;
                    doa <= dia;
                else
                    doa <= RAM(conv_integer(addra));
                end if;
            end if;
        end if;
       end process;
       
    process(clk)
    begin
        if clk'event and clk = '1' then
            if enb = '1' then
                dob <= RAM(conv_integer(addrb));
            end if;
        end if;
    end process;
end syn;

 

It looks as if leaving doa open is ok:

 

library IEEE;
use IEEE.STD_LOGIC_1164.ALL;
use IEEE.NUMERIC_STD.ALL;

entity top is
port (
        clk : in std_logic;
        addra : in std_logic_vector(9 downto 0);
        addrb : in std_logic_vector(9 downto 0);
        dia : in std_logic_vector(15 downto 0);
        doa : out std_logic_vector(15 downto 0);
        dob : out std_logic_vector(15 downto 0)
);
end top;

architecture Behavioral of top is
begin

    inst : entity work.simple_dual_one_clock port map (
        clk => clk,
        ena => '1',
        enb => '1',
        wea => '1',
        addra => addra,
        addrb => addrb,
        dia => dia,
        doa => open,
        dob => dob
    );
end Behavioral;

 

This synthesises with write mode set to WRITE_FIRST and is placed with write mode set to NO_CHANGE, which seems to be enough to meet timing as required.

 

I expect I can safely omit the ena and enb ports, and it looks as if the use of shared variable is just needed to help with modelling; I guess this is as complete an answer as this question permits.

0 Kudos
Highlighted
Moderator
Moderator
6,697 Views
Registered: ‎07-21-2014

Re: Is there a way to infer simple dual port block RAMs in WRITE_FIRST mode?

@araneidae

 

Yes, you need to completely follow language template, if the RTL has WRITE_FIRST and READ_FIRST logic then tool should be able to infer the same in generated netlist.

 

Please close this thread by marking suitable answer as accepted solution.

 

Thanks,
Anusheel
-----------------------------------------------------------------------------------------------
Search for documents/answer records related to your device and tool before posting query on forums.
Search related forums and make sure your query is not repeated.

Please mark the post as an answer "Accept as solution" in case it helps to resolve your query.
Helpful answer -> Give Kudos
-----------------------------------------------------------------------------------------------

0 Kudos
Highlighted
Explorer
Explorer
6,693 Views
Registered: ‎11-22-2016

Re: Is there a way to infer simple dual port block RAMs in WRITE_FIRST mode?

Hmm.  I'm not altogether convinced we're there yet, as the solution I've posted is alarmingly brittle.  As you hint, apparently trivial changes to the templates (for example, merging the two processes into one) have dramatic changes to the synthesis, and of course my solution is not a template solution, it's a hacked merge.

 

Simply saying "completely follow language template" when the templates are incomplete (and archaic, I think conv_integer and std_logic_unsigned have been deprecated for some time) is less than helpful at best, and actually rather misleading.

0 Kudos
Highlighted
Moderator
Moderator
6,689 Views
Registered: ‎07-21-2014

Re: Is there a way to infer simple dual port block RAMs in WRITE_FIRST mode?

Editing typo:

Yes, you need "not" to completely follow language template, if the RTL has WRITE_FIRST and READ_FIRST logic then tool should be able to infer the same in generated netlist.

 

Please close this thread by marking suitable answer as accepted solution.

 

Thanks,
Anusheel
-----------------------------------------------------------------------------------------------
Search for documents/answer records related to your device and tool before posting query on forums.
Search related forums and make sure your query is not repeated.

Please mark the post as an answer "Accept as solution" in case it helps to resolve your query.
Helpful answer -> Give Kudos
-----------------------------------------------------------------------------------------------

0 Kudos
Highlighted
Explorer
Explorer
6,673 Views
Registered: ‎11-22-2016

Re: Is there a way to infer simple dual port block RAMs in WRITE_FIRST mode?

Brittle indeed.  My project is to VDHL 2008.  The code I've written here now generates the error:

 

ERROR: [Synth 8-4747] shared variables must be of a protected type [/scratch/tmp/LMBF/build/project_2/project_2.srcs/sources_1/new/simple_dual_one_clock.vhd:22]

 

> Please close this thread by marking suitable answer as accepted solution.

 

Do you think this is adequately solved?  I will wait until my target project has completed synthesis...  Your anxiety to mark half baked "solutions" as solved bothers me.  Sorry.  I'm trying to stay polite.

0 Kudos
Highlighted
Explorer
Explorer
6,667 Views
Registered: ‎11-22-2016

Re: Is there a way to infer simple dual port block RAMs in WRITE_FIRST mode?

Ok.  I'll mark message #10 as a solution, with the proviso that it must be synthesied with FILE_TYPE set to VHDL, not "VHDL 2008".

0 Kudos
Highlighted
Explorer
Explorer
6,457 Views
Registered: ‎11-22-2016

Re: Is there a way to infer simple dual port block RAMs in WRITE_FIRST mode?

Is there any way to revert the "solved" mark?  This problem is NOT solved.

 

If the width of the required bus requires more than one block ram to be generated, and there's any level of hierarchy collapse, then it would appear that synthesis optimisations defeat the "code pattern" and READ_FIRST is set again.

 

As I already commented, the solution is as brittle as hell, and breaks at the first breath.  I would appreciate acknowledgement from Xilinx that there is an issue here; I'm guessing an actual long term fix is rather too much to hope for, but an acknowledgement of the frankly disappointing quality of the Vivado tool would go quite a long way.

0 Kudos