- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic to the Top
- Bookmark
- Subscribe
- Printer Friendly Page
How fast is reading and writing with MIG?
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content
07-05-2012 05:59 AM
Hi All,
I am doing a project on an Atlys board in which I interface the spartan6 fpga with the on board ddr2 sdram MT47H64M16-25E via the MIG ip core by xilinx. I have set the operating frequency of the MIG as 333.333Mhz. On board oscillator is 100Mhz.
Now I implemented a small controller which simplifies further the interface with the MIG. I used it to write and read all the 128Mbytes of space available succesfully. The problem is the speed.
Important info: I am not making use of the memory's ability to do double data rate. The MIG has a datawidth of 32 bits whereas the IC has a datawidth of 16bits. I am only writing and reading 16 bits at any one time so I always mask half the data width; I do this due to my application. I am only using a burst lenght of 1.
When I operate the controller at 100Mhz (MIG at 333.333Mhz), I manage to get a data write speed of 133MB/s, whereas the data read speed is of 39 MB/s. When I change the controller frequency to 200Mhz (again MIG is at 333.333Mhz), the data wirte speed goes up to 267MB/s whereas the data read speed goes up to 45.87MB/s.
I know that I am not using the MIG in all its potential. I should use a longer burst lenght for read. And also I should be reading 32 bit words instead .. but doesn't it seem to you that the data read speed is a bit too low? Also, wouldn't you expect that the data read speed is equal or possibly faster than the write speed (due to electronic reasons..).
Thanks for your time. Best regards,
bouvett
By the way I attached the controller code below..
---------------------------------------------------------------------------------- -- Company: -- Engineer: -- -- Create Date: 03:24:50 06/12/2012 -- Design Name: -- Module Name: DDR2_SDRAM_CONTROLLER - Behavioral -- Project Name: -- Target Devices: -- Tool versions: -- Description: This module instnatiates a simple interface between the host and the MIG module generated via the -- MIG wizard by Xilinx. -- -- Dependencies: -- -- Revision: -- Revision 0.01 - File Created -- Additional Comments: -- -------------------------------------------------- -------------------------------- library IEEE; use IEEE.STD_LOGIC_1164.ALL; use IEEE.numeric_std.all; library work; use work.DDR2_SDRAM_PACKAGE.all; entity DDR2_SDRAM_CONTROLLER is generic( read_burst_count : integer := 1; --read command burst lenght read_cmd : std_logic := '1'; write_cmd : std_logic := '0' ); Port ( --Controller ports clock_src : in STD_LOGIC; --single ended clock source input mstr_rst : in STD_LOGIC; --master reset of the module cmd : in STD_LOGIC; --specifies the job that is required. '0' to write, '1' to read location : in STD_LOGIC_VECTOR (29 downto 0); --specifies location where to write the item data_to_rd : out STD_LOGIC_VECTOR (15 downto 0); --is the port holding the data read from the requested location in the memory ic data_to_wrt : in STD_LOGIC_VECTOR (15 downto 0); --is the port holding the data to be written to the requested location in the memory ic trigger_prcs : in STD_LOGIC; --trigger the controller to start processing the new command prcs_ready : out STD_LOGIC; --normally low; this signal is set for one clock cycle to show that the process has been completed error_flag : out STD_LOGIC; --when this goes high an error would have occured read_data_to_rd : out STD_LOGIC; --when this pin is asserted, the client should read the data_to_rd --port. This is used only if the read_burst_count is greater than 1. ack_trigger : out STD_LOGIC; --This pin is used to acknowledge the client that the trigger has been received. clock_locked : in std_logic; --When this pin is high it means that the clock is good to use c3_sys_rst_i : out std_logic; c3_calib_done : in std_logic; c3_clk0 : in std_logic; c3_rst0 : in std_logic; c3_p0_cmd_en : out std_logic; c3_p0_cmd_instr : out std_logic_vector(2 downto 0); c3_p0_cmd_bl : out std_logic_vector(5 downto 0); c3_p0_cmd_byte_addr : out std_logic_vector(29 downto 0); c3_p0_cmd_empty : in std_logic; c3_p0_cmd_full : in std_logic; c3_p0_wr_en : out std_logic; c3_p0_wr_mask : out std_logic_vector(C3_P0_MASK_SIZE-1 downto 0); c3_p0_wr_data : out STD_LOGIC_VECTOR(C3_P0_DATA_PORT_SIZE-1 downto 0); c3_p0_wr_full : in std_logic; c3_p0_wr_empty : in std_logic; c3_p0_wr_count : in std_logic_vector(6 downto 0); c3_p0_wr_underrun : in std_logic; c3_p0_wr_error : in std_logic; c3_p0_rd_en : out std_logic; c3_p0_rd_data : in std_logic_vector(C3_P0_DATA_PORT_SIZE - 1 downto 0); c3_p0_rd_full : in std_logic; c3_p0_rd_empty : in std_logic; c3_p0_rd_count : in std_logic_vector(6 downto 0); c3_p0_rd_overflow : in std_logic; c3_p0_rd_error : in std_logic ); end DDR2_SDRAM_CONTROLLER; architecture Behavioral of DDR2_SDRAM_CONTROLLER is type states is (calibration_check,trigger_check,write_start,write _data_buffer_check,command_buffer_check_wr,procedu re_finished,read_start,command_buffer_check_rd,rea d_buffer_check,catch); signal current_state : states; signal location_2_lsb : std_logic_vector (1 downto 0); signal hey : std_logic := '0'; signal test_pin_signal : std_logic := '0'; begin process(clock_src,mstr_rst) variable read_counter : unsigned (4 downto 0); --maximum value 16 begin --asynchrous reset if(mstr_rst = '1') then --reset settings current_state <= calibration_check; c3_p0_cmd_en <= '0'; c3_p0_wr_en <= '0'; c3_p0_rd_en <= '0'; error_flag <= '0'; prcs_ready <= '0'; elsif(clock_src'event and clock_src='1') then case current_state is when calibration_check => --check whether calibration is ready if(c3_calib_done = '1' and clock_locked = '1') then current_state <= trigger_check; end if; when trigger_check => --next check whether a process has been triggered --also check what command has been entered if(trigger_prcs = '1') then if(cmd = '1') then --read command? current_state <= read_start; --start read state else --write command current_state <= write_start; --start write state end if; ack_trigger <= '1'; end if; when write_start => --This is the write start node --set write and command parameters ack_trigger <= '0'; --Take care of port alignment if(location_2_lsb = "00") then c3_p0_wr_data (31 downto 16) <= (others => '0'); c3_p0_wr_data (15 downto 0) <= data_to_wrt; c3_p0_wr_mask <= "1100"; else --location_2_lsb = "10" c3_p0_wr_data (31 downto 16) <= data_to_wrt; c3_p0_wr_data (15 downto 0) <= (others => '0'); c3_p0_wr_mask <= "0011"; end if; c3_p0_cmd_bl <= "000000"; c3_p0_cmd_instr <= "000"; --set MIG to write c3_p0_cmd_byte_addr <= location(29 downto 2) & "00"; c3_p0_wr_en <= '1'; --enable data write buffer current_state <= write_data_buffer_check; when write_data_buffer_check => --Disable the write buffer enable pin --after 1 clock cycle. --If command buffer is not full, enable --it. c3_p0_wr_en <= '0'; --disable data write buffer if(c3_p0_cmd_full = '0') then c3_p0_cmd_en <= '1'; current_state <= command_buffer_check_wr; end if; when command_buffer_check_wr => --Disable the command buffer enable pin --after 1 clock cycle. c3_p0_cmd_en <= '0'; --disable command buffer --signal that procedure is ready prcs_ready <= '1'; current_state <= procedure_finished; when procedure_finished => --Procedure finished hence de-assert flag pin prcs_ready <= '0'; current_state <= trigger_check; when read_start => --Start read procedure ack_trigger <= '0'; c3_p0_cmd_instr <= "001"; --set MIG to read --Take care of address alignment --In this test we are assuming that the user supplies 2 byte data. Since the user --interface is *32bits, then according to table 4-2 of UG388, the 2 LSB must be set --to 0. Hence the 2 LSBs (location_2_lsb) can either be "00" or "10". --Therefore, what should be done is that the 2 LSBs of c3_p0_cmd_byte_addr are always --set to 0 and then use the value of location_2_lsb to determine whether the higher or --lower 2 bytes of c3_p0_rd_data should be output to data_to_rd. c3_p0_cmd_byte_addr <= location(29 downto 2) & "00"; --Set burst lenght --command fifo is 4 data words deep, whereas the data fifo is 64 data words deep --hence maximum burst lenght which can exactly fit the fifos is (64/4 = 16). -- !! Make sure burst lenght is not more than 16 !! c3_p0_cmd_bl <= std_logic_vector(to_unsigned(read_burst_count-1,6) ); if(c3_p0_cmd_full = '0') then c3_p0_cmd_en <= '1'; --pass command to MIG current_state <= command_buffer_check_rd; read_counter := to_unsigned(read_burst_count,5); end if; when command_buffer_check_rd => c3_p0_cmd_en <= '0'; if(c3_p0_rd_empty = '0') then c3_p0_rd_en <= '1'; read_data_to_rd <= '1'; --Take care of port alignment if(location_2_lsb = "00") then data_to_rd <= c3_p0_rd_data(15 downto 0); else --location_2_lsb = "10" data_to_rd <= c3_p0_rd_data(31 downto 16); end if; current_state <= read_buffer_check; end if; when read_buffer_check => c3_p0_rd_en <= '0'; read_data_to_rd <= '0'; read_counter := read_counter - 1; if(read_counter = 0) then prcs_ready <= '1'; current_state <= procedure_finished; else current_state <= command_buffer_check_rd; end if; when catch => test_pin_signal <= '1'; when others => NULL; end case; end if; end process; --test_pin <= test_pin_signal; location_2_lsb <= location(1 downto 0); end architecture Behavioral;
Re: How fast is reading and writing with MIG?
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content
07-05-2012 09:25 AM
b,
Welcome to the world of DRAM memories. Yes, it takes quite a bit of optimization to push such memories to even 50% efficiency (they read, or write at even half the rate they are being clocked at).
This is not new, and definitely not something that 'only' happens in FPGA implementations: it happens whenevere these devices are being used. One has to eanable a row, wait, and then access. Whhenn accessing, it is best to access as many locations as possible (read in 256 bytes at a time, or write many bytes at a time, before you change the row address).
Significant software techniques, and added hardware is often used to increase the efficiency: re-ordering all operations in a queue so that row address changes are minimized, performing all the reads at once, etc.
http://www.xilinx.com/txpatches/pub/documentation/
Principal Engineer
Xilinx San Jose
Re: How fast is reading and writing with MIG?
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content
07-05-2012 01:50 PM
Actually the Spartan 6 MCB does a pretty good job of merging multiple small read or write
commands when you access the memory sequentially. Without going through your code,
my best guess is that when writing, your commands are issued while the command queue
is not full - this gives the MCB a "lookahead" at the next write command and allows it to
burst them together.
When reading, if you always wait for returned data or an empty command queue before issuing
the next read, then you will never get close to the memory's maximum bandwidth. However if
you either use a longer burst (make the user interface burst as long or longer than the memory
chip burst size) or queue up multiple read commands, then you can gain some speed.
-- Gabor
Spartan-6 MCB performanc e
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content
07-05-2012 08:38 PM
Actually the Spartan 6 MCB does a pretty good job of merging multiple small read or write commands when you access the memory sequentially...
... However if you either use a longer burst (make the user interface burst as long or longer than the memory chip burst size) or queue up multiple read commands, then you can gain some speed.
This recent thread includes some very interesting (and, to me, surprising) Spartan-6 MCB performance trial results. In particular, read posts #7 and #9.
A comprehensive white paper on Spartan-6 MCB performance would be very interesting to Spartan-6 customers. UG388 has no useful information for understanding how to maximise effective performance from the MCB.
Some examples:
- For consecutive read (or write) operations, is there an optimal transaction burst length (cmd_BL)? Does MCB "overhead" incur one or more "dead" cycles between back-to-back operations?
- When user access patterns effectively refresh the DRAM (for example: video buffer fills or fetches), does MCB skip refreshes for rows which have recently been accessed (and refreshed)? When user access patterns are organised to also refresh the memory, can (redundant) MCB refresh activity be disabled?
- Can refresh be performed entirely during opportune "dead" times -- such as video blanking intervals -- to ensure that refresh activity does not interfere with memory access during other times? Does MCB simply schedule a refresh every tREFI? Or does MCB understand that once all rows have been refreshed in a certain period (typically 64mS for commercial DDR2), no further refresh cycles are needed in that period?
For some unknown reason this sort of useful applications information, which is known by design, was not deemed worth of inclusion in UG388.
-- Bob Elkind
README for newbies is here: http://forums.xilinx.com/t5/New-Users-Forum/README-first-Help-for-new-users/td-p/219369
Summary:
1. Read the manual or user guide. Have you read the manual? Can you find the manual?
2. Search the forums (and search the web) for similar topics.
3. Do not post the same question on multiple forums.
4. Do not post a new topic or question on someone else's thread, start a new thread!
5. Students: Copying code is not the same as learning to design.
6 "It does not work" is not a question which can be answered. Provide useful details (with webpage, datasheet links, please).
7. You are not charged extra fees for comments in your code.
8. I am not paid for forum posts. If I write a good post, then I have been good for nothing.
Re: Spartan-6 MCB performanc e
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content
07-06-2012 02:51 AM
hi all,
thanks for your replies.i will read them..
But just a quick query, does it make sense that the read data rate is so low compared to the write data rate? doesn't it make more sense (electronically), that the read is faster?
thanks for your time.
bouvett
Re: Spartan-6 MCB performanc e
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content
07-06-2012 03:13 AM
But (if my experience with Virtex-4/5 controllers is anything to go by) there is a lot of latency with a read, which will reduce any realistic effective rate measurement.
------------------------------------------
"If it don't work in simulation, it won't work on the board."
Re: Spartan-6 MCB performanc e
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content
07-06-2012 05:05 AM
For video applications, which typically use very long sequential reads or writes, it is
possible to get very close to the theoretical maximum bandwidth of the memory.
As for skipping refresh, this is not an option for DDR memories according to the JEDEC
standard which guarantees a refresh at regular intervals in order for the memory to
update the internal DLL while most of the interface pins are idle. It is not clear that
many DRAM chips require this (Micron explicitly states they don't), but I'd be surprised
if the MCB designers would ignore the requirement.
-- Gabor
Re: Spartan-6 MCB performanc e
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content
07-06-2012 01:42 PM
As for skipping refresh, this is not an option for DDR memories according to the JEDEC standard which guarantees a refresh at regular intervals in order for the memory to update the internal DLL while most of the interface pins are idle.
First, a semantic nit: I believe you intended to write that the JEDEC standard requires (rather than guarantees) refresh at regular intervals...
The "skipping" to which I refer is directed at the MCB logic, and not the memory device. Re-phrased:
Is the MCB design clever enough to realise when refreshes are redundant and unnecessary (and can be skipped), or will the MCB issue a refresh transaction every tREFI whether needed or not?
There are proven methods to explicitly schedule refresh activity when it will have the least impact on overall system performance. Is the Spartan-6 MCB supportive of such practices? To my understanding, this is a useful but unanswered question.
-- Bob Elkind
README for newbies is here: http://forums.xilinx.com/t5/New-Users-Forum/README-first-Help-for-new-users/td-p/219369
Summary:
1. Read the manual or user guide. Have you read the manual? Can you find the manual?
2. Search the forums (and search the web) for similar topics.
3. Do not post the same question on multiple forums.
4. Do not post a new topic or question on someone else's thread, start a new thread!
5. Students: Copying code is not the same as learning to design.
6 "It does not work" is not a question which can be answered. Provide useful details (with webpage, datasheet links, please).
7. You are not charged extra fees for comments in your code.
8. I am not paid for forum posts. If I write a good post, then I have been good for nothing.
Re: Spartan-6 MCB performanc e
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content
07-06-2012 01:49 PM
The "skipping" to which I refer is directed at the MCB logic, and not the memory device. Re-phrased:
Is the MCB design clever enough to realise when refreshes are redundant and unnecessary (and can be skipped), or will the MCB issue a refresh transaction every tREFI whether needed or not?
My point was that according to JEDEC, refreshes cannot be skipped - not because a row will lose its data,
but because the memory device needs periodic updates to the DLL circuit. This was not the case in the old single-
data-rate SDRAM's which had no local DLL. This requirement may have been removed for newer varieties of
DDR (DDR2 DDR3) but it is certainly in the original DDR memory JEDEC spec.
-- Gabor
Re: Spartan-6 MCB performanc e
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content
07-06-2012 02:22 PM
My point was that according to JEDEC, refreshes cannot be skipped - not because a row will lose its data, but because the memory device needs periodic updates to the DLL circuit. This was not the case in the old single-data-rate SDRAM's which had no local DLL. This requirement may have been removed for newer varieties of DDR (DDR2 DDR3) but it is certainly in the original DDR memory JEDEC spec.
From a recent Micron 1Gbit DDR2 device datasheet:
The refresh period is 64ms (commercial) or 32ms (industrial and automotive). This equates to an average refresh rate of 7.8125μs (commercial) or 3.9607μs (industrial and automotive). To ensure all rows of all banks are properly refreshed, 8,192 REFRESH commands must be issued every 64ms (commercial) or 32ms (industrial and automotive).
In other words, the memory requires 8K refreshes every 64mS, which is a much more flexible requirement than a single refresh every 7.8125uS. The unanswered question stands: Does Spartan-6 MCB permit some or all of the flexibility allowed by the Micron memory device?
From a several-years-old Micron 1Gbit DDR3 device datasheet (notice the highlighted differences!):
The refresh period is 64ms when TC is less than or equal to 85°C. This equates to an average refresh rate of 7.8125μs. However, nine REFRESH commands should be asserted at least once every 70.3μs. When TC is greater than +85°C, the refresh period is 32ms. Although JEDEC specifies tREFI as a MAX, Micron allows REFRESH commands to be burst provided that the maximum refresh period is not violated.
-- Bob Elkind
README for newbies is here: http://forums.xilinx.com/t5/New-Users-Forum/README-first-Help-for-new-users/td-p/219369
Summary:
1. Read the manual or user guide. Have you read the manual? Can you find the manual?
2. Search the forums (and search the web) for similar topics.
3. Do not post the same question on multiple forums.
4. Do not post a new topic or question on someone else's thread, start a new thread!
5. Students: Copying code is not the same as learning to design.
6 "It does not work" is not a question which can be answered. Provide useful details (with webpage, datasheet links, please).
7. You are not charged extra fees for comments in your code.
8. I am not paid for forum posts. If I write a good post, then I have been good for nothing.











