UPGRADE YOUR BROWSER

We have detected your current browser version is not the latest one. Xilinx.com uses the latest web technologies to bring you the best online experience possible. Please upgrade to a Xilinx.com supported browser:Chrome, Firefox, Internet Explorer 11, Safari. Thank you!

cancel
Showing results for 
Search instead for 
Did you mean: 
Visitor pieterhuyghe
Visitor
9,453 Views
Registered: ‎11-16-2007

Spartan6 -> MCB Performance

Hi all,

 

For a design were I use a Spartan6 LX100T, I have some problem with the maximum bandwidth. So at the starting up of the FPGA we write all the data to the RAM after this we only have to read. The clock of the RAMS is 200Mhz, but it are DDR2 so there are running at 400Mhz. I've generated a MIG v3.8 module with coregen and the cmd interface runs at 100Mhz and the data offcourse at 200Mhz. So we read data from these RAMS and put this in a FIFO were we read out at 150Mhz, because we need a constant stream of data. The problem is now the it seems that the inferface to the MCB is to slow. Below you can find a picture which should make my problem more visible. Anybody knows how I can shorter the time between the cmden and when there is actual data? Or is it not possible to make a constant stream of 150Mhz of data with the MCB block in a Spartan6?

ReadActionMIGv3_8_Spartan6LX100T.png
0 Kudos
18 Replies
Explorer
Explorer
9,435 Views
Registered: ‎08-12-2011

Re: Spartan6 -> MCB Performance

Hi Pieter,

 

The architecture of consumer SDRAM (SDRAM, DDR2, DDR3) means that an SDRAM controller can be optimised in various ways:

1. Optimise for low latency of small random accesses.  This unfortunately leads to low bandwidth.

2. Optimise for high bandwidth.  This unfortunately leads to high latency.

3. Provide a balance of moderate bandwidth and moderate latency.

4. Optimise for a specific application - can give excellent results for memory masters with very specific access patterns.

 

I'm not very familiar with the S6 MCB, but I guess it falls into category 3.  If your application falls into category 4 then you can probably get better results with an optimised soft DRAM contoller.  That may sound crazy, but it's true - a soft controller (ie implemented with ISERDES, OSERDES, LUTs and FFs) can outperform a hard macro like the MCB for specific types of memory access.

 

Ask yourself a few questions:

A. Am I happy to sacrifice bandwidth to achieve lower latency ? (or vice-versa)

B. Are my memory accesses extremely predictable - eg access address known far in advance, access size always the same, accesses are always Read-Modify-Write, each access is alway in a different bank than the previous one, ...

In short, if there's anything atypical but consistent about your access pattern then an optimised soft controller may significantly outperform the MCB.

 

In a previous job I was able to reduce the latency of a DDR2 controller attached to a DDR2-333 x16 DRAM from 22 clock cycles down to 13 by dumping the general purpose controller and hand crafting an application optimised one.

 

I'm not quite sure from your post, but it sounds like you're asking for both sustained bandwidth and low latency.  This is generally not achievable, but if you can post your requirements in more detail I can tell you whether you're asking for the impossible or not.  The crucial information is:

* DRAM width - 4bit / 8bit / 16bit ?

* CAS latency - CL3 or CL4 ?

* required sustained bandwidth ?

* Access pattern - random addresses or sequential or ... ?

* Required transfer size - 32 bit / 64 bit / 128 bit / ... ?

* required latency ?

 

Best regards,

Stephen

 

Stephen Ecob

Silicon On Inspiration

Sydney Australia

www.sioi.com.au

 

Xilinx Employee
Xilinx Employee
9,424 Views
Registered: ‎10-23-2007

Re: Spartan6 -> MCB Performance

What is your external memory width - 4, 8, or 16 bits?  And what size internal port for the MCB did you choose?  32, 64, or 128 bit?

0 Kudos
Visitor pieterhuyghe
Visitor
9,401 Views
Registered: ‎11-16-2007

Re: Spartan6 -> MCB Performance

Thanks for the replies, my feedback can be found below:

 

@eschabor: I know that we can use a soft DRAM controller, this was done in a Virtex5. But we thought that because it's a HW controller it would be faster that the soft controller. But if I read you're explanation this is not true, it could actually be that the soft controller is faster then the MCB blocks.

We're using DDR2 with a 16bit interface and the interface in the MCBmodule is 32bit, so this should be perfect for the optimal performance.

 

@jspaldings: Like I said in the reply to eschabor we're using 16bit DDR2 and the MCB interface is 32bit. So should be perfect I think.

 

I'll now look to check if I can run the DDR2 at a faster speed, because we stuffed DDR2 800 on the boards, but now we're only using have of it.

0 Kudos
Explorer
Explorer
9,388 Views
Registered: ‎08-12-2011

Re: Spartan6 -> MCB Performance

In most situations the MCB is the best solution - it is only is some specific use cases that soft DRAM controllers are better.  If you want absolute lowest possible latency or if you want to stream data at 98% of maximum bandwidth then a soft controller is a better solution.

 

"16bit DDR2 and the MCB interface is 32bit. So should be perfect I think."

No, DDR2 has a minimum burst of 4, so the minimum efficient transfer size is 64 bits.  The use of 32 bit transfers immediately throws away 50% of your bandwidth.

 

"I'll now look to check if I can run the DDR2 at a faster speed, because we stuffed DDR2 800 on the boards, but now we're only using have of it."

That would double your bandwidth and leave your latency around the same.

 

If you post your detailed requirements then we can provide more specific answers.

1. Required latency ?

2. Required bandwidth ?

3. Address predictability - are the addresses highly predictable or highly random ?

 

 

0 Kudos
Visitor areslee
Visitor
9,368 Views
Registered: ‎06-29-2010

Re: Spartan6 -> MCB Performance

Hi, I have similar problem, my spartan-6 connect a 16bit DDR2 running on 300MHz,  the read and write addresses are all highly predictable and in different bank, I want read 16 32bit data and write 16 32bit data within 40 clocks, can this be reached using a simple soft controller?

0 Kudos
Explorer
Explorer
9,366 Views
Registered: ‎08-12-2011

Re: Spartan6 -> MCB Performance

Yes, that is possible.  You could certainly do it with a simple soft controller and I'd guess that you could also do it with the Spartan 6 MCB.

 

A couple of caveats:

* Memory refresh will mean that occasionally an access will take significantly longer (roughly an extra 40 clocks).  Refreshes happens every 8 microseconds or so

* If you have back to back accesses that are in the same bank but different pages then you'll have an extra delay of around 20 clock cycles

 

Stephen Ecob

Silicon On Inspiration

Sydney Australia

www.sioi.com.au

$39 Spartan 6 board with 32MB DDR DRAM ?

http://www.sioi.com.au/shop/product_info.php/products_id/47

 

0 Kudos
Visitor pieterhuyghe
Visitor
9,351 Views
Registered: ‎11-16-2007

Re: Spartan6 -> MCB Performance

So we have two DDR2 chips and the MIG has 8-ports (4 32-bit read and 4 32-bit write ports). Four of them are used for a OSD (this is a low refresh rate, don't use much  BW). From the other four ports we mostly use the two read ports, like I mentioned earlier we write once to the DDR2 and then we only read and we want to have a continues stream at 150Mhz.

The addresses are highly predictable, there are always counting up and no randow switches. 

 

I also saw that I infact have some bi-dir ports left on the MCB, so maybe by using two 32-bit as only read ports. By doing this I would have 4 read ports to make a contiues stream of 150Mhz and I would get rid of the latency because I could read from 2 different FIFO's (there are 4 32-bit read ports then and I'm wont to have a 64bit interface). But I still think, I would have to increase the clock of the the DDR2 to get enough BW. The advantage would be that I would less increase the clock by using extra read ports. Correct?

0 Kudos
Explorer
Explorer
9,337 Views
Registered: ‎08-12-2011

Re: Spartan6 -> MCB Performance

Two 16 bit DRAMS operating at DDR400 transfer rates give you a maximum theoretical bandwidth of 1.6GB/s.  You say that "The addresses are highly predictable, there are always counting up and no randow switches" so a real world bandwidth of 70-80% of maximum theoretical should be certainly be possible, ie > 1GB/s

 

You want to have a "continuous stream at 150MHz".  I'm guessing that you mean 32 bits at 150MHz, which means 600MB/s. That should certainly be doable given the available peak bandwidth and predicatable addressing.

 

You ask about the effect of using more of the MCB ports.  It shouldn't make much difference - the DRAM bandwidth is shared across the MCB ports, using less of them or more of them doesn't change the latency or bandwidth of the DRAM.

 

I'm ignorant of the MCB, it may be that you can attain your goal be modifying the configuration of your MCB.  I'd be very surprised if the MCB couldn't handle 600MB/s of contiguous accesses from 1.6GB/s peak bandwidth.

 

As far as soft controllers go: yes, I'm sure it can be done - as long as I've interpreted your statements correctly (150MHz means 600MB/s etc). 

 

Regards,
Stephen

 

Stephen Ecob

Silicon On Inspiration

Sydney Australia

www.sioi.com.au

$59 Spartan 6 LX9 board with 32MB DDR DRAM ?

http://www.sioi.com.au/shop/product_info.php/cPath/30_24/products_id/48
 

Instructor
Instructor
9,333 Views
Registered: ‎07-21-2009

Re: Spartan6 -> MCB Performance

Stephen,

 

I would analyse the bandwidth balance a bit differently -- of course Pieter must be the judge, and Pieter has not provided all useful detail.

 

Here are some alternate assumptions and derivations:

 

Typically, a video buffer must record and output video concurrently (unless it is a still frame grabber).  So video rate to and from the DRAM buffer is 300 MPixels/sec rather than 150 MPixels/sec.  This roughly corresponds to the pixel data rate for 1080P/60 video.

 

Typical switching suite (rather than editing suite) video is 3 channels per pixel (Y/U/V or G/B/R), at 8 bits per channel.  In other words, 3 bytes per pixel.  Now the aggregate video buffer bandwidth is 900 MBytes/sec. [note, for an editing suite device, a 4th channel -- alpha channel -- is a requirement].

 

150MHz is a very reasonable fabric clock rate for Spartan-6 devices.  At 150MHz fabric clock frequency and 900 MB/sec sustained video data bandwidth, a 4-byte (32-bit) port for each of two MCBs would be needed.  In other words, the fabric can feed data to each MCB at up to 600 MBytes/sec, for a total bandwidth of 1.2 GBytes/sec.

 

Each DRAM must be able to provide sustained data bandwidth of 450 MBytes/sec.  At DDR2-300, peak bandwidth is 600 MBytes/sec for each DRAM.  Considering the sequential access pattern of video, 25% bandwidth de-rating (from 600 MB/sec to 450MB/sec) is very conservative.  It shouldn't hurt that the fabric video clock (150MHz) is the same frequency as the DRAM clock, which is exactly half the frequency of the MCB clock.  Note that customary minimum clock frequency for DDR2 devices is 125MHz, so the 150MHz DRAM clock is well within DRAM device spec.

 

Here is a quick summary:

 

  • Video pixel:  3 bytes
  • Video buffer store bandwidth:  150 Mpixels/sec, 450 MBytes/sec
  • Video buffer readout bandwidth:  150 Mpixels/sec, 450 MBytes/sec
  • MCB (each of two) video port read/write peak bandwidth:  600 MBytes/sec (150 MHz fabric clock, 4 byte port width)
  • MCB (total both MCBs) video port read/write peak bandwidth:  1.2 GBytes/sec
  • DRAM clock:  150MHz
  • DRAM (each of two) peak read/write bandwidth: 600 MBytes/sec
  • DRAM (total both DRAMs) peak read/write bandwidth:  1.2 GBytes/sec

 

Overall, this seems fairly well balanced and conservative.

 

In a perfect world, odd video lines are stored in one MCB/DRAM and even video lines are stored in the other MCB/DRAM.  While an even video line is fetched from one MCB/DRAM, an odd video line is written to the other MCB/DRAM.  This simplifies the video line access pattern, avoiding interference (addressing, DQ bus turnaround) between the two access ports which would otherwise degrade bandwidth efficiency.  All 'miscellaneous' video buffer accesses (e.g. OSD) should be fitted in the horizontal video blanking intervals, avoiding interference with the active video line accesses.

 

Having said all this, I would make one last change.  I would ratchet the DRAM clock from 150MHz up to 300MHz.  Here are the reasons why:

 

  • Even -2 speed grade Spartan-6 devices are rated for (better than) DDR2-600 performance (i.e. 300MHz DRAM clock).
  • There is nothing gained by running the DRAM at a lower clock frequency.
  • DRAM power consumption is esssentially the same at either 150MHz or 300MHz, because cycle bandwidth is unchanged.
  • Slightly shorter access latency due to higher frequency MCB clock.

None of the 'reasons' listed above are very compelling.  The reason for doubling the DRAM clock frequency is, quite simply, nothing more than 'because we can'.

 

There...  I've told you (and Pieter) everything I know... and more.  Keep in mind that there are some major design assumptions which have been made -- but not confirmed -- in each of our schemes.

 

-- Bob Elkind

SIGNATURE:
README for newbies is here: http://forums.xilinx.com/t5/New-Users-Forum/README-first-Help-for-new-users/td-p/219369

Summary:
1. Read the manual or user guide. Have you read the manual? Can you find the manual?
2. Search the forums (and search the web) for similar topics.
3. Do not post the same question on multiple forums.
4. Do not post a new topic or question on someone else's thread, start a new thread!
5. Students: Copying code is not the same as learning to design.
6 "It does not work" is not a question which can be answered. Provide useful details (with webpage, datasheet links, please).
7. You are not charged extra fees for comments in your code.
8. I am not paid for forum posts. If I write a good post, then I have been good for nothing.
Explorer
Explorer
7,280 Views
Registered: ‎08-12-2011

Re: Spartan6 -> MCB Performance

Bob, your analysis is good - but I think there's two of us providing answers and no further questions from Pieter!

Stephen

 

0 Kudos
Instructor
Instructor
7,277 Views
Registered: ‎07-21-2009

Re: Spartan6 -> MCB Performance


@eschabor wrote:

Bob, your analysis is good - but I think there's two of us providing answers and no further questions from Pieter!

Stephen

 


Stephen, I concur with your analysis of the analyses!

 

-- Bob Elkind

SIGNATURE:
README for newbies is here: http://forums.xilinx.com/t5/New-Users-Forum/README-first-Help-for-new-users/td-p/219369

Summary:
1. Read the manual or user guide. Have you read the manual? Can you find the manual?
2. Search the forums (and search the web) for similar topics.
3. Do not post the same question on multiple forums.
4. Do not post a new topic or question on someone else's thread, start a new thread!
5. Students: Copying code is not the same as learning to design.
6 "It does not work" is not a question which can be answered. Provide useful details (with webpage, datasheet links, please).
7. You are not charged extra fees for comments in your code.
8. I am not paid for forum posts. If I write a good post, then I have been good for nothing.
0 Kudos
Participant rifo
Participant
7,241 Views
Registered: ‎09-20-2010

Re: Spartan6 -> MCB Performance


@eschabor wrote:

"16bit DDR2 and the MCB interface is 32bit. So should be perfect I think."

No, DDR2 has a minimum burst of 4, so the minimum efficient transfer size is 64 bits.  The use of 32 bit transfers immediately throws away 50% of your bandwidth.



Hello,

 

Shouldn't the internal mechanics of the MCB handle this so that bandwidth is not wasted? For example if I completely  fill a MCB port Fifo (configured in 32 bit) do I automatically lose half of my bandwidth?

 

thanks a lot for your help

rifo

0 Kudos
Instructor
Instructor
7,238 Views
Registered: ‎07-21-2009

Re: Spartan6 -> MCB Performance

@rifo

 

Shouldn't the internal mechanics of the MCB handle this so that bandwidth is not wasted? For example if I completely  fill a MCB port Fifo (configured in 32 bit) do I automatically lose half of my bandwidth?

 

For example, Stephen (eschabor) is saying:  'For maximum efficiency, request data transactions which are a multiple of the fundamental DRAM transaction (e.g. 64 bits in the case or 16-bit DDR2 with DRAM burst length of 4).'

 

Even if you fill a MCB port FIFO, you must tell the MCB how much of the FIFO data to use for each memory transaction from one of the MCB user (fabric) ports.  You explicitly convey this information by way of the pX_cmd_bl[5:0] user-interface MCB signal port.  See UG388, seach for 'cmd_bl' for additional information.

 

Make sure you understand the difference between user-port 'burst length' (how many FIFO words are to be read or written) and DRAM device-level 'burst length'.  These are two very different attributes which sound deceptively alike.  One is described in UG388 as a part of the MCB interface.  The other is described in the DRAM device datasheet.

 

-- Bob Elkind

SIGNATURE:
README for newbies is here: http://forums.xilinx.com/t5/New-Users-Forum/README-first-Help-for-new-users/td-p/219369

Summary:
1. Read the manual or user guide. Have you read the manual? Can you find the manual?
2. Search the forums (and search the web) for similar topics.
3. Do not post the same question on multiple forums.
4. Do not post a new topic or question on someone else's thread, start a new thread!
5. Students: Copying code is not the same as learning to design.
6 "It does not work" is not a question which can be answered. Provide useful details (with webpage, datasheet links, please).
7. You are not charged extra fees for comments in your code.
8. I am not paid for forum posts. If I write a good post, then I have been good for nothing.
0 Kudos
Participant rifo
Participant
7,228 Views
Registered: ‎09-20-2010

Re: Spartan6 -> MCB Performance

thanks a lot Bob,

 

I misunderstood what Stephen was sayin but it's clear now.

0 Kudos
7,037 Views
Registered: ‎05-15-2012

Re: Spartan6 -> MCB Performance

Hi all,

 

I am testing a Spartan6 design using x16 DDR2 SDRAM and it appears to have very poor performance.

 

I'm issuing write command  bursts 16 x 32bit data word and read commands of 16 x 32bit words. My data access pattern is sequential.  i.e. I'm using the SDRAM as a really deep FIFO, so reads and writes are to different pages.  

 

The sustained data throughput is 43% of peak data throughput. i.e. for 400MHz DDR I can not read and write data at 100MHz x 32 bits.

 

This seems to be very poor performance.

 

Does anyone have any idea if this is typical performance for the MCB? Do I have to change my interface to use the 64bit or 128bit interfaces to the MCB?  Is there any other way to improve this?

 

 

Best Regards, Thomas D.

--


0 Kudos
Highlighted
Adventurer
Adventurer
6,987 Views
Registered: ‎06-26-2008

Re: Spartan6 -> MCB Performance

I'd like to see if anyone out there has some measurable data on this.

 

For example, on a single DDR2 16bit width chip, using a 32 bit write MCB port.  If I fill the fifo of the MCB write port and send a command to the MCB with a user burst length of 64 (the max length), then I automatically lose half of my bandwidth on that transaction based solely on my configured port width?  It sounds like a terrible design if this is true!  I would think that since the transaction is on a continuous stretch of address, the MCB controller would optimize it so that the DRAM bursts would not waste bandwidth.

 

Did I miss understood something, or was the optimization implied in the response?

 

I have having trouble achieving write bandwith (read is perfectly fine).  Has anyone seen an improvement in performance by increasing the port clocks (ie. p0_wr_clk, p1_rd_clk ... etc).

 

Thanks,

-J

Tags (1)
0 Kudos
Instructor
Instructor
6,985 Views
Registered: ‎07-21-2009

Re: Spartan6 -> MCB Performance

J,

 

Please start a new thread for this discussion subject.

 

-- Bob Elkind

SIGNATURE:
README for newbies is here: http://forums.xilinx.com/t5/New-Users-Forum/README-first-Help-for-new-users/td-p/219369

Summary:
1. Read the manual or user guide. Have you read the manual? Can you find the manual?
2. Search the forums (and search the web) for similar topics.
3. Do not post the same question on multiple forums.
4. Do not post a new topic or question on someone else's thread, start a new thread!
5. Students: Copying code is not the same as learning to design.
6 "It does not work" is not a question which can be answered. Provide useful details (with webpage, datasheet links, please).
7. You are not charged extra fees for comments in your code.
8. I am not paid for forum posts. If I write a good post, then I have been good for nothing.
0 Kudos
Adventurer
Adventurer
6,975 Views
Registered: ‎06-26-2008

Re: Spartan6 -> MCB Performance

0 Kudos