UPGRADE YOUR BROWSER

We have detected your current browser version is not the latest one. Xilinx.com uses the latest web technologies to bring you the best online experience possible. Please upgrade to a Xilinx.com supported browser:Chrome, Firefox, Internet Explorer 11, Safari. Thank you!

cancel
Showing results for 
Search instead for 
Did you mean: 
Advisor ronnywebers
Advisor
422 Views
Registered: ‎10-10-2014

does a DDR3 or DDR4 refresh cycle stall a PL master for a while when it wants to access the same memory region that is being refreshed?

Jump to solution

I plan to develop an IP that uses lookup tables in DDR. 

I'm not a DDR expert at all, but I understand that it has a high throughput, but has it's limitations when being used as random access memory.

I think to understand that there's a variable latency involved when accessing the DDR, i.e. whether it needs to switch banks or not, wether another access is ongoing, and so on. 

The other question I have is wether a refresh in the DDR would 'block' the PL (or any other master) from accessing the DDR, or does it happen 'transparantly' or in background such that no master suffers from a refresh? Let's say the PL wants to access a region in the DDR that is at that moment being refreshed, does that 'stall' the access until the refresh is completed? If so, how long does a typical refresh take, or where is this controlled/set?

** kudo if the answer was helpful. Accept as solution if your question is answered **
0 Kudos
1 Solution

Accepted Solutions
Moderator
Moderator
317 Views
Registered: ‎11-28-2016

Re: does a DDR3 or DDR4 refresh cycle stall a PL master for a while when it wants to access the same memory region that is being refreshed?

Jump to solution

Hello @ronnywebers ,

Thanks for providing the additional details.  The question is how time sensitive is your workload and how well your system can 'survive' a refresh period when the memory is not accessible.  tRFC for a 4Gbit DDR3 is 260ns, add in some extra time for the precharges and activates, and you're looking at around 320ns when you may not be able to access the memory.  From an overall efficiency standpoint random addressing like you describe is the worst case scenario and that's made worse by having short accesses like 64-bytes.  Still it sounds like your total bandwidth requirement for this application is low and in total there will be plenty of memory bandwidth available but your access pattern isn't ideal.  Unless you have some unreasonably strict latency requirements I don't see any issues with your total memory bandwidth.  If you want to simulate this then the best path is the Zynq UltraScale+ Verification IP in DS941:
https://www.xilinx.com/support/documentation/ip_documentation/zynq_ultra_ps_e_vip/v1_0/ds941-zynq-ultra-ps-e-vip.pdf

Hello @watari ,

Self Refresh is a low power state for the DDR memory in which you can't perform any accesses.  The idea is to save power when you have a system that intermittently needs to use the DRAM.  Self-Refresh saves the memory contents while in a low power state and then can quickly return to an active state to service the needs of the system.  This is a solution for a different problem then the one we're looking at here.  The PL side has a User Refresh option in which you can schedule your own refreshes but that doesn't exist in the PS.

View solution in original post

10 Replies
Scholar dgisselq
Scholar
404 Views
Registered: ‎05-21-2015

Re: does a DDR3 or DDR4 refresh cycle stall a PL master for a while when it wants to access the same memory region that is being refreshed?

Jump to solution

@ronnywebers,

SDRAM is organized into rows, columns, and banks.  Each bank has a charging location where a given row can be read into and charged.  Once it is in this charging location the information in that row can be read or written.  A column address is used to do this.

A general SDRAM access, once the protocol has been handled, looks like 1) a PRECHARGE command to remove whatever row is in the charging position, 2) an ACTIVATE command to move a row from the memory to the charging location, followed by either a 3) READ or WRITE command to actually move the data.

How the memory core handles these accesses is up to the driver.  For example, once a row has been activated, it can stay active for a while without needing to be deactivated..  Hence, you can issue multiple READs or WRITEs to a row that has been activated.  Similarly, you can have a row activated on each of the eight banks, and it is up to the controller again to choose which row will be activated, and when it will be deactivated (PRECHARGEd).

A common AXI misperception of SDRAM memory is that both READ and WRITE can take place at the same time.  This is not the case.  The memory data wires are shared between READ and WRITE, so one direction must stall for the other to take place.

A certain number of REFRESH commands also need to be sent to the memory every second.  Typically you need to send one REFRESH command to the memory for every row and bank in the memory inside 640uS, so the bigger the memory the more REFRESH commands need to be issued.  To send a REFRESH command, the controller must first issue a PRECHARGE-ALL banks command.  Once issued, no READs or WRITEs are possible.  Once the PRECHARGE-ALL command is complete, the REFRESH command may be issued.  The number of cycles between the two is memory dependent.  Similarly, once a given REFRESH command has been issued, the memory cannot be accessed for several cycles while the REFRESH is taking place.  Once complete, the memory is returned to service and the controller may now issue an ACTIVATE row command, followed by the READ or WRITE that is requested from the bus.

There's a lot of room for configurability in here.  For example, some memory chips allow you to send multiple REFRESH cycles at a time.  So, after issuing the REFRESH, if nothing is requesting the access, the controller may (after a minimum time period) issue another REFRESH.  Typically, I've seen memory controllers that will allow up to 16 early REFRESH cycles.  In other words, if the controller sends 16 REFRESH cycles, then it can go 16 REFRESH intervals without issuing another REFRESH command.

Another possible location for configurability regards how long a row is kept activated.  It doesn't really make sense, for example, to PRECHARGE (deactivate) a row that is in use, allowing back to back READ commands.  However in a fully random access scenario, this would mean that the row would need to be PRECHARGEd and a new row ACTIVATEd between every READ/WRITE request.  The alternative would be to issue a PRECHARGE command any time the bus becomes idle, and then issue an ACTIVATE followed by the READ/WRITE command upon any access.

Similarly, it takes fewer clock cycles to WRITE to the device than it takes to READ from it, since during a WRITE (once the ACTIVATE has taken place), the data may be placed on the data wires immediately.  This also means, though that WRITES must complete before READs may be issued--so as to avoid contention on the bus..  READs on the other hand may be issued immediately after WRITE commands.  On the data lines themselves, once the WRITE completes, the wires must become idle before the READ can take place.

Part of the trick of an DDRx SDRAM controller lies in the timing to and from the chip.  The controller needs to send a clock to the chip clocking all operations.  This clock is required to be delayed 90 degrees from the rest of the logic.  There's also an enable line--so a power conscious controller can disable the clock, although it takes a clock or two to re-enable it before commands can be issued again.  The second part of the timing lies in the DQS wires.  These are bidirectional.  During a WRITE, they transition at the beginning and end of every bit.  During a READ, they transition in the middle of the bits coming from the memory.    The challenge of the controller is to somehow lock a PLL to these DQS wires, since they contain the actual delay from the chip.  (Did I mention that there was one DQS wire for every 8 data wires?)

The Xilinx controller, as I understand, requires a minimum number of READ commands from the memory in order to keep a PLL locked to the various DQS streams.  This period READ command will also take your controller off line--assuming you have not been issuing your own READ commands.

The whole thing is a rather complex protocol, but hopefully this helps you understand the choices that take place within the controller a bit better.

Dan 

Moderator
Moderator
366 Views
Registered: ‎11-28-2016

Re: does a DDR3 or DDR4 refresh cycle stall a PL master for a while when it wants to access the same memory region that is being refreshed?

Jump to solution

Hello @ronnywebers ,

The PL DDR3/DDR4 controller will Precharge all the banks prior to issuing the refresh command and the back pressuring behavior depends on the state of the DDR controller at that point.  If the controller is fully queued up with commands when the scheduled refresh must occur then you'll see the app_rdy deassert if using the native interface.  If using the AXI option when you'll see the AXI xREADY signals deassert.  If there are still command slots available you may be able to issue a few read commands before you'll see the controller backpressure.  Overall this is something you can simulate with the Example Design.  Keep in mind the refresh period tRFC is determined by the die density of your memory.

Advisor ronnywebers
Advisor
345 Views
Registered: ‎10-10-2014

Re: does a DDR3 or DDR4 refresh cycle stall a PL master for a while when it wants to access the same memory region that is being refreshed?

Jump to solution

thanks @dgisselq for the very detailed explanation - looks like there are many factors to be considered indeed ...

as @ryana suggest, simulation would probably give me a better idea ... however ... when I read this, I saw that my original question is missing one detail : I'd like to use the DDR3 /DDR4 memory connected to the PS of my Zynq / Zynq US+, and read/write data in DDR through the AXI ports. My current hardware board has only DDR memory connected to the DDR  (But still it's interesting to check the DDR PL controller example for me)

So sorry for that - I should rephrase my question maybe : let's say DDR3 on my Zynq board runs at 1,066 MT/s (two Micron 256M x 16-bit DDR3L memory components creating a 256M x 32-bit interface, totaling 1 GB of random access memory)

If my PL code runs at 100MHz, that is 10x slower than the DDR, and loads small blocks (like 64 bytes, 256 bytes, 1k), where the block start address will be random between block reads, will it suffer a lot from the fact that it's DDR? This is probably much more difficult to simulate, and better to measure with an ILA? It will also depend a lot on the activity of the dual A9 controllers etc. 

Guess my question is hard to answer, and that some kind of profiling is my only real option? 

 

** kudo if the answer was helpful. Accept as solution if your question is answered **
0 Kudos
Mentor watari
Mentor
339 Views
Registered: ‎06-16-2013

Re: does a DDR3 or DDR4 refresh cycle stall a PL master for a while when it wants to access the same memory region that is being refreshed?

Jump to solution

Hi @ronnywebers 

 

Why don't you use "self refresh" to improve bandwidth issue and reduce power consumption ?

I guess it seems that using "self refresh" command in your design is best way...

 

Best regards,

Advisor ronnywebers
Advisor
333 Views
Registered: ‎10-10-2014

Re: does a DDR3 or DDR4 refresh cycle stall a PL master for a while when it wants to access the same memory region that is being refreshed?

Jump to solution

@watari , not sure how to do that ... I actually use a Picozed board for this purpose, and I use the default config that comes with the board... It has 1Gb of DDR3 connected to the  PS (see previous post), not sure if they configured self-refresh..

I think even with self-refresh enabled : when I 'accidentaly' request data from the address that is in the progress of being refreshed, my (AXI) read request will be stalled I guess... just wondering how long this (typically) would be.

In essence, I want to use a part of the PS DDR memory as a LUT from my PL. My PL would have an AXI master interface, and get data through that interface from the DDR. (PS will initialize the LUT on power-up). Of course the PL will also use the DDR for it's own firmware, the cache will be too small to run from. In my PL I receive external events, let's say ever 100us. Upon such event, I need to get a block of data (somewhere between 64 bytes and 1k bytes) from the PS DDR. Each subsequent block can start at random addresses, so I cannot intermediately 'cache' a larger block, and have a more real-time behaviour.

** kudo if the answer was helpful. Accept as solution if your question is answered **
0 Kudos
Moderator
Moderator
318 Views
Registered: ‎11-28-2016

Re: does a DDR3 or DDR4 refresh cycle stall a PL master for a while when it wants to access the same memory region that is being refreshed?

Jump to solution

Hello @ronnywebers ,

Thanks for providing the additional details.  The question is how time sensitive is your workload and how well your system can 'survive' a refresh period when the memory is not accessible.  tRFC for a 4Gbit DDR3 is 260ns, add in some extra time for the precharges and activates, and you're looking at around 320ns when you may not be able to access the memory.  From an overall efficiency standpoint random addressing like you describe is the worst case scenario and that's made worse by having short accesses like 64-bytes.  Still it sounds like your total bandwidth requirement for this application is low and in total there will be plenty of memory bandwidth available but your access pattern isn't ideal.  Unless you have some unreasonably strict latency requirements I don't see any issues with your total memory bandwidth.  If you want to simulate this then the best path is the Zynq UltraScale+ Verification IP in DS941:
https://www.xilinx.com/support/documentation/ip_documentation/zynq_ultra_ps_e_vip/v1_0/ds941-zynq-ultra-ps-e-vip.pdf

Hello @watari ,

Self Refresh is a low power state for the DDR memory in which you can't perform any accesses.  The idea is to save power when you have a system that intermittently needs to use the DRAM.  Self-Refresh saves the memory contents while in a low power state and then can quickly return to an active state to service the needs of the system.  This is a solution for a different problem then the one we're looking at here.  The PL side has a User Refresh option in which you can schedule your own refreshes but that doesn't exist in the PS.

View solution in original post

Advisor ronnywebers
Advisor
302 Views
Registered: ‎10-10-2014

Re: does a DDR3 or DDR4 refresh cycle stall a PL master for a while when it wants to access the same memory region that is being refreshed?

Jump to solution

thanks @ryana  for the detailed explanation! indeed my requirements are rather 'low' at the moment.

I'll try to simulate it with the VIP anyway, it will give me a better understanding I guess of the way this works.

 

** kudo if the answer was helpful. Accept as solution if your question is answered **
0 Kudos
Advisor ronnywebers
Advisor
294 Views
Registered: ‎10-10-2014

Re: does a DDR3 or DDR4 refresh cycle stall a PL master for a while when it wants to access the same memory region that is being refreshed?

Jump to solution

@ryana can I check one last thing with you? If you prefer I create a new post for this. But the question is short :

To access the DDR3 from the PL, can I create a custom IP (Vivado -> Tools -> Custom IP) with an AXI master interface, which I then connect to for example the Zynq's AXI HP or GP slave ports? Will this allow me to access the DDR? I have a collegue who told me I would need an AXI CDMA for this, and control this from the PL ... but I wouldn't see why a simple AXI master IP would not be able to do the same, as a DMA is also acting as AXI master ... or am I wrong?

** kudo if the answer was helpful. Accept as solution if your question is answered **
0 Kudos
Moderator
Moderator
289 Views
Registered: ‎11-28-2016

Re: does a DDR3 or DDR4 refresh cycle stall a PL master for a while when it wants to access the same memory region that is being refreshed?

Jump to solution

Hello @ronnywebers ,

Basically yes.  You'll have to configure the Zynq MPSoC block in your block design and make sure you enable an AXI Slave Port for your custom master, of course configure the DDR controller, and then assign your memory map.  You can check out UG1085 for some more details in the PS-PL AXI Interfaces Section.
https://www.xilinx.com/support/documentation/user_guides/ug1085-zynq-ultrascale-trm.pdf

Mentor watari
Mentor
271 Views
Registered: ‎06-16-2013

Re: does a DDR3 or DDR4 refresh cycle stall a PL master for a while when it wants to access the same memory region that is being refreshed?

Jump to solution

Hi @ryana , @ronnywebers 

 

Of cause I know that primary purpose is to reduce power consumption and @ryana mentioned phenomenon (quickly return to an active state on DRAM).

However, self refresh with partial array self refresh (PASR) is to reduce refresh area.

So, it results in improvement of penalty to access to DRAM.

 

Also, I know almost of all DRAM controller doesn't support self refresh for my explanationed purpose.

If you design DRAM controller, it might have some extra good point.

 

Best regards,