cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
hbucher
Scholar
Scholar
2,380 Views
Registered: ‎03-22-2016

Debug DDR3 PS-side

Jump to solution

I am trying to visualize a core starvation situation to make a video.

It looks like the Zynq-DDR bus is not debug-able by system ILA IP. 

Is there any alternative? 

Thank you.

 

vitorian.com --- We do this for fun. Always give kudos. Accept as solution if your question was answered.
I will not answer to personal messages - use the forums instead.
0 Kudos
1 Solution

Accepted Solutions
gmarinkovic
Explorer
Explorer
2,798 Views
Registered: ‎01-09-2012

@hbucher

 

Sorry I misinterpreted your sentence: "I am trying to visualize a core starvation situation to make a video."

I thought the starvation could be shown as well as a reduction of memory accesses as measured by the AXI performance monitor. But I agree with you: The DDR SDRAM controller is not debug-able my means of Xilinx tools... at least I'm not aware of any such possibility. For such a problem I would use an external logic analyzer device. What I was referring to was what  is available by Xilinx and useful for the MPSoC. For example with the mentioned AXI performance monitors, which give you a transaction rate at particular interface. It looks in my case as follows:

|----------------------------------------------------------------------|

|                     AXI Performance Monitor                          |

|----------------------------------------------------------------------|

|Slot                |Write Byte Cnt |Read Byte Cnt |Total RW Byte Cnt |

|----------------------------------------------------------------------|

|DDR Slot1 (APU)     |       8644816 |    118538688 |        127183504 |

|DDR Slot2 (APU)     |             0 |            0 |                0 |

|DDR Slot3 (HP0)     |     236500864 |    230866944 |        467367808 |

|DDR Slot4 (HP1/2)   |     470689664 |    460473984 |        931163648 |

|DDR Slot5 (HP3)     |     236904704 |    230987392 |        467892096 |

|----------------------------------------------------------------------|

|Total               |     952740048 |   1040867008 |       1993607056 |

|----------------------------------------------------------------------|

 

Then of course you know as well what to expect from your memory. The bandwidth expected for a device which I was using is about:

64 [bit] / 8 [bit/byte] * 2 [DDR multiplier] * 8 [transfers] * 1067 [MHz] / (15 [clock cycles RAS to CAS delay] + 15 [clock cycles CAS latency] + 8 [clock cycles data access] + 15 [clock cycles precharge time]) = 2576.9 Mbyte/s

 

Now I can see the throttling of a particular port caused by the limited bandwidth of the DDR SDRAM part. I thought this is what you call starving. The same you would see if you could use the ILA: You would see how the acknowledgements get delayed, or how the DDR SDRAM controller sends new commands to the memory for accessing another row/column/bank.

Sure, the throttling is shown indirectly but that's what I can use from Xilinx to come closest to the required measurement . Especially useful since I'm able easily to change QOS and addresses accessing the DDR SDRAM.

 

Cheers

Goran

View solution in original post

0 Kudos
5 Replies
gmarinkovic
Explorer
Explorer
2,343 Views
Registered: ‎01-09-2012

@hbucher

 

What a honor to share some insights with Henrique Bucher. :-)

 

Since I assume you would like to get closest to the maximal performance reachable by your system including AXI bus features and DDR SDRAM settings. This is what we called performance tests. We did this using the ZCU102 board for tests of the MPSoC as described in https://forums.xilinx.com/t5/Embedded-Boot-and-Configuration/ZYNQMP-configuration-for-access-PS-DDR-from-PL/m-p/832245#M247 my post from 02-19-2018 01:09 AM My task was to see if this platform is useful for my high speed electronics and what is feasible to achieve in such a system... and where the pitfalls are.

With several of the AXI Traffic Generators, which I can start concurrently (but as well choose a varying load by deselecting some of the ATGs), you are able to flood the DDR4 SDRAM controller, check the performance effect when choosing different addresses in the SDRAM (precharge/activate issues) and as well do something with the processor (volatile read/modify/write operations). In addition you can start playing with the QOS of AXI... This feature of the AXI was intended in order to distinguish messages and order them accordingly... which could prevent starvation if used properly (badly used it will cause starvation). Of course the Zynq U+ MPSoC has restrictions and you will see soon some of the restrictions: For example when you realize the AXI V3 in the PS (AXI V4 in PL) and some details of the QOS implementation. There you will end up with

https://forums.xilinx.com/t5/Embedded-Processor-System-Design/Zynq-UltraScale-MPSoC-DDR-arbitration-between-PL-and-A53/m-p/800844 This led to a SR which my colleague got answered from Xilinx to be solved in a later Vivado version... lets see.

But there are as well some nice features in the MPSoC like the AXI performance monitors which are very helpful for identifying the bottlenecks of such systems. So there is plenty of techniques to play around with and learn how to avoid potential problems.

 

Have fun and cheers

Goran

0 Kudos
hbucher
Scholar
Scholar
2,331 Views
Registered: ‎03-22-2016

@gmarinkovic Goran thanks for your thoughts. 

the question is how to visualize the DDR bus activity, not how to achieve best performance.

That bus is not debug-able

vitorian.com --- We do this for fun. Always give kudos. Accept as solution if your question was answered.
I will not answer to personal messages - use the forums instead.
0 Kudos
gmarinkovic
Explorer
Explorer
2,799 Views
Registered: ‎01-09-2012

@hbucher

 

Sorry I misinterpreted your sentence: "I am trying to visualize a core starvation situation to make a video."

I thought the starvation could be shown as well as a reduction of memory accesses as measured by the AXI performance monitor. But I agree with you: The DDR SDRAM controller is not debug-able my means of Xilinx tools... at least I'm not aware of any such possibility. For such a problem I would use an external logic analyzer device. What I was referring to was what  is available by Xilinx and useful for the MPSoC. For example with the mentioned AXI performance monitors, which give you a transaction rate at particular interface. It looks in my case as follows:

|----------------------------------------------------------------------|

|                     AXI Performance Monitor                          |

|----------------------------------------------------------------------|

|Slot                |Write Byte Cnt |Read Byte Cnt |Total RW Byte Cnt |

|----------------------------------------------------------------------|

|DDR Slot1 (APU)     |       8644816 |    118538688 |        127183504 |

|DDR Slot2 (APU)     |             0 |            0 |                0 |

|DDR Slot3 (HP0)     |     236500864 |    230866944 |        467367808 |

|DDR Slot4 (HP1/2)   |     470689664 |    460473984 |        931163648 |

|DDR Slot5 (HP3)     |     236904704 |    230987392 |        467892096 |

|----------------------------------------------------------------------|

|Total               |     952740048 |   1040867008 |       1993607056 |

|----------------------------------------------------------------------|

 

Then of course you know as well what to expect from your memory. The bandwidth expected for a device which I was using is about:

64 [bit] / 8 [bit/byte] * 2 [DDR multiplier] * 8 [transfers] * 1067 [MHz] / (15 [clock cycles RAS to CAS delay] + 15 [clock cycles CAS latency] + 8 [clock cycles data access] + 15 [clock cycles precharge time]) = 2576.9 Mbyte/s

 

Now I can see the throttling of a particular port caused by the limited bandwidth of the DDR SDRAM part. I thought this is what you call starving. The same you would see if you could use the ILA: You would see how the acknowledgements get delayed, or how the DDR SDRAM controller sends new commands to the memory for accessing another row/column/bank.

Sure, the throttling is shown indirectly but that's what I can use from Xilinx to come closest to the required measurement . Especially useful since I'm able easily to change QOS and addresses accessing the DDR SDRAM.

 

Cheers

Goran

View solution in original post

0 Kudos
hbucher
Scholar
Scholar
2,321 Views
Registered: ‎03-22-2016

@gmarinkovic I see. The external logic analyser is a good idea, will investigate.

In the PL side it is easy to debug but this board in particular (the one in the video) has no PL side DDR. 

I was hoping to get a nice screenshot - it would help folks understand the issue much easier

vitorian.com --- We do this for fun. Always give kudos. Accept as solution if your question was answered.
I will not answer to personal messages - use the forums instead.
0 Kudos
gmarinkovic
Explorer
Explorer
2,314 Views
Registered: ‎01-09-2012

@hbucherI see, However you may as well show the folk that accessing DDR SDRAM addresses can happen more or less cleverly... so the starvation will hit them later or sooner, depending on how clever the code (access cased by code) is made. Just a well meant proposal... since we all know that hitting a wall does hurt.... the good question is: Where is the wall and how do I prevent hitting it. :-)