UPGRADE YOUR BROWSER

We have detected your current browser version is not the latest one. Xilinx.com uses the latest web technologies to bring you the best online experience possible. Please upgrade to a Xilinx.com supported browser:Chrome, Firefox, Internet Explorer 11, Safari. Thank you!

cancel
Showing results for 
Search instead for 
Did you mean: 
Contributor
Contributor
3,284 Views
Registered: ‎03-13-2017

Zynq MPSoC: Limited Throughput from DDR on HP AXI Port

 

I am using a Zynq MPSoC xczu3eg-1.

 

On the PL side I have a Datmover IP block which is connected to the HP0 AXI port of the Zynq processing system. The HP0 bus is configured for 64-bit data width and is clocked at 200MHz. When I have the Datamover transfer 64kB of data from DDR, everything works fine except the throughput is much lower than expected. As shown in the ILA waveforms below, I see a continuous pattern of two data beats with 10 idle clock cycles in between. The system's DDR is DDR3 with a 32-bit bus running at 533MHz (DDR3-1066), so it should be able to provide much more throughput than what I am seeing.

 

Any ideas on what could be limiting the throughput from DDR on the HP0 AXI port?

 

HP0_port_read.png

 

 

0 Kudos
10 Replies
Voyager
Voyager
3,261 Views
Registered: ‎06-24-2013

Re: Zynq MPSoC: Limited Throughput from DDR on HP AXI Port

Hey @rjbohnert

 

I don't know the Datamover IP, but the trace looks like you are reading one byte at a time.

If that is the case, chances are good that this creates a major bandwidth bottleneck.

 

Not sure it helps,

Herbert

-------------- Yes, I do this for fun!
0 Kudos
Contributor
Contributor
3,251 Views
Registered: ‎03-13-2017

Re: Zynq MPSoC: Limited Throughput from DDR on HP AXI Port

Hi @hpoetzl

 

Thanks for the suggestion, but data is being read 8 bytes at a time as expected. It not possible to see from the screenshot I provided, so I added some annotations to the image. The problem is the large gaps between the pairs of data beats.

 

HP0_port_read_anotated.png

0 Kudos
Xilinx Employee
Xilinx Employee
3,182 Views
Registered: ‎03-27-2013

Re: Zynq MPSoC: Limited Throughput from DDR on HP AXI Port

Hi rjbohnert,

Would you please share the snapshot of the IPI connection so that we can see the masters and the slaves of this transaction?

And you can expand the axi_interconnect IP and check if function IPs like data width converter or protocol converter are involved.

Best Regards,
Jason
-----------------------------------------------------------------------------------------------
Please mark the Answer as "Accept as solution" if the information provided is helpful.

Give Kudos to a post which you think is helpful and reply oriented.
-----------------------------------------------------------------------------------------------
0 Kudos
Contributor
Contributor
3,174 Views
Registered: ‎03-13-2017

Re: Zynq MPSoC: Limited Throughput from DDR on HP AXI Port

Hi Jason,

 

Please see the attached snapshot of the IPI connection.You can see that the interconnect is just a pass through.

 

The S00_AXI side of the interconnect connects to a custom IP block, which I cannot share here. However, internally the AXI signals are directly connected to an AXI Datamover IP instance for which I have attached the .xci file.

 

The ILA snapshot shown in my previous post was captured from the System ILA that is shown in the IPI connection image. You can see that this System ILA is connected right at the boundary of the Zynq system. You can also see from the ILA snapshot that the data is being throttled by the RVALID signal which comes directly from the Zynq. Therefore, it seems that it must be something outside of the PL that is causing the throttling and low throughput.

IPI_connection.png
0 Kudos
Moderator
Moderator
3,157 Views
Registered: ‎11-28-2016

Re: Zynq MPSoC: Limited Throughput from DDR on HP AXI Port

Hello @rjbohnert,

 

Are you using the memory interface in the PS or is this 32-bit DDR3 interface configured in the PL?

0 Kudos
Contributor
Contributor
3,149 Views
Registered: ‎03-13-2017

Re: Zynq MPSoC: Limited Throughput from DDR on HP AXI Port

Hi @ryana,

 

I am using the PS memory interface. The AXI Datamover IP in the PL is connected to the S_AXI_HP0_FPD port of the Zynq system. The S_AXI_HP0_FPD port is clocked at 200MHz and the PS DDR3 memory is running at 533MHz.

 

0 Kudos
Moderator
Moderator
3,148 Views
Registered: ‎11-28-2016

Re: Zynq MPSoC: Limited Throughput from DDR on HP AXI Port

0 Kudos
Xilinx Employee
Xilinx Employee
3,135 Views
Registered: ‎03-27-2013

Re: Zynq MPSoC: Limited Throughput from DDR on HP AXI Port

Hi rjbohnert,

 

Are there heavy AXI transactions via AXI HPC0 port which you enable on this design?

 

And it is little strange that in ILA snapshot I can see signals named ps8_0_axi_periph....

Most of the time the axi_interconnect IP in this name is used to connect the CPU master and peripheral slaves.

Best Regards,
Jason
-----------------------------------------------------------------------------------------------
Please mark the Answer as "Accept as solution" if the information provided is helpful.

Give Kudos to a post which you think is helpful and reply oriented.
-----------------------------------------------------------------------------------------------
0 Kudos
Highlighted
Contributor
Contributor
3,120 Views
Registered: ‎03-13-2017

Re: Zynq MPSoC: Limited Throughput from DDR on HP AXI Port

 

I see this issue even when no there are no AXI transactions enabled on the HPC0 port. I even rebuilt the design with the HPC0 port disconnected, just to be sure. I still see the issue.

 

I am not sure why the ILA signals are named as they are.

 

Could this be related to the QoS functions that the Zynq MP has? I have tried maxing out the RDQoS value for the HP0 port, but it did not help. Do you have any suggestions on other settings for the DDR controller or Interconnect that I could try adjusting?

0 Kudos
Visitor gaochangw
Visitor
790 Views
Registered: ‎06-07-2017

Re: Zynq MPSoC: Limited Throughput from DDR on HP AXI Port

In my personal opinion, this is quite normal to have a 10-cycle latency when you do single beat access from DRAM. To better utilize the DRAM bandwidth you have to do burst read/write. Usually a larger burst size is better.

0 Kudos