cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
elien1222
Visitor
Visitor
882 Views
Registered: ‎12-05-2012

PCIe Endpoint DMA access fail on ZCU102 PS Root Port system

Hi all!

I have a problem regarding ZCU102 PS PCIe Root Port.

Recently I'm working on a standalone FW project using PS PCIe RP on the zcu102 evaluation board.
The zcu102 EV board (RP) has a PCIe endpoint device attached to it. EP device has PCIe DMA engine in it.
My goal is to make the PCIe EP DMA access zcu102 PS DDR buffer region.

I configured the PS PCIe block as belows.

 

20201215_144954.png

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Using this HW, I was able to recognize the PCIe device on Petalinux and perform EP DMA successfully.
So, I'm sure there would be no issues with HW.

But the problem arose with standalone FW.

BSP's pciepsu_v1_2 made EP device enumeration easy, and accessing the BAR area of EP device was not difficult.
However, when DMA is triggered through the EP BAR setting, the EP DMA fails to access the zcu102's memory.
In more detail, 'master abort' error occurs while EP DMA tries to read the memory.

The DDR area on the zcu102 HW uses the address 0x8_0000_0000, and a part of this memory will be used as a DMA buffer.

20201214_171854.png

 

 

 

 

 

 

 

 

 

 

 

 

For testing, I made a simple example where EP DMA reads the contents of src_buf and writes it back to dst_buf.
However, EP DMA was unsuccessful. It did not fetch any correct data from RP and I got 'master_abort' while EP reading the contents of RP's src_buf.

This conceptual zcu102 FW test code is roughly flown as follows.

 

 

 

 

#include "xpciepsu.h"
#include "xpciepsu_common.h"

XPciePsu pci_dev;

main(void)
{
    u32 *src_buf, *dst_buf;
    u32 size = 0x1000;

    PcieInitRootComplex(&pci_dev, XPAR_PSU_PCIE_DEVICE_ID);
    find_ep_device(&pci_dev, PCIE_EP_DEVICE_ID);
    pci_enable_device_mem(&pci_dev);
    pci_set_master(&pci_dev);

    src_buf = 0x800000000UL;
    dst_buf = 0x800001000UL;
    memset(src_buf, 0xF, size);
    memset(dst_buf, 0x0, size);

    issue_ep_dma_request(&pci_dev, src_buf, dst_buf, size);

    while (1) {
        if (*dst_buf)
            break;
    }
}

 

 

 

 

 

 

 

For your reference, I'll add some comments for this test code.

1. PcieInitRootComplex(), which is included in the zcu102 PCIe RP enumeration_example code, calls a BSP function XPciePsu_CfgInitialize().
 XPciePsu_CfgInitialize() calls XPciePsu_BridgeInit().

2. XPciePsu_BridgeInit() in xpciepsu.c seems to be logically identical to the nwl_pcie_bridge_init() of pcie-xilinx-nwl.c in petalinux kernel source.

3. In XPciePsu_BridgeInit(),  subtractive decode (I_ISUB_CONTROL.ingress_sub_enable) is set.
  It seems that there's no need to set ingress address translation. So, I left all AXIPCIE_INGRESSx registers at zeroes.

20201214_173501.png

 

 

Are there any additional settings necessary to make EP DMA work normally?

Thanks in advance for your help...

 

Best regards,
John

 

0 Kudos
6 Replies
fshenmi
Contributor
Contributor
548 Views
Registered: ‎03-11-2016

Hi,

I ran into the same problem just like your case. My EP device is a M.2 NVMe SSD.

https://forums.xilinx.com/t5/PCIe-and-CPM/PSU-PCIe-master-AXI-failed-to-access-PS-DDR-memory-space/td-p/1224279

Have you resolve this issue? If so, may I know what's the additional tricks I need to take?

 

Thanks!

 

0 Kudos
pvenugo
Moderator
Moderator
504 Views
Registered: ‎07-31-2012

@fshenmi ,

We have released answer record on PS PCIe as RP and nVME SSD as EP.

If you are looking for that please refer to Xilinx_Answer_76169_ZCU102_PS_PCIe_NVMe.pdf

Regards

Praveen


-------------------------------------------------------------------------
Don’t forget to reply, kudo, and accept as solution.
-------------------------------------------------------------------------
0 Kudos
garethc
Moderator
Moderator
474 Views
Registered: ‎06-29-2011

Hi @elien1222 

We have AR:71493 which details a Petalinux image generation and system example design with ZCU102 PS PCIe as RP and a ZC706 as EP.
https://www.xilinx.com/support/answers/71493.html

 

 

Thanks,

Gareth


------------------------------------------------------------------------------------------------

Don’t forget to reply, kudo, and accept as solution.

If starting with Versal take a look at our Versal Design Process Hub and our
Versal Blogs

------------------------------------------------------------------------------------------------
0 Kudos
fshenmi
Contributor
Contributor
453 Views
Registered: ‎03-11-2016

Hi Gareth,

None of these xilinx solution works since I'm not using petalinux.

The same synthesis and implementation run do work for petalinux in my case but the performance on SSD is way tooooo bad.

I can only achieve < 600MB/s W/R speed using all four dd tasks simultaneously. This speed is similar to PL XDMA when using petalinux NVMe driver.

But for baremetal/RTOS I could achieve line rate on XDMA up to 3.3GB/s for a single thread. And 1.5GB/s for Gen2 x4 is good enough for me thus I'm porting this driver to the PSU PCIe side. Then I encounter this RMA error.

 

0 Kudos
pvenugo
Moderator
Moderator
417 Views
Registered: ‎07-31-2012

Performance depends on various factors w.r.t hardware and software side.

NVMe performance depends on type of flash being used, the firmware if any, which comes in data path on EP, the whole architecture deployed by EP vendors.


On software front, how are you measuring performance ? Is it multi-threaded ? Since data has to reach till user space application there will be performance impact as there will be context switching from application to kernel NVMe stack. (CPU speeds could be low unlike x86)

The type of interrupts being exercised also affect performance.
In root port we support only MSI, so all interrupts are routed to single CPU. (Only with MSI-X interrupts are routed to different cpu’s, which will provide better performance)


Could you try other NVMe SSD as well like Samsung . May be you can try using multiple performance testing utilities to see how much difference they make.

Would you point me to baremetal application which you used for XDMA+nVME test design?

 

Regards

Praveen


-------------------------------------------------------------------------
Don’t forget to reply, kudo, and accept as solution.
-------------------------------------------------------------------------
0 Kudos
fshenmi
Contributor
Contributor
388 Views
Registered: ‎03-11-2016

The performance is a pure software issue on Zynq MPSoC, not hardware.

600MB/s is the cap for both PL XDMA and PSU GTR under a linux driver stack. Even with four simultaneous write/read thread this is still the case.

Thus for performance I have to migrate to a freeRTOS enviroment. Please check Shane Colton's write up on his baremetal SSD NVMe driver.

Both Samsung 970 and XPG 8200 could achieve manufacturer's claimed I/O speed under baremetal. I have no doubt other SSD will be able to achieve such speed using a raw driver stack.

 

0 Kudos