UPGRADE YOUR BROWSER

We have detected your current browser version is not the latest one. Xilinx.com uses the latest web technologies to bring you the best online experience possible. Please upgrade to a Xilinx.com supported browser:Chrome, Firefox, Internet Explorer 11, Safari. Thank you!

cancel
Showing results for 
Search instead for 
Did you mean: 
Observer redgatorsmp
Observer
96 Views
Registered: ‎07-01-2019

zcu102 2018.2 pre-built petalinux image xapp859 ml506 I am not actually moving data probably missing a twiddle

Jump to solution

I have a zcu102 booting from an sdcard the pre-built petalinux image from 2018.2. Plugged into this board I have an ml506 with a bitstream built from xapp859.

I have had the ml506/xapp859 plugged into another x86 Linux system. I had to write my own user-level driver for this app, but I do three embedded drivers before breakfast, the docs did not lie, and that isn't a problem. I have been able to run the bitstream and get the performance indicated in the whitepaper that accompanies the example.

Plugged into the ZCU the Linux 4.14 probes the PCIe bus and sees the ML506 bitstream. I'm using a unused portion of CMA until I can get the baseline to build and either modify the device tree to expose a reserve area or build a user-level interface to CMA.

 

root@xilinx-zcu102-2018_2:~# lspci
00:00.0 PCI bridge: Xilinx Corporation Device d021
01:00.0 Memory controller: Xilinx Corporation Default PCIe endpoint ID (rev 01)

01:00.0 Memory controller: Xilinx Corporation Default PCIe endpoint ID (rev 01)
Subsystem: Xilinx Corporation Default PCIe endpoint ID
Flags: bus master, fast devsel, latency 0, IRQ 255
Memory at e0000000 (32-bit, non-prefetchable) [size=128]
Capabilities: [40] Power Management version 3
Capabilities: [48] MSI: Enable- Count=1/1 Maskable+ 64bit+
Capabilities: [60] Express Endpoint, MSI 00
Capabilities: [100] Device Serial Number 00-00-00-00-00-00-00-00

root@xilinx-zcu102-2018_2:~# dmesg | egrep cma
[ 0.000000] cma: Reserved 256 MiB at 0x000000006fc00000
[ 0.000000] Memory: 3743624K/4193280K available (9980K kernel code, 644K rwdata, 3128K rodata, 512K init, 2168K bss, 187512K reserved, 262144K cma-reserved)

So the user-level driver I have uses the PCIe address 0xe0000000 and the memory address of 0x6fc00000.

With the ml506 plugged into the ZCU, I got the xapp859 cycling after I discovered a couple of ARM/PCIe ideosynchrasies:

1) From https://www.xilinx.com/Attachment/Xilinx_Answer_71210_PS_PL_PCIe_Drivers_Debug_Guide.pdf the PCIe BAR on the ml506 needs to be enabled for DMA transfers

setpci -s 01:00.0 COMMAND=0x7

2) From same guide, enable bus mastering on the root port

setpci -s 00:00.0 COMMAND=0x4

So it's cycling without bus errors, but there is no actual data being transferred (at least, the memory I'm writing is not being updated with the expected patterns). I'm assuming there is at least one more twiddle to make memory slaved to the PCIe interface writable.

What have I missed?

If I can't get there from a user-level driver, then I would like to know that also.  

0 Kudos
1 Solution

Accepted Solutions
Highlighted
Observer redgatorsmp
Observer
53 Views
Registered: ‎07-01-2019

Re: zcu102 2018.2 pre-built petalinux image xapp859 ml506 I am not actually moving data probably missing a twiddle

Jump to solution

After wrestling through other material, including adding a reserve space to the zcu102 device-tree, I finally found that the data was being transferred, but was not visible because of caching issues.

There is some debate whether a user-level driver can ever work with the caching behavior on the Zynq.

Making the following changes fixed my particular user-level driver on the memory block I was using:

open() - including the O_DSYNC flag - I'm dubious of the necessity, but another example I looked at set O_SYNC, which is a file-system flag. O_SYNC worked, but O_DSYNC makes more sense at the data-transfer level as a minimal requirement. The same example suggested the MAP_LOCKED flag on mmap(), but this should have been unnecessary on a reserve area and showed that it wasn't.

There is no user-level cache management facility in Petalinux - probably a nice kernel module for me to write...

After making these changes, the data was visible in the write buffer, and I was getting very good performance numbers (205MB/s read 215MB/s write vs 185MB/s read and 215MB/s write for Intel x86 host).

1 Reply
Highlighted
Observer redgatorsmp
Observer
54 Views
Registered: ‎07-01-2019

Re: zcu102 2018.2 pre-built petalinux image xapp859 ml506 I am not actually moving data probably missing a twiddle

Jump to solution

After wrestling through other material, including adding a reserve space to the zcu102 device-tree, I finally found that the data was being transferred, but was not visible because of caching issues.

There is some debate whether a user-level driver can ever work with the caching behavior on the Zynq.

Making the following changes fixed my particular user-level driver on the memory block I was using:

open() - including the O_DSYNC flag - I'm dubious of the necessity, but another example I looked at set O_SYNC, which is a file-system flag. O_SYNC worked, but O_DSYNC makes more sense at the data-transfer level as a minimal requirement. The same example suggested the MAP_LOCKED flag on mmap(), but this should have been unnecessary on a reserve area and showed that it wasn't.

There is no user-level cache management facility in Petalinux - probably a nice kernel module for me to write...

After making these changes, the data was visible in the write buffer, and I was getting very good performance numbers (205MB/s read 215MB/s write vs 185MB/s read and 215MB/s write for Intel x86 host).