UPGRADE YOUR BROWSER

We have detected your current browser version is not the latest one. Xilinx.com uses the latest web technologies to bring you the best online experience possible. Please upgrade to a Xilinx.com supported browser:Chrome, Firefox, Internet Explorer 11, Safari. Thank you!

cancel
Showing results for 
Search instead for 
Did you mean: 
Visitor flo_xilinx
Visitor
150 Views
Registered: ‎12-19-2018

devmem slow read from shared-dma-pool

Jump to solution

Hi,

I'm using a DMA SG connected via ACP to zynq and writing to reserved memory region in DDR (128MB size).
This is working fine (also cache coherency) by using default drivers for dma and shared-dma-pool.

From userspace I try to push the data via network:
Pseudo-code:

open file(dev/mem)
mapping the complete reserved region
UDP socket transfer in 8k junks, reading directly from mapped region

Average speed on the network is: 17-20 MiB/s, which is unfortunately not enough for my application. This is nearly the same speed I can achieve with uncached memory location (using arg:'mem=' in u-boot).

Question: what is the bottleneck here? Is the price for coherency slow dev/mem read speed? Any hints how to speed this up?

Tryouts:

- when copying the data to be send to an intermediate buffer with memcpy and sending this one, speeds up to 30 MiB/s in average can be achieved.

- when reading from reserved memory only (just by omitting the no-map keyword in dts - disables shared-dma-pool) speeds up to 80 MiB/s can be reached.

 

- My setup -

Dts entry:

reserved-memory {
ranges;
dma_buf@18000000 { compatible = "shared-dma-pool"; device_type = "reserved_memory"; linux,dma-default; no-map; reg = <0x18000000 0x8000000>; };};

uboot-partial-log:

Reserved memory: created DMA memory pool at 0x18000000, size 128 MiB
OF: reserved mem: initialized node dma_buf@18000000, compatible id shared-dma-pool

Vivado 2018.3 on Zedboard
(Petalinux version 4.14.0-xilinx-v2018.3)

Thanks in advance for any help!

0 Kudos
1 Solution

Accepted Solutions
Visitor flo_xilinx
Visitor
88 Views
Registered: ‎12-19-2018

Re: devmem slow read from shared-dma-pool

Jump to solution

A working solution for my setup with DMA-SG and ACP is:

To reserve the memory by omitting 'no-map' in dts:
Mapping will fail with:

pr_err("Reserved memory: regions without no-map are not yet supported\n");

but nevertheless the region seems to be cached, at least the access from user-mode is fast enough for me now. It might be possible to kick out the rest of the unnecessary parameters in the dts tree.

dma_buf@18000000 {
	compatible = "shared-dma-pool";
	device_type = "reserved_memory";
	linux,cma-default;
	reg = <0x18000000 0x8000000>;
};

With the default AXI settings (axuser tied high, awcache='11') I still got bad cache reads (DMA only writing, reading from user-space via dev/mem).

After configuring awcache to '0111-Write-back No-allocate' no more invalid reads from user-space observed. awprot was set to '00'.

0 Kudos
2 Replies
Visitor flo_xilinx
Visitor
123 Views
Registered: ‎12-19-2018

Re: devmem slow read from shared-dma-pool

Jump to solution

I was digging into the kernel code and it looks like the reserved memory is marked as none cacheable.
This would explain the slow read speed.

[Q]:
But, does this mean that I shall not use ACP together with reserved memory?
How do I benefit from the coherency ability of ACP? How should I alloc the mem in DDR?


-- some code snippets and comments for shared-dma-pool --

dma-coherent.c
dma_init_coherent_memory:
mem_base = memremap(phys_addr, size, MEMREMAP_WC);

* MEMREMAP_WC - establish a writecombine mapping, whereby writes may
* be coalesced together (e.g. in the CPU's write buffers), but is otherwise
* uncached. Attempts to map System RAM with this mapping type will fail.


[MT_DEVICE_WC] = { /* ioremap_wc */
.prot_pte = PROT_PTE_DEVICE | L_PTE_MT_DEV_WC,
.prot_l1 = PMD_TYPE_TABLE,
.prot_sect = PROT_SECT_DEVICE,
.domain = DOMAIN_IO,

DEV_WC Bufferable Normal memory / non-cacheable

0 Kudos
Visitor flo_xilinx
Visitor
89 Views
Registered: ‎12-19-2018

Re: devmem slow read from shared-dma-pool

Jump to solution

A working solution for my setup with DMA-SG and ACP is:

To reserve the memory by omitting 'no-map' in dts:
Mapping will fail with:

pr_err("Reserved memory: regions without no-map are not yet supported\n");

but nevertheless the region seems to be cached, at least the access from user-mode is fast enough for me now. It might be possible to kick out the rest of the unnecessary parameters in the dts tree.

dma_buf@18000000 {
	compatible = "shared-dma-pool";
	device_type = "reserved_memory";
	linux,cma-default;
	reg = <0x18000000 0x8000000>;
};

With the default AXI settings (axuser tied high, awcache='11') I still got bad cache reads (DMA only writing, reading from user-space via dev/mem).

After configuring awcache to '0111-Write-back No-allocate' no more invalid reads from user-space observed. awprot was set to '00'.

0 Kudos