cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
Highlighted
Anonymous
Not applicable
3,494 Views

Using ACP from Linux on Zynq Ultrascale+

I've been trying to use the ACP in Linux on the Zynq Ultrascale+ but so far I haven't got it to work fully. I have the ACP interface working fine in a Bare Metal design, so I believe the problem lies in software and my use of the kernel. I have looked at other forum posts about similar issues on the Zynq 7000, but nothing in these has fixed things for me. The problem looks like a caching issue. Details of the problem:

 

I have written a custom module to send data directly to the ACP interface on the Zynq (no Xilinx DMA block in between). The AWCACHE and ARCACHE lines are both set to 0b1111 and I have tried setting the AWUSER and ARUSER lines to 0b00, 0b01, 0b10, but this hasn't made any difference. My BM design works with AWUSER and ARUSER set to 0b00.

 

In order to use the module with Linux, I have written a basic kernel driver. In its init function it allocates memory to a buffer: 

kbuf = __get_free_pages(GFP_KERNEL | GFP_DMA, order);

In its mmap function it remaps the physical address of the buffer: 

remap_pfn_range(vma, vma->vm_start, __pa(kbuf) >> PAGE_SHIFT,
                  vma->vm_end - vma->vm_start, vma->vm_page_prot);

In my userspace application, I open the device and mmap it:

fd = open("/dev/acp", O_RDWR | O_SYNC);
buff = (volatile unsigned long int *)mmap(0, 4096, PROT_EXEC|PROT_READ|PROT_WRITE, MAP_SHARED, fd, 0);

I pass the physical address to my module, start a transfer (128 bytes) and poll the buffer to check when it gets there:

buff[last_word] = 0;
i = 0;
start_transfer(); while (buff[last_word] == 0 && i++ < 100000) { usleep(10); }

When this times out, I print the contents of the receive buffer. The contents is variable, but normally it shows the first 64 bytes of the transfer, followed by junk for the next 64 bytes, and a zero in the last word. For example, when printed as 64-bit words (data sent was 1, 2, 3, ...):

Data:
1
2
3
4
5
6
7
8
0
50000000001c8
198
0
50000000001c8
1a0
0
0

Sometimes I'll see all the data but there will still be a zero in the last word (1, 2, ... , f, 0). This leads me to think that something is going wrong with the caching, which is done in 64 byte lines. The ACP should give cache coherency (it does in my BM test), but maybe something is not set up correctly in the Linux test.

 

I have checked the ACP transaction with an ILA block and the Xilinx JTAG debugger, and all the data is transferred and acknowledged with no errors. From my BM test, I trust that the data is getting to the right place.

 

In another test, I set up the kernel driver as before, but with an interrupt that fires when the ACP transfer completes (a line on the FPGA goes high when the last BRESP is received for the transfer, indicating that the data has transferred and that there is "global observability for that write" (http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.ddi0500g/BABIAJAG.html)). When the interrupt fires, the kernel reads the last word of the transfer, which is always zero:

printk(KERN_INFO "ACP: last word in buf: %x\n", readl((void*)(kbuf + 15 * 4)));

 

Can anyone offer any ideas of what else to try? One thing would be to try clearing the cache to check if the cache is the problem, but I haven't found a method for this yet. Maybe it's a kernel config issue? Any other ideas or input would be welcome!

 

Thanks,

Arthur

0 Kudos
3 Replies
Highlighted
Anonymous
Not applicable
3,348 Views

Update:

In another test, after writing data to my ACP buffer in the DDR, I then got the FPGA to read that data back via the same ACP interface. All the data read back by the FPGA was correct, but still the data read by the Linux driver appeared stale. I don't know if the FPGA read the data straight from the L2 cache or the DDR, so this could mean one of two things:

 

1) the data was read back from the L2 cache, meaning that the mechanism for getting data from the L2 cache into the L1 cache is not working, so the ARM is seeing stale data in its L1 cache, or

 

2) the data was read back from the DDR, meaning that it is either the L1 or the L2 behaviour or both that is not set up correctly.

 

I think the first seems more likely but I don't have any real evidence. I think this confirms that it is a cache coherency issue though. Does anyone have an idea of what might be going wrong, or what configuration/initialisation may not be getting done?

 

Thanks,

Arthur

0 Kudos
Highlighted
Contributor
Contributor
2,448 Views
Registered: ‎05-05-2015

Hello,

 

I am wondering if you found a solution for this problem. I posted a similar problem related to ACP and Zynq Ultrascale:

 

https://forums.xilinx.com/t5/SDSoC-Development-Environment/zcu102-ACP-port/td-p/756724

 

I am not sure if there is a workaround to use the ACP port in the Zynq ultrascale ?

 

Thanks,

 

 

0 Kudos
Highlighted
Observer
Observer
1,475 Views
Registered: ‎08-09-2017

@Anonymous, were you ever able to fix this issue?  I'm having pretty much the exact same symptoms as you.  I've tried shrinking the linux available ram to 1GB and allocating memory for the fpga at 0x400000000+ (above the 1GB).  I've tried the ACP, HP, and HPC ports with various kinds of soft flushing in sw and they all have their own issues that sound very related to your initial post.  I see the SMMU starting up properly in dmesg

0 Kudos