cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
j_ney
Contributor
Contributor
430 Views
Registered: ‎11-26-2017

Communication ACP Linux user space driver

Jump to solution

Hey!

I've designed an application on the Ultrascale+ where I transfer data between PS and PL using PS-side DRAM that I access using mmap function in Linux. 

Currently, I use the HP ports to transfer the data. One bottleneck of this application is the time the software needs to access the DRAM and read the value. Although it only reads 64 bits, the DRAM access from software takes a significant amount of time compared to the runtime of the hardware core. 

So I thought about using the ACP port instead of HP port, so the software does not need to access the DRAM but can take the cached value. 

Is my understanding correct that this could theoretically improve the performance?

My problem is that when using the ACP port I get wrong results in software. The only thing I changed is HP to ACP in the hardware design. My guess is that the software still reads from DRAM and PL writes to cache so the software gets the wrong value. 

To read from memory I use the /dev/mem device with the mmap function. 

Is my understanding of the problem correct and if yes how can I force the software to read from cache?

Do I need to change something else at the hardware or the Linux kernel than just the port?

 

Thanks in advance!

 

0 Kudos
1 Solution

Accepted Solutions
369 Views
Registered: ‎04-20-2017

Sounds like you are missing a piece of the puzzle.

How do you reserve the memory section on the PS side which should exchange the data?

Either you use continuous memory allocation in kernel which can be a pain or use an  abstracted driver ( udmabuf driver for example) which gives you userspace access to the cma reserved memory hardware addresses

If you do what I assume you are doing now, just exempting some memory from your Linux and mmap that by /dev/mem, it won't be cached / is not cache able by your PS side, hence much slower (I saw >10x slowdown)

If you do neither of the above, and just pick a random adress that is managed by your Linux and write to it from Pl side (eg dma) it's highly likely you are corrupting your system

 

 

 

 

View solution in original post

2 Replies
370 Views
Registered: ‎04-20-2017

Sounds like you are missing a piece of the puzzle.

How do you reserve the memory section on the PS side which should exchange the data?

Either you use continuous memory allocation in kernel which can be a pain or use an  abstracted driver ( udmabuf driver for example) which gives you userspace access to the cma reserved memory hardware addresses

If you do what I assume you are doing now, just exempting some memory from your Linux and mmap that by /dev/mem, it won't be cached / is not cache able by your PS side, hence much slower (I saw >10x slowdown)

If you do neither of the above, and just pick a random adress that is managed by your Linux and write to it from Pl side (eg dma) it's highly likely you are corrupting your system

 

 

 

 

View solution in original post

j_ney
Contributor
Contributor
334 Views
Registered: ‎11-26-2017

juergen.kratochwill@grapho-metronic.com 

Thanks for you answer.

If you do what I assume you are doing now, just exempting some memory from your Linux and mmap that by /dev/mem, it won't be cached / is not cache able by your PS side, hence much slower (I saw >10x slowdown)


 That's what I do. But I found out most timing overhead of my application came from opening and closing the /dev/mem device every time I read. So now I open and close it just once and it works fine. Probably the DRAM access itself is negligible for me. 

Your answer is probably still helpful if someone else has the problem with caching, so I mark it as solution.