UPGRADE YOUR BROWSER

We have detected your current browser version is not the latest one. Xilinx.com uses the latest web technologies to bring you the best online experience possible. Please upgrade to a Xilinx.com supported browser:Chrome, Firefox, Internet Explorer 11, Safari. Thank you!

cancel
Showing results for 
Search instead for 
Did you mean: 
Contributor
Contributor
420 Views
Registered: ‎02-04-2015

Improve UDP Throughput

I am porting an embedded application from FreeRTOS to Petalinux. The target is a custom made board with a 1G PHY and Zynq-7020 chip. The app generates about 16KB of data and sends it out the PHY to a host computer via fragmented UDP messages repeatedly at a fairly high rate. This all works very well with FreeRTOS and lwIP, achieving up to 800 Mbps sustained throughput. With Petalinux I am doing pretty well in achieving about 400 Mbps (from what I have read on other posts) but I need to squeeze some more speed out of it.

 

The FPGA builds the data into one of two ping-pong buffers in on-chip-memory and then kicks the ARM CPU. On interrupt, the ARM app reads a register that states what ping-pong buffer is complete, adds some header information to the OCM buffer, and then calls sendto to send the buffer to the host.

 

I believe the slowdown is at the user-kernel barrier. I have seen posts that suggest using vmsplice/splice from memory to socket via pipes but I have not been able to get this to work. I think the problem is that the user-level code cannot mmap to the OCM memory correctly and the vmsplice call fails giving an EFAULT error (bad address). I have mmap-ed to the OCM memory via /dev/mem and I think that might be one place where I am going wrong.

 

I have also tried creating a kernel module that ioremaps the OCM memory and provides it to the user application but I have not been able to get the user app to access the OCM buffer properly (the kernel module appears to get the buffer from the OCM driver). I need to be able to modify the buffer in OCM at the user level and then have the buffer sent to a socket without the kernel level generating an extra copy of it (try to get closer to zero-copy).

 

Is what I am trying to do even possible in Petalinux? Can someone please point me in a direction that might work? I am fairly new to Petalinux so I might be missing something trivial.

0 Kudos