cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
Highlighted
Visitor
Visitor
9,212 Views
Registered: ‎12-22-2011

LWIP echo server example drops/delays TX packets on Zedboard ?

I'm trying to get started with SDK 2016 on Zynq 7k (Zedboard).

 

I have created an application project from the "LWIP echo server" template (both standalone and FreeRTOS version, in both 2016.1 and 2016.2) and can not get it to work properly, DHCP often fails with the default timeout value and ping reponse times are "weird":

 

64 bytes from 10.3.228.143: icmp_seq=680 ttl=255 time=0.096 ms
64 bytes from 10.3.228.143: icmp_seq=681 ttl=255 time=1000 ms
64 bytes from 10.3.228.143: icmp_seq=682 ttl=255 time=0.150 ms
64 bytes from 10.3.228.143: icmp_seq=683 ttl=255 time=1000 ms
64 bytes from 10.3.228.143: icmp_seq=684 ttl=255 time=0.182 ms
64 bytes from 10.3.228.143: icmp_seq=685 ttl=255 time=0.110 ms
64 bytes from 10.3.228.143: icmp_seq=686 ttl=255 time=1000 ms
64 bytes from 10.3.228.143: icmp_seq=687 ttl=255 time=0.134 ms
64 bytes from 10.3.228.143: icmp_seq=688 ttl=255 time=1000 ms
64 bytes from 10.3.228.143: icmp_seq=689 ttl=255 time=0.164 ms
64 bytes from 10.3.228.143: icmp_seq=690 ttl=255 time=0.155 ms

 

By looking at wireshark dumps and tracing the code I have determined that some (25% - 50%) of TX packets are not actually transmitted when requested, but rather they seem to end up forgotten in a queue somewhere until the next TX packet comes along. When this happens it seems that emacps_send_handler() never gets called for the "lost" packet. No error messages are printed, even with full LWIP debugging enabled.

 

I know the hardware setup works OK, networking is running fine with Linux on the board.

0 Kudos
6 Replies
Highlighted
Visitor
Visitor
9,207 Views
Registered: ‎12-22-2011

In response to myself, in case anyone else runs into the same problem.

 

After digging into the code and documentation a little deeper I believe I found the cause, the "xemacspif" driver doesn't flush DMA descriptors and therefore it semi-randomly fails to transmit packets. The sample code work perfectly fine with data cache disabled.

 

 

0 Kudos
Highlighted
Visitor
Visitor
9,188 Views
Registered: ‎12-22-2011

Another update, I managed to get LWIP working with cache enabled.

 

Found that the EMAC driver does (claim to) set up an uncached region for the DMA descriptors, setting the TLB attributes to NORM_NONCACHE (0x11DE2). By changing this to STRONG_ORDERED (0xC02) everything seems to work reliably.

 

I got this value from the FeeRTOS (v9.0) Zynq demo from freertos.org, which I assume is based on older (working?) Xilinx code.

 

Highlighted
Observer
Observer
8,193 Views
Registered: ‎01-05-2012

I had the same problem on a MyIR Z-turn board. After a ton of experimentation, I had worked around the problem by imposing a short delay in emacps_sgsend after XEmacPs_BdClearTxUsed, but that was just a hack. I figured it had something to do with the cache, but that's as far as I got.

 

Then I saw your post, erigusaab. Thanks.

 

It is possible to effect the required change without making a local copy of the BSP source files. I added this code after calling xemac_add in my application:

{
	// Fix delayed Ethernet transmission problem
	extern u8_t bd_space[0x100000];
	Xil_SetTlbAttributes((s32_t)bd_space, STRONG_ORDERED );
}

And at the top of the file:

#include "xil_mmu.h"

Now ping reliably responds within a millisecond or so.

 

I found this ARM reference describing the cache types. Quote:

Strongly-ordered and Device memory types are used for communicating with input and output devices and memory-mapped peripherals. They are not looked-up in any cache.

 

That sure sounds like what we want here. I'd say Xilinx should make the change in the BSP code.

 

 

0 Kudos
Highlighted
Scholar
Scholar
8,156 Views
Registered: ‎06-14-2012

Thanks for reporting this. We have filed a bug request for the same (CR# 954380) and this will be fixed in 2016.3

 

Regards

Sikta

0 Kudos
Highlighted
Visitor
Visitor
6,762 Views
Registered: ‎06-01-2015

Thanks for the solution to this issue.

 

The STONG_ORDERED change seems to fix both the ping issue and other related issues that I have seen, weird byte swaps or periodic several bytes of incorrect data.

 

Is a shame it took so long for Xilinx to even care about this and sort of promise a fix in 2016.3.    As this issue can have a big impact on the quality of Zynq based solutions.

 

 

 

0 Kudos
Highlighted
Visitor
Visitor
5,314 Views
Registered: ‎05-05-2014

Thanks for this solution.
Xilinx should test better libraries.
I'm sure the problems come since they added lwip drivers for Zynq Ultrascale+. Too many new processors.
Big negative point for Xilinx SW depelovers, they should check this forum: 1 year, 4 versions and still happening.
0 Kudos