UPGRADE YOUR BROWSER

We have detected your current browser version is not the latest one. Xilinx.com uses the latest web technologies to bring you the best online experience possible. Please upgrade to a Xilinx.com supported browser:Chrome, Firefox, Internet Explorer 11, Safari. Thank you!

cancel
Showing results for 
Search instead for 
Did you mean: 
Visitor derekyu
Visitor
3,364 Views
Registered: ‎10-03-2016

ZYNQ platform UDP Packet corruption when optimization is ON

I am experiencing significant incoming UDP packet corruption (about 40 /8000 packets, packet size 528 bytes) in my custom ZYNQ board. It happens in an isolated network connecting only two devices back to back. I would expect no packet loss in such a simple network.

 

After I enable the lwip UDP checksum check option, these packets got discarded. It looks like the packets are corrupted inside the lwip stack. However, when I tried to debug the lwip stack by changing the compile flag to -O0 in the BSP build, the problem disappeared.

 

This behavior is reproducible, It works fine with no packet corruption when I build using -O0. There were packet corruptions using other build options (-O1, to -O3).

 

Any suggestion on how to resolve this issue?

0 Kudos
12 Replies
Highlighted
Scholar hbucher
Scholar
3,339 Views
Registered: ‎03-22-2016

Re: ZYNQ platform UDP Packet corruption when optimization is ON

@derekyu

What makes you think that the Zynq device has speed enough to handle the traffic?

The master GP ports are 128 MB/s and the HP ports just 256 MB/s each. There are many choke points. Is this 10G or 1G? Is this LInux or baremetal? 

Also if this is being DMA'd throught the slave HP, beware that the HP port does not take in account cache coherency of the PS so you might end up with corrupted data. You got to invalidate cache before reading your buffer. 

This can be solved by flowing the data through the ACP slave port but that will potentially make the PS slower because of cache misses - as expected.

There's a ton of info here (for Linux)

http://www.wiki.xilinx.com/Zynq-7000+AP+SoC+-+Performance+-+Ethernet+Packet+Inspection+-+Linux+-+Redirecting+Packets+to+PL+and+Cache+Tech+Tip

Baremetal

https://forums.xilinx.com/t5/Embedded-Processor-System-Design/Slave-port-HP0-on-Zynq-problem/td-p/265968

 

 

 

vitorian.com --- We do this for fun. Always give kudos. Accept as solution if your question was answered.
I will not answer to personal messages - use the forums instead.
0 Kudos
Visitor derekyu
Visitor
3,308 Views
Registered: ‎10-03-2016

Re: ZYNQ platform UDP Packet corruption when optimization is ON

There is no UDP corruption when I turn compile optimization off (-O0) in build the bsp. If the board can handle the traffic with no code optimization, I would expect the optimized code (-O2) should have no problem handling it.

 

I am running baremetal with 1G network speed connection.

 

I do not explicitly call xil_ICacheEnable() or xil_DCacheEnable() in my code. Unless it is implicitly enable somewhere, cache is not enabled.

0 Kudos
Scholar hbucher
Scholar
3,305 Views
Registered: ‎03-22-2016

Re: ZYNQ platform UDP Packet corruption when optimization is ON

@derekyu Yes, cache is implicitly enabled on Zynq. That is why the ACP port exists in first place.

A non-optimized build might be allowing enough time for cache coherency, who knows.

When you are dealing with the HP ports, clearing the cache is mandatory on Zynq. You should just do it and see what happens, my advice.

 

vitorian.com --- We do this for fun. Always give kudos. Accept as solution if your question was answered.
I will not answer to personal messages - use the forums instead.
0 Kudos
Visitor derekyu
Visitor
3,254 Views
Registered: ‎10-03-2016

Re: ZYNQ platform UDP Packet corruption when optimization is ON

If there is a need to clear the cache, I would expect the Xilinx SDK lwip contrib should have perform this already (I see the call to Xil_DCacheInvalidateRange() in setup_rx_bds() of xemacpsif_dma.c.

 

I would have no clue to find out which cache line needs to be cleared.

0 Kudos
Scholar hbucher
Scholar
3,249 Views
Registered: ‎03-22-2016

Re: ZYNQ platform UDP Packet corruption when optimization is ON

@derekyu

I doubt lwip would be aware where the buffers are being read from as you can point them anywhere.

In fact, if you are using the ACP, the last thing you want to do is invalidate the cache.

 

Xil_DCacheInvalidateRange( ptr, size )

where (ptr,size) is the memory location you are about to read/write.

 

vitorian.com --- We do this for fun. Always give kudos. Accept as solution if your question was answered.
I will not answer to personal messages - use the forums instead.
0 Kudos
Visitor derekyu
Visitor
3,238 Views
Registered: ‎10-03-2016

Re: ZYNQ platform UDP Packet corruption when optimization is ON

The following description comes from emacps_v3_1 Documentation:

 

Alignment & Data Cache Restrictions

...........

Both cache invalidate/flush are taken care of in driver code.

 

This is what I meant that I expect the SDK to take care of invalidate the cache.

0 Kudos
Scholar hbucher
Scholar
3,234 Views
Registered: ‎03-22-2016

Re: ZYNQ platform UDP Packet corruption when optimization is ON

@derekyu

Why dont you just put this line there and see how it goes? 

 

vitorian.com --- We do this for fun. Always give kudos. Accept as solution if your question was answered.
I will not answer to personal messages - use the forums instead.
0 Kudos
Explorer
Explorer
3,195 Views
Registered: ‎08-21-2013

Re: ZYNQ platform UDP Packet corruption when optimization is ON

What is the nature of the corruption? If you compare the bytes you sent to the bytes you received, are there random errors or always certain bytes?

0 Kudos
Scholar ericv
Scholar
3,184 Views
Registered: ‎04-13-2015

Re: ZYNQ platform UDP Packet corruption when optimization is ON

With lwIP, during development of our drivers, I've encountered many times -O0 being OK and other levels failing.

All the times, lwIP was not the issue, but the lower level I/F & driver was.

Each case was unique.

 

If it helps, here's a few things you could check:

- On the Zynq, the DMA descriptors cannot be in cache memory as they are contiguous blocks of 16 bytes (1/2 a cache line) and at -O0, this may not be visible if they are in cached memory.

- buffer copying : GCC library memmove() / memcpy() behaves differently between -O0 and the others levels (try using lwIP own copy function)

- at -O1, the code is already faster than -O0:

     check the throughput and add packet pacing to bring it down to see if becomes OK.

- Running out of buffers and the low level goes on and doesn't wait for an available buffer (increase the # of DMA buffers / add blocking on no buffer).

 

I can't provide specific insight in the BSP EMAC driver. A few years ago, looking at it, I was a bit baffled on its complexity when it's quite a simple set of operations to perform.

 

Regards

 

 

 

 

 

0 Kudos
Visitor derekyu
Visitor
1,728 Views
Registered: ‎10-03-2016

Re: ZYNQ platform UDP Packet corruption when optimization is ON

Thanks for the comments. Here are some more information I gathered:

 

1. The packet corruption was some what random with no pattern on when it happened. Sometimes no corruption. Sometimes the corruption ratio is about 50 packets/ 10,000 packets. There was no particular pattern in the corrupted data as well.

 

2. I realized that hardware checksum offload was enabled. However, checksum error showed up when I enabled the lwip software UDP checksum check. This implied the packet was received OK and was corrupted after the packet was received and before the lwip check the checksum. So the likely candidate to corrupt the packet was the interaction between the DMA and the cache controller.

 

3. I also tried disable data cache and there was no UDP packet corruption afterward. The performance, however, was even worst than turning optimization off.

 

4. The software path from receiving the packet to checking the checksum in the lwip stack are managed by the Xilinx SDK. I tried to look into files like xemacpsif_dma.c and other files in the xilinx contributed netif folder. So far I found no candidate to insert invalid data cache.

0 Kudos
Scholar hbucher
Scholar
1,723 Views
Registered: ‎03-22-2016

Re: ZYNQ platform UDP Packet corruption when optimization is ON

@derekyu Try to hook up to the ACP instead of the HP ports and see what happens.

vitorian.com --- We do this for fun. Always give kudos. Accept as solution if your question was answered.
I will not answer to personal messages - use the forums instead.
0 Kudos
Scholar ericv
Scholar
1,716 Views
Registered: ‎04-13-2015

Re: ZYNQ platform UDP Packet corruption when optimization is ON

@derekyu

In the lwIP port / driver directory, look for a function named low_level_input().

That's the standard function name used in lwIP to I/F between the stack and the EMAC driver.

There's certainly a copying done from the driver buffer to lwIP's pbuf (the pbuf is supplied by the stack/caller).

This is where you would need to apply a cache invalidate (not a flush) on the driver buffer before it being copied.

 

Regards

 

 

0 Kudos