UPGRADE YOUR BROWSER

We have detected your current browser version is not the latest one. Xilinx.com uses the latest web technologies to bring you the best online experience possible. Please upgrade to a Xilinx.com supported browser:Chrome, Firefox, Internet Explorer 11, Safari. Thank you!

cancel
Showing results for 
Search instead for 
Did you mean: 
Visitor ajk
Visitor
113 Views
Registered: ‎12-22-2016

lwip 141 v1.5 udp intermittent packet loss, old packet reception!

Hello,

i have a problem with the following setup:

Zynq 7000, SDK 2016.2, FreeRTOS823, lwip141 V1.5, UDP

Zynq 7000, SDK 2016.2, FreeRTOS823, lwip202 V1.0, UDP

Short description

Znyq with PS Ethernet. Micrel KSZ9031RNXCATR.

Same app runs on both Zynq.

Zynq7000_A <===> Switch <===> Zynq7000_B
Client_A                    ------>                Server_B
Server_A                   <-----                 Client_B
Every second the clients A and B send packets to each other (almost simultaneously):
e.g. A to B
                   P1 ...ca.60 micro secs... P2 ca. 20 micro secs .... P3 (here, "P" denotes Packet)
This communication is perfect for about 7hrs.
After 7 hours, say, anywhere after 20 some minutes, two types of error happens on both sides.
1. A recieves older packets from B. (B receives older packets from A).
2. A does not at all receive a packet from B. (B does not at all receive a packet from A).
 
The recieved older packets are always 127 packets older !!!
The error happens intermittently for about 8 minutes.
After those 8 minutes, exchange becomes normal again.
This error does not repeat after another 7 hours!
 
Changing the frequency of communication, say for example, packets are sent 500 times every second, does not cause the error to happen earlier. It is always after 7 hrs and x minutes., and always for about 8 minutes long.
 
I could not find out anything by switching on all of lwIP debug.
Kindly, put in your ideas to solve this problem.
pbufs, dma ring buffers ... ???
If you need more clarifications regarding this problem, will gladly do it.

Kind Regards,

ajk

// This is my data packet exchanged with UDP.
typedef struct CommData{
	//	message ID
	u16	msgId;
	//	device ID
	u16	msRId;
	//	Count of trigger
	int     count;
	int     data_1;
	int     data_2;
} COMMDATA;
Both Zynq are triggerd simultaneously and send the data almost simultaneously. So, every second, the "count" sent by each other should match own "count".
I log the counters with xil_printf via UART and Teraterm.
Giving you a simplified code for understanding the problem.
// using own warppers for lwip calls.
// readUpd translates to recvfrom(..) which translates to lwip_recvfrom(..)

while( rc == NetSuccess ){
	rc = readUdpNet( udpServer, &recvBuf, sizeof( recvBuf ), &numBytes );

	if( rc == NetSuccess ){
		// Got a message. Process it.
		xil_printf( "Own count\r\n", gCount );
		if( pBuf->count <= prevRecvBuf.count ){
			// Old message
			xil_printf( "OM %d %d %d %d\r\n",
					recvBuf->msRId,
					recvBuf->data_1,
					recvBuf->count,
					prevRecvBuf.count );
		}
		prevRecvBuf = *pBuf;
		xil_printf( "MOk %d %d\r\n",
		    pBuf->msRId, pBuf->count );
// give semaphore for higher prio Task waiting for this data.
..... } }
Output upto the occurence of error
Own count 26779           upto now OK.
MOk 1 26779
Own count 26780           upto now OK.
MOk 1 26780
the start of error after 7 hrs x minutes
Own count 26781
OM 1 1356 26653 26780       the next packet was an old one (diff. is 127)!!!
 
At other times within the 8 minutes the other type of error:
Own count 26883
Data recv timed out!     (written by task waiting for data)
 
 
Tags (1)
0 Kudos