05-02-2019 12:46 PM
I’ve done some digging in the Xilinx LWIP port and found a definite bug. In cases where tons of RAM is available, you’d probably be able to avoid this issue, but we don’t have that luxury.
When a packet comes in through the DMA (emacps) the packets are enqueued in a packet queue (PQ).
New packet buffers are popped from the packet buffer pool.
These packet buffers are then given to the DMA through the BD Ring.
The issue comes when there aren’t any packet buffers to give to the BD ring. When packets are popped off the BD ring, they are only checked whether they are “new” packets, meaning they have been given to the BD ring, the DMA has filled the packet, and given it back.
This line of code indicates that it will try to pop all of the “new” buffers, regardless of whether they have been popped before.
The RingPtr->HwTail was certainly intended to be used for this purpose – and it does work when done that way. As it currently is implemented, RingPtr->HwTail is entirely ignored. Even the comment above the indicates HwTail and HwCnt should be used, however they aren’t used at all.
The thing that might suffer is when frames get split over partial buffers. I haven’t come across this condition, but suspect it might have to do with jumbo frame support. There are likely issues in that implementation as well.
The other issue that jumps out is right here where they’re always calling printf from within an interrupt handler. If anything, it should use LWIP_DEBUGF. Removing printf entirely from the Xilinx contrib of LWIP would likely make it much smaller. (“many kilobytes of code and that is unacceptable in most embedded systems”)
This issue is very reproduceable, and causes the entire ethernet RX stack to crash. (Double frees, garbage processed through the LWIP stack, corrupt BD ring, pbuf pool never freed, etc.)