UPGRADE YOUR BROWSER

We have detected your current browser version is not the latest one. Xilinx.com uses the latest web technologies to bring you the best online experience possible. Please upgrade to a Xilinx.com supported browser:Chrome, Firefox, Internet Explorer 11, Safari. Thank you!

cancel
Showing results for 
Search instead for 
Did you mean: 
Observer kerryhj
Observer
7,172 Views
Registered: ‎07-02-2008

LL_TEMAC Core Dropping Packets

Jump to solution

I'm working on a project where I am integrating a processor subsystem into a non-EDK designed FPGA.  The system boots from flash into U-Boot.  This is in and of itself fine, and is working.  On a build which contains only the EDK design, everything seems to work correctly.  I can download files rapidly over Ethernet without issue (such as the ~1MB linux kernel, which takes about 1 second).

 

However, when I use the other build, which is integrated in with the other design, I run into issues.  Ethernet seems to work (I can ping, etc...), U-Boot boots up, and I can read memory in U-Boot with no issues.  However, when I try and TFTP the same files as in the other design, it is incredibly slow.  After about 40 seconds the transfer times out, reinitializes the Ethernet driver, and tries again, only to continue in the same cycle indefinitely until I hit CTRL+C.

 

Between the two builds some additional peripherals were added to the EDK design, and the clocking system was moved from a DCM internal to the EDK project to DCMs that were located in the external FPGA design.  I've verified that all the clocks are running at the correct relative speeds by comparing counters incrementing on the different clock domains, and the fact that the Ethernet works at all seems to indicate that the 125MHz clock is working correctly.  Barring this, is it possible that there is some kind of weird bus contention occurring that would make my Ethernet MAC operate totally differently in an identical software environment (the exact same build of U-Boot is used for both FPGA builds)?  Has anyone else seen any kind of issue like this, and what caused it?

EDIT: Problem isolated to being within the LL_TEMAC core somehow, see the fourth reply.  Also updated thread title.
Message Edited by kerryhj on 03-02-2009 01:08 PM
0 Kudos
1 Solution

Accepted Solutions
Observer kerryhj
Observer
8,366 Views
Registered: ‎07-02-2008

I finally fixed this by moving generation of some of the derived clocks from the design inside of the processor block.  I'm still not sure why this fixed the problem, but it does work now, so I'm not going to worry about it overmuch.

0 Kudos
6 Replies
Explorer
Explorer
7,164 Views
Registered: ‎08-14-2007

Re: What could cause the ll_temac to work fine in one build and very slowly/poorly in another?

Jump to solution

Well without knowing the exact specifics of any part of your design, one can only speculate, but my first thought we be that something in the other part of the design is slowing it down. I know, obvious huh?

 

If you're using the MPMC and the other portion is requesting accesses to the memory, this can lead to contention on the memory bus and 'slow' things down. I wouldn't guess much slowness, but altogether possible.

 

From a purely diagnostic standpoint, have you run Wireshark on your TFTP system to view what its actually seeing with respect to the network traffic?

0 Kudos
Xilinx Employee
Xilinx Employee
7,157 Views
Registered: ‎01-18-2008

Re: What could cause the ll_temac to work fine in one build and very slowly/poorly in another?

Jump to solution

It is possible that the IOs are placed at inefficient locations. I'm not sure if you are building this on a Spartan or Virtex. If it is spartan, you should check the DCM's that are being used. On Virtex you could check the IDELAY locations. I'd try locking the ethernet portion of the design to specific locations and see if that helps.

 

You should also file a webcase.

0 Kudos
Observer kerryhj
Observer
7,140 Views
Registered: ‎07-02-2008

Re: What could cause the ll_temac to work fine in one build and very slowly/poorly in another?

Jump to solution

The I/Os are all located at the same places on both designs.  I'll look into the IDELAY thing though, since that might be something I didn't transfer over when I integrated the EDK project into the rest of the FPGA design.

 

I am using the MPMC to access external memory in both cases, but there aren't any new MPMC-connected peripherals between the two designs, so I doubt that is the problem.

 

EDIT:  I just verified that all the IDELAYs are in the same places on both designs, so I don't think this is the problem either.

 

EDIT2: Further testing, and network captures using Wireshark show that what is happening is that some of the Ethernet Frames that U-Boot thinks it sent are not actually being sent out the Ethernet Port.  In the working case, U-Boot Debug lists "Frame Sent" 28 times, and Wireshark captures 28 frames.  In the broken case, U-Boot lists "Frame Sent" 37 times, but Wireshark only received 29 frames.  The "slowness" is caused by the TFTP server not receiving the acknowledgement packet, waiting 5 seconds, then sending the frame again.  Given that ~1/4 of the frames are not acknowledged correctly, all those 5 second waits make the connection go a lot slower.


Message Edited by kerryhj on 02-26-2009 02:26 PM
0 Kudos
Observer kerryhj
Observer
7,058 Views
Registered: ‎07-02-2008

Re: What could cause the ll_temac to work fine in one build and very slowly/poorly in another?

Jump to solution

Further debugging on this:

 

I've chipscope the interface between the MPMC and the LL_TEMAC, and have found that all the packets are present on that interface. (I capture the same number of packets as U-Boot claims to have sent in the debug messages).  I have also compared the chipscope captures of a packet that did make it out with one that didn't (via a diff) and the only difference between the two was the identification field in the IPv4 header (64 in the lost packet, 65 in the one that made it through), and therefore the header checksum.

 

The packets must therefore be getting lost somewhere inside the LL_TEMAC core, or in the interface to the PHY (which seems unlikely, given that about 3/4 of the packets are sent successfully).

 

I'm going to try and dig deeper with chipscope and see what comes up.

0 Kudos
Xilinx Employee
Xilinx Employee
7,021 Views
Registered: ‎08-01-2007

Re: What could cause the ll_temac to work fine in one build and very slowly/poorly in another?

Jump to solution

Kerry,

 

What is your target device?

 

One good data point is to look at the placement of the LL_TEMAC in the failing design in FPGA_Editor or PlanAhead and comparing it against the working one. It is possible that owing to the addition of newer peripherals and the timing constraints associated with them, the TEMAC logic was spread across the FPGA resulting in extraneous delays.

 

0 Kudos
Observer kerryhj
Observer
8,367 Views
Registered: ‎07-02-2008

I finally fixed this by moving generation of some of the derived clocks from the design inside of the processor block.  I'm still not sure why this fixed the problem, but it does work now, so I'm not going to worry about it overmuch.

0 Kudos