06-02-2008 08:30 AM
We have switched to ISE 10.1 as our design is now using a Virtex 5.
The debugging tools (either under SDK, or with the GDB run from the command line) have turned awfuly slow.
Even downloading the .elf file is slow: under XMD it is just a snap, with the SDK it takes almost 30 secs.
As for stepping, all the same.
Is there something wrong in my config I should be aware of?
05-14-2009 08:35 AM
I've read this thread with interest and hope to find a solution for the very,very slow GDB GUI.
In order to understand, that we talk about the same: I mean the debugger running under eclipse from the 'Xilinx ISE Design Suite 10.1->EDK->Xilinx Platform Studio SDK
under Windows XP.
I've debugged a project with the Microblaze and also the PowerPC of an Virtex4.
The debugging of the PowerPC is extremely slow (micoblaze a bit better - but also unsatisfactoy). And the most annoying thing is not the download time, this is also awful, but stepping or running to breakpoints is the most terrible thing. It takes tens of seconds, which is not acceptable in my opinion. Try Visual Studio from MS, and you have fun :-)).
All given hints with entries in command windows or home-directories or similar don't help if either the hints are for other operating systems or not clear, how to use.
And as far as I see they all refer to download issues.
And to close variable-windows isn't really a good choice. Why should I step through a program, when i don't see a result in a variable-, memory - or expression window?
That makes no sense.
I hope, that someone from Xilinx willl answer and give 'real' hints, workarounds, or at least tells us, that they've skrewed up the debugger.
06-07-2009 03:49 AM
I've made some optimizations to the edk 11.1 mb-gdb. Especially the locals and disassembly views are updated much faster.
Also remember to update your .gdbinit filtes (in c:\Documents and Settings\yourname):
set download-write-size 4096
set remote memory-write-packet-size 16384
set remote memory-write-packet-size fixed
set remote memory-read-packet-size 4096
set remote memory-read-packet-size fixed
There is a tactical patch for the XMD see http://www.xilinx.com/support/answers/32621.htm
06-25-2009 05:17 AM
I'm a seldom viewer to this forum, and when I did so, rapidShare had removed your zip-file. Could you be so kind, to copy it again. I will
look more often then.
06-25-2009 09:16 AM
@rehnmaak - Could you provide details on exactly what was changed? I haven't tried it as yet. If the optimizations are general enough, then we can try to get that into the shipped versions of gdb.
The download-write-size, etc options are all on by default in the 11.x toolchain, so the .gdbinit is not required anymore.
06-25-2009 09:50 AM
For downloading the maximum block size was limited to around 400bytes per chunk. This was changed to 16kbytes, it actually does not improve download speed very much but it's an unnecessary limitation.
The other annoying thing is that the disassembly view took for ever because only 4bytes per chunk was transferred. I implemented a read-ahead buffer that reads 1024bytes per chunk. This made disassembly of large objects around 100 times faster.
Viewing locals has always been very slow with microblaze and gdb. It's so slow especially when large structures are on the stack it's almost impossible to use the locals view. Many people are instead adding variables of interest to the watch window. Because reading of locals is not always sequential I've implemented a cache (256 entries by 256 bytes). Every time gdb gets signal (like stop or breakpoint) the cache is invalidated. This improvement makes single line stepping much faster.
06-25-2009 11:04 AM
Any possibility you would be willing to upload the mb-gdb-6.5.0-lightning-speed-090607.zip file again? The link to download it is not working now. Thanks very much.
06-25-2009 11:08 AM
06-25-2009 12:02 PM
06-25-2009 10:54 PM
I have the file now. Thanks very much.
I tried debugging this evening with the 11.2 tools and it has the same speed problems. Any one from Xilinx wan't to comment on plans to fix this speed problem for the masses?
I will try rehnmaak's stuff soon and let you know how it goes. By the way I am running the tools on a 4.3 GHz system with 64 bit Suse Enterprise Desktop and 8G of RAM. So it can't have any thing to do with my PC.
On the bright side of things there are several problems that we were having with our PPC440 design in 10.1.03 that have gone away when we upgraded to new IP in version 11.2. So there are definately some improvements to the IP that have helped us. :smileyhappy:
06-26-2009 01:43 PM
Thanks for pointing out that it is only for microblaze. Unfortunately I am doing a dual PPC440 design in a V5 FXT200 part. So I guess I can't use your patch. Any suggestions on what I can do for a PPC440 design in the 11.2 tools? Thanks.
06-26-2009 01:47 PM
Any possiblity that Xilinx will follow rehnmaak's example for microblaze and come out with a similar patch for PPC440 in the 11.2 tools? This speed thing is a real bummer for your customers. It also slows down how long it takes us to get our products to market, and hence order lots of big expensive chips from you guys :-)
06-26-2009 04:17 PM
The only change in the 11.1 toolchain is that the download speed issue is resolved. The speed while stepping through programs is still slow, and I think it is quite noticeable for PPC.
I do apologize for the delay in the fix. We will try to get this fixed asap - either in 11.3, or 12.1 in the worst case. Ideally we should be able to get it fixed in an AR or something before 11.3. I will update this thread when we have a resolution.
06-26-2009 07:02 PM
That is great news that it is being worked on. I will look forward to a fix. Hopefully it will be fixed for 64 bit Linux as well as Windows.
I was thinking about throwing together a quick mb design using base sytem builder to try rehnmaak's patch in 11.2 to give some feedback. But I then realized the patch is for Windows so I don't think I am in a position to give feedback. Sorry about that. I was using Windows in the past but have moved to using Linux because the Xilinx tool support for 64 bits seems to be much better on the Linux OS. When we place and route our FX200T design a 32 bit OS won't cut it because of the memory requirements.
vsiva, hopefully any AR fixes would include support for 64 bit Linux? Thanks.
06-29-2009 03:54 PM
I have now applied the patches to the powerpc as well as the linux versions of gdb. I've not done much testing of the linux versions more than it do communicate with the target. The powerpc gdb is also slower than the microblaze when it comes to single line stepping, becase of the powerpc has to examine the stack frame before being able to set breakpoints when "stepping over calls".
I did some investigation how gdb is doing the single line stepping. It seem like it steps one machine instruction at a time until it reaches a call instruction where it puts a breakpoint just after the call instruction. I don't know why it just don't put a breakpoint at the next line. I'll have a look at the 6.8 version and see if it is any different.
I also had trouble with tk/tcl when running under linux. I had to comment a few lines in:
listbox.tcl lines 182-184
text.tcl lines 461-463
There are no instructions how to change and backup the files on linux so you are on your own. But as far as I understand just put the files in the /bin folder....
...and don't forget to leave feedback when you try them out.
06-30-2009 02:31 PM
I tried out your PPC linux patch for gdb and there is a definite improvement. Still not fast enough sometimes, but it is much more useable than before. The delay is on the order of 5-7 seconds rather than the 10-15 seconds it used to be. It also seems to depend on the complexity of the statement being executed. Thanks very much for making the effort!
I hope Xilinx looks at this patch and integrates it into their next release....anyone from Xilinx care to respond?
BTW I am using gnat (http://libre.adacore.com/libre/tools/gps/) on Linux as my debugger frontend. Way better than the Xilinx SDK in my opinion.
06-30-2009 02:42 PM
Can you provide some kind of test code that I can try because I'm getting responsetimes in the order of 0.5 second, although in NT but it shouldn't be any different in linux.
Do you by chance have the abillity to try the nt version?
06-30-2009 02:56 PM
0.5 second responses?! That would be great....
I am debugging code in EDK/sw/ThirdParty/sw_services/lwip130_v1_00_b/src/contrib/ports/xilinx/netif/xlltemacif_hw.c
Sorry, I don't think I will be able to try the NT version right now.
One thing I noted is that in the gdb console I get a message like:
Software Breakpoint 14 Hit, Processor Stopped at 0x20004f1c
which I believe indicates that gdb is about to stop and give me a prompt back. But it always takes a few seconds before I do get the prompt. Not sure why it takes that long. I am running on a 3 GHz dual core machine with 6GB RAM, so I don't think it is a resource problem.
06-30-2009 03:15 PM
What's happening after the program is stopped by a breakpoint is gdb is reading all the locals and updating the watch window. What I did was writing a cache with a 256 byte cache line. Because gdb is very inefficient when it comes to untangling the locals.
Any way you could write "ver" (without the quotes) in xmd and it will show you all the communication that goes on between gdb and the target. If you could just copy the output and post it for one "step" (make sure you get rid of all the other) that is especially slow I will have a look at it. Or you could post a personal message so we don't clutter the forum.
06-30-2009 03:43 PM
I don't have a watch window open and I am not displaying any variables, so that should not account for the delay.
I am attaching the "ver" output from xmd from executing 1 line of code:
options |= XTE_MULTICAST_OPTION;
It took 7 seconds to get a prompt back.
I have another log that took 17 seconds and will send it if you are interested.
I also ran strace on xmd and gdb while they were waiting and found that xmd was simply taking a long time to get any data from the card. Once it got the data it would promptly send it to gdb. The data is sent in chunks of 1500 bytes. So this may be an issue with communication across the USB cable?
06-30-2009 03:53 PM
That small amount of data should take a lot less than 7 seconds.
I saw that you had a retry when reading registers. Is that were it stalls? Do you get that everytime?
06-30-2009 04:13 PM
There are occasional retries, but they go by very fast. ie The retry doesn't seem to be related to any timeout, but instead in response to whatever data was received:
Received < '-' Retry Ack, Sending Data Again...
The actual delays happen in this state:
XMD(440): Read registers
After the delay there is a burst of " Sending > $0001...." (with or without a retry) ending in "Received < m2....".
So once a data transfer starts it seems to happen pretty fast, even with a retry. But getting it to start is where the delay seems to come in.
06-30-2009 04:21 PM
Ok, it seem like it is as you said a problem to get the data out of the chip. I didn't have these problems because I was running the xmd on my NT machine and gdb on linux in a virtual box. I'll try installing edk on my linux box and see what happens.
But I think Xilinx got to help with this one....
08-11-2009 12:12 PM
08-25-2009 01:26 PM
Not sure if anyone is following this thread anymore, but I found a fix for this problem: use the old Jungo windrvr instead of libusb.
libusb (which 11.2 uses by default, and was available in 10.1) is much slower than the Jungo windrvr for accessing the USB cable. Some slowdown is to be expected since the code runs in userspace, but it doesn't explain the numbers I saw while programming an 8MB bitstream to flash:
Windows: 9 minutes
RHEL (windrvr): 8 minutes
RHEL (libusb): 20 minutes
Ubuntu (libusb): 50 minutes!
The bundled version of windrvr does not compile on Ubuntu so I had to download the latest version from jungo.com and build it. With it the programming time on Ubuntu went down to 8 minutes.
Interestingly this also helped in the debugger slowdown issue. With windrvr I now get instant responses from the debugger both on RHEL and Ubuntu.