UPGRADE YOUR BROWSER

We have detected your current browser version is not the latest one. Xilinx.com uses the latest web technologies to bring you the best online experience possible. Please upgrade to a Xilinx.com supported browser:Chrome, Firefox, Internet Explorer 11, Safari. Thank you!

cancel
Showing results for 
Search instead for 
Did you mean: 
Explorer
Explorer
17,658 Views
Registered: ‎12-21-2009

map multi-threading

Hi all,

 

This message is directed to the ISE design suite developers. I'm doing a lot of on-chip debugging using an xc5vlx330t kit. As know the par is multi-threaded with up to 4 processors which is very efficient, when the device utilization is up to 90% the par takes about 20 minutes runtime on an AMD phenom x4 processor. The real problem is with the MAP phase, it takes about 90 minutes with somehow relaxed timing constraints. The MAP is multi-threaded with up to 2 processors, i hope that in the next ISE release the MAP phase will be multi-threaded up to 4 or 6 processors. Developers could try to match the available processors in the market, AMD Phenom II X6 is released with higher level 2 cache and 6 cores. I hope at least that the ISE can utilize the best Desktop machine in the market away from servers. the runtime is large enough due to the single threaded XST phase.

 

Another point, try to parallelize the TRCE and bitgen if possible.

 

 

Regards,

Amr

12 Replies
Scholar markcurry
Scholar
17,641 Views
Registered: ‎09-16-2009

Re: map multi-threading

I wonder - have you done any comparisons for implementation times with and without the multi-threaded switch for map, and par?  I did, and didn't see ANY appreciable differences in implementation time.  What differences there were was in the noise.  We end up leaving the multi-threaded switches off; saving the other processor for "other things". 

 

Just wondering if you or others have benchmarked it.

 

Darned Amdahl's law and all that...

 

--Mark

 

Xilinx Employee
Xilinx Employee
17,635 Views
Registered: ‎07-01-2008

Re: map multi-threading

Currently the only phases that are multi-threaded are the global placement phase of MAPand the routing phase of PAR. You shouldn't expect runtime differences anywhere else.

0 Kudos
Scholar markcurry
Scholar
17,632 Views
Registered: ‎09-16-2009

Re: map multi-threading

 

Yes I understand,-mt only is used in PAR and MAP -  I only benchmarked the "MAP" and "PAR" phases.

And like I said, saw no significant difference in run times with and without multi-thread.  Benchmark

done on ISE 12.2, Virtex6, Virtex5, and Spartan6 designs.

 

--Mark

 

Xilinx Employee
Xilinx Employee
17,629 Views
Registered: ‎07-01-2008

Re: map multi-threading

The global placement phase (10.8) may or may not be a major portion of the overall MAP run time. For an easy design you likely won't see much difference in total run time. For a hard design you should see a difference. The same goes for PAR. Even though all of the routing phases are multi-threaded, the eassier the design is, the smaller the routing time is compared to the overhead.

 

Phase 10.8  Global Placement
......................................................................................................................................................................................
.......................................................................................................................................................................................
....................................................................................................................................................................
..............................................
Phase 10.8  Global Placement (Checksum:891e2d45) REAL time: 7 hrs 48 mins 6 secs

0 Kudos
Explorer
Explorer
17,622 Views
Registered: ‎12-21-2009

Re: map multi-threading

mark, I did not perform a direct benchmarking but i did something similar. Once i had a problem with timing closure and had to run the SmartXplorer which is NOT mutithreaded, yes you can modify the map and par option via the -mo and -po swithces but this directs the smarxplorer to use only one strategy in all iterations with just changing the placer cost table. Anyway the difference bweteen runtimes using the single threaded SmartXplorer versus running the flow with my script using the -mt switch was segnificant. Am talking about a very large design, when a large FPGA like xc5vlx330t be utilized up to about 60% to 90% over this is really a large design to route. In this case the multithreaded router is very effective and efficient, the problem lies in the MAP runtime which is very very long !!! At least the global placement phase should be multithreaded with up to 4 processors or as i said to the recent processor releases.

I have read that Altera's Quartus II software uses up to 16 processors in the fitting phase "fitter is equivalent to PAR in ISE"

0 Kudos
Scholar samcossais
Scholar
17,468 Views
Registered: ‎12-07-2009

Re: map multi-threading

bwade >

Thank you very much for your clear answer. This is an important information. Because MAP is by far the longest phase of my FPGA implementation runtime.

 

I am using ISE 13.2 and my device is a Virtex-6 LX130T.

 

As far as I am concerned, MAP process lasted longer with multi-thread enabled. Recently I haven't done any benchmark with multi-thread enabled, I did it at the begining of my project and whereas PAR was much faster with multi-thread enabled, MAP was a bit slower (maybe 5 or 10% I don't remember well). And at the time I benchmarked, MAP process took like a hour.

 

As for now, my FPGA ressources use is very high (I do think it's well optimized though) and MAP takes like 6 or 7 hours (without using time taking options, as timing closure is always ok).

 

I'm going to try again to benchmark that and see what happens on global placement phase.

Tags (1)
0 Kudos
Scholar samcossais
Scholar
17,464 Views
Registered: ‎12-07-2009

Re: map multi-threading

I am currently running the implementation of my device (see above) on 2 workstations.

 

- Core i3 550 dual core (x2 logical cores with hyperthreading) with 8GB of memory running on Win7 64b. My ISE project not using more than 3GB of memory for implementation, I used the 32b version of ISE as it is slightly faster.

- Linux 64b dual quad-core processors with 32GB of memory (8 logical cores seen).

 

Actually on both workstations the implementation is still currently working on "Phase 10.8 Global Placement" (for about a hour) and I'm checking CPU usage with Windows Task Manager and Linux System Monitor.

 

It appears that :

- on Windows several cores are used but the sum of all cores usage is never higher than 25% of the overall CPU usage

  -> which means mutli threading is not working (CPU being composed of 4 logical cores)

- on Linux only one thread is used -> which means mutli threading is not working

 

My FPGA is quite well filled up with 170 BRAMs, 158 DSPs, the PCIe hard block, MIG for DDR3, a lot of registers and LUTs, etc (I used to have a hi-speed streaming interface using 16 GTX but it's currently disabled).

 

Is my project not compatible with mapper multithreading or is it your option which is not actually working on multithread ???

0 Kudos
Scholar samcossais
Scholar
17,463 Views
Registered: ‎12-07-2009

Re: map multi-threading

By the way, Smart Guide is disabled, and when mapper starts, I get this message :

------------------------------------------------------------------------

Started : "Map".
Running map...
Command Line: map -filter "iseconfig/filter.filter" -intstyle ise -p xc6vlx130t-ff1156-2 -w -logic_opt off -ol high -xe n -t 1 -xt 0 -register_duplication off -r 4 -global_opt off -mt 2 -detail -ir off -ignore_keep_hierarchy -pr b -lc off -power off -o rx_top_map.ncd rx_top.ngd rx_top.pcf
Using target part "6vlx130tff1156-2".
INFO:Map:284 - Map is running with the multi-threading option on. Map currently
   supports the use of up to 2 processors. Based on the the user options and
   machine load, Map will use 2 processors during this run.

------------------------------------------------------------------------

0 Kudos
Visitor vijju
Visitor
15,830 Views
Registered: ‎08-03-2012

Re: map multi-threading

I just came to know that MAP and PAR has mulit-threading option. I was hoping that enabling multi threading would reduce my compile time. But on enabling these, I am seeing no change in my compile time.  Its the same as before(40 mins for MAP and 20 mins for PAR). My design is for virtex-6 and I am running it on a processor having 4 cores.

0 Kudos
Scholar samcossais
Scholar
9,697 Views
Registered: ‎12-07-2009

Re: map multi-threading

vijju> I didn't check for ISE 14 but basically map is slightly longer with multi-thread enabled while PAR is slightly shorter (multi-thread is effectively used during some phases in PAR). This is for the Windows version (64b). With certain settings it won't be enabled though (you get a message if it's not).

 

So you should enable mutli thread in PAR.

0 Kudos
Newbie bcalin1984
Newbie
9,406 Views
Registered: ‎08-27-2013

Re: map multi-threading

I am using the ISE Project navigator P.68d that comes with ISE Design Suite 14.6
I have just ran with mt option on and off. I ran on the same design, same pc, the MAP and PAR stages. The machine that I am using is a Toshiba laptop P770 with Core i7-2670QM @ 2.2 GHz quad-core processor and 8 GB of RAM, OS is Win8 Pro, 64-bit edition.
In mt mode, for the MAP I had not higher option than 2 threads, and for PAR, 4 threads.
Keep in mind that my cpu has Turbo capability and therefore may go up to 3.1 GHz when running single-threaded.

When running multithreaded it seems to stay at 2.2 GHz
When running singlethreaded it seems to stay between 2.8 and 3.0 GHz (checked with CPU-Z)

The single thread results are:
==============================

Peak Memory Usage:  1861 MB
Total REAL time to MAP completion:  23 mins 18 secs
Total CPU time to MAP completion:   22 mins 55 secs

Total REAL time to PAR completion: 5 mins 30 secs
Total CPU time to PAR completion: 5 mins 45 secs
Peak Memory Usage:  1707 MB


The -mt results are:
====================

Peak Memory Usage:  1878 MB
Total REAL time to MAP completion:  29 mins 29 secs
Total CPU time to MAP completion (all processors):   29 mins 56 secs

Total REAL time to PAR completion: 5 mins 46 secs
Total CPU time to PAR completion: 6 mins 2 secs
Peak Memory Usage:  1707 MB

The main difference comes from the Placer: about 6 minutes longer when using mt !
This difference might be explained by the difference in clock frequency (see Turbo):
23.25 minutes / 29.5 minutes = 0,78 which is approx. 0,71 == 3.1 GHz / 2.2 GHz

PS: I explicitly checked in the logs of both runs for arguments to confirm the mt settings.

I am using the ISE Project navigator P.68d that comes with ISE Design Suite 14.6
I have just ran with mt option on and off. I ran on the same design, same pc, the MAP and PAR stages. The machine that I am using is a Toshiba laptop P770 with Core i7-2670QM @ 2.2 GHz quad-core processor and 8 GB of RAM, OS is Win8 Pro, 64-bit edition.
In mt mode, for the MAP I had not higher option than 2 threads, and for PAR, 4 threads.
Keep in mind that my cpu has Turbo capability and therefore may go up to 3.1 GHz when running single-threaded.


0 Kudos
Scholar samcossais
Scholar
9,402 Views
Registered: ‎12-07-2009

Re: map multi-threading

Well no surprise whatsoever. Even without turbo MAP is longer with mt on. And PAR is only 10-15% shorter in this case. So with turbo you're likely to be slower indeed.

I don't expect these figures to improve on ISE. I want to check how Vivado do though.

0 Kudos