11-12-2009 06:13 PM
I found that the default fanout value for Virtex 5 is very large (100000 as mentioned in XST document) which is quite different from other FPGA series.
In my design (using Virtex 5), I've tried to set the fanout value to a small number (say, 32) and a large number (10000), the P&R timing result is quite different. For the one I've used the large fanout value, ther performance is better than using the smaller value about 5~10% which is in contrary with the ASIC behaviour.
I would like to ask any reason behind that? Thx all.
Solved! Go to Solution.
11-12-2009 11:22 PM
did you actually create a net that feeds >1000 inputs and perform your test with it?
Can happen for Clock Enable or Sync Reset nets in large designs, but unlikely elsewhere.
For nets with <100 Inputs the additionally created hardware may corrupt your results when you set the fanout to 32.
And I think the same would happen in ASICs too. Theres always a tradeoff between the load of a hig fanout line and
the additional delays that come from additional hardware that's needed to keep the fanout low.
Give some details about your test approach and results
11-13-2009 09:14 AM
Generally speaking, Virtex II onward, had fully buffered interconnect, so loading was irrelevant (did not affect performance strongly as in previous families).
Unless you are fighting a problem of a few 10's of picoseconds, I would not worry about loading at all.
Xilinx San Jose
11-15-2009 07:45 PM
I think you've misunderstand my question. What I mean is by using one logic gate (or flop) to drive N logic. If I set fanout to 32 in XST and N > 32, the XST will add buffer(s) between the gates.
I did think that there is some buffers in the switch box so that fanout is no longer the problem for the virtex5. But it can't explain my design's performance improvement. Actually, what I think is that, there are too many wires in my design, if I "manually" added some additional buffers to the high fanout wire, it will divided the high fanout wire into a smaller wire segments. Thus, the whole network's wire connection will increase and the design will become more congested. Is this a possible reason for that? (I can only prove this by checking out the "wire utilization ratio", however, which is not available from the tools)
11-15-2009 08:07 PM
Modern FPGA architectures are "fully buffered" as Austin pointed out earlier in this thread. What this means is that at every connection there is a buffer to drive the next segment. Having these buffers present means that only the individual segments act like a RC delay instead of the entire net acting as RC that you are familiar with in your previous ASIC experience.
When you reduce the fanout limit in a FPGA, you are inserting LUTs in the net instead. Routing through a LUT component adding delay and reduces the performance. In addition to this component delay, there is no guarantee that synthesis made the right choice about where to split the original high fanout net and the routing/placement connections may not longer be optimal.
Leave the high fanout nets alone and let the tools and architecture do the work they were designed to do.
Have you tried typing your question into Google? If not you should before posting.
Too many results? Try adding site:www.xilinx.com