07-04-2019 03:55 PM
I'm kinda new to FPGA design. Previously the projects I was working on mostly just require it to be funtionally correct (at a fairly reasonable frequency). But now I'm actually facing the challege of making the design working on a frequency as high as possible.
The way I could think of first is changing the strategy under Project Manager -> Settings -> Synthesis -> Strategy to Flow_PerfOptimized_high, and Implementation -> Strategy to Performance_ExtraTimingOpt. Is that the right way of thinking?
Then I guess I probably need some more aggressive constraints for the design. It might sound funny but for the project I'm working on, currently I didn't give any constraints except for clock definition. I guess I can have input_delay and output_delay, other than that I'm not sure if there is anything more I can change.
Could you generously share some experience on performance optimization? Thank you so much!
07-04-2019 04:03 PM
The most effective way of getting a faster design is change the source code - minimise logic between flops to no more than 1-2 luts max. This way you can keep the effort low for good results.
But why the effort to get a fast fmax? usually you're constrained in the design by existing clocks etc. Why the new focus on speed? usually you have a target frequency to hit...
07-04-2019 05:36 PM - edited 07-04-2019 05:44 PM
… facing the challege of making the design working on a frequency as high as possible.
So, the first question should be, what is the maximum operating frequency, Fmax, of your current design?
If your project doesn’t use FPGA IO, then finding the current Fmax is done by “keep raising the output clock(s)-frequency of your MMCM/PLL - running synthesis/implementation each time until the design fails timing analysis”. This method will be time-consuming but it is a method that no one can find fault with. I know of no shortcuts to this method. If you are using the Clocking Wizard to setup the MMCM/PLL, then the clock constraints are being automatically updated for you. Some other constraints (eg. set_max_delay -datapath_only) may require manual updating with each new clock frequency that you try. Trying different synthesis/implementation strategies will probably help only a little.
After you find the current Fmax, you can probably reach higher Fmax by doing the difficult stuff (pipelining, parallel processing, etc) and other things that Richard mentions. Again, trying different synthesis/implementation strategies will probably help only a little.
If your project does use FPGA IO, then focus on the IO first to get it running as fast as possible – since it is likely to be the limiting factor for Fmax. This can be challenging! If this is what you are trying to do, then we can talk more about it.
<Here> is a similar thread for you to read.
07-06-2019 08:22 PM
07-06-2019 08:36 PM
Thanks a lot for your reply. I acutally don't need to change the current design I'm working on. The target is researching on what effect that timing error would bring to a target design. So my experiment would be intentionally raise the frequency to allow some timing errors to happen. But before that, I need to make sure the design is well synthesized and PR-ed to meet industry standard. Which is the reason why I raise this question. So far as I tried, the highest frequency that SYN/PR can achieve without timing violation would be around 300MHz, while even at 350MHz, my design seems to output correctly. (My custom design itself doesn't have IO, it uses AXI interconnect to talk to Zynq SoC at a lower frequency of 100MHz).
So it seems like there are quite much margin left even when SYN and PR are done aggressively. I took a look at the slack distribution, there are actually not too many paths that are close to critical path. I understand this might be because of the design itself, but just wondering if there are any other configs I can change without modifying the design. I think set_max_delay might be a good direction to try? I took a look at it, seems like it can overide setup and hold delays. But I don't know if this is kinda like a standard move for people in industry.
07-07-2019 12:59 AM
07-07-2019 03:16 AM
….just wondering if there are any other configs I can change without modifying the design. … overide setup and hold delays.
Don’t go there! Trying to micro-manage implementation is not “standard” procedure and is bound to get you in trouble.
Many of the ideas given to you by dror_m come under the category of synthesis/implementation strategies, which will help but not (I suspect) a lot for increasing Fmax. However, dror_m does bring up the importance of clock-domain crossings (CDCs). CDCs can be synchronous or asynchronous and can have associated timing constraints that depend on the frequency, Fmax. So, be sure to update these constraints as you change/search for Fmax.
Also, if you are using Xilinx IP, then doing the following should help achieve higher Fmax. First, when using wizards to setup the IP, you may be given algorithm settings options that say something like “minimum area” or “low power” or “performace” – you should choose “performance”. Also, do not use OOC synthesis for the IP since OOC synthesis will be unaware of the Fmax that you are using. Vivado synthesis is what Xilinx calls "timing aware". So, doing the right thing in synthesis will help achieve higher Fmax - but not a lot (I suspect). Vivado implementation is really where "all things timing analysis" are done.
Finally, be aware of jitter for the base clock entering the FPGA, which you probably specified (or left at default value) when you used the Clocking Wizard to create clocks (Fmax) for your project. You should use the same jitter value throughout your research. However, if you are going to fiddle with jitter, now (at start of your work) is the time.