06-07-2017 05:12 AM
I have two RTL processors written in VHDL which are exactly the same.
The first one is a three pipe stage and it is divided as follows: IF => ID => EX-WB (Denoting instr fetch decode execute and writeback)
The second one is a four pipeline stage which is divided as follows: IF => ID => EX => WB.
Synthesizing the 4 pipe stage on vivado takes about 3 mins and generates a layout that requires 4,585 LUTs (Lookup tables)
Synthesizing the 3 pipe stage on vivado takes about 30mins and generates a layout that requires 38,886 LUTs.
Can anyone explain why is there such a big difference, and how was vivado able to optimize one in such a way , and the other in a completley different way
06-07-2017 05:27 AM
Hard to tell without seeing the code and the report. My guess would be that in the three-stage example, you're violating one of the rules for using either block RAM or LUT RAM (eg. block RAM requires addresses to be registered). The result is that Vivado is turning your program memory or RAM into a giant register table, whereas in the four-stage one it's being mapped vastly more efficiently into block RAM or LUT RAM.
06-07-2017 08:17 AM
@u4223374 Thank you for your quick reply
I apologize, but as of now, I am not authorized to share the code.
Below I attached the synth_reports regarding the cores. I have put both for three and four pipe stages