UPGRADE YOUR BROWSER

We have detected your current browser version is not the latest one. Xilinx.com uses the latest web technologies to bring you the best online experience possible. Please upgrade to a Xilinx.com supported browser:Chrome, Firefox, Internet Explorer 11, Safari. Thank you!

cancel
Showing results for 
Search instead for 
Did you mean: 

Retiming in Vivado Synthesis

Xilinx Employee
Xilinx Employee
7 2 339

RETIMING DESCRIPTION

Retiming is a sequential optimization technique to move registers across combinatorial logic to improve the design performance without affecting the input/output behavior of the circuit. The circuit shown in Figure 1 has a critical path with a 6-input adder. The path highlighted in red is the path that limits the performance of the whole circuit.

image.pngFigure 1 : Example of a register-to-register path with 6-input adder logic

The performance of the circuit shown here can be improved by retiming the registers on the adder output into the combinatorial logic of the circuit.

The overall latency of the circuit is 4. Figure 2 shows one way to move the registers in order to minimize the logic. Moving the output registers into the cone of logic is called backward retiming. When this is done, the critical path is reduced to a 2-input adder.

image.pngFigure 2 : Example of a register-to-register path with a 2-input adder by applying backward retiming

One other thing to note about the above examples is that the number of registers has changed.

Figure 1 had 9 different register buses. Figure 2 has 12 different registers buses. The reason for this is that when performing backward retiming, when it is moved from the output to the input both inputs of the gate must now have a register.

There are two different types of retiming, backward retiming and forward retiming. Backward retiming removes registers from the output of a gate, and creates new registers at the inputs of the same gate. Forward retiming does the exact opposite, it removes registers from the input of a gate, and creates a new one at the output.

For backward retiming to work, the combinatorial logic must drive only the register and not fanout to other logic. For forward retiming to work, each input of the gate must be driven by a register with the same control logic.

Figure 3 shows the same circuit with either forward or backward retiming.

image.pngFigure 3 : AND gate either being forward retimed or backward retimed

GLOBAL RETIMING VS LOCAL RETIMING

There are two ways to enable automatic retiming in Vivado Synthesis, Global and Local.

Global retiming works on the full design and moves registers across large combinatorial logic structures based on the timing of the design.

It will analyze all of the logic in the design and move registers in the worst case paths in order to make the overall design faster. In order for this to work, the design must have accurate timing constraints in the .xdc file. Global retiming is enabled with the -retiming switch in synth_design or in the Vivado GUI under synthesis settings.  In addition, this feature can also be used with the BLOCK_SYNTH feature in synthesis to target specific modules in your design.

Local retiming is when a user specifically tells the tool which logic to perform the retiming on using the retiming_forward/retiming_backward RTL attributes.

Care should be taken when performing local retiming as it is not timing driven and the tool will do exactly what is asked of it.

For more information on the use of retiming, please refer to (UG901) Vivado Design Suite User Guide : Synthesis.

ANALYZING MESSAGES FROM THE LOG FILE

Figure 4 shows an example where retiming can improve logic levels. The structure has a critical path of 3 logic levels coming from a 37 bit AND gate. The source register is called din1_dly_reg and the destination register is called tmp1_reg with an extra register after tmp1_reg with 0 logic levels.

This is an ideal path to retime as we can switch to one path with 3 logic levels followed by a path with 0 levels to a path with 2 logic levels followed by a path with 1 or 2 levels.

 

image.pngFigure 4 : Circuit that can be backward retimed

The synthesis log file looks similar to the following:

image.png

 

 

 

 

 

 

 

From this log file you can see the reported logic levels before and after retiming, and the names of the new registers that were created. When synthesis creates new registers from retiming, it will use the suffix "bret" for registers that were backward retimed, and "fret" for registers that were forwards retimed.

Figure 5 shows a circuit where incompatible register elements will make retiming illegal. The structure again has a start register called din1_dly_reg going through a 37 bit AND gate causing 3 levels of logic, and then ending at a register called din1_dly_reg. In addition, the AND gate has a fanout to another register highlighted in pink.

image.pngFigure 5 : Example of a circuit that can't be retimed

This example cannot be retimed because of the register highlighted in pink. This register has an asynchronous reset where tmp1_reg does not. Because the two registers do not have the same control set, they are not able to be backward retimed into the AND gate logic. The log file in this example will show the following:

image.png

 

 

 

 

 

The log file includes a message about incompatible flip-flops and the before and after logic levels do not change.

Retiming cannot happen in the following situations:

1. Timing Exceptions on a register (multicycle paths, false paths, max delays)

2. Keep type attributes on the register (DONT_TOUCH, MARK_DEBUG)

3. Registers with different control sets

4. Registers driving outputs or being driven by inputs (unless design is marked as out-of-context).

 

Example where retiming is unable to improve the critical path in a feedback loop:

 When a path has the same source and destination register, retiming optimization might not be able to improve logic levels.

 For example:

 The critical path for the register “dout_reg” is highlighted in red. It goes through a reduction AND operator and ends at the reset pin of the same register.

 The reduction AND operator will consume 2 logic levels based on the width which we have i.e. 16-bit.

retiming-1.jpg

 

 

 

 

 

 

 

The Screen capture below shows how synthesis describes the nature of the critical path.

It also mentions the cell names which are part of the critical path.

 retiming-2.png

Thanks to Chaithanya Dudha who is the original author of this article.

2 Comments
Explorer
Explorer

Does item 3 also cover clock domain crossings so they under no circumstances gets retimed? Or will such paths get optimized if the clocks do not have any asynchronous clock group setting?

 

Xilinx Employee
Xilinx Employee

@tsjorgensen, yes item 3 = different control sets covers clock domain crossings.

A control set is the combination of the clock, clock enable and reset signal. A different clock means a different control set, so such registers will never be retimed.

 

Best regards

Dries