Today's designs are really complicated with many clocks domains, embedded processors, IPs, complex state machines and sometimes even HLS tool generated RTL from high level languages like C/C++. This complexity is exacerbated by many types of resets - processor, system, core, software and IP resets. Further complicating the reset architecture is the choice of synchronous and asynchronous resets, active high and active low resets which makes RTL coding style complicated too. Unfortunately, reset architecture is not thought about early in the design cycle leading to every designer deciding the fate of resets in their blocks which results in a reset strategy that is ad-hoc and poorly planned and implemented leading to many iterations, debug and sometimes even product recalls.
The general recommendation is use synchronous resets as much as possible. Asynchronous resets are perfectly acceptable as long as you are aware of the limitations. In fact asynchronous resets are especially useful if a clock cannot be guaranteed. Of course, under such circumstances, in a system where a stable clock cannot be guaranteed a synchronous reset can be held under reset until the PLLs or MMCMs are locked and a stable clock is generated before the reset is released.
In this blog I will attempt to demystify resets as applicable in Xilinx's FPGA designs. In Part 1 of the blog I will address a few design considerations that impacts the reset architecture, the power impact and coding styles for synchronous resets and how to synchronize an asynchronous reset.
The Need for Resets and Reset Planning
The benefit of resets in a system is it forces the design to be at a known state for simulation or to ensure that every chip in the system is at a known state after power-up.
Reset architecture needs to be really well planned early in the design phase. Early planning of reset avoids surprises with disparate teams spread all over world with different teams responsible for the RTL design, verification, synthesis, implementation, timing closure and design validation and generation of the production bitstream. If not adequately planned, reset issues may show up long after the product has been in production. A case in point, I own a smart refrigerator from a leading manufacturer. In this smart refrigerator the LCD panel will refuse to respond to the user touch and the only way to get it respond is a hard reboot jut like the Windows PC of yesteryears. Calling the factory gets you the same resolution - Have you power cycled the refrigerator? Power cycling the refrigerator is a painful process every time I want to change the setting, don't you agree? Sometimes the LCD won't respond even after power cycling. I suspect a reset architecture issue. I'm sure there are other examples too.
The first step of the planning process is to decide whether a reset is even necessary. Xilinx FPGAs come with a Global Reset that resets every thing in the device. It might be worthwhile eliminating resets in your design and using the Global Reset instead. Eliminating resets is not an easy solution as it not only leads to coding issues (discussed later in this blog) but also eliminating resets in legacy RTL code and third party IP is not a trivial task. If it is possible to get rid of resets that is in your part the IP and connect the third party IP resets to the Global Reset then it would be an ideal situation.
Once it has been decided that the system or chip design architecture absolutely requires resets, the next steps are decide what to reset and what not to reset, how to deal with synchronous and asynchronous resets, whether to use active-high or active-low resets and finally how to handle reset clock domain crossing.
Reset and the impact to system/chip power
Resets have a huge impact on power and it can be attributed to two primary factors. First, excessive reset use generally create 3-5% more logic (FFs and LUTs) which you can simply think of 3-5% more power in general (can be more). Secondly, resets generally tend to have a higher fanout and is more timing critical. As a result, it can often consume the more valuable routing resources leaving the less timing critical data paths to use lesser optimal routes. Since resets do not toggle that often in general, whether it exists on a short route or a long one, power is much the same since dynamic power is driven by switch-rates. Data paths however obviously do switch and if they have higher wire-lengths, even if they meet timing, they will consume more power. So if wire lengths are increased by say 15%, then signal power will increase by that same amount. Reducing resets in your design will result in power savings as well.
What to reset within a design and how long should the reset be active
One of the decisions to be made at the planning stage is to decide what gets and what does not get reset in the design. But as a general guideline, reseting state machines and FIFO read and write pointers should be enough. Data paths seldom needs to be reset if the state machine controlling the data path is properly designed. In most cases, only the first pipeline stage needs to be reset. The rest of the pipeline stages will be flushed in the subsequent clock cycles.
Another consideration for resets is the duration of the reset pulse. Ideally the reset should be active long enough for the entire pipeline registers to be flushed before valid data can flow through the design. The reset pulse should be long enough, typically 20-50 clock cycles or until a few cycles after the PLLs or MMCMs lock, depending on the number elements that are being reset - the more registers (or flip-flops) that are being reset, the longer the reset pulse. This will ensure recovery and removal times are met for the registers that are placed far away from the reset register.
A common practice is use the 'locked' signal from a PLL or MMCM and combine (AND/OR type LUT) it with the system or CPU or core reset before resetting all the flops in the design. Keep in mind that the 'locked' signal is an asynchronous signal and LUT (OR/AND) in the path might result in a glitch. Such a spurious glitch might cause unwanted results. The recommendation would be to register the output of the LUT and then using the output of the register to reset the flops. The idea here is you don't want a LUT driving resets all through the design as it can lead to a spurious reset due to a glitch. The recommendation is, the source of the reset must be the output of a flip-flop and not from the output of a LUT or combinatorial block. Registering the LUT output allows the tool to replicate it in case the fanout is very high.
Synchronous and Asynchronous Resets
Another architectural decision that has to be made very early in the design process is to decide whether to use a synchronous reset or an asynchronous reset. Xilinx's recommendation would to be use synchronous resets throughout your design. Asynchronous resets are very common in today's complex SoC designs. If there are any asynchronous resets in your design, the recommendation would be to synchronize the asynchronous reset. Xilinx also recommends the use of XPM CDC modules for all CDC topologies including asynchronous resets. The XPM CDC modules are correct by construction and come with timing constraints. An asynchronous reset synchronizer allows the reset signal to be asserted asynchronously but the de-assertion (or removal) will be synchronous.
Fig. 1 show an active high asynchronous reset synchronizer and the corresponding RTL coding is shown in Fig. 2. When the reset is asserted (1'b1), the Q-pins of the synchronizer flops goes high and stays high until the reset is de-asserted (goes low). Once the reset is de-asserted, in the next clock cycle, the 1'b0 on the D-pin of the first flop is captured and in the next cycle(two clock cycles later), the output of the second flop in the synchronizer goes low. If there are more flops on the synchronizer chain (more than the 2 flops shown in Fig. 1), then it would take that many more cycles before the reset is de-asserted on the last flop of the synchronizer chain.
Fig. 1: Schematic for synchronizing an asynchronous reset
Fig. 2: Verilog code snippet for synchronizing an active-high asynchronous reset
If the asynchronous reset comes from a top level port and will feed multiple clock domains, remember to synchronize the reset in each of the respective clock domains.
Notice that the code snippet is for an active high reset and the D-pin of the first flip-flop is connected to GND. If the reset is active low, the D-pin of the first flip-flop is connected to VCC. The asynchronous reset is connected to the CLR pins of the reset synchronizer (see Fig. 4 and Fig. 5 below).
Once the asynchronous reset has been synchronized in the top level module, the subsequent RTL coding style (the wire from the Q-pin of reset_a_synched_reg flip-flop in the Fig. 1 above) should ideally be for a synchronous reset as shown in the code snippet below:
Fig. 3: Code snippet for the rest of the design once the asynchronous reset has been synchronized in the top level.
In the code snippet of Fig. 3, the hierarchical modules below the Top level module use a synchronous reset coding style. Depending on the fanout of the reset register, 'reset_a_synced', it could be replicated as necessary in order to meet timing. Another thing to note is, the register reset_a_synced goes high as soon as the asynchronous reset is active. It goes low (or inactive) depending on how many synchronizer stages there are. If there are 2 FFs in the synchronizer chain, the reset will be inactive after two clock cycles. If there are three FFs in the synchronizer chain then the reset is inactive after three clock cycles. If a register is added to aid in replication then it add another reset latency.
The first reason for recommending synchronous resets is for big blocks like DSPs and block RAMs which by architecture support only synchronous resets. The inference of DSPs and block RAMs is possible if synchronous resets are used. Use of asynchronous resets might result in these structures getting inferred in the fabric which might hurt performance. In the DSP blocks, the pipeline registers only support synchronous resets. In block RAMs, the output registers support only synchronous resets and using output registers is an advantage as it reduces the clock-to-out (Tco).
The other reason for using synchronous resets is the flexibility for the tool to either hook the reset directly to the R or the CLR pin or merge the reset signal to the datapath. This flexibility reduces the number of control sets in the design and allows the placer to pack more flops into the same slices (refer to Chapter 5 of UG949: Ultrafast Design Methodology Guide for the Vivado Design Suite for control sets).
The final reason for recommending synchronous resets is, synchronous reset is automatically timed and do not need any special timing constraints. Synchronous resets are predictable ( the clock edge) when compared to asynchronous resets because in asynchronous resets the release of the reset is not always predictable - it can happen at any time.
Active-high and Active-low Resets
For control signals in general and resets in particular, it is recommended to use either an active-high or an active-low reset throughout your design. I have observed RTL issues where one hierarchical module designer assumed active-high while another designer assumed active-low for the same reset signal. The simulations were all done at the hierarchical level by the RTL engineer so they all passed. When the FPGA was being tested in the lab it was noticed that one hierarchical block was always in reset. After a lot of debug the issue was identified to problematic reset coding. The choice of type of reset - active-high or active-low needs to be decided at the planning stages itself to avoid any surprises later in the design stages. The decision should be based on the system architecture, any legacy RTL code that will be reused and the reset styles in third party IPs.
If possible, always use active-high resets (as active-low resets require an inversion adding a LUT in the path) when using Xilinx FPGAs. If your design has active low resets, the reset synchronizer and the RTL code snippet are as shown in Fig. 4 and Fig. 5 below:
Fig. 4: Active low asynchronous reset synchronizer
Fig. 5: RTL code snippet for active-low asynchronous reset synchronizer
In the next blog I will cover a few tricks and tips on combining multiple asynchronous resets, sequencing resets across clock domains, an insight into how you can manage reset behavior in Vivado and finally how Vivado manages resets in your design.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.