UPGRADE YOUR BROWSER

We have detected your current browser version is not the latest one. Xilinx.com uses the latest web technologies to bring you the best online experience possible. Please upgrade to a Xilinx.com supported browser:Chrome, Firefox, Internet Explorer 11, Safari. Thank you!

cancel
Showing results for 
Search instead for 
Did you mean: 
Adventurer
Adventurer
8,722 Views
Registered: ‎07-24-2016

ASYNC_REG and clock domain crossing

Jump to solution

Greetings forum,

 

I have created a simple clock domain crossing circuit (CDCC) but I think I will need some clarification on the ASYNC_REG attribute. If I am not mistaken, FFs which are connected to each other and have the ASYNC_REG attribute will be placed close together (in the same slice?) in order to optimize the timing of their interconnections. But in this design, have I...overdone it with the assignment of the attribute? Should it be asserted only at the signal which connects the FDREs clocked by clk_dst? Am I putting too much work at the placer/router? Please see the VHDL code below:

 

entity CDCC is
port(
    clk_src        : in std_logic;     -- clock of source clock domain
    clk_dst        : in std_logic;     -- clock of destination clock domain
    data_in       : in std_logic;     -- signal input, synced to clk_src
    data_out_s  : out std_logic     -- signal output, synced to clk_dst
    );
end CDCC;

architecture RTL of CDCC is

    signal data_in_int          : std_logic := '0';
    signal data_in_reg          : std_logic := '0';
    signal data_sync_stage_0    : std_logic := '0';
    signal data_out_s_int       : std_logic := '0';

    attribute ASYNC_REG                                      : string;
    attribute ASYNC_REG of data_in_int                : signal is "true";
    attribute ASYNC_REG of data_in_reg               : signal is "true";
    attribute ASYNC_REG of data_sync_stage_0    : signal is "true";
    attribute ASYNC_REG of data_out_s_int           : signal is "true";

    -- register the incoming signal first...
    FDRE_reg_input_CDCC: FDRE
    generic map (INIT => '0')
    port map(
        Q   => data_in_reg,
        C   => clk_src,
        CE  => '1',
        R   => '0',
        D   => data_in_int
        );

    -- first sync stage
    FDRE_sync_CDCC_0: FDRE
    generic map (INIT => '0')
    port map(
        Q   => data_sync_stage_0,
        C   => clk_dst,
        CE  => '1',
        R   => '0',
        D   => data_in_reg
        );

    -- second sync stage
    FDRE_sync_CDCC_1: FDRE
    generic map (INIT => '0')
    port map(
        Q   => data_out_s_int,
        C   => clk_dst,
        CE  => '1',
        R   => '0',
        D   => data_sync_stage_0
        );

    data_in_int <= data_in;
    data_out_s  <= data_out_s_int;

end RTL;

 

 Cheers!

0 Kudos
1 Solution

Accepted Solutions
Historian
Historian
14,677 Views
Registered: ‎01-23-2009

Re: ASYNC_REG and clock domain crossing

Jump to solution

Yes. You have too many ASYNC_REG properties. Its not just overdoing them, they are wrong...

 

First, ASYNC_REG should be applied to the cells, not the signals when you instantiate the cells. When the flip-flops are inferred (using a clocked process) then you apply the attribute to the signal that will become the flip-flop, but when you directly instantiate the cell, then you should apply the attribute to the cell.

 

The only cells that should have the attribute set are the ones that form the metastability reduction chain. In your design, those are FDRE_sync_CDCC_0 and FDRE_sync_CDCC_1.

 

It is incorrect to apply them to FDRE_reg_input_CDCC, since that is not part of the metastability resolution chain, and the "report_cdc" command would probably flag this as a violation (and I have no idea what the placer/router will do with it).

 

Furthermore, it is meaningless to try and apply this to data_in_int since (at least in this code) this signal is not associated with a register at all - it is a wire...

 

Avrum

View solution in original post

12 Replies
Historian
Historian
14,678 Views
Registered: ‎01-23-2009

Re: ASYNC_REG and clock domain crossing

Jump to solution

Yes. You have too many ASYNC_REG properties. Its not just overdoing them, they are wrong...

 

First, ASYNC_REG should be applied to the cells, not the signals when you instantiate the cells. When the flip-flops are inferred (using a clocked process) then you apply the attribute to the signal that will become the flip-flop, but when you directly instantiate the cell, then you should apply the attribute to the cell.

 

The only cells that should have the attribute set are the ones that form the metastability reduction chain. In your design, those are FDRE_sync_CDCC_0 and FDRE_sync_CDCC_1.

 

It is incorrect to apply them to FDRE_reg_input_CDCC, since that is not part of the metastability resolution chain, and the "report_cdc" command would probably flag this as a violation (and I have no idea what the placer/router will do with it).

 

Furthermore, it is meaningless to try and apply this to data_in_int since (at least in this code) this signal is not associated with a register at all - it is a wire...

 

Avrum

View solution in original post

Scholar drjohnsmith
Scholar
8,707 Views
Registered: ‎07-09-2009

Re: ASYNC_REG and clock domain crossing

Jump to solution

async_reg don't work the way one would expect.

 

effectively its useless..

 

The official line is on lines of if you use lots of tcl , and check what the system has done to prove it, then it can work.

 

for you and me, async_reg don't do what we'd want

 

do a search of the forums and the answer records, you will see a lot of chatter about it..

 

rant over.

 

<== If this was helpful, please feel free to give Kudos, and close if it answers your question ==>
0 Kudos
Historian
Historian
8,702 Views
Registered: ‎01-23-2009

Re: ASYNC_REG and clock domain crossing

Jump to solution

async_reg don't work the way one would expect.

 

effectively its useless..

 

We have to be careful with statements like this...

 

While there have been more than a fair share of issues with the support of the ASYNC_REG attribute/property, and I agree that you need to be careful with it (and maybe even "spot check" that the tools are doing what they are supposed to), we need to keep our eye on the fact that this is supposed to work and we should continue to use it as it is intended. If it doesn't work properly, then we need to continue filing SRs on it.

 

That being said, I haven't seen much chatter about 2016.3 and now 2016.4 is out - I expect that issues should resolve as each new version comes out...

 

So - in my opinion - we should continue to use them as they are intended - they are important.

 

Avrum

Tags (1)
0 Kudos
Scholar drjohnsmith
Scholar
8,685 Views
Registered: ‎07-09-2009

Re: ASYNC_REG and clock domain crossing

Jump to solution

Thanks

 

"have to be careful,"

 

Being primarily a VHDL user, and doing smaller Kintex parts most of the time for companies,

 

 I'm very used to things not working as advertised and putting in requests,

    

As for the original question, async_reg, 

    it can not be relied upon to do what one would expect,

 

which is what I call broken.

 

Now it can be 'fixed' by TCL, 

   

But why can't it be made easy to 

 

a) from the vhdl ,say out these two registers into one block

 

b)  from the vhdl, say , put this into a IOB 

 

To quote Clarkson and top gear, how hard can it be..

 

 

 

 

<== If this was helpful, please feel free to give Kudos, and close if it answers your question ==>
0 Kudos
Adventurer
Adventurer
8,635 Views
Registered: ‎07-24-2016

Re: ASYNC_REG and clock domain crossing

Jump to solution

@avrumw

@drjohnsmith

 

Thank you for your replies. I also read this topic (LINK HERE), and decided to test a VHDL code which generates FDREs, with the ASYNC_REG attribute implemented in the .xdc file of the design. I asserted the attribute only to the two FFs clocked by clk_dst as avrumw suggests. It seems that the placer put the FFs close together as expected. I will just attach all the stuff below so that users can go through it in the future for reference.

 

library IEEE;
library UNISIM;
use IEEE.STD_LOGIC_1164.ALL;
use UNISIM.VComponents.all;

entity CDCC is
generic(
    NUMBER_OF_BITS : integer := 8); -- number of signals to be synced
port(
    clk_src     : in  std_logic;                                        -- input clk (source clock)
    clk_dst     : in  std_logic;                                        -- input clk (dest clock)
    data_in     : in  std_logic_vector(NUMBER_OF_BITS - 1 downto 0);    -- data to be synced
    data_out_s  : out std_logic_vector(NUMBER_OF_BITS - 1 downto 0)     -- synced data to clk_dst
    );
end CDCC;

architecture RTL of CDCC is
    
    signal data_in_reg          : std_logic_vector(NUMBER_OF_BITS - 1 downto 0) := (others => '0');
    signal data_sync_stage_0    : std_logic_vector(NUMBER_OF_BITS - 1 downto 0) := (others => '0');
    signal data_out_s_int       : std_logic_vector(NUMBER_OF_BITS - 1 downto 0) := (others => '0');

begin

-------------------------------------------------------
-- Register the input signals
-------------------------------------------------------
reg_input_CDCC: for I in 0 to (NUMBER_OF_BITS - 1) generate
FDRE_reg_input_CDCC: FDRE
    generic map (INIT => '0')
    port map(
        Q   => data_in_reg(I),
        C   => clk_src,
        CE  => '1',
        R   => '0',
        D   => data_in(I)
        );
end generate reg_input_CDCC;

-------------------------------------------------------
-- Synchronization stage 0
-------------------------------------------------------
sync_block_CDCC_0: for I in 0 to (NUMBER_OF_BITS - 1) generate
FDRE_sync_CDCC_0: FDRE
    generic map (INIT => '0')
    port map(
        Q   => data_sync_stage_0(I),
        C   => clk_dst,
        CE  => '1',
        R   => '0',
        D   => data_in_reg(I)
        );
end generate sync_block_CDCC_0;

-------------------------------------------------------
-- Synchronization stage 1
-------------------------------------------------------
sync_block_CDCC_1: for I in 0 to (NUMBER_OF_BITS - 1) generate
FDRE_sync_CDCC_1: FDRE
    generic map (INIT => '0')
    port map(
        Q   => data_out_s_int(I),
        C   => clk_dst,
        CE  => '1',
        R   => '0',
        D   => data_sync_stage_0(I)
        );
end generate sync_block_CDCC_1;

    data_out_s  <= data_out_s_int;
    
end RTL;

.XDC file snippet below. I instantiated the CDCC component at the top level of the design as: CDCC_200to125: CDCC generic map(...) etc...

 

set_property ASYNC_REG true [get_cells CDCC_200to125/sync_block_CDCC_0[*].FDRE_sync_CDCC_0]
set_property ASYNC_REG true [get_cells CDCC_200to125/sync_block_CDCC_1[*].FDRE_sync_CDCC_1]

And the result of the implemented design:

FFs.png

 

 

If you have any other comments, feel free to contribute.

 

Cheers!

 

Tags (1)
0 Kudos
Historian
Historian
8,604 Views
Registered: ‎01-23-2009

Re: ASYNC_REG and clock domain crossing

Jump to solution

I have a couple of questions/comments...

 

First, why are you instantiating FDRE cells, rather than inferring them using clocked processes

 

FF: process (CLK)
begin
  if (rising_edge CLK) then
    if (RST = ‘1’) then
      Q <= ‘0’;
    else
      Q <= D;
  end if;
end

When done this way, I am pretty sure (and others can correct me if I am wrong) that when you the VHDL attribute format to set the ASYNC_REG attribute on the signal "Q" this correctly ends up on the resulting flip-flop.

 

Second, rather than (or maybe in addition to) looking at the placement of the flip-flops after place and route, you should verify that the ASYNC_REG property has been preserved through the tool flow. In the GUI you can open the implemented design, select the cell and look at the properties window, but the ASYNC_REG one in not one of the "standard" ones shown (by the way, Xilinx, if you are listening, then it should be) and you have to add it to the list using the "+" icon. You can also get it through Tcl (with the implemented design open)

 

get_property ASYNC_REG [get_cells <hierarchical_name_of_flip_flop]

 

it should return "TRUE" if the ASYNC_REG was preserved.

 

Finally, I want to caution about the generated "bank" of flip-flops as a clock crosser (which is what the generate loop does - it generates a bank of  NUMBER_OF_BITS flip-flops for crossing a vector of width NUMBER_OF_BITS). As a general rule this is not a valid clock domain crossing (CDC) system. When crossing a bus of bits, you need to ensure that the bus remains correlated through the CDC. Done this way (with independent single bit synchronizers), they do not. The only exception I know is if they bus was not correlated to start with (like a bus of independent status bits), or the bus is coded to not need correlation (i.e. is Gray coded). If the bus is anything else, then you can end up corrupting the bus as you bring it across clock domains - see this post on the risks of bus coherency and clock crossing.

 

Avrum

Adventurer
Adventurer
8,586 Views
Registered: ‎07-24-2016

Re: ASYNC_REG and clock domain crossing

Jump to solution
@avrumw
@avrumw wrote:

I have a couple of questions/comments...

 

First, why are you instantiating FDRE cells, rather than inferring them using clocked processes

 

When done this way, I am pretty sure (and others can correct me if I am wrong) that when you the VHDL attribute format to set the ASYNC_REG attribute on the signal "Q" this correctly ends up on the resulting flip-flop.

 


 

I just wanted to see if it would work, as in the link I provided above, you mentioned something about a problem with the ASYNC_REG being asserted correctly in FDREs generated in a loop: LINK HERE. I am aware that the way you showed does the same thing with more compact code. But in any case, I don't think there is any real difference in the final implemented design. Correct?

 


@avrumw wrote:

 

 

Finally, I want to caution about the generated "bank" of flip-flops as a clock crosser (which is what the generate loop does - it generates a bank of  NUMBER_OF_BITS flip-flops for crossing a vector of width NUMBER_OF_BITS). As a general rule this is not a valid clock domain crossing (CDC) system. When crossing a bus of bits, you need to ensure that the bus remains correlated through the CDC. Done this way (with independent single bit synchronizers), they do not. The only exception I know is if they bus was not correlated to start with (like a bus of independent status bits), or the bus is coded to not need correlation (i.e. is Gray coded). If the bus is anything else, then you can end up corrupting the bus as you bring it across clock domains - see this post on the risks of bus coherency and clock crossing.

 

Avrum


Yes, I am aware of that. But in my case, this circuit will be used to synchronize several unrelated to each other signals, that are being sent from one FSM to another (which are clocked by clocks that are asynchronous to each other). So it is not much of a problem. By the way, I recently stumbled upon this paper which covers these issues in great detail. 

 

Regarding the CDC of muti-bit signals that are part of a bus (e.g. a counter) I would strongly prefer FIFOs. Anyway, your post that you linked (the one I quoted right above), got me thinking, and to be more precise, this part did:

 

"So, how do we fix this. We need to constrain the clock crossing paths so that this is impossible. What we really need to do is ensure that the skew on the different bits of our Gray code bus is less than one source clock period. Even in Vivado, we cannot directly constrain skew. But, if we constrain the maximum delay on the path to be less than one source clock period, then we can (indirectly) constrain the skew. How do we do this

 

set_max_delay -from <source_FFs_on_source_domain> -to <destination_FFs_on_destination_domain> <value_less_than_source_clock_period> -datapath_only"

 

So my question is: Can the same thing be achieved, with the code I attached in my previous post, if I add the following line in my .xdc file?

 

set_property ASYNC_REG true [get_cells CDCC_200to125/reg_input_CDCC[*].FDRE_reg_input_CDCC]

Wouldn't this constrain the source FF (reg_input), to be closer to the first FF of the synchronization chain?

 

P.S. As I understand, the -datapath_only flag of the -set_max_delay command, omits the clock skew from the slack computation...forgive my ignorance, but wouldn't this result in a less accurate/realistic analysis of the path?

 

Cheers

 

0 Kudos
Scholar drjohnsmith
Scholar
8,549 Views
Registered: ‎07-09-2009

Re: ASYNC_REG and clock domain crossing

Jump to solution

I dont know if its possible

   or it would ever happen

 

but

 

would it not be nice , 

 

if Xilinx had an IP block, that ensured that two registers would be in same unit for synchronisation,

 

 

Then we could do away with the need for async_reg, tcl  et all, 

 

     just instantiate the IP,  and be done with it 

 

I come to this from many decades back,

    the way we have to force to registers together was different in ISE, 

 

then we find over the years Vivado improves and approved methods of doing this sort of things change

 

but

 

anyway

 

this post is the best set of descriptions I have seen on what the current approved method is. 

 

Well done

 

 

 

<== If this was helpful, please feel free to give Kudos, and close if it answers your question ==>
Adventurer
Adventurer
8,536 Views
Registered: ‎07-24-2016

Re: ASYNC_REG and clock domain crossing

Jump to solution

@drjohnsmith

 


@drjohnsmith wrote:

I dont know if its possible

   or it would ever happen

 

but

 

would it not be nice , 

 

if Xilinx had an IP block, that ensured that two registers would be in same unit for synchronisation,

 


It would indeed be awesome, but this is more fun ;)

 


@drjohnsmith wrote:

 

but

 

anyway

 

this post is the best set of descriptions I have seen on what the current approved method is. 

 

Well done 


Hey, thanks. I just want to gather as much information as possible to make our lives easier...

 

But anyway, the question I made in my last post is still there (and anyone's input is appreciated...I am also calling out for @avrumw whose posts have been very helpful): The ASYNC_REG attribute seems to physically constrain registers so that their in-between skew/signal propagation delay is minimal (and correct me if I am wrong here). So if I assert the attribute to the FDRE_reg_input_CDCC of the circuitry I attached HERE, wouldn't this optimize the placement of this particular FF in relation to the FF/LUT that produces the signal that goes into the component's data_in port, and the first FDRE of the synchronization chain? And if this applies to all input registers of the component, could the component be used to synchronize an entire parallel data bus (e.g. a counter)? 

 

Cheers

0 Kudos
Highlighted
Historian
Historian
5,548 Views
Registered: ‎01-23-2009

Re: ASYNC_REG and clock domain crossing

Jump to solution

would it not be nice , 

 

if Xilinx had an IP block, that ensured that two registers would be in same unit for synchronisation,

 

So, first, in UltraScale there is the new HARD_SYNC cell. This is a hard block in device that contains 2 or 3 (selectable) back to back specially designed metastability reduction flip-flops. If you instantiate this cell, you don't need the ASYNC_REG property, The disadvantage of these is that there are a finite number of them, and they are in the block RAM column, which can make them hard to get to/get back from.

 

Second, there is nothing preventing you from creating your own RTL module/entity that can be customized with parameters/generics to implement back to back flip-flops with variable width and depth. If you find a way to get the ASYNC_REG onto these flip-flops (and again, I am fairly sure that the embedded RTL attribute works on inferred flip-flops in all recent versions), then you can simply instantiate this module/entity where you need it. This is, in fact, what I do in most of my designs and I have never had a problem with the ASYNC_REG (but I use Verilog and most of the issues I have seen in the forums have been with VHDL).

 

Avrum

Tags (1)
0 Kudos
Historian
Historian
5,546 Views
Registered: ‎01-23-2009

Re: ASYNC_REG and clock domain crossing

Jump to solution

But in any case, I don't think there is any real difference in the final implemented design. Correct?

 

The final result will be the same. But inference is always preferred over instantiation - yes, it is more compact, but it is also portable - the FDRE is specific to most modern Xilinx FPGAs, but not to other technologies (like ASIC, or...), but even if you plan on remaining with Xilinx forever, there is nothing that ensures that the flip-flop will always be called FDRE - some day Xilinx may need an FDRE_E1 (or something else)...

 

By the way, I recently stumbled upon this paper which covers these issues in great detail. 

 

Yes, this paper is definitely worth a read...

 

Regarding the CDC of muti-bit signals that are part of a bus (e.g. a counter) I would strongly prefer FIFOs

 

Clock crossing FIFOs are certainly the "biggest hammer" in our toolbox for bringing data between clock domains - for some combinations of clock rates and data throughputs they are the only ones that can be used. But

  a) clock crossing FIFOs are expensive. The hard FIFOs (BRAM based) in the FPGA are limited in number and can be far from other logic (in the block RAM column). Even if you use distributed RAM based FIFOs, they consume a fair number of cells

  b) Aside from the hard FIFOs (FIFO36 and FIFO18), all other FIFOs still require fabric based CDCs with constraints. The generation of the full/empty signals of the FIFOs require bringing the address from write clock to the read clock (and vice versa) - these crossings are done using fabric based techniques (like Gray coding) and need synchronizers with constraints...

 

Particularly for the first reason, there is often a desire to use other CDCs...

 

Can the same thing be achieved, with the code I attached in my previous post, if I add the following line in my .xdc file? [putting ASYNC_REG on the last FF on the source domain]

 

I would not recommend this. The ASYNC_REG property does lots of things - it doesn't just influence the placer. It also

  - marks the cell as DONT_TOUCH

  - informs back-annotated simulations not to go to X on failed setup/hold violations on this cell

  - is used to help CDC analysis commands like report_cdc and report_synchonizer_mtbf

 

Some/most/all of these would not be appropriate for the source flip-flop. And, as I mentioned before, I am not certain what this will do to the placer. Since these two FFs are not on the same clock, they cannot be packed into the same slice (since they don't share the same control set). So, will the placer put them in adjacent slices? Or will the placer simply ignore it since this it can tell that this makes no sense according to the definition of ASYNC_REG.

 

And that's really the important point - you are not supposed to do this - the purpose of the ASYNC_REG property is pretty clearly defined, and it does not include this...

 

As I understand, the -datapath_only flag of the -set_max_delay command, omits the clock skew from the slack computation...forgive my ignorance, but wouldn't this result in a less accurate/realistic analysis of the path?

 

You have to understand how static timing in Vivado is done. In Vivado, we attach clocks to specific pins or ports (most user defined clocks should be attached to ports of the design). From there, they propagate forward. Static timing analysis then includes three parts, the "Source Clock Delay (SCD)" (the propagation delay from the source clock attachment point to the source flip-flop) the "Datapath Delay (DPD)" (the delay from the source flip-flop to the destination flip-flop) and the "Desitantion Clock Delay (DCD)" (the delay from the destination clock's attachment point to the destination flip-flop).

 

In Vivado, all clocks are related by default. So, even if the two clocks are attached to different clock inputs (and share nothing in common), the analysis includes all three components. This is true for both paths with no exceptions on them as well as paths that are covered by a set_max_delay (without the -datapath_only).

 

So, if, for example, your source clock goes through an MMCM to cancel the clock insertion (so the SCD is close to 0), whereas the destination clock goes through (as an extreme example, say...) a BUFR and a BUFG (so the DCD is on the order of like 6ns), then doing a

 

set_max_delay 8 -from... -to... (without the -datapath_only)

 

Will end up timing the path with the requirement that the datapath is less than 8 -SCD + DCD (with some fudge factors for jitter and the like), which effectively constrains the datapath to 14ns, which is WAY bigger than the 8ns you asked for,

 

With the -datapath_only flag, it ignores the SCD and DCD, and only verifies that the DPD is less than the requirement (in this case 8ns).

 

Avrum

Scholar drjohnsmith
Scholar
5,537 Views
Registered: ‎07-09-2009

Re: ASYNC_REG and clock domain crossing

Jump to solution

Making ones own 'ip'  is like you what I do

 

But, it is in VHDL as that what my clients state,

    but it seems far than stable.  Vivado seems to find all sorts of ways of breaking it.

 

Hence my desire for Xilinx to take this on the head and take on providing the IP, then they can break and fix it. 

 

As for ultra scale,

 

My customers are far from that level of chips but great to see Xilinx has provided something,

   Now if you can instantiate this new dual register, it would be lovely if the same IP would in say a Kintex give us the synchronous registers, and in Ultra scale, use the  built in feature.

 

 

<== If this was helpful, please feel free to give Kudos, and close if it answers your question ==>
0 Kudos