cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
fanxitian
Contributor
Contributor
1,232 Views
Registered: ‎12-20-2011

How to constrain data crossing clocks

Jump to solution

Hi, I want to multiply two signals (a and b) from two different clocks, say clk_100M and clk_200M. These two clocks are generated from the same DCM without any phase shifting. Suppose signal a belongs to clk_100M, signal b belongs to clk_200M. My codings are as follow:

//===========

(*ASYNC_REG=="true"*) reg  [7:0] a_dly;

reg [15:0] result;

always @(posedge clk_200M )  

     a_dly <= a;

always @(posedge clk_200M)

     result <= a_dly * b;

//=============

My question is how to constrain the path from a to a_dly  to achieve the timing closure?  use the set_multicycle_path ? Any suggestions? Thanks in advance.

 

 

0 Kudos
Reply
1 Solution

Accepted Solutions
avrumw
Guide
Guide
1,052 Views
Registered: ‎01-23-2009

OK - first...

Let us first re-establish that these clocks are synchronous - they both come from the same MMCM and have the same type of clock buffer, and (in the case of UltraScale/UltraScale+) they are in the same CLOCK_DELAY_GROUP.

If that is the case, then these are synchronous domains. If they are synchronous domains you do not need any exceptions and you should NOT use the ASYNC_REG property - these are specifically for synchronization chains going between asynchronous clock domains. This property will mess things up - including the ability to pack registers into the DSP48 cells (FFs with ASYNC_REG must be put in fabric flip-flops).

Obviously, signal a will be sampled in Edge_1 rather than Edge_2. In this case, how to add the timing exception in the constrain file? 

Using the rules I outlined above, since a changes on the rising edge of clk_100 (lets call that Edge_0), then it will be sampled by default on the next edge of the destination clock, which is Edge_1. This is the normal behavior and doesn't require any exceptions - the requirement will result in a 5ns path.

If, however, you wanted to do the opposite - force the sampling on Edge_2 (which would be done with the "select" asserted on the opposite periods of clk_200M - then you would need an exception. In this case, you have designed a circuit where the data is not sampled on Edge_1, but is sampled only on Edge_2. Without an exception, the tool will still time this as a 5ns path, even though there is now 10ns for this propagation. So here you would declare a multicycle path

set_multicycle_path 2 -end -from [get_cells a_reg] -to [get_cell a_dly_reg]
set_multycycle_path 1 -hold -end -from [get_cells a_reg] -to [get_cell a_dly_reg]

I have put the "-end" flag here to be explicit (it is the default, so is technically unecessary) - this specifies that we are modifying the end of the static timing path - the capture flip-flop. If you were going the other way - holding data on the 200MHz clock for two clock periods so that it only changed on Edge_0 and Edge_2, and then capture it on the 100MHz clock, then you would have to do (assuming b is on the 200MHz clock and b_dly is in the 100MHz)

set_multicycle_path 2 -start -from [get_cells b_reg] -to [get_cell b_dly_reg]
set_multycycle_path 1 -hold -start -from [get_cells b_reg] -to [get_cell b_dly_reg]

This indicates that you are pulling back the start of the static timing path by one clock, which is different than pushing forward the end of the static timing path. Without the -start, you would actually end up giving this path 15ns for the propagation (Edge_1 would normally be captured on the 100MHz clock at Edge_2, but by pushing this forward, you would end up with it being captured on the next 100MHz clock, which is Edge_4 - 15ns after Edge_1).

By the way, the generation of this "select" signal can be done "the right way" and "the wrong way" - the wrong way is trying to sample clk_100M on a flip-flop clocked on clk_200M. Take a look at this post fo the right way to establish the phase relationship between two synchronous clocks.

Avrum

Avrum

View solution in original post

13 Replies
hgleamon1
Teacher
Teacher
1,219 Views
Registered: ‎11-14-2011

Are you having timing problems with this method?

If the two clocks are synchronous, off the top of my head, I can't see much wrong with what you have.

 

----------
"That which we must learn to do, we learn by doing." - Aristotle
0 Kudos
Reply
drjohnsmith
Teacher
Teacher
1,209 Views
Registered: ‎07-09-2009

did you mean one of the clocks to be 100 in your example ?

 

<== If this was helpful, please feel free to give Kudos, and close if it answers your question ==>
0 Kudos
Reply
fanxitian
Contributor
Contributor
1,169 Views
Registered: ‎12-20-2011
yes, the period of clk_100M is twice of clk_200M. Since I have many such multiplications, there are many setup time and hold time violations in my design.
0 Kudos
Reply
fanxitian
Contributor
Contributor
1,166 Views
Registered: ‎12-20-2011
The problem is that setup time and hold time violation are detected in the path from signal a to signal a_dly. Signal a belongs to clk_100M (the period is twice of clk_200M), while signal a_dly belongs to clk_200M. Should I set "multiple cycle path" constrain for that path? For example:
set_multicycle_path -setup -start -from [get_pins {a[*]/C}] -to [get_pins {a_dly[*]/C}] 2
set_multicycle_path -hold -from [get_pins {a[*]/C}] -to [get_pins {a_dly[*]/C}] 1
0 Kudos
Reply
avrumw
Guide
Guide
1,148 Views
Registered: ‎01-23-2009

Should I set "multiple cycle path" constrain for that path? For example:

No!

If the two clocks, clk_100 and clk_200 come from the same MMCM (there are no DCMs in any technology that can be used in Vivado - see this post on the difference between DCM, MMCM and PLL), and they each go through the same type of buffer (i.e. each goes through a BUFG - there is one additional requirement in UltraScale/UltraScale+), then the tool already knows how to figure out the requirements on this path. If they are failing timing, these are real failures.

In UltraScale/UltraScale+ the clocking structure is different - it is no longer true that the delay through all clock networks are equal. For these architectures it is required to place the two clocks (clk_100 and clk_200) in the same CLOCK_DELAY_GROUP - see this post on CLOCK_DELAY_GROUPs.

Avrum

0 Kudos
Reply
hgleamon1
Teacher
Teacher
1,128 Views
Registered: ‎11-14-2011

@avrumw As OP has not revealed what device he is targeting (although he did write "DCM"), could you explain (or link a previous explanation that you have probably written in another thread) what the method is for non-UltraScale architectures?

I am no Vivado or Series 7 expert. However, my experience from ISE and older architectures is that this kind of logic (inferring multipliers) could be dependent on if the multiplication is using DSP slices or not. In some cases, increasing the pipelining for the inputs and outputs can help alleviate timing issues.

Or am I barking up the wrong tree?

----------
"That which we must learn to do, we learn by doing." - Aristotle
0 Kudos
Reply
fanxitian
Contributor
Contributor
1,125 Views
Registered: ‎12-20-2011

@avrumw  Thanks. Now I have another similar problem and  my design intention  is simplified as following drawing and description.

Signal belongs to clk_100M , while signal select  belongs to clk_200M. We want to sample a with clk_200M when select  signal asserts high. 

//======

(*ASYNC_REG=="true"*) reg  [7:0] a_dly;

always @(posedge clk_200M)

     if(select == 1) a_dly <= a;

//======

Obviously, signal a will be sampled in Edge_1 rather than Edge_2. In this case, how to add the timing exception in the constrain file? 

clk_100M and clk_200M are generated from the same MMCM and the period of clk_100M is twice of clk_200M.

20190311152027.png
0 Kudos
Reply
hgleamon1
Teacher
Teacher
1,117 Views
Registered: ‎11-14-2011

 

In which clock domain is "select" generated? 

----------
"That which we must learn to do, we learn by doing." - Aristotle
0 Kudos
Reply
fanxitian
Contributor
Contributor
1,113 Views
Registered: ‎12-20-2011
signal "select" is generated by clk_200M.
0 Kudos
Reply
hgleamon1
Teacher
Teacher
1,092 Views
Registered: ‎11-14-2011

This subject interests me because, although I have built up quite some experience with ISE and .ucf constraints, I am pretty much a beginner with Vivado and series 7 architectures. Even more so with .xcf constraints.

 

Out of interest, I created a very small design that generates some data for a and b and multiplies them together using the clock frequencies and domains that you have indicated in your design. I ran the design through Vivado with the most basic of clock constraints and no placement or IO constraints.

On a side note, I did this in ISE targeting a Spartan 6 first. ISE infers a DSP block and recommended more pipelining on the multiplier output. When I changed to Vivado and targeted a Series 7 device, no DSP was inferred and the design uses only FFs and LUTs (could be a synthesis setting, however).

 

I get no timing errors. This could be that my design is so simple that it is easy for the tools to place and route to meet the clock constraint.

Do you have a "busy" design or some other placement constraints?

I apologise if I am missing something but, to me, there should not be much of an issue, timing wise, with doing what you are trying to do.

 

----------
"That which we must learn to do, we learn by doing." - Aristotle
0 Kudos
Reply
josephsamson
Explorer
Explorer
1,082 Views
Registered: ‎10-05-2010

@fanxitian wrote:

(*ASYNC_REG=="true"*) reg  [7:0] a_dly;

 

Does your verilog have the == ? The correct syntax is (* ASYNC_REG = "true" *)

 

---

Joe Samson

 


 

0 Kudos
Reply
avrumw
Guide
Guide
1,053 Views
Registered: ‎01-23-2009

OK - first...

Let us first re-establish that these clocks are synchronous - they both come from the same MMCM and have the same type of clock buffer, and (in the case of UltraScale/UltraScale+) they are in the same CLOCK_DELAY_GROUP.

If that is the case, then these are synchronous domains. If they are synchronous domains you do not need any exceptions and you should NOT use the ASYNC_REG property - these are specifically for synchronization chains going between asynchronous clock domains. This property will mess things up - including the ability to pack registers into the DSP48 cells (FFs with ASYNC_REG must be put in fabric flip-flops).

Obviously, signal a will be sampled in Edge_1 rather than Edge_2. In this case, how to add the timing exception in the constrain file? 

Using the rules I outlined above, since a changes on the rising edge of clk_100 (lets call that Edge_0), then it will be sampled by default on the next edge of the destination clock, which is Edge_1. This is the normal behavior and doesn't require any exceptions - the requirement will result in a 5ns path.

If, however, you wanted to do the opposite - force the sampling on Edge_2 (which would be done with the "select" asserted on the opposite periods of clk_200M - then you would need an exception. In this case, you have designed a circuit where the data is not sampled on Edge_1, but is sampled only on Edge_2. Without an exception, the tool will still time this as a 5ns path, even though there is now 10ns for this propagation. So here you would declare a multicycle path

set_multicycle_path 2 -end -from [get_cells a_reg] -to [get_cell a_dly_reg]
set_multycycle_path 1 -hold -end -from [get_cells a_reg] -to [get_cell a_dly_reg]

I have put the "-end" flag here to be explicit (it is the default, so is technically unecessary) - this specifies that we are modifying the end of the static timing path - the capture flip-flop. If you were going the other way - holding data on the 200MHz clock for two clock periods so that it only changed on Edge_0 and Edge_2, and then capture it on the 100MHz clock, then you would have to do (assuming b is on the 200MHz clock and b_dly is in the 100MHz)

set_multicycle_path 2 -start -from [get_cells b_reg] -to [get_cell b_dly_reg]
set_multycycle_path 1 -hold -start -from [get_cells b_reg] -to [get_cell b_dly_reg]

This indicates that you are pulling back the start of the static timing path by one clock, which is different than pushing forward the end of the static timing path. Without the -start, you would actually end up giving this path 15ns for the propagation (Edge_1 would normally be captured on the 100MHz clock at Edge_2, but by pushing this forward, you would end up with it being captured on the next 100MHz clock, which is Edge_4 - 15ns after Edge_1).

By the way, the generation of this "select" signal can be done "the right way" and "the wrong way" - the wrong way is trying to sample clk_100M on a flip-flop clocked on clk_200M. Take a look at this post fo the right way to establish the phase relationship between two synchronous clocks.

Avrum

Avrum

View solution in original post

fanxitian
Contributor
Contributor
1,037 Views
Registered: ‎12-20-2011

@avrumw Thanks very much.  You solve my puzzles.

0 Kudos
Reply