Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

- Community Forums
- :
- Forums
- :
- Vivado RTL Development
- :
- Implementation
- :
- Re: What do trce_dly_max and trce_dly_min mean whe...

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page

Highlighted

efpkopin

Adventurer

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

02-19-2019 02:15 PM - edited 02-19-2019 02:16 PM

1,676 Views

Registered:
01-20-2017

I'm trying to understand how to define output delay constraints and I have myself tied up in knots trying to understand these concepts. Let me illustrate using an image that @avrumw has posted previously (see image below).

Assume I have an FPGA with two output signals: a clock signal, 'SysClk' and a data signal 'DataOut'. These signals travel across the PCB (and over some connection cables) eventually reaching an external device. For the following discussion, let me identify two particular instances in time for these two signals:

- Let **SysClk_re** be the rising edge of the SysClk signal

- Let **DataOut_A** be the time corresponding to the earliest 'established' point of each DataOut value. Said another way, this is the time corresponding to the '1ns' dashed line (the 'setup time') in the figure below.

Obviously, these two instances repeat every 10 ns in each signal. Given these definitions, let me ask some specific questions in the context of setting the output delay constraints:

1. Are we supposed to assume that **SysClk_re** and **DataOut_A** leave the FPGA simultaneously?

2. Furthermore, is the assumption that **SysClk_re** reaches the external device faster than **DataOut_A**?

3. And that:

3a. trce_dly_max is the maximum possible amount of time **DataOut_A** will get there after **SysClk_re**?

3b. trce_dly_min is the minimum possible amount of time **DataOut_A** will get there after **SysClk_re**?

If not, can anyone clarify these terms?

1 Solution

Accepted Solutions

Highlighted

markg@prosensing.com

Curator

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

02-19-2019 06:53 PM - edited 02-19-2019 08:16 PM

1,642 Views

Registered:
01-22-2015

What you’ve described is source-synchronous single-data-rate (SDR) output from the FPGA. One way to think about the problem is that this is simply the transfer of data from a register located inside the FPGA to a register located outside the FPGA.

We can tackle analysis of the problem by first drawing a picture of the essential circuits and label the **timing-arcs**.

In the picture above, data is being transferred from the register, DAT1_reg, inside the FPGA to the external register, EX1_reg. Throughout the diagram, I have drawn timing-arcs that identify signal propagation delay. For example, t_mcd, is the delay of the clock signal, CLK, that travels from the MMCM clock generator to the clock-pin of the register, DAT1_reg. Another example, t_cod, is the delay of the data-signal as it travels from the data-pin of DAT1_reg to the output (Q-pin) of DAT1_reg (also called the clock-to-output time of DAT1_reg).

Next, we write an expression for the **data-arrival-time** at the external register. Terms in this expression come from the following sequence of events:

a) EDGE1 of CLK is launched from the MMCM at time, t_mle (EDGE1 is called the **launch-edge** of CLK)

b) EDGE1 of CLK travels through the FPGA clock tree, taking t_mcd seconds to reach DAT1_reg/C

c) DAT1_reg sees EDGE1 of CLK and takes t_cod seconds to transfer data from DAT1_reg/D to DAT1_reg/Q

d) The data travels from DAT1_reg/Q and through OBUF, taking (t_bfd + t_pxd) seconds to reach EX1_reg/D (t_pxd is the delay of the path/trace outside the FPGA over which the data propagates)

Adding up these times, gives the desired expression:

data-arrival-time = (t_mle + t_mcd + t_cod + t_pfd + t_pxd)(1)

For the rest of this analysis, we will focus on what is needed for **setup timing-analysis**. That is, we will write an expression for **data-required-time** from the following sequence of events:

a) EDGE2 of CLK is launched from the MMCM at time, t_mce (EDGE2 is called the **capture-edge** of CLK, which is normally that edge of CLK that immediately follows EDGE1)

b) EDGE2 of CLK travels through the FPGA clock tree, taking t_mcc seconds to reach ODDR1/C (the ODDR component is often used to forward a clock out of the FPGA).

c) ODDR1 sees EDGE2 of CLK and it takes t_coc seconds to transfer the edge from either ODDR1/D1 or ODDR1/D2 to ODDR1/Q

d) The edge from ODDR1/Q travels through OBUF, taking (t_bfc + t_pxc) seconds to reach EX1_reg/C (t_pxc is the delay of the path/trace outside the FPGA over which the clock propagates)

Summing these timing-arcs gives (t_mce + t_mcc + t_coc + t_bfc + t_pxc), which is the time when the capture-edge of the clock arrives at EX1_reg. Satisfying the setup requirement for EX1_reg means that the data-arrival time must come t_sux seconds before the capture-edge of the clock arrives at EX1_reg (t_sux is the setup time requirement for EX1_reg). Thus, the desired expression is:

data-required-time(setup) = (t_mce + t_mcc_+ t_coc + t_bfc + t_pxc) – t_sux(2)

Finally, we can write the setup timing-analysis requirement that data-arrival-time must be less than the data-required-time – or (data-required-time) minus (data-arrival-time) must be greater than zero. From equations (1) and (2) this requirement can be written as:

(t_mce + t_mcc_+ t_coc + t_bfc + t_pxc) – t_sux - (t_mle + t_mcd + t_cod + t_pfd + t_pxd) > 0(3)

All the timing-arcs shown in (3) that are inside the FPGA are known to the Xilinx timing-analysis tools. It is our job to specify the timing-arcs that lie outside the FPGA and are not (yet) known to the Xilinx tools. The timing-arcs from (3) that are outside the FPGA are (t_pxc – t_sux – t_pxd) = ((t_pxc – t_pxd) - t_sux). Note that the negative of this value can be written as (EX1_reg setup time + (external data-path delay minus external clock-path delay). It is this value that gets used in one of the needed constraints for an SDR interface. That is,

set_output_delay -clock $fwdclk -max [expr $t_pxd - $t_pxc + $t_sux] [get_ports $output_port]

Where, $output_port is the name of the FPGA port where the data exits the FPGA and $fwdclk is the name of the clock that is being forwarded to EX1_reg. Defining $fwdclk is done using a create_generated_clock constraint – which is another story. Just ask if you need help with the create_generated_clock constraint. Finally, the other needed set_output_delay constraint comes from an analysis like the one above except that we consider the hold time, t_hdx, of EX1_reg. In short, the results of this 2^{nd} analysis give the 2^{nd} needed constraint as:

set_output_delay -clock $fwdclk -min [expr $t_pxd - $t_pxc - $t_hdx] [get_ports $output_port]

So, at long last we come to your question about trce_dly_max and trce_dly_min. These quantities refer to (t_pxd - t_pxc). That is, we usually know this trace-delay-difference with some error (eg. 0.02ns +/- 0.005ns). So, for this example, trce_dly_max=0.025ns and trce_dly_min=0.015ns.

Cheers,

Mark

4 Replies

Highlighted

markg@prosensing.com

Curator

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

02-19-2019 06:53 PM - edited 02-19-2019 08:16 PM

1,643 Views

Registered:
01-22-2015

What you’ve described is source-synchronous single-data-rate (SDR) output from the FPGA. One way to think about the problem is that this is simply the transfer of data from a register located inside the FPGA to a register located outside the FPGA.

We can tackle analysis of the problem by first drawing a picture of the essential circuits and label the **timing-arcs**.

In the picture above, data is being transferred from the register, DAT1_reg, inside the FPGA to the external register, EX1_reg. Throughout the diagram, I have drawn timing-arcs that identify signal propagation delay. For example, t_mcd, is the delay of the clock signal, CLK, that travels from the MMCM clock generator to the clock-pin of the register, DAT1_reg. Another example, t_cod, is the delay of the data-signal as it travels from the data-pin of DAT1_reg to the output (Q-pin) of DAT1_reg (also called the clock-to-output time of DAT1_reg).

Next, we write an expression for the **data-arrival-time** at the external register. Terms in this expression come from the following sequence of events:

a) EDGE1 of CLK is launched from the MMCM at time, t_mle (EDGE1 is called the **launch-edge** of CLK)

b) EDGE1 of CLK travels through the FPGA clock tree, taking t_mcd seconds to reach DAT1_reg/C

c) DAT1_reg sees EDGE1 of CLK and takes t_cod seconds to transfer data from DAT1_reg/D to DAT1_reg/Q

d) The data travels from DAT1_reg/Q and through OBUF, taking (t_bfd + t_pxd) seconds to reach EX1_reg/D (t_pxd is the delay of the path/trace outside the FPGA over which the data propagates)

Adding up these times, gives the desired expression:

data-arrival-time = (t_mle + t_mcd + t_cod + t_pfd + t_pxd)(1)

For the rest of this analysis, we will focus on what is needed for **setup timing-analysis**. That is, we will write an expression for **data-required-time** from the following sequence of events:

a) EDGE2 of CLK is launched from the MMCM at time, t_mce (EDGE2 is called the **capture-edge** of CLK, which is normally that edge of CLK that immediately follows EDGE1)

b) EDGE2 of CLK travels through the FPGA clock tree, taking t_mcc seconds to reach ODDR1/C (the ODDR component is often used to forward a clock out of the FPGA).

c) ODDR1 sees EDGE2 of CLK and it takes t_coc seconds to transfer the edge from either ODDR1/D1 or ODDR1/D2 to ODDR1/Q

d) The edge from ODDR1/Q travels through OBUF, taking (t_bfc + t_pxc) seconds to reach EX1_reg/C (t_pxc is the delay of the path/trace outside the FPGA over which the clock propagates)

Summing these timing-arcs gives (t_mce + t_mcc + t_coc + t_bfc + t_pxc), which is the time when the capture-edge of the clock arrives at EX1_reg. Satisfying the setup requirement for EX1_reg means that the data-arrival time must come t_sux seconds before the capture-edge of the clock arrives at EX1_reg (t_sux is the setup time requirement for EX1_reg). Thus, the desired expression is:

data-required-time(setup) = (t_mce + t_mcc_+ t_coc + t_bfc + t_pxc) – t_sux(2)

Finally, we can write the setup timing-analysis requirement that data-arrival-time must be less than the data-required-time – or (data-required-time) minus (data-arrival-time) must be greater than zero. From equations (1) and (2) this requirement can be written as:

(t_mce + t_mcc_+ t_coc + t_bfc + t_pxc) – t_sux - (t_mle + t_mcd + t_cod + t_pfd + t_pxd) > 0(3)

All the timing-arcs shown in (3) that are inside the FPGA are known to the Xilinx timing-analysis tools. It is our job to specify the timing-arcs that lie outside the FPGA and are not (yet) known to the Xilinx tools. The timing-arcs from (3) that are outside the FPGA are (t_pxc – t_sux – t_pxd) = ((t_pxc – t_pxd) - t_sux). Note that the negative of this value can be written as (EX1_reg setup time + (external data-path delay minus external clock-path delay). It is this value that gets used in one of the needed constraints for an SDR interface. That is,

set_output_delay -clock $fwdclk -max [expr $t_pxd - $t_pxc + $t_sux] [get_ports $output_port]

Where, $output_port is the name of the FPGA port where the data exits the FPGA and $fwdclk is the name of the clock that is being forwarded to EX1_reg. Defining $fwdclk is done using a create_generated_clock constraint – which is another story. Just ask if you need help with the create_generated_clock constraint. Finally, the other needed set_output_delay constraint comes from an analysis like the one above except that we consider the hold time, t_hdx, of EX1_reg. In short, the results of this 2^{nd} analysis give the 2^{nd} needed constraint as:

set_output_delay -clock $fwdclk -min [expr $t_pxd - $t_pxc - $t_hdx] [get_ports $output_port]

So, at long last we come to your question about trce_dly_max and trce_dly_min. These quantities refer to (t_pxd - t_pxc). That is, we usually know this trace-delay-difference with some error (eg. 0.02ns +/- 0.005ns). So, for this example, trce_dly_max=0.025ns and trce_dly_min=0.015ns.

Cheers,

Mark

Highlighted

efpkopin

Adventurer

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

02-20-2019 04:06 AM

1,612 Views

Registered:
01-20-2017

Mark,

That is an *awesome* explanation that really clears it up. Thank you! Couple of follow up questions:

- To calculate the trace delay on our PCB, I assume I use the length of the trace (w/ tolerance) and assume a propagation speed. What is a good speed to use? I have seen 15 cm/ns. Is that valid or is there another way to determine?

- Ok - that covers the source-synchronous SDR case. We *actually* have a source-synchronous DDR system (wanted to keep it simple to start with). Is there any nuances involved in applying the same approach to constrain the falling edge in a DDR set-up?

Best,

-Eric

Highlighted

markg@prosensing.com

Curator

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

02-20-2019 06:01 AM

1,604 Views

Registered:
01-22-2015

Hi Eric,

You’re very welcome.

What is a good speed to use? I have seen 15 cm/ns.

Yes, that’s a good number. For writing the set_max_delay constraints we need the path-length-difference, (t_pxd - t_pxc), and not the actual values of t_pxd and t_pxc. This makes your job a lot easier because board layout software is often use to make t_pxd equal to t_pxc (ie. to make the difference equal to zero).

We *actually* have a source-synchronous DDR system (wanted to keep it simple to start with).

Although the same approach (data-arrival-time, data-required-time) is used, the DDR interface is a more complicated that the SDR interface because you effectively have two data paths between the FPGA and the external register. These two data paths make it necessary to write four set_max_delay constraints. Search this Forum for “source synchronous DDR” stuff written by Avrum. As you know, he is the master at this kind of thing.

Be sure to read UG903 for details on writing the create_generated_clock constraint that is needed for these interfaces.

Mark

Highlighted

markg@prosensing.com

Curator

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

07-12-2019 11:26 AM

1,146 Views

Registered:
01-22-2015

I have been asked to show the Vivado implementation of a Single-Data-Rate (SDR) Source-Synchronous-Output (SSO) interface along with the needed constraints. In the constraints, ?.???, are numbers that the user must enter since they are design-specific.

# Source Synchronous Output, Single Data Rate (SDR), all times used below in nanoseconds # ------- set tsu ?.???; #destination device setup time set thd ?.???; #destination device hold time set tdif_max ?.???; #max trace delay diff (data-delay minus clock-delay) set tdif_min ?.???; #min trace delay diff (data-delay minus clock-delay) set dat_port SSO1_DAT; #name of FPGA port(s) used to forward the data set clk_port SSO1_CLK; #name of FPGA port used to forward the clock set fclk_nam FCLK1; #name of forwarded-clock assigned by create_generated_clock set fclk_src [get_pins ODDR1/C]; #source of the forwarded-clock # Describe the forwarded clock create_generated_clock -name $fclk_nam -source $fclk_src -divide_by 1 \ -invert [get_ports $clk_port ] # Output delay constraints set_output_delay -clock $fclk_nam -max [expr $tdif_max + $tsu ] [get_ports $dat_port ] set_output_delay -clock $fclk_nam -min [expr $tdif_min - $thd ] [get_ports $dat_port ] # Sometimes the following constraint is needed by MMCM to correctly interpret phase shifts set_property PHASESHIFT_MODE LATENCY [get_cells MMCM1/inst/mmcm_adv_inst]