12-31-2020 09:06 AM
Short version:
DAC chips have glitchy outputs, especially at max clock rate. What tests should I do, and what are good practices for interfacing with these chips?
Longer version:
I have a custom baseboard with multiple MAX5875 16-bit 2-channel 200 MSPS DACs. I'm using a Mercury+ KX2 FPGA module with a Kintex-7 FPGA.
I'm programming the DACs in interleaved mode, meaning the data for both channels is sent over the same 16 data lines. A 17th data line, select, determines which DAC channel receives the new data. Figure 4b in the data sheet shows the timing diagram for interleaved mode.
I'm seeing glitches on the DAC outputs when clocking at 200 MHz. I also see glitches at lower speed, possibly less frequent. The glitches seem to be short (of order the clock period), and random (i.e. it's not the same bit glitching each time).
As a sanity check I made a program that programs only 1 bit high at a time independently for both DAC channels. For example I can send 0010_0000_0000_0000 to the IDAC, and 0000_0010_0000_0000 to the QDAC. I scanned through all combinations, and did not see the output glitches that I see with an arbitrary output. To me, this test at least indicates that all the data lines are connected properly and are being set correctly by the FPGA, at least in a simple case.
I also tried sending a phase shifted 200MHz clock out to the DACs, in the hope that I could find a window that had no glitches, and was reproducible. This approach did not work either. I can find edges where the data-clock phase is definitely not good, but I couldn't find a phase that worked without any glitches.
I've also attached my modules for programming the DACs.
Could I get some feedback and advice for an approach to getting these chips working correctly? Are there any standard practices for programming this sort of DAC chip?
I'm happy to provide more information about the board or program. Thank you all for the assistance.
12-31-2020 01:27 PM
First, what are your timing constraints? This interface looks simple if you just look at the timing diagram which shows tS and tH symmetrically around the rising edge of the clock. However, if you look at the datasheet numbers (which show tSETUP and tHOLD, which I assume to be the same) they are very NOT symmetrical; tSETUP=-0.6, tHOLD=2.1. This means the valid data must be there 0.6ns after the rising edge of the clock and remain until 2.1ns after the rising edge of the clock; which puts the valid window in the middle-ish of the high phase of the clock. This is not an ideal place for it.
To maximize the setup/hold margin, you will probably need to use two different clocks from the same MMCM - one to generate the data (and to run all your internal logic) and one to drive the ODDR that generates the forwarded. This assumes you are driving the DAC as a source synchronous device (forwarding the clock from the FPGA) rather than system synchronous (where the FPGA and DAC share a common clock input). This is a tough decision; the clock from the FPGA will have more jitter than a "clean" clock from an oscillator, and that jitter will translate into voltage error on the output of the DAC. Conversely trying to do 200MHz system synchronous interface with a 5ns period, of which the device use 1.5 will be very tough (maybe impossible without some external clock feedback). So show us your clock arrangement on the board.
Second, this is a parallel DAC, so you have 16 data bits changing every clock. You may need to keep an eye on the Simultaneously Switching Outputs; there are limits to how many I/O of a given bank can change simultaneously without experiencing VCCO droop and/or GND bounce. In some of the newer families (like Kintex-7) it is pretty hard to violate these requirements (the bank power and ground have been increased from previous generations), but there are still limits - particularly for LVCMOS outputs.
Avrum
01-01-2021 06:55 AM - edited 01-01-2021 07:13 AM
01-01-2021 11:56 AM - edited 01-01-2021 12:00 PM
The constraints look reasonable - particularly if the fDACclkB is a positive shift with respect to the non shifted clock that generates the data. I would generally confirm these with a look at both the setup and the hold detailed timing report just to be sure it is using the correct edges (but I think it should).
By "voltage error" do you mean that this will decrease the SNR?
Yes. This is a known feature of almost all DACs - clock jitter results in increased SNR.
How large of jitter can I expect from the FPGA generated clock? Is MMCM_TOUTJITTER the relevant spec here?
The jitter of the MMCM is part of it, but the clock will pick up more jitter as it travels through the clock tree (through a noisy digital environment), which will couple in more jitter before it reaches the pin. More jitter can be coupled in from noise on your VCCO power supply. While these can be significant, they will probably all be in a similar order to the MMCM jitter - so if 3x or so of the MMCM jitter isn't a problem, you are probably OK. Unfortunately it isn't possible (or at least easy) to get a numerical value for "how much jitter is there on an FPGA forwarded clock", which is why it is generally avoided for high precision analog design. For this reason, many DACs (but not, apparently this one) have a separate clock for the analog section and the digital data, with the assumption that the two clocks ultimately trace back to the same oscillator (they are mesochronous). The analog one usually comes straight from the oscillator (so is cleaner from a jitter point of view) the data one is forwarded through the sending device (the FPGA in this case). Unfortunately this requires a clock crossing in the DAC which increases latency (and adds uncertainty to the latency).
So show us your clock arrangement on the board.
Apologies, I'm not quite sure what you're looking for here. Do you want to see my board layout?
Knowing that it is clock forwarded is what I needed.
I am using LVCMOS33 for these signals. The board has 7 of these chips in total. The DACs use 41 lines in bank
So LVCMOS33 is one of the worst I/O standards for Simultaneously Switching Noise (Which is what Xilinx calls it - others call it SSO), and 41 is a HUGE number - I would be highly suspicious that this is your problem. You can get some information on this through the report_ssn command, but I haven't used it in a long time and I don't remember what kind of configuration it needs. But you should definitely look into this. Keeping all but one of the DACs idle (sending a constant value on both phases of the output) while testing only one would be a good way to reveal if this is the problem; if the corruption goes away when you do this it would be a likely culprit. You might also be able to see some of the corruption with a really good oscilloscope looking at the noise on both the VCCO and the output signals, although SSN is really corruption on the internal FPGA power grid (after the inductance of the balls and the internal routing).
This could also negatively affect the SNR of the DAC, since this will definitely manifest as jitter on the forwarded clock or clocks...
I thought of an an alternate way to send the data to the DACs
I didn't follow everything, but I don't think it would be better. If I understand what you are trying to do, the net result would be similar, with the exception that you would be more sensitive to the duty cycle imbalance that is inherent in internal FPGA clocks. Also the constraints would probably have to change.
Avrum
01-01-2021 08:46 PM - edited 01-14-2021 07:26 PM
Actually, the constraints are not right...
First, according to the datasheet of the MAX5875, the setup and hold are specified with respect to the rising edge of the clock, so there shouldn't be a "-clock_fall" in the constraints.
Second, the -max of a set_output_delay is the setup time, which is -0.6ns. The -min of a set_output_delay is the negative of the hold time, which is -2.1. So the correct constraints are
create_generated_clock -name fDACclkB -source [get_pins {UUT/ODDR_fDACclkB/C}] -divide_by 1 [get_ports {fDACclkB}]
set_output_delay -clock [get_clocks fDACclkB] -max -0.6 [get_ports {fDAC0_out[*]}]
set_output_delay -clock [get_clocks fDACclkB] -min -2.1 [get_ports {fDAC0_out[*]}]
set_output_delay -clock [get_clocks fDACclkB] -max -0.6 [get_ports fDAC0_sel]
set_output_delay -clock [get_clocks fDACclkB] -min -2.1 [get_ports fDAC0_sel]
[edit: the -max and -min were originally reversed]
Again, I would want to see the timing reports, but I suspect that this needs a set_multicycle_path 0 -setup on the output constraints
set_multicycle_path -setup 0 -to [get_ports {fDAC0_out[*] fDAC0_sel}]set_multicycle_path -hold -1 -to [get_ports {fDAC0_out[*] fDAC0_sel}]
[Edit: I think the last line shouldn't be there - I would have to spend time figuring it out again...]
Avrum
01-13-2021 12:06 PM
Thanks so much for your detailed replies. I apologize I haven't followed up with this sooner.
I think I need to have correct output constraints before more testing of SSO issues, right? I added in the set_output_delay constraints you wrote. I'd like to give you the timing reports, but I'm not sure how to get them for these signals specifically.... All I've been able to get is reports about signals within the FPGA. How can I get reports on specific output signals?
Could you give a bit more explanation on the set_multicycle_path constraints you wrote? set_multicycle_path 0 almost seems like a paradox to me.
01-13-2021 06:53 PM
I'd like to give you the timing reports, but I'm not sure how to get them for these signals specifically..
In the Tcl console use the command
report_timing -to [get_ports {fDAC0_out[*]}]
This will give yo a text report. If you want to put it in the GUI for interactive operations
report_timing -to [get_ports {fDAC0_out[*]}] -name fDAC0_out
Could you give a bit more explanation on the set_multicycle_path constraints you wrote? set_multicycle_path 0 almost seems like a paradox to me.
While the "set_multicycle_path" command is most often used for actual multicycle paths, it really modifies the default rules as to which clock edge captures a signal with respect to the clock edge that launched it. A normal path is captured on the clock after the one that launched it - this is the equivalent of set_multicycle_path 1
In a 2 cycle multicycle path, you want to change the relationship so that data launched at a given edge is captured not on the next clock edge, but the one after that - that is set_multicycle_path 2 - one more than a normal path.
In some cases on interfaces you want the data to capture the data on the same edge as the edge that launches the data; this is set_multicycle_path 0.
But for the first pass, do not use the set_multicycle_path command and let me see the complete timing path - from there we can look at how the interface needs to be modified.
As for looking at the SSOs, you probably do need a set_output_delay command on the outputs - just so that the tool understands which clock the output is related to. Even if the interface fails timing, you can still do the SSO analysis.
Avrum
01-14-2021 05:14 AM - edited 01-14-2021 05:16 AM
Here's the output into the Tcl console. It only outputs the path for fDAC0_out[12], I guess because its the shortest slack time? I attached the rest as well. This is without the set_multicycle_path constraints.
report_timing -to [get_ports {fDAC0_out[*]}]
INFO: [Timing 38-91] UpdateTimingParams: Speed grade: -2, Delay Type: max.
INFO: [Timing 38-191] Multithreading enabled for timing update using a maximum of 2 CPUs
INFO: [Timing 38-78] ReportTimingParams: -to_pins -max_paths 1 -nworst 1 -delay_type max -sort_by slack.
Copyright 1986-2020 Xilinx, Inc. All Rights Reserved.
--------------------------------------------------------------------------------------
| Tool Version : Vivado v.2020.1.1 (win64) Build 2960000 Wed Aug 5 22:57:20 MDT 2020
| Date : Thu Jan 14 08:12:16 2021
| Host : DESKTOP-E9F111O running 64-bit major release (build 9200)
| Command : report_timing -to [get_ports {fDAC0_out[*]}]
| Design : top
| Device : 7k160t-ffg676
| Speed File : -2 PRODUCTION 1.12 2017-02-17
--------------------------------------------------------------------------------------
Timing Report
Slack (MET) : 2.571ns (required time - arrival time)
Source: fDAC_inst/FAST_DAC_0/s_out_reg[12]/C
(rising edge-triggered cell FDRE clocked by clk3_in_1 {rise@0.000ns fall@2.500ns period=5.000ns})
Destination: fDAC0_out[12]
(output port clocked by fDACclkB {rise@2.500ns fall@5.000ns period=5.000ns})
Path Group: fDACclkB
Path Type: Max at Slow Process Corner
Requirement: 2.500ns (fDACclkB rise@2.500ns - clk3_in_1 rise@0.000ns)
Data Path Delay: 4.630ns (logic 3.216ns (69.463%) route 1.414ns (30.537%))
Logic Levels: 1 (OBUF=1)
Output Delay: -2.100ns
Clock Path Skew: 2.792ns (DCD - SCD + CPR)
Destination Clock Delay (DCD): 10.083ns = ( 12.583 - 2.500 )
Source Clock Delay (SCD): 7.677ns
Clock Pessimism Removal (CPR): 0.386ns
Clock Uncertainty: 0.192ns ((TSJ^2 + DJ^2)^1/2) / 2 + PE
Total System Jitter (TSJ): 0.071ns
Discrete Jitter (DJ): 0.125ns
Phase Error (PE): 0.120ns
Location Delay type Incr(ns) Path(ns) Netlist Resource(s)
------------------------------------------------------------------- -------------------
(clock clk3_in_1 rise edge)
0.000 0.000 r
AA4 0.000 0.000 r clk (IN)
net (fo=0) 0.000 0.000 clkINPUT0/clk
AA4 IBUF (Prop_ibuf_I_O) 0.619 0.619 r clkINPUT0/IBUF_inst/O
net (fo=1, routed) 1.605 2.224 clkINPUT0/clk_int
BUFGCTRL_X0Y5 BUFG (Prop_bufg_I_O) 0.093 2.317 r clkINPUT0/BUFG_inst/O
net (fo=33, routed) 1.747 4.064 sADC_fDAC/clk_out
MMCME2_ADV_X1Y0 MMCME2_ADV (Prop_mmcme2_adv_CLKIN1_CLKOUT3)
0.077 4.141 r sADC_fDAC/MMCME2_BASE_inst/CLKOUT3
net (fo=1, routed) 2.114 6.255 sADC_fDAC/clk3_in
BUFGCTRL_X0Y4 BUFG (Prop_bufg_I_O) 0.093 6.348 r sADC_fDAC/BUFG_3/O
net (fo=119, routed) 1.329 7.677 fDAC_inst/FAST_DAC_0/clk3
SLICE_X0Y157 FDRE r fDAC_inst/FAST_DAC_0/s_out_reg[12]/C
------------------------------------------------------------------- -------------------
SLICE_X0Y157 FDRE (Prop_fdre_C_Q) 0.204 7.881 r fDAC_inst/FAST_DAC_0/s_out_reg[12]/Q
net (fo=1, routed) 1.414 9.295 fDAC_inst/fDAC0_int[12]
F20 OBUF (Prop_obuf_I_O) 3.012 12.307 r fDAC_inst/pins[12].OBUF_inst0/O
net (fo=0) 0.000 12.307 fDAC0_out[12]
F20 r fDAC0_out[12] (OUT)
------------------------------------------------------------------- -------------------
(clock fDACclkB rise edge)
2.500 2.500 f
AA4 0.000 2.500 f clk (IN)
net (fo=0) 0.000 2.500 clkINPUT0/clk
AA4 IBUF (Prop_ibuf_I_O) 0.513 3.013 f clkINPUT0/IBUF_inst/O
net (fo=1, routed) 1.497 4.510 clkINPUT0/clk_int
BUFGCTRL_X0Y5 BUFG (Prop_bufg_I_O) 0.083 4.593 f clkINPUT0/BUFG_inst/O
net (fo=33, routed) 1.589 6.182 sADC_fDAC/clk_out
MMCME2_ADV_X1Y0 MMCME2_ADV (Prop_mmcme2_adv_CLKIN1_CLKOUT4)
0.073 6.255 f sADC_fDAC/MMCME2_BASE_inst/CLKOUT4
net (fo=1, routed) 1.991 8.246 sADC_fDAC/clk4_in
BUFGCTRL_X0Y2 BUFG (Prop_bufg_I_O) 0.083 8.329 f sADC_fDAC/BUFG_4/O
net (fo=2, routed) 1.251 9.580 fDAC_inst/clk4
OLOGIC_X0Y166 ODDR (Prop_oddr_C_Q) 0.318 9.898 r fDAC_inst/ODDR_fDACclkB/Q
net (fo=1, routed) 0.000 9.898 fDAC_inst/fDACclkB_int
F19 OBUF (Prop_obuf_I_O) 2.685 12.583 r fDAC_inst/OBUF_clkB/O
net (fo=0) 0.000 12.583 fDACclkB
F19 r fDACclkB (OUT)
clock pessimism 0.386 12.969
clock uncertainty -0.192 12.778
output delay 2.100 14.878
-------------------------------------------------------------------
required time 14.878
arrival time -12.307
-------------------------------------------------------------------
slack 2.571
01-14-2021 11:27 AM - edited 01-14-2021 11:28 AM
Sorry - I also need to see the hold time path
report_timing -to [get_ports {fDAC0_out[*]}] -hold
Avrum
01-14-2021 12:00 PM
Do you want it for all bits?
report_timing -to [get_ports {fDAC0_out[*]}] -hold
INFO: [Timing 38-91] UpdateTimingParams: Speed grade: -2, Delay Type: min.
INFO: [Timing 38-191] Multithreading enabled for timing update using a maximum of 2 CPUs
INFO: [Timing 38-78] ReportTimingParams: -to_pins -max_paths 1 -nworst 1 -delay_type min -sort_by slack.
Copyright 1986-2020 Xilinx, Inc. All Rights Reserved.
--------------------------------------------------------------------------------------
| Tool Version : Vivado v.2020.1.1 (win64) Build 2960000 Wed Aug 5 22:57:20 MDT 2020
| Date : Thu Jan 14 14:59:26 2021
| Host : DESKTOP-E9F111O running 64-bit major release (build 9200)
| Command : report_timing -to [get_ports {fDAC0_out[*]}] -hold
| Design : top
| Device : 7k160t-ffg676
| Speed File : -2 PRODUCTION 1.12 2017-02-17
--------------------------------------------------------------------------------------
Timing Report
Slack (MET) : 1.273ns (arrival time - required time)
Source: fDAC_inst/FAST_DAC_0/s_out_reg[3]/C
(rising edge-triggered cell FDRE clocked by clk3_in_1 {rise@0.000ns fall@2.500ns period=5.000ns})
Destination: fDAC0_out[3]
(output port clocked by fDACclkB {rise@2.500ns fall@5.000ns period=5.000ns})
Path Group: fDACclkB
Path Type: Min at Fast Process Corner
Requirement: -2.500ns (fDACclkB rise@2.500ns - clk3_in_1 rise@5.000ns)
Data Path Delay: 1.679ns (logic 1.358ns (80.878%) route 0.321ns (19.122%))
Logic Levels: 1 (OBUF=1)
Output Delay: -0.600ns
Clock Path Skew: 2.114ns (DCD - SCD - CPR)
Destination Clock Delay (DCD): 5.756ns = ( 8.256 - 2.500 )
Source Clock Delay (SCD): 3.155ns = ( 8.155 - 5.000 )
Clock Pessimism Removal (CPR): 0.487ns
Clock Uncertainty: 0.192ns ((TSJ^2 + DJ^2)^1/2) / 2 + PE
Total System Jitter (TSJ): 0.071ns
Discrete Jitter (DJ): 0.125ns
Phase Error (PE): 0.120ns
Location Delay type Incr(ns) Path(ns) Netlist Resource(s)
------------------------------------------------------------------- -------------------
(clock clk3_in_1 rise edge)
5.000 5.000 r
AA4 0.000 5.000 r clk (IN)
net (fo=0) 0.000 5.000 clkINPUT0/clk
AA4 IBUF (Prop_ibuf_I_O) 0.126 5.126 r clkINPUT0/IBUF_inst/O
net (fo=1, routed) 0.700 5.826 clkINPUT0/clk_int
BUFGCTRL_X0Y5 BUFG (Prop_bufg_I_O) 0.026 5.852 r clkINPUT0/BUFG_inst/O
net (fo=33, routed) 0.697 6.549 sADC_fDAC/clk_out
MMCME2_ADV_X1Y0 MMCME2_ADV (Prop_mmcme2_adv_CLKIN1_CLKOUT3)
0.050 6.599 r sADC_fDAC/MMCME2_BASE_inst/CLKOUT3
net (fo=1, routed) 0.931 7.530 sADC_fDAC/clk3_in
BUFGCTRL_X0Y4 BUFG (Prop_bufg_I_O) 0.026 7.556 r sADC_fDAC/BUFG_3/O
net (fo=119, routed) 0.599 8.155 fDAC_inst/FAST_DAC_0/clk3
SLICE_X0Y155 FDRE r fDAC_inst/FAST_DAC_0/s_out_reg[3]/C
------------------------------------------------------------------- -------------------
SLICE_X0Y155 FDRE (Prop_fdre_C_Q) 0.100 8.255 r fDAC_inst/FAST_DAC_0/s_out_reg[3]/Q
net (fo=1, routed) 0.321 8.576 fDAC_inst/fDAC0_int[3]
K16 OBUF (Prop_obuf_I_O) 1.258 9.834 r fDAC_inst/pins[3].OBUF_inst0/O
net (fo=0) 0.000 9.834 fDAC0_out[3]
K16 r fDAC0_out[3] (OUT)
------------------------------------------------------------------- -------------------
(clock fDACclkB rise edge)
2.500 2.500 f
AA4 0.000 2.500 f clk (IN)
net (fo=0) 0.000 2.500 clkINPUT0/clk
AA4 IBUF (Prop_ibuf_I_O) 0.292 2.792 f clkINPUT0/IBUF_inst/O
net (fo=1, routed) 0.765 3.557 clkINPUT0/clk_int
BUFGCTRL_X0Y5 BUFG (Prop_bufg_I_O) 0.030 3.587 f clkINPUT0/BUFG_inst/O
net (fo=33, routed) 0.946 4.533 sADC_fDAC/clk_out
MMCME2_ADV_X1Y0 MMCME2_ADV (Prop_mmcme2_adv_CLKIN1_CLKOUT4)
0.053 4.586 f sADC_fDAC/MMCME2_BASE_inst/CLKOUT4
net (fo=1, routed) 0.996 5.582 sADC_fDAC/clk4_in
BUFGCTRL_X0Y2 BUFG (Prop_bufg_I_O) 0.030 5.612 f sADC_fDAC/BUFG_4/O
net (fo=2, routed) 0.828 6.440 fDAC_inst/clk4
OLOGIC_X0Y166 ODDR (Prop_oddr_C_Q) 0.221 6.661 r fDAC_inst/ODDR_fDACclkB/Q
net (fo=1, routed) 0.000 6.661 fDAC_inst/fDACclkB_int
F19 OBUF (Prop_obuf_I_O) 1.595 8.256 r fDAC_inst/OBUF_clkB/O
net (fo=0) 0.000 8.256 fDACclkB
F19 r fDACclkB (OUT)
clock pessimism -0.487 7.769
clock uncertainty 0.192 7.961
output delay 0.600 8.561
-------------------------------------------------------------------
required time -8.561
arrival time 9.834
-------------------------------------------------------------------
slack 1.273
01-14-2021 07:24 PM
Something doesn't look right...
First, it shows the edges of fDACclkB rising at 2.5 and falling at 5.0 - an inverted clock with respect to the internal clock. That shouldn't be the case unless you specifically asked for it. This can be done (and may even be the correct solution) by reversing the D1 and D2 of the ODDR driving forwarding the clock (connecting D1 to 0 and D2 to 1) and also changing the create_generated_clock command to use the -invert flag. But we didn't discuss that in the constraints, so I am wondering what's going on here.
But if that is the case (the ODDR is reversed and the -invert flag is used) the clock edges used look consistent; on a 5ns period, the setup requirement is 2.5ns and the hold requirement is -2.5ns - effectively the clock edge before the setup capture edge, which is consistent. So we need to figure out how the inverted clock got there.
But the overall timing doesn't make sense (and it's probably my fault) - if you add up all the uncertainties they are way more than one clock period. After looking at it a bit I realize I reversed the -min and -max flags in the constraints - the calculations were all correct; I determined that the setup was -0.6 which means the max is -0.6 and the hold is 2.1, which means the min is -2.1, but I reversed them in the command. As a cross check your max should never be smaller than your min. I will correct the constraints in my earlier reply.
The only other thing to note is the clock structure isn't optimal - normally the clock coming in (on a clock capable pin) should go directly to the MMCM, but you have a BUFG between the pin and the MMCM. This adds am unnecessary, significant and not PVT compensated delay to the clock path - this will have a negative impact on overall timing.
So fix all these things (the set_output_delays, the extra BUFG and investigate how the inversion got there) and then send both path reports again.
Avrum