Debugging Aurora 64b66b when PLL is not locking (XKCU115)
I'm working with 4 lane Aurora 64b66b IP core and become stuck as I am getting no Tx/Rx between the two boards I'm using. I've read the Aurora 64b66b document (https://www.xilinx.com/support/documentation/ip_documentation/aurora_64b66b/v11_2/pg074-aurora-64b66b.pdf) and found that there is a hardware debug section, and have proceeded along the steps until I am unable to verify. I can see that tx_resetdone = '1' and rx_resetdone = '1' in an ILA core, however, I notice that mmcm_not_locked stays high and my channels/lanes never come up when the two designs are run together.
The documentation suggests that checking user_clk_i, which is dependent on tx_clk_out, can indicate if something is wrong with the design. I have previously checked that user_clk_i is at least varying in the design by sampling user_clk_i with an ILA clocked at 400MHz, for which my user_clk_i should be 161.133MHz. I observed the expected weird pattern that occurs when you sample a clock with another unsynced clock wth f_s > 2*f_clock. I'm at a bit of a loss with where to continue my debugging efforts. It seems as though user_clk_i is correct, which would indicate that tx_clk_out is also correct, but in this case then why is mmcm_not_locked staying high?
A bit of extra background information which may be important: The refclk of the quad that my transceivers are placed in (lets call this quad A) is being used by another core. So I have sourced the refclk from the adjacent quad (lets call this Quad B). My method of doing this was to input the refclk pins from Quad B to Quad A and use this as the differential buffer input in Quad A. Further, to override some errors with BEL/LOC constraint conflicts on the lanes themselves, I locked the Aurora core and then manually edited the lane locs (swapping 3 to 0 and 2 to 1 etc because otherwise I would get critical warnings and lanes would not get placed). Locking the IP core prevents OOC synthesis from overwriting my BEL/LOC fixes. In the end I get a design with no critical warnings, with the following clock summary.
What can I do to debug the design further? I could go back to the reference design with refclk in the same quad and confirm this works, which is maybe what I have to do. But I wonder if someone with more transceivers/aurora experience can point me in the correct direction for debugging these designs in HW in a better way?
Thanks for taking the time to read this, I appreciate any guidance that can be given.