cancel
Showing results for 
Search instead for 
Did you mean: 
Visitor
Visitor
778 Views
Registered: ‎07-03-2018

PCIE Link Training Issues (Stuck at LTSSM state 5)

We are attempting to get a Zynq XC7Z045 to talk to an NVMe card, specifically a Samsung 960 Pro SSD.  The '045 is part of a Knowres KRM-3Z7045 module.  Both that module and the NVMe card are plugged into a custom carrier board.

Our current setup instantiates just the processor core and the AXI PCIe bridge as a root complex, as shown in this FPGA Developer Blog post. We're running single lane, and we've tried both Gen2 and Gen1 speeds.  We are using Vivado 2018.1.

We are using ChipScope to look at the LTSSM in the AXI PCIe bridge.  We see that the LTSSM cycles between states 0, 2, 4, 5, 2D (timeout), and back to 0.

When in state 4, we see that the RX data appears to be a string of TS1 symbols.  The LTSSM transitions to state 5 and the TX data then switches from TS1 to TS2 symbols.  We never see TS2 symbols from the NVMe card, and eventually timeout and cycle.

Here is a picture of state 4, demonstrating TS1 on the RX channel, and the transition to sending TS1 on the TX channel. 

Screen Shot 2019-05-06 at 2.21.40 PM.png

Here is a picture of transitioning to state 5, and showing the TS1 to TS2 transition on the TX channel. Note that RX is still TS1.

Screen Shot 2019-05-06 at 2.23.45 PM.png

Here is a picture of the timeout from state 5 to 2D, then back to 0. Note that the SSD is still sending TS1.

Screen Shot 2019-05-09 at 12.10.08 PM.png

We suspect that the NVMe card is not achieving bit/symbol lock, and so its LTSSM is stuck in state 4. Our root complex is achieving lock, transitioning to state 5, and timing out while waiting for TS2 from the NVMe.

Our question is: Why?

One possibility is a signal integrity issue. It is, after all, our custom carrier board.

  • Differential impedance on the carrier board is 100 ohms.
  • TX AC coupling capacitors are 100nF.
  • But, the entire channel length from FPGA to NVMe is approximately 50mm, and clear of all other signals (we're not really sure where a signal integrity issue would be coming from).

We've looked into using IBERT, but the AXI-PCIE block (v2.8) only lists the JTAG debugger in its debug options. Also, XAPP1198 says that "If the link does not train to any speed, including gen 1 speeds (2.5 Gb/s), then using Eye Scan is not recommended, and using the ltssm signal from the core is a better option."

This picture does demonstrate another infrequent issue we've seen, and we are unsure if it's related to the larger problem. Here, we've lost rxcdrlock, and the resulting data is messed up. Then we get 8B/10B and receive errors on rx_status. This is an intermittent error, and we're not certain what causes it or what it indicates. What does this clue mean?

Screen Shot 2019-05-06 at 3.23.44 PM.png

We would appreciate any suggestions.

0 Kudos
2 Replies
Highlighted
704 Views
Registered: ‎02-15-2017

Re: PCIE Link Training Issues (Stuck at LTSSM state 5)

Update on this problem (I'm also working on it).

We put a 4 GHz scope on the PCIe signals on the board, in both directions (FPGA->NVMe, and NVMe->FPGA), and the signal integrity is fine.  So that's not the problem.  We think we have some kind of configuration problem and are mystified about what it might be.

0 Kudos
Highlighted
602 Views
Registered: ‎02-15-2017

Re: PCIE Link Training Issues (Stuck at LTSSM state 5)

The fix turned out to be simple.

The Samsung NVMe will not come out of reset with a x1 link, but it comes up just fine with a x4 link.

It's possible this is because we were talking to the wrong lane when we tried x1.  We don't know, we haven't tried different lanes.

0 Kudos