10-31-2019 11:22 PM
We have a custom board made up of Ultrascale plus FPGA. The board is not detecting PCIe
We have checked the LTSSM state and found that it is stuck in 03 that is polling compliance state.
I didnt found any document that is describing reasons for polling compliance stuck.
What may be the reason?, how to make debug this further?
11-01-2019 12:35 AM
I believe you need to look at PCIe specification for detailed LTSSM explanation. You will not find detailed explanation on Xilinx document. I am not PCIe expert, but I want to share my understanding.
(a) During Polling.Active , the protocol needs to obtain Bit/Symbol lock.
So, Transmitter from each connected device will send at least 1024 consecutive TS1 ordered-set on all connected lanes.
(b) If (a) is succesful, ( means receiver side can receives 8 consecutive TS1 (or TS2) LTSSM will exit to Pooling.Configuration
(c) When during (a) , at least one of the Lane has never detected an exit from Electrical idle.
since entering Polling.Active, the next state will be Pooling.Compliance.
(d) If your device stuck at Pooling.Compliance state, perhaps the partner device is not ready yet.
Hope this helps.
Thanks & regards
11-01-2019 11:05 PM
Thanks for the reply
Is there any way to know which lane has not exited from electrical idle?
Also could you please help me the meaning of this,
"♦ Note: If a passive test load is applied on all Lanes then the device will go to
11-07-2019 08:49 AM
Have you looked at Xilinx Answer 56616? There are lots of debugging steps in there that might help you.
11-07-2019 09:35 PM
They are not talking about polling issues. They specified signals and didnt say anything about how to interpret those signals.
However, As fas as i know,
1. receiver is detected with all the 16 lines
2. Reset signals are ok
3. The LTSSM signal stuck in polling.
I dont know what all signals to probe. Any help is appreciated..
11-10-2019 06:51 PM
are you able to rule out the issue that "the link partiner is not detecting the board while the FPGA has already detected the link partiner"
if this is the case, the rxelecidle is high while the txelecidle is low.
11-13-2019 05:50 AM
>are you able to rule out the issue that "the link partiner is not detecting the board while >the FPGA has already detected the link partiner"
>if this is the case, the rxelecidle is high while the txelecidle is low.
Unlike the OP, in my case I'm doing a Gen3x4. The design/board does link find in another slot in the system (a x16), but fails to be detected in the other (x4) slot.
Coincedentaly, I'm now stuck with almost the same problem (gets to ltssm state 3, but never exits). And, at least in my case, rx_elec_idle is high and txelecidle goes low (around transition to ltssm state 2). What does this imply?
11-14-2019 01:06 AM
it means FPGA has detected host but the host is not able to detect FPGA.
Does that slot able to detect other devices?
the setting of the slots might be different (using falling edge detection instead of rising edge)or the receiver detection circuit of the board needs to be checked.(capacity for example)
11-14-2019 04:35 AM
Yes, that slot can detect other devices.
Previously I had just been JTAG loading with a warm reboot and not getting link/detection. I found that if I burn the bitfile into the configuration flash and power-cycle the machine, the card links and gets detected. Subsequent JTAG loading of the bitfile followed by a warm reboot also leads to link/detection.
I then erased the configuration flash, power-cycled, then JTAG loaded with a warm reboot and no link/detection.
Not sure why this behaviour exists.
11-14-2019 10:56 PM
My Issue solved. The Issue is that my MGTAVTT power slowly degrading to 1.1V from 1.2V after sometime.
When i have powered this rail externally. It detects the link.
Just monitor your MGTAVTT rail continously on Erased mode and Flashed mode. It may also degrading. On this case also it will stuck in polling.
11-19-2019 04:50 PM
Vishnu, read this other forum posting (https://forums.xilinx.com/t5/PCIe-and-CPM/Spartan6-PCIe-field-troubles/td-p/706973). It might partially explain both yours and my observed behaviours (though I'm guessing for different reasons). When you say powered this rail externally, is that allowing your card to be configured earlier than normal (in relation to when the PC starts up)?
11-20-2019 12:57 AM
Thanks for the link. I hope this will helps me.
Coming to the point, there is nothing to do with configuration time. If so it should work after reboot. I have powered externally means, the level of voltage is not good enough due to damaged power IC. Now I can able to detect the PCIe in two boards out of three. The last board also have some other issue. After correcting the power issue the third board is comes out from Polling state and now stuck in Disabled state. shown in the fig.
The RXELECIDLE is now 0001. TXELECIDLE is ffff.
does this RXELECIDLE implies that the lane0 didnt come out of elecidle state, right?.
11-20-2019 04:24 AM
Based on a quote from the linked page - "If I understand correctly, the rx_elec_idle staying high indicates that the FPGA's receiver did not detect the transition when the far-end transmitter exits from electrical idle" - it seems like you have a problem on the first lane.
Also, for you, or others that read this, my point of linking to the other forum thread is to point out that it is not necessarily a valid assumption to think that doing a reboot is sufficient to rule out configuration time. Apparently some motherboards, and even some pcie slots of some motherboards, will "shut off" if they don't detect a link partner soon after power-up, and once off, won't "turn on" even after a reboot.
11-21-2019 11:07 AM - edited 11-21-2019 11:07 AM
Since Lane0 has some issue.We cant able to detect PCIe of any width.If have done lane reversal on x16 (16th lane > 0th lane viceversa), will this help to detect atleast x8 or lower. ?
11-21-2019 11:29 AM
Tough question to answer. "Lane Reversal" section of pg213 covers this, but it's a little hard to follow. One thing I didn't notice until now is the note at the bottom of the page (pp 280 of v1.3) that says that the default is to have lane reversal support disabled, and, that lane reversal must not be enabled if link partner has reversal capability.
My _guess_ is that the chipsets on PC's do support lane reversal. If that's the case, and you intentionally reversed your lane order, and the slot you are using has the same width capabilities as your design (x16), then it would likely train to x8 in your case. But that is probably most dependent upon what the link partner reversal capabilities are, and that's hard to know.