cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
henry23
Visitor
Visitor
12,734 Views
Registered: ‎09-18-2012

V6 PCIe AXI interface tx tready stay low

Jump to solution

Hello,

 

I am using V6 PCIe endpoint IP v2.5. It is configured to be Gen2 x8.

 

I find the s_axis_tx_tready stuck low after successfully sending some TLPs. One way to recovery is performing a memory write to this endpoint.

 

I looked into the source folder of IP, found the trn_tbuf_av[5:0] stuck at 5 which case the s_axis_tx_tready not released.

So here is my question, how to debug in this situation? What can cause the TLP tx buffer hang? The IP configuration can cause this issue? Or something wrong in the PCIe bridge or root?

 

hang

hang.PNG

 

recovery

recovery.PNG

 

Thanks

Henry

 

0 Kudos
1 Solution

Accepted Solutions
henry23
Visitor
Visitor
19,742 Views
Registered: ‎09-18-2012

Finally, we fixed this issue by change the PCIe reference clock to 100MHz, which directly from motherboard.

 

In my hardware, there is a PLL(jtter attenuator) generate 250MHz reference clock from 100MHz clock of motherboard.

Bypassing this PLL, and reconfiguring IP to 100MHz reference clock, the 8b/10b error and disparity error in the PHY layer are gone. I figure the PLL may not work with SSC on.

 

Thanks

Henry

View solution in original post

0 Kudos
6 Replies
henry23
Visitor
Visitor
12,729 Views
Registered: ‎09-18-2012

Catch this when recovery ( I don't know if this is the same time as upper recovery capture)

recovery1.PNG

0 Kudos
henry23
Visitor
Visitor
12,704 Views
Registered: ‎09-18-2012

Reason of LTSSM transfer from L0 to Recovery:

1. software set the retrain bit

2. error condition in L0

3. received TS1 or TS2 Ordered-Sets

4. need to change speed

 

My satuation is: when tx data path is stucked, memory read/write from host can recovery. The LTSSM transfers from L0 to Recovery then back to L0.

 

So what cause the TX data path be stucked? If there is SI issue causing data corruption, the replay-timeout mechanism in Data Link Layer should get TX data path not be stucked.

 

I aslo noticed this is a simulation issue: http://china.xilinx.com/support/answers/60418.html

Virtex-6 Integrated Block Wrapper for PCI Express - The core may truncate some DLLPs/TLPs during the process of going into Recovery

0 Kudos
henry23
Visitor
Visitor
12,676 Views
Registered: ‎09-18-2012

I found when stucked:

 

1. Flow Control credits of posted transfer is 0xe81. which is > 0x800, so the transmit buffer should not be fetched.

2. rx[x]_chanisaligned are all low except rx0_chanisaligned

3. no read/write transation on MIM interface

4. LTSSM stays L0

 

So it seems the receiver can not receive FC update DLLPs, and the transmitter is stucked due to Flow Control, but no error reported or try to do Phy Layer recovery.

 

Am I right? Is it is possible the Phy Layer not strong enough to recovery? Do the core have optional FC timeout timer? Why the posted FC is 0xe81? I think it will cause upstream receiver buffer overflow, should it be 0xff0 or 0xff[x] (My memory write pakect is 64DWs)?

 

Hope someone can help me.

 

Thanks

Henry

0 Kudos
markzak
Explorer
Explorer
12,654 Views
Registered: ‎12-01-2010

Henry,

  You've done quite a bit of investigative work, and i commend you for that.  I had the exact same problem as you did, with my buf_av credits being stuck at 5.

 

See my post about this issue:

 

Throughput issues with DMA and tx_buf_av in 7 Series Integrated PCIe Block

 

Essentially, what is happenning is that your packets are getting stuck in an upstream switch, because the switch doesn't support payloads of that size. I found that my system negotiated a payload size of 256 bytes (64 DWs) max.  Perhaps yours is lower.  From what i read, the baseline max_payload is 128 bytes for most systems. 

 

Try decreasing your payload size to 128 bytes, and your problem will most likely go away.

 

The reason your system is recovering is that a recovery necessitates flushing buffers and clearing all credits to their starting values.

 

0 Kudos
henry23
Visitor
Visitor
12,594 Views
Registered: ‎09-18-2012

Mark,

 

Thanks for your help.

 

I think my issue is different. We found TLP max payload size is 256 bytes for Intel Ivy bridge, 128 bytes for Haswell. We do the MPS check in the driver and program it into FPGA. So it should not be the MPS issue.

 

I have tryed directing link speed to Gen1 x8, the issue is gone. It is more like a PHY RX side SI issue. What's wrong is the IP core or the switch side not handling the PHY receive problem correctly or robustly.

 

This board can work without issue on another motherboard, which PCIe is directly from CPU. So we confirm this is a SI issue. We can't find any layout issue on FPGA board, so it is more like a whole channel issue, including the motherboard side.

 

We will try to do some PCIe PHY test on the failed motherboard, and try to modify the GTX receiver's attributes/port, especially the rx equilizers. Hope that can solved this issue, or we just degrade speed to Gen1.

 

Thanks

Henry

0 Kudos
henry23
Visitor
Visitor
19,743 Views
Registered: ‎09-18-2012

Finally, we fixed this issue by change the PCIe reference clock to 100MHz, which directly from motherboard.

 

In my hardware, there is a PLL(jtter attenuator) generate 250MHz reference clock from 100MHz clock of motherboard.

Bypassing this PLL, and reconfiguring IP to 100MHz reference clock, the 8b/10b error and disparity error in the PHY layer are gone. I figure the PLL may not work with SSC on.

 

Thanks

Henry

View solution in original post

0 Kudos