cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
Contributor
Contributor
2,342 Views
Registered: ‎04-10-2018

Zynq Ultrascale+ NVME Driver Interrupt Issue

Jump to solution

Hello, I'm using the 'DMA/Bridge Subsystem for PCI Express' core, acting as a root complex, mapped into my PS.

 

When I connected NVME drives to the PCIe bus, the drives were enumerated, but painfully slow (mounting took 26 minutes), with the following message logged to dmesg every 60 seconds:

nvme nvme0: I/O xxx QID x timeout, completion polled

 

It seemed like the PS was missing the MSI interrupts altogether.

 

I noticed that since I had downloaded the git repository, a patch to the MSI driver was committed: https://github.com/Xilinx/linux-xlnx/commit/73ef385ad0e6f7643d7617fc72e5a8c6514e0f44 . Adding this in seems to have helped my problem a lot - I now can mount the drives and successfully write several GB of data, but every few seconds the above message is still logged to dmesg, and I cannot access the drive for several seconds/minutes.

 

I am also finding that the link between my root complex and external PCIe switch is negotiating down to Gen 1 from Gen 3, but I believe this is an separate problem.

 

Any idea how I can solve this?

Tags (1)
0 Kudos
Reply
1 Solution

Accepted Solutions
Contributor
Contributor
2,479 Views
Registered: ‎04-10-2018

The solution to my link training issue turned out to be setting "Link Partner TX Preset" to 5 in the GT Settings tab.

 

I hadn't previously changed this setting because I could not find any information about what it did (and the documentation recommends not changing it), but thanks to @bethe for providing insight. This option provides presets for transmit settings which the endpoint can request from the upstream device during phase 2 of equalisation. Preset 4 works well for 95%+ of links, but for some very low loss situations Preset 5 can work better.

 

In my case, our board has a very short, low loss, chip-to-chip connection between the FPGA and the switch, and so Preset 5 appears to make the link speed train correctly!

View solution in original post

0 Kudos
Reply
5 Replies
Xilinx Employee
Xilinx Employee
2,330 Views
Registered: ‎12-10-2013

Hi @josh_tyler,

 

It sounds like you are experiencing the issue described here:

 

https://www.xilinx.com/support/answers/71106.html

 

We would *strongly* recommend ensuring you are in Vivado and Petalinux 2018.1, and apply the above.  Please note, there is a limit (for a PCI Express perspective) of number of downstream devices from the perspective of multi-vector MSI.   Please carefully review the steps to apply, as there is an IP change, port / design addition, driver patch, and DTG edit (based on your port/design).  There is a simple example in the text of the Answer, and an advanced example available via PDF. 

 

On the link training issue - I would recommend putting in a JTAG Debugger to look at the LTSSM and see where the Gen1 -> Gen3 training is failing back.  That functionality is described in PG194 (Product Guide 194) and expanded on in https://www.xilinx.com/support/answers/68134.html.   The hierarchy insertion of the JTAG Debugger has changed (it is now internal to the core) but the set-up is the same as described. 

 

For other known issues with the DMA/Bridge Subsystem - please reference Xilinx Answer 65443, and for other known issues with the PL Root Port driver and IP/driver interaction, please reference Xilinx Answer 70702. 

-------------------------------------------------------------------------
Don’t forget to reply, kudo, and accept as solution.
-------------------------------------------------------------------------
Contributor
Contributor
2,279 Views
Registered: ‎04-10-2018

Thank you @bethe, AR71106 has solved the linux interrupt issue! I am using Vivado 2018.1, and I'm not using Petalinux, but I am using a linux kernel built from the latest https://github.com/Xilinx/linux-xlnx/ (which appears to incorporate the petalinux patch).

 

As for the link speed training problem, I've put in an IBERT core, and the JTAG debugging. The reset sequence looks good, and the LTSSM state machine follows the following sequence (found using a combination of the tcl scripts and and ILA):

 

Detect -> Polling -> Configuration -> L0 -> R.Lock -> R.Eq -> R.speed -> R.lock -> R.Cfg -> R.Speed -> R.lock -> R.Cfg -> R.Idle -> L0.

 

My link parter is a PCI express switch, and they are configured for Gen3, since the devices downstream of the switch successfully negotiate Gen3 links.

 

I am trying to use the 7 series debug guide  in conjunction with the Ultrascale debug guide, but it is made slightly difficult by the fact that some of the signal names appear to be different between the cores (for example I have no signal named pl_link_partner_gen2_supported).

 

The eye diagrams I get from the IBERT core look very clean (albeit at Gen 1 speeds), and at this stage I am only attempting to train a link of width 1 between the root complex and switch.

0 Kudos
Reply
Xilinx Employee
Xilinx Employee
2,247 Views
Registered: ‎12-10-2013

Hi @josh_tyler,

 

Can you upload a failing ltssm .dat file from the JTAG debugger - that will let us see the equalization phase substate that it is failing in.  (This contains a bit more information than the drawing).

 

Thanks!

-------------------------------------------------------------------------
Don’t forget to reply, kudo, and accept as solution.
-------------------------------------------------------------------------
0 Kudos
Reply
Contributor
Contributor
2,215 Views
Registered: ‎04-10-2018

Hi @bethe,

 

The ltssm .dat file is attached.

 

Thank you!

 

(Rename the file to .dat from .dat.txt)

0 Kudos
Reply
Contributor
Contributor
2,480 Views
Registered: ‎04-10-2018

The solution to my link training issue turned out to be setting "Link Partner TX Preset" to 5 in the GT Settings tab.

 

I hadn't previously changed this setting because I could not find any information about what it did (and the documentation recommends not changing it), but thanks to @bethe for providing insight. This option provides presets for transmit settings which the endpoint can request from the upstream device during phase 2 of equalisation. Preset 4 works well for 95%+ of links, but for some very low loss situations Preset 5 can work better.

 

In my case, our board has a very short, low loss, chip-to-chip connection between the FPGA and the switch, and so Preset 5 appears to make the link speed train correctly!

View solution in original post

0 Kudos
Reply