cancel
Showing results for 
Search instead for 
Did you mean: 
Highlighted
Contributor
Contributor
467 Views
Registered: ‎06-30-2014

QDMA character device not present

Jump to solution

I'm attempting to follow this reference to repeat the various benchmarks of QDMA on our host system (CentOS 7 64-bit x86_64 -based).  The target platform is zynqmp-based, and we've loaded a basic PCIe+QDMA design utilizing what settings we could from table 2 in that link.  Some tests from that link we can get to function, some we cannot, specifically those involving dmaperf from the dma_ip_drivers/QDMA/linux-kernel.  We're using toolset 2019.2 and the target is installed in a PCIe x8 slot in the host.

On the host, we've installed the the kernel drivers, libraries and tools:

sudo yum install -y kernel-devel libaio-devel
sudo yum groupinstall -y "Development Tools"

git clone -b 2019.2 https://github.com/Xilinx/dma_ip_drivers
cd dma_ip_drivers/QDMA/linux-kernel
make
sudo make install
sudo make install-mods sudo modprobe qdma shutdown -r now

No variation of trying to trigger a PCI bus rescan would cause the devices to be discovered and bound, so we had to do the reboot.  Upon reboot, we can see that the 4 PCIe devices are discovered:

# lspci -vm
# non-applicable entries omitted
Device:	b3:00.0
Class:	Memory controller
Vendor:	Xilinx Corporation
Device:	Device 9038
SVendor:	Xilinx Corporation
SDevice:	Device 0007
NUMANode:	0

Device:	b3:00.1
Class:	Memory controller
Vendor:	Xilinx Corporation
Device:	Device 9138
SVendor:	Xilinx Corporation
SDevice:	Device 0007
NUMANode:	0

Device:	b3:00.2
Class:	Memory controller
Vendor:	Xilinx Corporation
Device:	Device 9238
SVendor:	Xilinx Corporation
SDevice:	Device 0007
NUMANode:	0

Device:	b3:00.3
Class:	Memory controller
Vendor:	Xilinx Corporation
Device:	Device 9338
SVendor:	Xilinx Corporation
SDevice:	Device 0007
NUMANode:	0

And we can see that dmesg indicates the qdma module has found those devices and did appear to init in auto-mode correctly:

# dmesg | grep qdma
[    6.815045] qdma_vf: loading out-of-tree module taints kernel.
[    6.848722] qdma_vf: module verification failed: signature and/or required key missing - tainting kernel
[    6.880469] qdma_vf:qdma_mod_init: Xilinx QDMA VF Reference Driver v2019.2.125.213.
[    6.918093] qdma:qdma_mod_init: Xilinx QDMA PF Reference Driver v2019.2.125.213.
[    6.928078] qdma:probe_one: 0000:b3:00.0: func 0x0/0x4, p/v 1/0,0xffff8cfcbf279140.
[    6.928083] qdma:probe_one: Configuring 'b3:00:0' as master pf
[    6.928086] qdma:probe_one: Driver is loaded in auto(0) mode
[    6.928089] qdma:qdma_device_open: qdma_pf, b3:00.00, pdev 0xffff8cfcbf1f7000, 0x10ee:0x9038.
[    6.928102] qdma_pf 0000:b3:00.0: enabling device (0140 -> 0142)
[    6.928229] qdma:qdma_device_attributes_get: qdmab3000-p0000:b3:00.0: num_pfs:4, num_qs:2048, flr_present:1, st_en:1, mm_en:1, mm_cmpt_en:0, mailbox_en:1, mm_channel_max:1, qid2vec_ctx:0, cmpt_ovf_chk_dis:1, mailbox_intr:1, sw_desc_64b:1, cmpt_desc_64b:1, dynamic_bar:1, legacy_intr:1, cmpt_trig_count_timer:1
[    6.928232] qdma:qdma_device_open: Vivado version = vivado 2019.1
[    6.928235] qdma:xdev_identify_bars: User BAR 2.
[    6.928238] qdma_dev_entry_create: Created the dev entry successfully
[    6.928242] qdma:intr_setup: current device supports only (8) msix vectors per function. ignoring input for (32) vectors
[    6.928267] qdma_pf 0000:b3:00.0: irq 62 for MSI/MSI-X
[    6.928279] qdma_pf 0000:b3:00.0: irq 63 for MSI/MSI-X
[    6.928291] qdma_pf 0000:b3:00.0: irq 64 for MSI/MSI-X
[    6.928305] qdma_pf 0000:b3:00.0: irq 65 for MSI/MSI-X
[    6.928316] qdma_pf 0000:b3:00.0: irq 66 for MSI/MSI-X
[    6.928327] qdma_pf 0000:b3:00.0: irq 67 for MSI/MSI-X
[    6.928338] qdma_pf 0000:b3:00.0: irq 68 for MSI/MSI-X
[    6.928349] qdma_pf 0000:b3:00.0: irq 69 for MSI/MSI-X
[    6.943891] qdma:qdma_device_open: 0000:b3:00.0, b3000, pdev 0xffff8cfcbf1f7000, xdev 0xffff8cde127c9000, ch 1, q 0, vf 0.
[    6.943970] qdma:probe_one: 0000:b3:00.1: func 0x1/0x4, p/v 1/0,0xffff8cfcbf2792c0.
[    6.943973] qdma:probe_one: Driver is loaded in auto(0) mode
[    6.943977] qdma:qdma_device_open: qdma_pf, b3:00.01, pdev 0xffff8cfcbf1b8000, 0x10ee:0x9138.
[    6.943990] qdma_pf 0000:b3:00.1: enabling device (0140 -> 0142)
[    6.944031] qdma:qdma_device_attributes_get: qdmab3001-p0000:b3:00.1: num_pfs:4, num_qs:2048, flr_present:1, st_en:1, mm_en:1, mm_cmpt_en:0, mailbox_en:1, mm_channel_max:1, qid2vec_ctx:0, cmpt_ovf_chk_dis:1, mailbox_intr:1, sw_desc_64b:1, cmpt_desc_64b:1, dynamic_bar:1, legacy_intr:1, cmpt_trig_count_timer:1
[    6.944034] qdma:qdma_device_open: Vivado version = vivado 2019.1
[    6.944037] qdma:xdev_identify_bars: User BAR 2.
[    6.944040] qdma_dev_entry_create: Created the dev entry successfully
[    6.944044] qdma:intr_setup: current device supports only (8) msix vectors per function. ignoring input for (32) vectors
[    6.944084] qdma_pf 0000:b3:00.1: irq 70 for MSI/MSI-X
[    6.944096] qdma_pf 0000:b3:00.1: irq 71 for MSI/MSI-X
[    6.944108] qdma_pf 0000:b3:00.1: irq 72 for MSI/MSI-X
[    6.944120] qdma_pf 0000:b3:00.1: irq 73 for MSI/MSI-X
[    6.944133] qdma_pf 0000:b3:00.1: irq 74 for MSI/MSI-X
[    6.944144] qdma_pf 0000:b3:00.1: irq 75 for MSI/MSI-X
[    6.944156] qdma_pf 0000:b3:00.1: irq 76 for MSI/MSI-X
[    6.944170] qdma_pf 0000:b3:00.1: irq 77 for MSI/MSI-X
[    6.944486] qdma:qdma_device_open: 0000:b3:00.1, b3001, pdev 0xffff8cfcbf1b8000, xdev 0xffff8cfcbaeaa000, ch 1, q 0, vf 0.
[    6.944538] qdma:probe_one: 0000:b3:00.2: func 0x2/0x4, p/v 1/0,0xffff8cfcbf279440.
[    6.944542] qdma:probe_one: Driver is loaded in auto(0) mode
[    6.944546] qdma:qdma_device_open: qdma_pf, b3:00.02, pdev 0xffff8cfcbf1b9000, 0x10ee:0x9238.
[    6.944552] qdma_pf 0000:b3:00.2: enabling device (0140 -> 0142)
[    6.944578] qdma:qdma_device_attributes_get: qdmab3002-p0000:b3:00.2: num_pfs:4, num_qs:2048, flr_present:1, st_en:1, mm_en:1, mm_cmpt_en:0, mailbox_en:1, mm_channel_max:1, qid2vec_ctx:0, cmpt_ovf_chk_dis:1, mailbox_intr:1, sw_desc_64b:1, cmpt_desc_64b:1, dynamic_bar:1, legacy_intr:1, cmpt_trig_count_timer:1
[    6.944580] qdma:qdma_device_open: Vivado version = vivado 2019.1
[    6.944583] qdma:xdev_identify_bars: User BAR 2.
[    6.944585] qdma_dev_entry_create: Created the dev entry successfully
[    6.944615] qdma:intr_setup: current device supports only (8) msix vectors per function. ignoring input for (32) vectors
[    6.944615] qdma_pf 0000:b3:00.2: irq 78 for MSI/MSI-X
[    6.944628] qdma_pf 0000:b3:00.2: irq 79 for MSI/MSI-X
[    6.944640] qdma_pf 0000:b3:00.2: irq 80 for MSI/MSI-X
[    6.944652] qdma_pf 0000:b3:00.2: irq 81 for MSI/MSI-X
[    6.944664] qdma_pf 0000:b3:00.2: irq 82 for MSI/MSI-X
[    6.944676] qdma_pf 0000:b3:00.2: irq 83 for MSI/MSI-X
[    6.944690] qdma_pf 0000:b3:00.2: irq 84 for MSI/MSI-X
[    6.944703] qdma_pf 0000:b3:00.2: irq 85 for MSI/MSI-X
[    6.944999] qdma:qdma_device_open: 0000:b3:00.2, b3002, pdev 0xffff8cfcbf1b9000, xdev 0xffff8cfcbaeac000, ch 1, q 0, vf 0.
[    6.945059] qdma:probe_one: 0000:b3:00.3: func 0x3/0x4, p/v 1/0,0xffff8cfcbf2795c0.
[    6.945061] qdma:probe_one: Driver is loaded in auto(0) mode
[    6.945065] qdma:qdma_device_open: qdma_pf, b3:00.03, pdev 0xffff8cfcbf1ba000, 0x10ee:0x9338.
[    6.945071] qdma_pf 0000:b3:00.3: enabling device (0140 -> 0142)
[    6.945098] qdma:qdma_device_attributes_get: qdmab3003-p0000:b3:00.3: num_pfs:4, num_qs:2048, flr_present:1, st_en:1, mm_en:1, mm_cmpt_en:0, mailbox_en:1, mm_channel_max:1, qid2vec_ctx:0, cmpt_ovf_chk_dis:1, mailbox_intr:1, sw_desc_64b:1, cmpt_desc_64b:1, dynamic_bar:1, legacy_intr:1, cmpt_trig_count_timer:1
[    6.945100] qdma:qdma_device_open: Vivado version = vivado 2019.1
[    6.945103] qdma:xdev_identify_bars: User BAR 2.
[    6.945106] qdma_dev_entry_create: Created the dev entry successfully
[    6.945110] qdma:intr_setup: current device supports only (8) msix vectors per function. ignoring input for (32) vectors
[    6.945133] qdma_pf 0000:b3:00.3: irq 86 for MSI/MSI-X
[    6.945145] qdma_pf 0000:b3:00.3: irq 87 for MSI/MSI-X
[    6.945156] qdma_pf 0000:b3:00.3: irq 88 for MSI/MSI-X
[    6.945170] qdma_pf 0000:b3:00.3: irq 89 for MSI/MSI-X
[    6.945182] qdma_pf 0000:b3:00.3: irq 90 for MSI/MSI-X
[    6.945195] qdma_pf 0000:b3:00.3: irq 91 for MSI/MSI-X
[    6.945207] qdma_pf 0000:b3:00.3: irq 92 for MSI/MSI-X
[    6.945220] qdma_pf 0000:b3:00.3: irq 93 for MSI/MSI-X
[    6.957611] qdma:qdma_device_open: 0000:b3:00.3, b3003, pdev 0xffff8cfcbf1ba000, xdev 0xffff8cde12080800, ch 1, q 0, vf 0.

One weird thing you'll notice is that despite using Vivado 2019.2 to create the design, the version reported back from the target enumerated to 2019.1.  We're guessing this only means that Xilinx's core in that version has a bug in it, a bug that probably isn't stopping us because there are no other errors indicating that the probe quit early for each device (there are no "close" messages).

When we try to run dmaperf examples, we have problems.  For example, using mm_1_1_64 modified for our bus, b3, we get this error:

# export PATH=$PATH:/usr/local/sbin
# dmaperf -c bi_mm_1_1_64
dmactl qdmab3000 q add idx 0 mode mm dir h2c
Zero Qsdmactl qdmab3000 q start idx 0 dir h2c idx_ringsz 5
Zero QsError: Cannot find /dev/qdmab3000-MM-0

So despite the lack of errors when the QDMA module is loaded, it does not create the character device at /dev/qdma* for any of the devices even though we can walk through the code and see that it must have (or else we would see "close" messages with some error(s)).

Nothing in the linked AR or the links it provides has so far pointed to how to get the character devices to function, and yet I've found a few forum posts on here indicating others do have those devices in earlier versions of Vivado (2018.3 after some patches and 2019.1).

Is this broken in 2019.2?

Is there something else that must be done to make the character devices appear on the host system?

Thanks in advance.

 

Tags (2)
0 Kudos
1 Solution

Accepted Solutions
Highlighted
Contributor
Contributor
385 Views
Registered: ‎06-30-2014

It would be nice if that AR detailing test results also detailed the precise steps for recreating those tests, with notes for where some value X should be tied to the number of processor cores, etc.  That's all we're really going for here, is a way to repeat the tests so that we can get an idea of how well this card and chassis performs with a baseline configuration before we start adding to it.

After days of chipping at this, I became suspicious that dmactl is supposed to create those devices as needed, but I couldn't quite stitch together how to make it happen until I stumbled across another forum post, where the user talked about setting qmax:

 

echo N > /sys/bus/pci/devices/0000:AB:CD.E/qdma/qmax

Where the value N is on the range of 0 - 512 - 8 * NUM_VF (according to dma_ip_drivers/QDMA/linux-kernel/docs/README section 2.1).  With that set to a non-zero value, dmactl is now able to create the character devices.

 

However, it's only part of the story since certain values of qmax vs. certain dmaperf configurations cause the computer to lock up with:

 

Message from syslogd@host_computer at DATE TIME ...
 kernel:NMI watchdog: BUG: soft lockup - CPU#0 stuck for 22s! [dmaperf:4905]

Also, I was originally trying any dmaperf configuration I could because I could get none to work.  Once I was able to get the character devices to be created, it became clear that the system design in the AR's testing had to be the combined, memory-mapped and streaming, design, not simply streaming as seemed to be indicated/implied earlier in the instructions.

Summary: if you're finding the /dev/qdma* drivers don't exist while trying to use dmaperf, it's probably because you need to set qmax after loading the driver -- one of the many steps not covered in the AR.

 

View solution in original post

8 Replies
Highlighted
Contributor
Contributor
386 Views
Registered: ‎06-30-2014

It would be nice if that AR detailing test results also detailed the precise steps for recreating those tests, with notes for where some value X should be tied to the number of processor cores, etc.  That's all we're really going for here, is a way to repeat the tests so that we can get an idea of how well this card and chassis performs with a baseline configuration before we start adding to it.

After days of chipping at this, I became suspicious that dmactl is supposed to create those devices as needed, but I couldn't quite stitch together how to make it happen until I stumbled across another forum post, where the user talked about setting qmax:

 

echo N > /sys/bus/pci/devices/0000:AB:CD.E/qdma/qmax

Where the value N is on the range of 0 - 512 - 8 * NUM_VF (according to dma_ip_drivers/QDMA/linux-kernel/docs/README section 2.1).  With that set to a non-zero value, dmactl is now able to create the character devices.

 

However, it's only part of the story since certain values of qmax vs. certain dmaperf configurations cause the computer to lock up with:

 

Message from syslogd@host_computer at DATE TIME ...
 kernel:NMI watchdog: BUG: soft lockup - CPU#0 stuck for 22s! [dmaperf:4905]

Also, I was originally trying any dmaperf configuration I could because I could get none to work.  Once I was able to get the character devices to be created, it became clear that the system design in the AR's testing had to be the combined, memory-mapped and streaming, design, not simply streaming as seemed to be indicated/implied earlier in the instructions.

Summary: if you're finding the /dev/qdma* drivers don't exist while trying to use dmaperf, it's probably because you need to set qmax after loading the driver -- one of the many steps not covered in the AR.

 

View solution in original post

Highlighted
Contributor
Contributor
366 Views
Registered: ‎01-13-2020

Qmax should be initialized for any use of QDMA: dmaperf, dma_to_device, dma_from_device etc. And reinit after SW reset (see example design register set)

0 Kudos
Highlighted
Contributor
Contributor
362 Views
Registered: ‎06-30-2014

Are you saying it should have automatically been initialized by using those tools or are you re-stating what I said in my previous post, that one must manually set that value (and that the AR doesn't direct the user to do that ahead of running the tests that are described)?

0 Kudos
Highlighted
Contributor
Contributor
299 Views
Registered: ‎01-13-2020
Sorry for the inaccuracy. I re-stating what you said in your previous post with a little clarification.
Qmax should be a power of 2 and it should be initialized (distributed among PCIe functions) manually by the user immediately after loading the qdma driver or soft-reset the QDMA IP Core.
0 Kudos
Highlighted
Contributor
Contributor
282 Views
Registered: ‎06-30-2014

Ah, thank you for the insight: lower power of 2 -- that's a new detail I haven't run across yet.

This is somewhat off-topic, but so far as I can tell as I run through trying to make these dmaperf configs work on our system, we need this value to be set at a minimum to the number of queues used in the test.  So if it's an 8-queue test, we must have qmax=8 on the bus:device.function in question.  To calculate the number of threads that will be used in a given test, we use:

 

total_threads = num_queues * num_threads * (direction == bi ? 2 : 1)

An 8-queue bidirectional test therefore takes 32 threads.  Naturally if total_threads exceeds the number of concurrent threads possible, we get a kernel timeout and lock-up.

Has that been your experience?

 

0 Kudos
Highlighted
Contributor
Contributor
272 Views
Registered: ‎01-13-2020

I have no much research experience with dmaperf-treads.
We found that after near 16 total threads dmaperf has problems with deletion queues. And also we found what for QDMA MM interface speed increments when PF number increments up to max=4, but for QDMA ST speed increments when queues number increments with PF num=1.
Now I use dmaperf for tests with 1 thread per queue per dir. For example
q=0:3, PF=0:0, dir=bi, threads=1.
Total threads = 8. Such configuration gives the problem of removing the queue very rarely.

Highlighted
Contributor
Contributor
192 Views
Registered: ‎06-30-2014

If you can share, in your system, how many physical CPU cores do you have?

We have 8 physical, and whether HyperThreading is enabled or not, we start seeing this problem when the number of threads exceeds the physical core count.

FWIW, so far anyway, the DPDK driver and test tooling described in the AR are not exhibiting this problem at all.

0 Kudos
Highlighted
Contributor
Contributor
169 Views
Registered: ‎01-13-2020

We have 4 physical cores (8 with HT at all) in the Intel processor with PCIe 2.0. The card is connected directly to the processor x16 lanes, without any PCIe switches.
Also we have a modern Xeon with PCIe 3.0, but not tested yet.
IMHO, I didn't find a strong correlation between the physical cores and the number of queues for invalidation for the linux kernel driver. But found that probability of inv.problems rising when rising num.of queues.
If I right, in the DPDK we did not encounter a problems with queues invalidation. Queues invalidation seems fast and accurate.

0 Kudos