cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
adatatw
Participant
Participant
908 Views
Registered: ‎12-06-2011

Access to QDMA register blocks at offset 0x1400 or 0x2000 always fails

Jump to solution

Dear all,

I'm trying to bring up a simple test design using the QDMA IP, but every access that I make (using "dd" command with /dev/mem) to certain blocks of registers fails (timing out, presumably) - specifically, the register blocks at 0x1400 and 0x2000. Accesses to registers 0 - 0x13FF work fine.

The QDMA PCIe endpoint appears to be 100% reliable at the link layer; as far as I can tell the LTSSM never exits L0 and it always links up at the expected speed and number of lanes. Failed reads do not cause the QDMA endpoint to stop responding to PCIe configuration transactions.

Reads of the registers at 0x1400 or 0x2000 always return 0xFFFFFFFF, and if enough such failed reads occur, the machine eventually dies with a machine check (i.e. the machine decides it has had enough). This behaviour is entirely consistent over bitstreams generated with Vivado 2018.3, 2019.1.3 and 2019.2 using either our own simple design or the Xilinx QDMA IP example design. Both designs fully meet timing.

On the other hand, we have an UltraRAM memory attached to the M_AXIL and M_AXI_BRIDGE interfaces, and we can read and write this without any problem using same /dev/mem method.

Thus it seems there is a functional problem associated with certain register blocks in the QDMA endpoint.

Hardware is our own VU3P board running at Gen 3 speed, 16 lanes, hosted in an x86_64 machine running various Linux distributions. An XDMA-based design works perfectly.

Any ideas about what might be causing this? From the behaviour seen, I get the impression that the problematic register blocks might live in a different clock domain to the other register blocks, and said clock domain is not responding (held in reset?). Could this hypothesis be correct?

Thanks in advance for any pointers.

 

0 Kudos
1 Solution

Accepted Solutions
deepeshm
Xilinx Employee
Xilinx Employee
687 Views
Registered: ‎08-06-2008

For the following, could you please try by enabling mailbox in the core configuration GUI?

0x2400 to 0x240F* (QDMA_PF_MAILBOX) - all bytes returned are 0xFF, probable timeout

Thanks.

View solution in original post

6 Replies
deepeshm
Xilinx Employee
Xilinx Employee
831 Views
Registered: ‎08-06-2008

Could you provide the details of the register blocks in reference to PG302? 

In P302 (Product Guide for QDMA IP), there is QDMA_TRQ_EXT_0(0x1400). This is reserved. 

Thanks.

0 Kudos
adatatw
Participant
Participant
776 Views
Registered: ‎12-06-2011

@deepeshm

Thank you for the response.

Sorry - in my original post I should have written 0x2400 instead of 0x2000. When using dd with /dev/mem, we cannot successfully read registers such as the RTL Version Register at 0x2414. Similarly, when using the Xilinx driver, we see that its attempts to access registers in the block at 0x2400 fail. In either case, reads to this block of registers appear to time out (0xFFFFFFFF data returned).

Regarding the unused region at 0x1400 - while I understand that this region is unused, it seems to me rather poor design to have "holes" in a BAR, to which accesses simply fail to complete. At the very least, reads should complete with arbitrary data and writes should be discarded.

0 Kudos
deepeshm
Xilinx Employee
Xilinx Employee
758 Views
Registered: ‎08-06-2008

Thank You. We will look into your suggestion regarding access to 0x1400 address space.Let us check if we can reproduce the issue with 0x2400 address. Could you confirm if this is the only address block you are seeing the issue or you do see issue on other addresses as well?

Thanks.

0 Kudos
adatatw
Participant
Participant
727 Views
Registered: ‎12-06-2011

@deepeshm 

Here are the results of attempting to read using /dev/mem from the following address ranges:

0x0 to 0x13FF (various register blocks) - OK, magic numbers in registers look good

0x1400 to 0x140F* (reserved region) - all bytes returned are 0xFF, probable timeout

0x2400 to 0x240F* (QDMA_PF_MAILBOX) - all bytes returned are 0xFF, probable timeout

0x10000 to 0x17FFF (QDMA_TRQ_MSIX) - OK, values consistent with BIOS setting mask bit for each MSI-X table entry

0x18000 to 0x1800C (QDMA_TRQ_SEL_QUEUE_PF) - OK, all registers 0 as per post-reset state

* = Did not read more because a machine check occurs that crashes the machine after a specific number of read failures. Rebooted the machine between testing 0x1400 and 0x2400 in order to get the endpoint back to a working state.

Another piece of information - if any failure occurs when accessing BAR0, the "good" regions become "bad". So seems that the logic associated with BAR0 locks up on the first access to a "bad" region of BAR0 (e.g. 0x2400).

Is it possible that the machine that I am using needs to set something up in PCIe configuration space, possibly related to virtualization, in order to deassert a reset somewhere in the QDMA IP? It is a Supermicro X10SAT motherboard. The QDMA IP is not configured to have any virtual functions, though.

0 Kudos
deepeshm
Xilinx Employee
Xilinx Employee
688 Views
Registered: ‎08-06-2008

For the following, could you please try by enabling mailbox in the core configuration GUI?

0x2400 to 0x240F* (QDMA_PF_MAILBOX) - all bytes returned are 0xFF, probable timeout

Thanks.

View solution in original post

adatatw
Participant
Participant
657 Views
Registered: ‎12-06-2011

@deepeshm 

I have enabled that option now, and it fixes the issue of the QDMA PCIe endpoint hanging up on access to the 0x2400 register block. The Xilinx QDMA driver also seems to work now.

The Xilinx QDMA Driver seems to assume that the mailbox registers are enabled, but the checkbox "Enable mailbox among functions" is unchecked by default in the IP. So users are left to guess why their design crashes the Xilinx driver, even with a 100% default QDMA configuration.

Having holes in a BAR containing registers, where the holes swallow up unsuspecting PCIe transactions (never to be seen again), is not a sane design. QDMA is not the only Xilinx IP that behaves this way; in the XDMA IP, there is also a large hole in the register BAR which behaves similarly.

Anyway, ranting aside, thank you for your help @deepeshm.

0 Kudos