cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
xivar
Observer
Observer
1,404 Views
Registered: ‎08-20-2019

QDMA C2H completion interface locks up when using multiple queues

Jump to solution

I have a weird issue with QDMA v3.0, where when I send packets to 15 or more queues, C2H completion entry locks up. I am using streaming mode for C2H (and H2C), and when this lock up occurs c2h_cmpt_ready gets stuck to low (as seen in the ILA capture below) and never recovers.

cmpt_tready_stuck_at_low.png

Since I couldn't find much clue from the user logic that interfaces to QDMA, I changed the value of MDMA_PFCH_CACHE_DEPTH to 32 (from the default 16). With this change it worked fine until I stared using 34/35 queues. Then it got locked up again. So I changed the value of prefetch cache depth to 64 and then I could activate 40 queues without any lock up. I haven't tried more queues, because 40 is the maximum I need for now. However, my intuition says it could very well break when I activate more than 64 queues.

According to pg302, MDMA_PFCH_CACHE_DEPTH is a performance tweaking parameter. Hence I think there is something else going wrong in the logic/ip configuration. Has anyone here seen anything like this? Can someone help me resolve this issue please?

Some additional points to consider.

1. With MDMA_PFCH_CACHE_DEPTH=16, less than 15 active queues work flawlessly.

2. When more than 15 queues are “activated” (at the same time or at random times) C2H CMPT interface breaks. Activated here simple means C2H received at least one packet with that QID.

3. Out of the 40 queues in my design, it doesn't matter which 15 queue combinations I use. It always breaks with 15 active queues.

4. Traffic is at a low rate.

Thank you

0 Kudos
1 Solution

Accepted Solutions
xivar
Observer
Observer
1,263 Views
Registered: ‎08-20-2019

@dmsspb  @markramona

I haven't tried using version 4.0. And my completion data is very different from what we have in the example design. But I may have just found (a) reason for my observation.

In the driver, the part of the code that configures the register QDMA_C2H_PFCH_CFG (pg302 v3.0, page 147) was removed for some reason. This means the register will be populated with its default value and the default assumes a prefetch cache depth of 64 (I think). However, in my IP instantiation prefetch cache depth is configured as 16. So clearly, the register configuration is wrong here.

I can't explain how that relates to the issue I am seeing, but this seems like the probable cause. I will go and run some tests after fixing the register config. I will mark this post as "Accepted solution" once I confirm my theory.

View solution in original post

12 Replies
markramona
Explorer
Explorer
1,367 Views
Registered: ‎12-12-2018

Are you incrementing the global packet ID?

 

I'm able to use QDMA with about 64 queues (or more very likely) ST-C2H

0 Kudos
xivar
Observer
Observer
1,356 Views
Registered: ‎08-20-2019

@markramonayes, I am incrementing `c2h_cmpt_ctrl_wait_pld_pkt_id` after every completion entry write back. Also I can confirm that there are enough credits available for the queues.

Following is a screenshot of a multiple packet transfer on different QIDs.

qdma_c2h_st.png

This is for 14 queues, and I used QIDs 0-13 for this capture. They all work fine, no packet drops or lock up. However, once I send a packet to another QID (that's not among 0-13), CMPT interface becomes non-responsive afterwards. And C2H locks up because of that. As I mentioned, it doesn't matter which QID combinations I use.

That's interesting to know that you are able to use 64 and more queues without any issues. It tells me that the problem might be somewhere in the user logic (or in the IP configuration). There are few things that's different in my implementation of the user logic. I don't use a FIFO (as in the speed example) to buffer up C2H packets or CMPT write backs to transfer them at a later time. Instead, I follow up a C2H packet transfer immediately with the corresponding CMPT entry write back (as you can see from the screenshot above). And this is true for every C2H packet transfer. I am not sure if that's the problem though.

0 Kudos
markramona
Explorer
Explorer
1,349 Views
Registered: ‎12-12-2018

We definitely have a fifo for the completion.

 

But we also throttle the whole bus if our completion fifo is full.

0 Kudos
dmsspb
Contributor
Contributor
1,291 Views
Registered: ‎01-13-2020

Have you tried a QDMA v4.0 with updated driver?

0 Kudos
markramona
Explorer
Explorer
1,276 Views
Registered: ‎12-12-2018

We also do a few more things:

 

  • We only send packets 4096 long.
  • We also add those magic numbers that are in their example design in the completion data.

 

Do you show us what you send in your completion message?

0 Kudos
xivar
Observer
Observer
1,264 Views
Registered: ‎08-20-2019

@dmsspb  @markramona

I haven't tried using version 4.0. And my completion data is very different from what we have in the example design. But I may have just found (a) reason for my observation.

In the driver, the part of the code that configures the register QDMA_C2H_PFCH_CFG (pg302 v3.0, page 147) was removed for some reason. This means the register will be populated with its default value and the default assumes a prefetch cache depth of 64 (I think). However, in my IP instantiation prefetch cache depth is configured as 16. So clearly, the register configuration is wrong here.

I can't explain how that relates to the issue I am seeing, but this seems like the probable cause. I will go and run some tests after fixing the register config. I will mark this post as "Accepted solution" once I confirm my theory.

View solution in original post

dmsspb
Contributor
Contributor
1,241 Views
Registered: ‎01-13-2020

We use a 8 bytes long completion for a C2H-ST with [19..4] bits = packet length. Other bits of the completion we use for our purpose.

0 Kudos
markramona
Explorer
Explorer
1,237 Views
Registered: ‎12-12-2018

I see that we have the following

  assign m_axis_c2h_cmpt_tdata = {{492{1'b0}}, packet_ctrl_len,  4'h8};

not sure if that helps you.

0 Kudos
markramona
Explorer
Explorer
1,236 Views
Registered: ‎12-12-2018
output [511:0] m_axis_c2h_cmpt_tdata,
0 Kudos
xivar
Observer
Observer
1,215 Views
Registered: ‎08-20-2019

I ran a bunch of tests for extended periods of time after correctly populating the register QDMA_C2H_PFCH_CFG. I haven't seen any C2H lock up afterwards. So I am happy to say this issue has been resolved. After all, it was an error introduced from my side.

I mentioned that I use a different completion entry format in my application. It looks like this,

 

m_axis_c2h_cmpt_tdata = { 224'b0, pkt_length[15:0], pkt_checksum_fail, pkt_err, pkt_last, pkt_first, num_disc_used[3:0], cmpt_entry_type[7:0] };

 

The completion entry size is also changed to 8B. We had to tweak the corresponding part of the driver to make this work, of course. But that seems okay with QDMA.

Thank you for all the help @markramona  @dmsspb 

markramona
Explorer
Explorer
1,176 Views
Registered: ‎12-12-2018

Are you saying that you were configuring the driver wrong with respect to

QDMA_C2H_PFCH_CFG (pg302 v3.0, page 147)

 

Or was this a bug introduced in recent updates to the qdma kernel driver?

0 Kudos
xivar
Observer
Observer
1,167 Views
Registered: ‎08-20-2019

@markramona 

Yes, this was a mistake introduced by us in the driver. In the process of modifying the driver, part of the code that populates QDMA_C2H_PFCH_CFG (prefetch configuration) was commented out.

I do not think there is any bug in Xilinx QDMA driver. Apologies if this has caused any confusion/trouble.