cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
Highlighted
Adventurer
Adventurer
3,009 Views
Registered: ‎01-26-2017

Petalinux 2017.2 Kernel panic in "skb_copy_and_csum_dev()" inside "xilinx_axienet_main.c"

Jump to solution

Hi,

 

I am observing a kernel panic with 100% reproducibility when passing traffic through the 10G PCS+MAC connected to AXIDMA connected to the zynq ultrascale+ ZCU102 board.

 

I've traced the kernel panic down to a memcpy originating from the "skb_copy_and_csum_dev()" function inside "drivers/net/ethernet/xilinx/xilinx_axienet_main.c" when using a mtu = 9000.

 

Can you please help me find a workaround for this?

 

Here are some specifics of my test setup:

- I am using the ZCU102 board similar to the configuration provided in XAPP1305 with the PS connected to an AXI DMA block connected to a 10G PCS+MAC Ethernet block with the 10G as ETH1 in linux

- Vivado and Petalinux are both 2017.2

 

After booting petalinux, this is how I reproduce the kernel panic:

ifconfig eth1 mtu 9000

ping 192.168.1.1 # this pings my other board successfully.

ping 192.168.1.1 -s 8800 # this pings my other board successfully.

ping 192.168.1.1 -s 9000 # this command causes a kernel panic.

 

I was able to look at the kernel panic stack trace and find that the last called function was a memcpy from a xmit function within drivers/net/ethernet/xilinx/xilinx_axienet_main.c. By inserting printk statements I was able to narrow down the line causing the panic to a skb_copy_and_csum_dev() call.

 

Also, the same Vivado design implemented in 2016.4 from the XAPP1305, is able to execute the "ping -s 9000" fine without any issues. This code uses a patch to implement the 10G core driver functionality within drivers/net/ethernet/xilinx/xilinx_axienet_main.c and has a much different infrastructure than the 2017.2 version.

 

Thanks,

Adam

0 Kudos
1 Solution

Accepted Solutions
Highlighted
Xilinx Employee
Xilinx Employee
4,328 Views
Registered: ‎02-18-2014

Small correction on Radhey's comment 

To enable DRE support "xlnx,include-dre" property should be added in DMA node not to the DMA channel node.

 

Example DT node for  DMA:

axi_eth_0_dma: dma@80040000 {
#dma-cells = <1>;
clock-names = "s_axi_lite_aclk";
clocks = <&misc_clk_0>;
compatible = "xlnx,eth-dma";
interrupt-parent = <&gic>;
interrupts = <0 89 4 0 90 4>;
reg = <0x0 0x80040000 0x0 0x10000>;
xlnx,include-dre ;
};

View solution in original post

10 Replies
Highlighted
Adventurer
Adventurer
2,971 Views
Registered: ‎01-26-2017

Update:

 

After more tracing it looks like there is a line in net/core/skbuff.c that is calling the memcpy which causes the kernel panic. Here is a summary of what I have so far. Can someone please help suggest how I can change the driver to avoid this error? Or if there is something within Vivado or Petalinux I can change to avoid this?

 

---

 

in "axienet_start_xmit" in drivers/net/ethernet/xilinx/xilinx_axienet_main.c:
skb_copy_and_csum_dev(skb, q->tx_buf[q->tx_bd_tail]); /* THIS LINE CAUSES PANIC */

in "skb_copy_and_csum_dev" in net/core/skbuff.c:
skb_copy_from_linear_data(skb, to, csstart); /* THIS LINE CAUSES PANIC */

in "skb_copy_from_linear_data" in include/linux

static inline void skb_copy_from_linear_data(const struct sk_buff *skb,
                                             void *to,
                                             const unsigned int len)
{
        memcpy(to, skb->data, len);
}

 

---

 

I printed out the value of csstart before the crash and it is 9010.

 

Please let me know if you have any suggestions to help fix this.

 

Thanks,

Adam

0 Kudos
Highlighted
Adventurer
Adventurer
2,931 Views
Registered: ‎01-26-2017

Hi,

 

Has anyone at Xilinx had a chance to look at this? Any suggestions or recommendations to try?

 

Thanks!

Adam

0 Kudos
Highlighted
Adventurer
Adventurer
2,886 Views
Registered: ‎01-26-2017

Hi,

 

Checking in again to see if there are any recommendations or suggestions from Xilinx to help with this issue.

 

Thanks,

Adam

0 Kudos
Highlighted
Xilinx Employee
Xilinx Employee
2,844 Views
Registered: ‎02-20-2014

Hi Adam,

 

Have you tried XAPP1035 2017.1 released version? Looks like the last update was in 2017.1.

 

Can you confirm if below works?

ping 192.168.1.1 -s 8800 # this pings my other board successfully.

 

Based on implementation if DRE is not enabled in the h/w tx buffers are allocated as-

XAE_MAX_PKT_LEN * TX_BD_NUM where XAE_MAX_PKT_LEN = 8192 

 

Thanks,

Radhey

0 Kudos
Highlighted
Adventurer
Adventurer
2,833 Views
Registered: ‎01-26-2017

@radheys

 

Thanks for the reply.

 

Yes, I used the xapp1305 2017.1 version as my starting point.

 

The "ready to test" binary files in the xapp1305 2017.1 file have a limit on the maximum mtu (by default they are set to 1500, and you can change them to values less than ~1024). I increased the FIFO sizes in the xapp1305 2017.1 to allow higher mtu values, and observe the kernel panic issues on 2017.1 as well as when I migrated the project to 2017.2. The ping -s 8800 works on both the 2017.1 and 2017.2 designs I have. I had a more thorough post on the specific setup I am using here:

 

https://forums.xilinx.com/t5/Networking-and-Connectivity/Is-there-petalinux-2017-2-support-for-10G-Ethernet-Subsystem-v2/td-p/789848

 

I checked my petalinux /components/plnx_workspace/device-tree-generation/plnx_aarch64-system.dts files and I see in the ethernet@8100000 device tree entry:

"xlnx,include-dre;"

 

However, when I was insert printk statements into xilinx_axienet_main.c, I see the skb_copy_and_csum_dev() function executed, which eventually leads to the kernel panic. The if statement just above this line seems to indicate that q->eth_has_dre is FALSE though, which seems to contradict the "xlnx,include-dre;" line in the petalinux device tree. I copied the relevant code below:

 

 

if (!q->eth_hasdre &&
	    (((phys_addr_t)skb->data & 0x3) || (num_frag > 0))) {
		skb_copy_and_csum_dev(skb, q->tx_buf[q->tx_bd_tail]);

		cur_p->phys = q->tx_bufs_dma +
			      (q->tx_buf[q->tx_bd_tail] - q->tx_bufs);

		if (num_frag > 0) {
pad = skb_pagelen(skb) - skb_headlen(skb);

 

 

 

0 Kudos
Highlighted
Xilinx Employee
Xilinx Employee
2,821 Views
Registered: ‎02-18-2014

Hi,

 

Does the DMA node has xlnx,include-dre; property???

Driver is expecting the xlnx,include-dre; property from the DMA node.

 

https://github.com/Xilinx/linux-xlnx/blob/master/drivers/net/ethernet/xilinx/xilinx_axienet_main.c#L3404

q->eth_hasdre = of_property_read_bool(np,
  "xlnx,include-dre");

 

Could you please add the xlnx,include-dre; property to the DMA node if it is not there and let me know the test observations??? 

 

Kedar....

0 Kudos
Highlighted
Xilinx Employee
Xilinx Employee
2,815 Views
Registered: ‎02-20-2014

Hi Adam,

 

Thanks for the clarification. Looks like "xlnx,include-dre" is not set properly.

To enable DRE support "xlnx,include-dre" should be added in DMA channel node.

 

Example:

axi_dma_0: dma@80000000 {
<snip>
dma-channel@80000000 {
compatible = "xlnx,axi-dma-mm2s-channel";
dma-channels = <0x1>;
interrupts = <0 89 4>;
xlnx,datawidth = <0x40>;
xlnx,device-id = <0x0>;
xlnx,include-dre ;
};
dma-channel@80000030 {
compatible = "xlnx,axi-dma-s2mm-channel";
dma-channels = <0x1>;
interrupts = <0 90 4>;
xlnx,datawidth = <0x40>;
xlnx,device-id = <0x0>;
xlnx,include-dre ;
};
};

 

Debug Flow (Please provide your observation)

a) Add xlnx,include-dre in DMA node and see if it works? Please share your dts for reference.

b) Add a print in axi ethernet driver to print the value of q->eth_hasdre after DT parsing.

c) In function axienet_start_xmit()  add print to display TX queue index.

 

u16 map = skb_get_queue_mapping(skb); /* Single dma queue default*/

pr_info("skb_map index %d",map);

 

Thanks,

Radhey

 

 

 

Highlighted
Xilinx Employee
Xilinx Employee
4,329 Views
Registered: ‎02-18-2014

Small correction on Radhey's comment 

To enable DRE support "xlnx,include-dre" property should be added in DMA node not to the DMA channel node.

 

Example DT node for  DMA:

axi_eth_0_dma: dma@80040000 {
#dma-cells = <1>;
clock-names = "s_axi_lite_aclk";
clocks = <&misc_clk_0>;
compatible = "xlnx,eth-dma";
interrupt-parent = <&gic>;
interrupts = <0 89 4 0 90 4>;
reg = <0x0 0x80040000 0x0 0x10000>;
xlnx,include-dre ;
};

View solution in original post

Highlighted
Adventurer
Adventurer
2,801 Views
Registered: ‎01-26-2017

@appanad

 

Thanks!!! Adding "xlnx,include-dre" in the DMA node fixed my issue! I can now set the mtu to 9000 and send many packets without experiencing a kernel panic.

 

@radheys

 

Thanks for the "include-dre" suggestion!

 

The petalinux flow included the "include-dre" option in the ethernet and DMA S2MM and MM2S channels but NOT the DMA node as mentioned by appand.

 

Before I added the include-dre to the DMA node, I was observing q->eth_hasdre return 0. After adding the "include-dre" in the DMA node, q->eth_hasdre returns 1 along with the kernel panics going away.

 

I am attaching my device trees to this post as well. All modifications were completed within the system-user.dtsi.

 

Thanks again,

Adam

 

 

0 Kudos
Highlighted
Xilinx Employee
Xilinx Employee
1,234 Views
Registered: ‎02-20-2014

Hi Adam,

 

Good to hear that kernel panic is fixed.

Thanks @appanad

 

-Radhey

0 Kudos