UPGRADE YOUR BROWSER

We have detected your current browser version is not the latest one. Xilinx.com uses the latest web technologies to bring you the best online experience possible. Please upgrade to a Xilinx.com supported browser:Chrome, Firefox, Internet Explorer 11, Safari. Thank you!

cancel
Showing results for 
Search instead for 
Did you mean: 
Highlighted
Observer eazrael
Observer
426 Views
Registered: ‎05-19-2017

Zynq-7000 ethernet broken in Petalinux 2018.1?

After upgrading from Petalinux 2017.4 to Petalinux 2018.1 the Linux kernel crashes during boot while initiliazing the Zynq-7000 integrated ethernet device. I've bisected the Xilinx Linux kernel between 2017.4 and 2018.1 and I think i found the commit causing the problems. 
 
I am not sure if this is caused by me or somebody else, so I would be nice if others may have a look at this to rule out that only me is effected. Possible reasons:
  • Z-Turn peculiarities (I've got the version with the KSZ9031N instead of the Atheros PHY)
  • My hardware design/device tree
  • Z-Turn u-boot init code I hacked from their outdated ubuntu release into the recent U-Boot releases (worked with 2017.4)
  • Anything else? 
 
Here's my error description. During bootup linux segfaults in the cadence ethernet driver. Probably caused by an interrupt occuring while trying to release the IRQ:
[    0.875121] libphy: Fixed MDIO Bus: probed
[    0.881203] CAN device driver interface
[    0.886178] libphy: MACB_mii_bus: probed
[    0.890434] Unable to handle kernel paging request at virtual address 6b6b6b73
[    0.897577] pgd = c0004000
[    0.900266] [6b6b6b73] *pgd=00000000
[    0.903829] Internal error: Oops - BUG: 5 [#1] PREEMPT SMP ARM
[    0.909642] Modules linked in:
[    0.912684] CPU: 1 PID: 1 Comm: swapper/0 Not tainted 4.14.0-rc4-xilinx #1
[    0.919536] Hardware name: Xilinx Zynq Platform
[    0.924051] task: ef0cf7c0 task.stack: ef0e4000
[    0.928573] PC is at macb_interrupt+0x24/0x3a8
[    0.932997] LR is at __free_irq+0x17c/0x298
[    0.937158] pc : [<c04d6674>]    lr : [<c016b920>]    psr: 60000193
[    0.943407] sp : ef0e5d18  ip : ef0e5d58  fp : ef0e5d54
[    0.948615] r10: 20000113  r9 : ef2f2538  r8 : ef215c14
[    0.953824] r7 : 0000001b  r6 : ef215ce4  r5 : 6b6b6b6b  r4 : ef2f2538
[    0.960333] r3 : c04d6650  r2 : 00000000  r1 : 6b6b6b6b  r0 : 0000001b
[    0.966845] Flags: nZCv  IRQs off  FIQs on  Mode SVC_32  ISA ARM  Segment none
[    0.974048] Control: 18c5387d  Table: 0000404a  DAC: 00000051
[    0.979776] Process swapper/0 (pid: 1, stack limit = 0xef0e4210)
[    0.985766] Stack: (0xef0e5d18 to 0xef0e6000)
[    0.990108] 5d00:                                                       ef0e5d54 ef0e5d28
[    0.998273] 5d20: c016b1e0 c016aefc ef0e5d44 ef215c00 eea4df80 ef215ce4 0000001b ef215c14
[    1.006431] 5d40: ef2f2538 20000113 ef0e5d8c ef0e5d58 c016b920 c04d665c eea66a80 ef2f2538
[    1.014590] 5d60: ef0e5d7c 0000001b ef2f2538 ef214810 00000006 eea66a80 00000000 c0464db8
[    1.022749] 5d80: ef0e5da4 ef0e5d90 c016bad8 c016b7b0 eea4ddc0 ef0e5db8 ef0e5db4 ef0e5da8
[    1.030909] 5da0: c016f4d0 c016ba8c ef0e5dec ef0e5db8 c0464f98 c016f4c0 eea66e00 eea4ddc0
[    1.039067] 5dc0: a0000113 ef214810 c0b8756c c0b87570 ffffffed 00000000 c0b429f0 00000000
[    1.047227] 5de0: ef0e5e04 ef0e5df0 c0465598 c0464df4 ef214810 c0b8756c ef0e5e34 ef0e5e08
[    1.055386] 5e00: c04616e0 c0465564 00000000 ef214810 ef214844 c0b429f0 c0b3a6d8 c0a32ef8
[    1.063545] 5e20: c0a82f9c 00000000 ef0e5e54 ef0e5e38 c04618dc c046159c 00000000 c0b429f0
[    1.071705] 5e40: c0461850 c0b3a6d8 ef0e5e7c ef0e5e58 c045fbdc c046185c ef12fe70 ef18cac0
[    1.079863] 5e60: c06e609c c0b429f0 eea69880 00000000 ef0e5e8c ef0e5e80 c0461154 c045fb58
[    1.088023] 5e80: ef0e5eb4 ef0e5e90 c0460c98 c0461138 c08b0b66 ef0e5ea0 c0b429f0 c0b6ab00
[    1.096182] 5ea0: 000000b2 00000000 ef0e5ecc ef0e5eb8 c04624b4 c0460b20 ffffe000 c0b6ab00
[    1.104342] 5ec0: ef0e5edc ef0e5ed0 c046331c c0462410 ef0e5eec ef0e5ee0 c0a32f18 c04632e8
[    1.112500] 5ee0: ef0e5f5c ef0e5ef0 c0101d18 c0a32f04 00000000 c08b556e ef0e5f00 ef0e5f08
[    1.120659] 5f00: c013dcb4 c0a00654 00000000 c091f380 000000b1 c091f380 00000006 00000006
[    1.128817] 5f20: 000000b2 c091e598 efffcde4 00000000 00000000 00000007 c0b6ab00 00000007
[    1.136978] 5f40: c0b6ab00 000000b2 c0a51840 c0b6ab00 ef0e5f94 ef0e5f60 c0a00f50 c0101c14
[    1.145136] 5f60: 00000006 00000006 00000000 c0a00648 00000000 c06def9c 00000000 00000000
[    1.153296] 5f80: 00000000 00000000 ef0e5fac ef0e5f98 c06defb4 c0a00dc0 00000000 c06def9c
[    1.161454] 5fa0: 00000000 ef0e5fb0 c0107c08 c06defa8 00000000 00000000 00000000 00000000
[    1.169614] 5fc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
[    1.177772] 5fe0: 00000000 00000000 00000000 00000000 00000013 00000000 00000000 00000000
[    1.185942] [<c04d6674>] (macb_interrupt) from [<c016b920>] (__free_irq+0x17c/0x298)
[    1.193662] [<c016b920>] (__free_irq) from [<c016bad8>] (free_irq+0x58/0x6c)
[    1.200695] [<c016bad8>] (free_irq) from [<c016f4d0>] (devm_irq_release+0x1c/0x20)
[    1.208246] [<c016f4d0>] (devm_irq_release) from [<c0464f98>] (release_nodes+0x1b0/0x1cc)
[    1.216403] [<c0464f98>] (release_nodes) from [<c0465598>] (devres_release_all+0x40/0x4c)
[    1.224566] [<c0465598>] (devres_release_all) from [<c04616e0>] (driver_probe_device+0x150/0x2c0)
[    1.233418] [<c04616e0>] (driver_probe_device) from [<c04618dc>] (__driver_attach+0x8c/0xb8)
[    1.241835] [<c04618dc>] (__driver_attach) from [<c045fbdc>] (bus_for_each_dev+0x90/0xa0)
[    1.249994] [<c045fbdc>] (bus_for_each_dev) from [<c0461154>] (driver_attach+0x28/0x30)
[    1.257980] [<c0461154>] (driver_attach) from [<c0460c98>] (bus_add_driver+0x184/0x1ec)
[    1.265966] [<c0460c98>] (bus_add_driver) from [<c04624b4>] (driver_register+0xb0/0xf0)
[    1.273955] [<c04624b4>] (driver_register) from [<c046331c>] (__platform_driver_register+0x40/0x54)
[    1.282983] [<c046331c>] (__platform_driver_register) from [<c0a32f18>] (macb_driver_init+0x20/0x28)
[    1.292095] [<c0a32f18>] (macb_driver_init) from [<c0101d18>] (do_one_initcall+0x110/0x130)
[    1.300433] [<c0101d18>] (do_one_initcall) from [<c0a00f50>] (kernel_init_freeable+0x19c/0x1e0)
[    1.309116] [<c0a00f50>] (kernel_init_freeable) from [<c06defb4>] (kernel_init+0x18/0x11c)
[    1.317358] [<c06defb4>] (kernel_init) from [<c0107c08>] (ret_from_fork+0x14/0x2c)
[    1.324900] Code: e8bd4000 e5915000 e1a04001 e5911008 (e5953008)
[    1.330980] ---[ end trace 1ebb284224a5896a ]---
[    1.335680] Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b
[    1.335680]
[    1.344743] CPU1: stopping
[    1.347431] CPU: 1 PID: 0 Comm: swapper/1 Tainted: G      D         4.14.0-rc4-xilinx #1
[    1.355497] Hardware name: Xilinx Zynq Platform
[    1.360033] [<c011151c>] (unwind_backtrace) from [<c010c3ec>] (show_stack+0x20/0x24)
[    1.367752] [<c010c3ec>] (show_stack) from [<c06cc750>] (dump_stack+0xa8/0xdc)
[    1.374953] [<c06cc750>] (dump_stack) from [<c010fa64>] (handle_IPI+0x238/0x330)
[    1.382330] [<c010fa64>] (handle_IPI) from [<c01014cc>] (gic_handle_irq+0x94/0xa0)
[    1.389879] [<c01014cc>] (gic_handle_irq) from [<c010cef0>] (__irq_svc+0x70/0xb0)
[    1.397336] Exception stack(0xef0fff40 to 0xef0fff88)
[    1.402377] ff40: 00000001 00000000 00000000 00000000 00000000 00000000 ffffe000 c0b054a8
[    1.410537] ff60: 0000406a 413fc090 00000000 ef0fff9c ef0fff90 ef0fff90 c01086e4 c01086e8
[    1.418690] ff80: 60000113 ffffffff
[    1.422172] [<c010cef0>] (__irq_svc) from [<c01086e8>] (arch_cpu_idle+0x30/0x4c)
[    1.429555] [<c01086e8>] (arch_cpu_idle) from [<c06e5f84>] (default_idle_call+0x40/0x48)
[    1.437621] [<c06e5f84>] (default_idle_call) from [<c015aa60>] (do_idle+0x110/0x1c8)
[    1.445345] [<c015aa60>] (do_idle) from [<c015ac84>] (cpu_startup_entry+0x28/0x2c)
[    1.452897] [<c015ac84>] (cpu_startup_entry) from [<c010f5ac>] (secondary_start_kernel+0x130/0x154)
[    1.461923] [<c010f5ac>] (secondary_start_kernel) from [<0010194c>] (0x10194c)
[    1.469131] ---[ end Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b
[    1.469131]
This happens not all the time, but in almost every try, may be releated to the randomness of interupts. 
 
I did a git bisect with the xilinx kernel from Github. Seems that this commit is the culprit:
$ git bisect bad
66bdede495c71da9c5ce18542976fae53642880b is the first bad commit
commit 66bdede495c71da9c5ce18542976fae53642880b
Author: Geert Uytterhoeven <geert+renesas@glider.be>
Date:   Wed Oct 18 13:54:03 2017 +0200

    of_mdio: Fix broken PHY IRQ in case of probe deferral

    If an Ethernet PHY is initialized before the interrupt controller it is
    connected to, a message like the following is printed:

        irq: no irq domain found for /interrupt-controller@e61c0000 !

    However, the actual error is ignored, leading to a non-functional (POLL)
    PHY interrupt later:

        Micrel KSZ8041RNLI ee700000.ethernet-ffffffff:01: attached PHY driver [Micrel KSZ8041RNLI] (mii_bus:phy_addr=ee700000.ethernet-ffffffff:01, irq=POLL)

    Depending on whether the PHY driver will fall back to polling, Ethernet
    may or may not work.

    To fix this:
      1. Switch of_mdiobus_register_phy() from irq_of_parse_and_map() to
         of_irq_get().
         Unlike the former, the latter returns -EPROBE_DEFER if the
         interrupt controller is not yet available, so this condition can be
         detected.
         Other errors are handled the same as before, i.e. use the passed
         mdio->irq[addr] as interrupt.
      2. Propagate and handle errors from of_mdiobus_register_phy() and
         of_mdiobus_register_device().

    Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
    Signed-off-by: David S. Miller <davem@davemloft.net>

:040000 040000 19c3e6ee64cbeebdb85ae4b2935c28cac70f2557 c2a372a9a35c51c0a565e1559d2dcb79d4612487 M      drivers
 
Git bisect between tags/xilinx-v2017.4 and tags/xilinx-v2018.1
git bisect start
# good: [b450e900fdb473a53613ad014f31eedbc80b1c90] imx274: Fix error handling
git bisect good b450e900fdb473a53613ad014f31eedbc80b1c90
# bad: [15b23f7fa80ed8166af46fb4dd971dbc12d46ad2] drm: xlnx: zynqmp: Disable a plane when the fb format changes
git bisect bad 15b23f7fa80ed8166af46fb4dd971dbc12d46ad2
# good: [0be75179df5e20306528800fc7c6a504b12b97db] Merge tag 'driver-core-4.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core
git bisect good 0be75179df5e20306528800fc7c6a504b12b97db
# good: [2cd648c110b5570c3280bd645797658cabbe5f5c] include/linux/sem.h: correctly document sem_ctime
git bisect good 2cd648c110b5570c3280bd645797658cabbe5f5c
# good: [aae3dbb4776e7916b6cd442d00159bea27a695c1] Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next
git bisect good aae3dbb4776e7916b6cd442d00159bea27a695c1
# good: [ae46654bcff303b33facbbd04a3ad9c21d303f9b] Merge tag 'armsoc-drivers' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc
git bisect good ae46654bcff303b33facbbd04a3ad9c21d303f9b
# good: [2569e7e1d684e418ba7ffc9d0ad9a5f5247df0a0] Merge commit 'keys-fixes-20170927' into fixes-v4.14-rc3
git bisect good 2569e7e1d684e418ba7ffc9d0ad9a5f5247df0a0
# bad: [b5ac3beb5a9f0ef0ea64cd85faf94c0dc4de0e42] Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
git bisect bad b5ac3beb5a9f0ef0ea64cd85faf94c0dc4de0e42
# good: [8d473320eebf938e9c2e3ce569e524554006362c] Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
git bisect good 8d473320eebf938e9c2e3ce569e524554006362c
# good: [e7a36a6ec9cf1b60273e48ee980b8920f333bd4d] Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
git bisect good e7a36a6ec9cf1b60273e48ee980b8920f333bd4d
# good: [545ea16f7c42969f94c769d0c2267cf4a65e5850] Merge tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc
git bisect good 545ea16f7c42969f94c769d0c2267cf4a65e5850
# good: [c92e8c02fe664155ac4234516e32544bec0f113d] tcp/dccp: fix ireq->opt races
git bisect good c92e8c02fe664155ac4234516e32544bec0f113d
# good: [54d431176429e9cf064461589e5174349a9f73da] sock: correct sk_wmem_queued accounting on efault in tcp zerocopy
git bisect good 54d431176429e9cf064461589e5174349a9f73da
# good: [e5f468b3f23313994c5e6c356135f9b0d76bcb94] Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input
git bisect good e5f468b3f23313994c5e6c356135f9b0d76bcb94
# good: [98870943a561c64aca22d10820a881aa4fa728e4] net: stmmac: Fix stmmac_get_rx_hwtstamp()
git bisect good 98870943a561c64aca22d10820a881aa4fa728e4
# good: [7433a8d6fa60a2f6910206fa10f3550c8f11f45f] textsearch: fix typos in library helpers
git bisect good 7433a8d6fa60a2f6910206fa10f3550c8f11f45f
# bad: [864e2a1f8aac05effac6063ce316b480facb46ff] ipv6: flowlabel: do not leave opt->tot_len with garbage
git bisect bad 864e2a1f8aac05effac6063ce316b480facb46ff
# bad: [66bdede495c71da9c5ce18542976fae53642880b] of_mdio: Fix broken PHY IRQ in case of probe deferral
git bisect bad 66bdede495c71da9c5ce18542976fae53642880b
# first bad commit: [66bdede495c71da9c5ce18542976fae53642880b] of_mdio: Fix broken PHY IRQ in case of probe deferral
 
Can somebody please try to recreate the issue? 
 
Any help is appreciated
Tags (4)
0 Kudos