11-05-2018 05:30 AM
I'm using Xilinx's linux-xlnx kernel tagged xilinx-v2018.1 to integrate with the 1/2.5G AXI Ethernet Subsystem on a PicoZed SOM and FMCv2 carrier. My colleagues and I followed this post to add our own RTC core and integrate it with both the subsystem and linux kernel. The suggestions from that post mirror those from Xilinx's wiki, which states you have to bring your own clock driver to use the subsystem in a 1588 environment (link).
Per that last link, our subsystem is connected in non-processor mode. What's less clear from that link is if they're indicating that the 1G MAC cannot also use 1588 (it lists both the MAC and as a separate bullet, 1588, but not like the other MACs where the name and 1588 are on the same bullet -- strange). The 7.1 version of the documentation states that 1588 is supported for both 1 and 2.5G operation (page 6). I've read their driver source code for the xilinx_axienet_main.c and do not see where the 1G gets singled out vs. the 2.5G.
From all the above, I have to believe it should be possible to use 1588 support on a 1G MAC in Linux. However, I get a kernel panic when I follow the instructions to enable sync and then start the LinuxPTP daemon (section, 1588 Testing):
[root@host]:# phc2sys -s eth0 -w & [root@host]:# ptp4l -m -i eth0 ptp4l[100.796]: selected /dev/ptp0 as PTP clock ptp4l[100.797]: port 1: INITIALIZING to LISTENING on INIT_COMPLETE ptp4l[100.798]: port 0: INITIALIZING to LISTENING on INIT_COMPLETE ptp4l[108.234]: port 1: LISTENING to MASTER on ANNOUNCE_RECEIPT_TIMEOUT_EXPIRES ptp4l[108.235]: selected local clock 000a35.fffe.000000 as best master ptp4l[108.235]: assuming the grand master role xilinx_axienet 41000000.ethernet: Did't get FIFO rx interrupt 9437184 Unhandled fault: imprecise external abort (0x1406) at 0x0043aeec pgd = c0004000 [0043aeec] *pgd=00000000 Internal error: Oops - BUG: 1406 [#1] PREEMPT SMP ARM Modules linked in: CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.14.0-xilinx-v2018.1 #8 Hardware name: Xilinx Zynq Platform task: c0a06d80 task.stack: c0a00000 PC is at cpuidle_enter_state+0xe4/0x1b4 LR is at cpuidle_enter_state+0xc0/0x1b4 pc : [<c04bb828>] lr : [<c04bb804>] psr: 600f0013 sp : c0a01f70 ip : 00000015 fp : 00000000 r10: 00000000 r9 : 00000019 r8 : 6f5fbbaa r7 : 00000000 r6 : ef7d4340 r5 : 00000019 r4 : 6f642674 r3 : ef7d4e80 r2 : 2ee8f000 r1 : 00000019 r0 : 00000000 Flags: nZCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment none Control: 18c5387d Table: 1fce404a DAC: 00000051 Process swapper/0 (pid: 0, stack limit = 0xc0a00210) Stack: (0xc0a01f70 to 0xc0a02000) 1f60: ef7d4340 c0a32b70 00000000 ef7d4340 1f80: c0a32b70 ffffe000 c0a03c68 c0a03cb4 c0933a30 00000000 00000000 c014e0d8 1fa0: 000000be 00000001 ffffffff c0a03c40 efffcd00 c014e270 c0a3ba4c c0900bbc 1fc0: ffffffff ffffffff 00000000 c090066c 00000000 c0933a30 00000000 c0a3bc94 1fe0: c0a03c58 c0933a2c c0a08004 0000406a 413fc090 0000807c 00000000 00000000 [<c04bb828>] (cpuidle_enter_state) from [<c014e0d8>] (do_idle+0x148/0x1a8) [<c014e0d8>] (do_idle) from [<c014e270>] (cpu_startup_entry+0x18/0x1c) [<c014e270>] (cpu_startup_entry) from [<c0900bbc>] (start_kernel+0x304/0x364) Code: f10c0080 e3a00000 ebf2dff3 f1080080 (e0540008) ---[ end trace 9f32ffb8a7d3d425 ]--- Kernel panic - not syncing: Attempted to kill the idle task! CPU1: stopping CPU: 1 PID: 0 Comm: swapper/1 Tainted: G D 4.14.0-xilinx-v2018.1 #8 Hardware name: Xilinx Zynq Platform [<c010e814>] (unwind_backtrace) from [<c010aa30>] (show_stack+0x10/0x14) [<c010aa30>] (show_stack) from [<c05f0a88>] (dump_stack+0x80/0xa0) [<c05f0a88>] (dump_stack) from [<c010cf78>] (ipi_cpu_stop+0x3c/0x70) [<c010cf78>] (ipi_cpu_stop) from [<c010d798>] (handle_IPI+0x64/0x84) [<c010d798>] (handle_IPI) from [<c0101420>] (gic_handle_irq+0x84/0x90) [<c0101420>] (gic_handle_irq) from [<c010b48c>] (__irq_svc+0x6c/0xa8) Exception stack(0xef05ff58 to 0xef05ffa0) ff40: 00000000 00000019 ff60: 2ee9f000 ef7e4e80 78557396 00000019 ef7e4340 00000000 78010d99 00000019 ff80: 00000000 00000000 00000015 ef05ffa8 c04bb804 c04bb828 60000013 ffffffff [<c010b48c>] (__irq_svc) from [<c04bb828>] (cpuidle_enter_state+0xe4/0x1b4) [<c04bb828>] (cpuidle_enter_state) from [<c014e0d8>] (do_idle+0x148/0x1a8) [<c014e0d8>] (do_idle) from [<c014e270>] (cpu_startup_entry+0x18/0x1c) [<c014e270>] (cpu_startup_entry) from [<001016cc>] (0x1016cc) ---[ end Kernel panic - not syncing: Attempted to kill the idle task!
I've added debug logging statements to the transmit ISR (not pictured above) to prove that the ISR does exit successfully. Moments later though, the above panic occurs.
Has anyone else gone through this exercise before, following that Xilinx Wiki successfully?
11-14-2018 02:49 AM - edited 11-14-2018 03:37 AM
There is a timestamp prerequisite for 1588 mentioned in the wiki page at https://xilinx-wiki.atlassian.net/wiki/spaces/A/pages/18842485/Linux+AXI+Ethernet+driver
Have you tried the flag XILINX_AXI_EMAC_HWTSTAMP enable in the kernel configuration? In master branch we do have this flag available see at https://github.com/Xilinx/linux-xlnx/blob/master/drivers/net/ethernet/xilinx/Kconfig
So try to enable it and hope you will have then timestamp enabled in 1588 settings.
11-14-2018 04:19 AM
Yes, I have that flag enabled as well as patched in drivers for PTP which interfaces with our RTC kernel driver which manages the IP core attached to the system time inputs of the AXI Ethernet Subsystem.
In case it helps in debugging this, I've continued to trace this down and the crash only occurs if ptp4l is running (either configured as grand master or slave). My test configuration is a separate system with ptp4l configured to be master and the embedded (zynq-based) system as slave. I've also attempted this all with my kernel revision set to the head of master from last week; it's all the same result.
11-22-2018 07:40 AM
11-26-2018 06:27 AM - edited 11-26-2018 09:26 AM
The kernel panic log is in the original post and the config is attached (renamed because of the forum rules). An example HDF and XDC are attached as well
Edit 1: Added HDF and XDC.
11-28-2018 05:44 AM
Based on the kernel documentation, there must be a call to skb_tx_timestamp() as close to the transmission time frame as possible (bullets 1-3) followed by a call to skb_hwtstamp_tx() and freeing the original SKB. Notably, that skb_hwtstamp_tx() method doesn't exist anywhere in the linux-xlnx kernel source for 2018.1 since it eventually changed to skb_tstamp_tx() (i.e., this is a documentation error). That method (and the cleanup) are called whether one is using the TSN core or not, so that takes care of bullet 4 (I think). So to summarize, it looks like the non-TSN use of the xilinx-axienet driver is missing a call to skb_tx_timestamp(), at the very least.
The patch I've attached applies that call approximately where the TSN route would have, and I did verify that line of code is being executed (via dev_info print statement). However, this does not result in a change; it still kernel panics with a similar stack listing (also attached).
12-11-2018 10:10 AM
I've seen your HDF file and it is built in Vivado 2018.2 s/w version. Have you tried to use Petalinux kernel tagged the same version? (xilinx-v2018.2). We do not support the drivers with downward compatible. With our s/w regression test with 2018.2, we do not see any issue. also the time stamp works as well.
12-31-2018 05:31 AM - edited 12-31-2018 05:38 AM
Yes, we're using the kernel tagged xilinx-v2018.2 from https://github.com/xilinx/linux-xlnx. I realize in an earlier log and post I pointed to the 2018.1, however since then we moved the kernel version to 2018.2 (even though there appeared to be no significant changes to the related driver in a non-TSN configuration like ours).
You say you all are not seeing any such issue -- can you please elaborate on your example design? Have you all attempted to run ptp4l against that example design? Any chance you all can post the example design?
[Edits for clarifying the version as well as requesting access to the example design]
01-03-2019 11:24 AM - edited 01-08-2019 05:58 AM
We had an odd breakthrough today wherein I added "twoStepFlag 0" to my PTP4L configuration on the target device. The kernel panic no longer happens; instead it has been replaced with a "timed out while polling for tx timestamp" error and never syncs to the master clock. I tired increasing the tx_timestamp_timeout to 100 (milliseconds, from the default of 1), but I still get the error.
We noticed that our master clock could only be configured for 2-step mode, so we reconfigured the IP core to 2-step and updated the target. Interestingly enough, we still get the kernel panic if running in 2-step mode. If we again set the above twoStepFlag to 0 (to make it 1-step), PTP4L starts and runs without there being a kernel panic (again though, that time out message is present and it does not sync).
Any further suggestions, given the above, @jadhavs? We have this effort in parallel with trying to use the TSN license (completely separate build, same goal), and neither effort is having much success getting basic 1588 PTPv2.
01-09-2019 11:29 AM