12-02-2020 07:11 AM
Hi. I have a zynq 7000 running linux from an QSPI NOR flash.
When booting, I'm experiencing a halt in the boot of about 4s. Here is the boot log around where is pauses, with timestamps:
[ 0.798108] PTP clock support registered [ 0.802036] EDAC MC: Ver: 3.0.0 [ 0.812179] FPGA manager framework [ 0.815749] fpga-region fpga-full: FPGA Region probed [ 0.832276] clocksource: Switched to clocksource arm_global_timer [ 0.848572] NET: Registered protocol family 2 [ 0.853599] TCP established hash table entries: 4096 (order: 2, 16384 bytes) [ 0.860611] TCP bind hash table entries: 4096 (order: 3, 32768 bytes) [ 0.867127] TCP: Hash tables configured (established 4096 bind 4096) [ 0.873660] UDP hash table entries: 256 (order: 1, 8192 bytes) [ 0.879435] UDP-Lite hash table entries: 256 (order: 1, 8192 bytes) [ 0.886159] NET: Registered protocol family 1 [ 5.182359] hw perfevents: enabled with armv7_cortex_a9 PMU driver, 7 counters available [ 5.192075] workingset: timestamp_bits=30 max_order=17 bucket_order=0 [ 5.202481] io scheduler noop registered [ 5.206322] io scheduler deadline registered [ 5.210635] io scheduler cfq registered (default) [ 5.215384] io scheduler mq-deadline registered [ 5.219827] io scheduler kyber registered [ 5.225905] dma-pl330 f8003000.dmac: Loaded driver for PL330 DMAC-241330 [ 5.232658] dma-pl330 f8003000.dmac: DBUFF-128x8bytes Num_Chans-8 Num_Peri-4 Num_Events-16 [ 5.241639] Serial: 8250/16550 driver, 1 ports, IRQ sharing disabled [ 5.249406] 43c10000.serial: ttyS0 at MMIO 0x43c11000 (irq = 49, base_baud = 6250000) is a 16550A [ 5.266918] brd: module loaded [ 5.272036] m25p80 spi0.0: non-uniform erase sector maps are not supported yet. [ 5.279450] m25p80 spi0.0: s25fl512s (65536 Kbytes) [ 5.284332] 8 ofpart partitions found on MTD device spi0.0 [ 5.289727] Creating 8 MTD partitions on "spi0.0":
I've added the pmu in the dts (present in the above log), but that does not seem to make it any faster.
Any suggestions on how to debug or solve is highly appreciated.
12-03-2020 12:40 AM
My rootfs.cpio is 24.8MB here is my full partition table:
[INFO] : Partition Information: [INFO] : Image: fsbl.elf [INFO] : Partition 0: fsbl.elf.0, Size: 98312 [INFO] : Image: fpga.bit [INFO] : Partition 1: fpga.bit.0, Size: 4045568 [INFO] : Image: u-boot.elf [INFO] : Partition 2: u-boot.elf.0, Size: 394284 [INFO] : Image: u-boot.dtb [INFO] : Partition 3: u-boot.dtb.0, Size: 12418 [INFO] : Image: u-boot-script.itb [INFO] : Partition 4: u-boot-script.itb.0, Size: 3383 [INFO] : Image: default_selector.bin [INFO] : Partition 5: default_selector.bin.0, Size: 1 [INFO] : Image: default_count.bin [INFO] : Partition 6: default_count.bin.0, Size: 1 [INFO] : Image: kernel-ramdisk-dtb.itb [INFO] : Partition 7: kernel-ramdisk-dtb.itb.0, Size: 9461276 [INFO] : Partition 8: kernel-ramdisk-dtb.itb.0, Size: 9461276 [INFO] : Image: ubi.img [INFO] : Partition 9: ubi.img.0, Size: 9699328 [INFO] : After Link -- Dump of Binary Image ---- 00000000 Len: 000008a0 Res: 00000000 "BootHeader" 000008c0 Len: 00000040 Res: 00000000 "ImageHeaderTable" 00000900 Len: 00000020 Res: 00000000 "ImageHeader fsbl.elf" 00000940 Len: 00000020 Res: 00000000 "ImageHeader fpga.bit" 00000980 Len: 00000020 Res: 00000000 "ImageHeader u-boot.elf" 000009c0 Len: 00000020 Res: 00000000 "ImageHeader u-boot.dtb" 00000a00 Len: 00000028 Res: 00000000 "ImageHeader u-boot-script.itb" 00000a40 Len: 0000002c Res: 00000000 "ImageHeader default_selector.bin" 00000a80 Len: 00000028 Res: 00000000 "ImageHeader default_count.bin" 00000ac0 Len: 0000002c Res: 00000000 "ImageHeader kernel-ramdisk-dtb.itb" 00000b00 Len: 0000002c Res: 00000000 "ImageHeader kernel-ramdisk-dtb.itb" 00000b40 Len: 0000001c Res: 00000140 "ImageHeader ubi.img" 00000c80 Len: 00000040 Res: 00000000 "PartitionHeader fsbl.elf.0" 00000cc0 Len: 00000040 Res: 00000000 "PartitionHeader fpga.bit.0" 00000d00 Len: 00000040 Res: 00000000 "PartitionHeader u-boot.elf.0" 00000d40 Len: 00000040 Res: 00000000 "PartitionHeader u-boot.dtb.0" 00000d80 Len: 00000040 Res: 00000000 "PartitionHeader u-boot-script.itb.0" 00000dc0 Len: 00000040 Res: 00000000 "PartitionHeader default_selector.bin.0" 00000e00 Len: 00000040 Res: 00000000 "PartitionHeader default_count.bin.0" 00000e40 Len: 00000040 Res: 00000000 "PartitionHeader kernel-ramdisk-dtb.itb.0" 00000e80 Len: 00000040 Res: 00000000 "PartitionHeader kernel-ramdisk-dtb.itb.0" 00000ec0 Len: 00000040 Res: 00000000 "PartitionHeader ubi.img.0" 00000f00 Len: 00000040 Res: 00000800 "PartitionHeader Null" 00001700 Len: 00018008 Res: 00000000 "fsbl.elf.0" 00019740 Len: 003dbb00 Res: 00000000 "fpga.bit.0" 003f5240 Len: 0006042c Res: 00000000 "u-boot.elf.0" 00455680 Len: 00003084 Res: 00000000 "u-boot.dtb.0" 004c0000 Len: 00000d38 Res: 00000000 "u-boot-script.itb.0" 00500000 Len: 00000004 Res: 00000000 "default_selector.bin.0" 00540000 Len: 00000004 Res: 00000000 "default_count.bin.0" 005c0000 Len: 00905e1c Res: 00000000 "kernel-ramdisk-dtb.itb.0" 011c0000 Len: 00905e1c Res: 00000000 "kernel-ramdisk-dtb.itb.0" 01e00000 Len: 00940000 Res: 00000000 "ubi.img.0" -- End of Dump [INFO] : Bootimage generated successfully
12-07-2020 10:56 AM
You might try rebuilding your kernel with CONFIG_ARM_PMU disabled (it is in menuconfig under Device Drivers -> Performance monitor support). This should get rid of the line with "hw perfevents".
If you still have a long delay at the same spot, there is something else (that is not printing any details). In that case I would try booting with console_loglevel increased to show more debug info during boot.
12-09-2020 05:04 AM
(Somehow I'm not able to post, this is my fifth try)
I tried what you suggested rfs613 but it didn't help.
Here is the new bootlog with increased console_loglevel
[ 0.781097] NET: Registered protocol family 2
[ 0.786151] TCP established hash table entries: 4096 (order: 2, 16384 bytes)
[ 0.793162] TCP bind hash table entries: 4096 (order: 3, 32768 bytes)
[ 0.799649] TCP: Hash tables configured (established 4096 bind 4096)
[ 0.806100] UDP hash table entries: 256 (order: 1, 8192 bytes)
[ 0.811875] UDP-Lite hash table entries: 256 (order: 1, 8192 bytes)
[ 0.818306] NET: Registered protocol family 1
[ 0.822604] PCI: CLS 0 bytes, default 64
[ 5.191401] workingset: timestamp_bits=30 max_order=17 bucket_order=0
[ 5.201770] io scheduler noop registered
[ 5.205674] io scheduler deadline registered
[ 5.209923] io scheduler cfq registered (default)
[ 5.214662] io scheduler mq-deadline registered
[ 5.219102] io scheduler kyber registered
[ 5.225273] dma-pl330 f8003000.dmac: Loaded driver for PL330 DMAC-241330
[ 5.231903] dma-pl330 f8003000.dmac: DBUFF-
12-09-2020 08:57 AM
(I got a bunch of email notifications for new posts, but couldn't see anything new on the forum - until now)
So in your latest log, there is one new line: "PCI: CLS 0 bytes, default 64", which wasn't there before. This is coming from pci_apply_final_quirks() in drivers/pci/quirks.c. Am not entirely sure what this is doing, but the question that comes to mind is: are you actually using PCI interface? Maybe try switching this off as well (CONFIG_PCI under Device Drivers)? Maybe it is trying to scan PCI endpoints, and spending a long time on each (non-existing) device before it times out?
Sounds like we can at least rule out the CONFIG_ARM_PMU... so some progress
12-10-2020 12:44 AM - edited 12-10-2020 12:45 AM
Thanks for sticking by me @rfs613!
(I think it rejects my posts if I @mention someone? this is my
second third time writing this post)
I've disabled PCI (good to get rid off anyway since it's not use as you suggested), but I still see the pause. This time I've included the full bootlog, just in case. I'm starting to think it might be just the QSPI attached NOR flash being slow, and that it shows up at that place in the boot time by coincidence?
(Maybe it's not liking too long posts? I've provided this link to the bootlog)
12-10-2020 06:31 AM
(The only quirk I'm aware of with posting on the site is that the login duration is fairly short... if you start typing a reply, then come back an hour later to finish and send it, usually it gets rejected. So I have gotten into the habit of copying my reply into clipboard before hitting "Reply" button... as a backup in case it logs me out).
Anyhow, thanks for the full boot log. It looks like you are loading kernel+ramdisk into RAM. The kernel does probe the QSPI, but this just involves reading its ID, which is just a few bytes long. So doubtful that this would cause a large delay.
Later on, when you mount the ubifs partition, that requires scanning the app-data partition, which certainly take time. You can in fact see it takes about 2 seconds, from 5.6 to 7.6 in your boot log. But this doesn't explain the earlier 4 second pause.
Two more thoughts come to mind:
1) When you watch the boot messages scrolling by, does the 4-second delay actually occur between 0.8 and 5.1? In other words, do the timestamps printed before each line match the actual wall-time?
2) Does the vendor of the SOM provide pre-built binary images for the board? Might be useful for comparison, eg. to see if they also have a similar delay.
12-15-2020 02:15 AM
Thanks for the suggestions @rfs613.
1) Yes, the delay is between those lines.
2) not that I know of, and we have been using the provided dts and configs, with the same results.
I'm not sure how to continue, but some sort of degging tool would be great for this. Maybe there is some probing capabilities that I'm not aware of?
12-16-2020 07:20 PM - edited 12-16-2020 07:21 PM
Apologies for not replying sooner... very busy time for me right now.
There are a large number of debugging tools, here are just a few:
I would also consider contacting the vendor of the SOM. To find out if they are aware of the delay, and perhaps what causes it. There is a good chance their other clients have the same issue.
[edit: fixed link]