05-09-2018 03:54 PM
Looks like the ubifs has been broken. It worked well in 2017.3
I get lots of assert failed messages like -
UBIFS assert failed in ubifs_change_lp at 540
UBIFS assert failed in ubifs_release_lprops at 278
These all seem to be related to mutex locking
Functionally it was OK for a while, but now I'm started to see random mounting failures
Anyone else seen this ?
05-10-2018 06:03 AM
Is this related to UBIFS on top of a NOR flash? Similar observations are reported for the nor flash of zcu102 (not Petalinux, but the meta-layer below (meta-xilinx)): https://lists.yoctoproject.org/pipermail/meta-xilinx/2018-April/003774.html.
05-10-2018 04:59 PM
Hi Martin !
Thanks for your feedback !
We are using Micron Flash with on-board ECC. Turns out those assert warnings are pretty harmless and goes away after applying the latest patches from the official u-boot github. However, there is still something not working quite right. In Linux i can read & write the UBIFS partitions, but u-boot fails to boot a partition with the error below. Its almost like it doesn't do ECC correctly, because some files can be read, while others fail
ubi0: attaching mtd2
ubi0 error: check_corruption: PEB 51 contains corrupted VID header, and the data does not contain all 0xFF
ubi0 error: check_corruption: this may be a non-UBI PEB or a severe VID header corruption which requires manual inspection
Volume identifier header dump:
05-14-2018 02:54 AM
Maybe the same trouble as me : https://forums.xilinx.com/t5/Embedded-Linux/NAND-Support-after-updating-from-2016-3-to-2017-4/m-p/854069#M25872
You should also to try to erase completely the memory.
05-14-2018 09:37 AM
Hi Trigger !
Thanks for your feedback. I think my problems have something to do with the on-die ECC that Micron has. If I turn that off and let the NAND-controller handle it, file operations are stable in Linux - but I cant mount the partition at all in u-boot. Initially it complains about the min-write size not being the same (2048 in Linux vs 512b in u-boot), but even after I add the NAND_NO_SUBPAGE_WRITE to the NAND-controller driver in u-boot it fails to mount for some reason.
One would think that something as "old" and established as NAND & UBIFS would just work, but evidently that is not the case
05-15-2018 02:25 AM
05-15-2018 12:04 PM
Hi Trigger !
Thank you for looking into this ! There is code in the arasan_nand.c anfc_ecc_ooblayout_ondie64_ecc() function, that looks very much like the pre-patched code you sent !
oobregion->offset = (section * 16) + 8;
My gut feeling is that this On-Die business is does not have extensive use, and therefore is littered with problems so I'd like to just use the Arasan controller ECC instead. I've done a few more tests so I'll summarize below
* Reading & Writing UBIFS partition is stable in Linux (After disabling the on-die ECC in arasan_nand.c)
* Writing a 100Mb file directly to NAND using nandwrite in linux and then reading it back and comparing it in u-boot shows its correct
* Also, Writing the same 100Mb file to NAND using 'nand write' in u-boot and then reading it back using nanddump in Linux also works OK
* Writing a ubifs root filesystem from u-boot using "ubi write" works, as far as the writing is concerned. I can mount the rootfs in Linux, however if I try to mount it in u-boot I get this error -
ZynqMP> ubifsmount ubi0:rootfs
UBIFS error (ubi0:0 pid 0): ubifs_recover_master_node: failed to recover master node
Error reading superblock on volume 'ubi0:rootfs' errno=-22!
Which is the same error I get when writing the file-system in Linux and then try to mount it in u-boot (attaching works BTW). So this is starting to look like a problem with ubifsmount.
05-18-2018 09:59 AM
To summarize the findings after 2 weeks of fighting this. We are using a Micron MT29UZ4B8DZZHGPB Flash, but presumably one would see the same problems on any Micron NAND with On-Die ECC
UBIFS in PetaLinux 2018.1 is unstable (in Linux) with On-Die ECC Enabled. Once stressing the system with a lot of reads and writes you start getting errors (see below). 2017.3 is stable using the same tests (and same board).
[ 3776.989891] UBIFS error (ubi2:0 pid 3077): do_readpage: bad data node (block 1, inode 8154)
[ 3776.998174] magic 0x6101831
[ 3777.001818] crc 0x5cbe6942
[ 3777.005539] node_type 1 (data node)
[ 3777.009532] group_type 0 (no node group)
[ 3777.013870] sqnum 105293
[ 3777.017256] len 2064
[ 3777.020465] key (8154, data, 1)
[ 3777.024634] size 4096
[ 3777.027841] compr_typ 1
[ 3777.030797] data size 2016
[ 3777.034007] data:
If you disable On-Die ECC the same test pass in Linux using PL 18.1.
If I write a partition from Linux in PL 17.3 then U-boot 18.3 can mount it and read files from ubifs. Also, reading the files from Linux (18.1) is OK. However as soon as I write a new file from Linux, u-boot can no longer read it (See error below). Still can read it in Linux though, so it looks like some kind of incompatibility problem.
UBIFS error (ubi0:0 pid 0): read_block: bad data node (block 661, inode 5763)
node_type 1 (data node)
group_type 0 (no node group)
key (5763, data, 661)
data size 3027
UBIFS error (ubi0:0 pid 0): do_readpage: cannot read page 661 of inode 5763, error -22
Error reading file '/boot/Image'
Writing a 100mb file using 'nand write' (in u-boot) and then reading it in Linux with nanddump, creates matching files w/o errors (and vice versa). So the NAND-layer looks OK
So we'll stick with 2017.3 for now
12-20-2018 12:38 PM
I'm having this exact same issue with Micron Nand with on-die ECC. Was a resolution ever reached in 2018+ kernels?
12-21-2018 11:08 AM
Hi Justin !
The problem is that no one is maintaining the Xilinx NAND driver and in Linux 4.14 there was changes made to the MTD-side Micron drivers that are incompatible with the Arasan driver. I told them about this in April and it has still not been fixed, so I'm guessing not many of their customers are affected. Welcome to the club ! hehe
See attached patch / hack
Have a great weekend !