cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
luojf9
Visitor
Visitor
718 Views
Registered: ‎11-18-2018

zynqmp nandflash ECC failed, nandtest failed

i am working on a zynqmp board, and my petalinux is 2018.3

to use nandflash, my dts is:

&nand0{
status = "okay";
arasan,has-mdma;
nand@0{
reg = <0x0>;
#address-cells = <2>;
#size-cells = <2>;
partition@0{
label = "test";
reg = <0x0 0x0 0x0 0x200000>;
};

partition@1{
label = "jffs2";
reg = <0x0 0x200000 0x0 0x1e00000>;
};

partition@2{
label = "part1";
reg = <0x0 0x2000000 0x1 0xfe000000>;
};

};
};

and my kernel print is:

[ 2.118228] nand: device found, Manufacturer ID: 0x2c, Chip ID: 0x64
[ 2.123038] nand: Micron MT29F64G08CBABAWP
[ 2.127101] nand: 8192 MiB, MLC, erase size: 2048 KiB, page size: 8192, OOB size: 744
[ 2.134888] arasan_nand ff100000.nand: HW ECC selected
[ 2.140187] Scanning device for bad blocks
[ 2.144156] Bad eraseblock 0 at 0x000000000000
[ 2.155531] Bad eraseblock 75 at 0x000009600000
[ 2.155929] Bad eraseblock 79 at 0x000009e00000
[ 2.159929] Bad eraseblock 90 at 0x00000b400000
[ 2.163504] Bad eraseblock 91 at 0x00000b600000
[ 2.175518] Bad eraseblock 173 at 0x000015a00000
[ 2.197754] Bad eraseblock 411 at 0x000033600000
[ 2.203862] Bad eraseblock 476 at 0x00003b800000
[ 2.226332] Bad eraseblock 716 at 0x000059800000
[ 2.251811] Bad eraseblock 988 at 0x00007b800000
[ 2.461408] Bad eraseblock 3229 at 0x000193a00000
[ 2.474063] Bad eraseblock 3364 at 0x0001a4800000
[ 2.510103] Bad eraseblock 3749 at 0x0001d4a00000
[ 2.549934] 3 ofpart partitions found on MTD device arasan_nand.0
[ 2.550374] Creating 3 MTD partitions on "arasan_nand.0":

but when I use nandtest:

it has some error:

nandtest /dev/mtd1
ECC corrections: 0
ECC failures : 245
Bad blocks : 0
BBT blocks : 0
00000000: reading (1 of 4)...
ECC failed at 00000000
00000000: checking...
compare failed. seed 1467388221
Byte 0x8a is 81 should be c1
Byte 0x12f is 2b should be 23
Byte 0x2e6 is 03 should be 01
Byte 0x301 is 31 should be 33
Byte 0x5ee is 5c should be 58
Byte 0x7a7 is 01 should be 09
Byte 0x84a is 96 should be b6
Byte 0x95b is b4 should be bc
Byte 0xb7c is 73 should be 33
Byte 0xc0d is db should be fb
Byte 0xc1f is 9f should be df
Byte 0xc2f is 63 should be e3
Byte 0xc44 is 2d should be 6d
Byte 0xc5c is 9d should be bd
Byte 0xc6b is a6 should be 86
Byte 0xc6d is 53 should be 52
Byte 0xc94 is 55 should be 51
Byte 0xc9a is 53 should be 5b
Byte 0xcb1 is 96 should be b6
Byte 0xcb4 is b2 should be f2
Byte 0xcca is cd should be c9
Byte 0xcd4 is 4e should be 4c
Byte 0xcf3 is b5 should be f5
Byte 0xcfa is 63 should be 67
Byte 0xd07 is ae should be ac

some bit flip,and there is no rule

but when i disable mdma

it run correct,but speed is so slow

please help me,thank you.

0 Kudos
7 Replies
luojf9
Visitor
Visitor
675 Views
Registered: ‎11-18-2018

maybe it is hardware problem, how should I setting zynqmp to use nandflash

0 Kudos
luojf9
Visitor
Visitor
658 Views
Registered: ‎11-18-2018

should I modify the linux driver to use nandflash correct

0 Kudos
joescou
Visitor
Visitor
275 Views
Registered: ‎02-09-2021

I have a similar issue.  What did you finally do?
I'm trying to modify aransan driver but there no much more info about that controller. (There a new driver on master but I must keep my code in 2020.1 version).

0 Kudos
watari
Teacher
Teacher
251 Views
Registered: ‎06-16-2013

Hi @joescou 

 

Did you format on NAND device ?

It seems file system or format issue.

Would you make sure it ?

 

Best regards,

0 Kudos
joescou
Visitor
Visitor
237 Views
Registered: ‎02-09-2021

Hi @watari !!

Yes, I have formatted the nand partition. First erasing all and then formatting ubifs.

flash_erase

 

root:~# flash_erase /dev/mtd3 0 0
Erasing 1024 Kibyte @ 3900000 --  1 % complete flash_erase: Skipping bad block at 03a00000
flash_erase: Skipping bad block at 03b00000
Erasing 1024 Kibyte @ fdb00000 -- 99 % complete flash_erase: Skipping bad block at fdc00000
flash_erase: Skipping bad block at fdd00000
flash_erase: Skipping bad block at fde00000
flash_erase: Skipping bad block at fdf00000
Erasing 1024 Kibyte @ fdf00000 -- 100 % complete

 

ubiformat

 

root:~# ubiformat /dev/mtd3
ubiformat: mtd3 (nand), size 4261412864 bytes (4.0 GiB), 4064 eraseblocks of 1048576 bytes (1024.0 KiB), min. I/O size 8192 bytes
libscan: scanning eraseblock 4063 -- 100 % complete
ubiformat: 4040 eraseblocks are supposedly empty
ubiformat: 6 bad eraseblocks found, numbers: 58, 59, 4060, 4061, 4062, 4063
ubiformat: warning!: 18 of 4058 eraseblocks contain non-UBI data
ubiformat: continue? (y/N) y
ubiformat: warning!: only 0 of 4058 eraseblocks have valid erase counter
ubiformat: erase counter 0 will be used for all eraseblocks
ubiformat: note, arbitrary erase counter value may be specified using -e option
ubiformat: continue? (y/N) y
ubiformat: use erase counter 0 for all eraseblocks
ubiformat: formatting eraseblock 4063 -- 100 % complete

 

ubiattach

root:~# ubiattach -p /dev/mtd3
UBI device number 0, total 4058 LEBs (4188635136 bytes, 3.9 GiB), available 3980 LEBs (4108124160 bytes, 3.8 GiB), LEB size 1032192 bytes (1008.0 KiB)

ubimkvol

root:~#  ubimkvol -N rootfs -m /dev/ubi0
Set volume size to 4108124160
Volume ID 0, size 3980 LEBs (4108124160 bytes, 3.8 GiB), LEB size 1032192 bytes (1008.0 KiB), dynamic, name "rootfs", alignment 1

mount

root:~# mount -t ubifs ubi0:rootfs /tmp/dest
root:~# dmesg
[ 1961.455150] ubi0: attaching mtd3
[ 1964.448284] ubi0: scanning is finished
[ 1964.458122] ubi0: attached mtd3 (name "rootfs", size 4064 MiB)
[ 1964.464000] ubi0: PEB size: 1048576 bytes (1024 KiB), LEB size: 1032192 bytes
[ 1964.471140] ubi0: min./max. I/O unit sizes: 8192/8192, sub-page size 8192
[ 1964.477925] ubi0: VID header offset: 8192 (aligned 8192), data offset: 16384
[ 1964.484964] ubi0: good PEBs: 4058, bad PEBs: 6, corrupted PEBs: 0
[ 1964.491049] ubi0: user volume: 0, internal volumes: 1, max. volumes count: 128
[ 1964.498262] ubi0: max/mean erase counter: 0/0, WL threshold: 4096, image sequence number: 1340877874
[ 1964.507385] ubi0: available PEBs: 3980, total reserved PEBs: 78, PEBs reserved for bad PEB handling: 74
[ 1964.516774] ubi0: background thread "ubi_bgt0d" started, PID 342
[ 2018.260478] UBIFS (ubi0:0): default file-system created
[ 2018.266098] UBIFS (ubi0:0): Mounting in unauthenticated mode
[ 2018.271893] UBIFS (ubi0:0): background thread "ubifs_bgt0_0" started, PID 349
[ 2018.471902] UBIFS (ubi0:0): UBIFS: mounted UBI device 0, volume 0, name "rootfs"
[ 2018.479298] UBIFS (ubi0:0): LEB size: 1032192 bytes (1008 KiB), min./max. I/O unit sizes: 8192 bytes/8192 bytes
[ 2018.489391] UBIFS (ubi0:0): FS size: 4097802240 bytes (3907 MiB, 3970 LEBs), journal size 33030144 bytes (31 MiB, 32 LEBs)
[ 2018.500424] UBIFS (ubi0:0): reserved for root: 4952683 bytes (4836 KiB)
[ 2018.507031] UBIFS (ubi0:0): media format: w5/r0 (latest is w5/r0), UUID 5868A244-2BCE-420C-B75F-A2F7E307ABAF, small LPT model

 
So,  copy data and test fs read:

rsync --info=progress2 -a /tmp/orig/ /tmp/dest
find /tmp/dest -type f -print -exec cat {} + > /dev/null


After some iterations, ubierror...

  ........................
[ 4375.577085] CPU: 3 PID: 481 Comm: cat Not tainted 5.4.0-dirty #16
[ 4375.583161] Hardware name: xlnx,zynqmp (DT)
[ 4375.587326] Call trace:
[ 4375.589760]  dump_backtrace+0x0/0x158
[ 4375.593404]  show_stack+0x14/0x20
[ 4375.596704]  dump_stack+0xb0/0xd8
[ 4375.600001]  ubifs_read_node+0x1d0/0x248
[ 4375.603907]  ubifs_tnc_read_node+0xb0/0xb8
[ 4375.607987]  ubifs_tnc_locate+0x1bc/0x1e8
[ 4375.611979]  do_readpage+0x17c/0x3d8
[ 4375.615538]  ubifs_readpage+0x5c/0x460
[ 4375.619271]  generic_file_read_iter+0x550/0xa78
[ 4375.623785]  new_sync_read+0xf4/0x190
[ 4375.627430]  __vfs_read+0x2c/0x40
[ 4375.630728]  vfs_read+0x94/0x160
[ 4375.633940]  ksys_read+0x64/0xf0
[ 4375.637152]  __arm64_sys_read+0x14/0x20
[ 4375.640972]  el0_svc_common.constprop.3+0xb0/0x168
[ 4375.645745]  el0_svc_handler+0x6c/0x88
[ 4375.649477]  el0_svc+0x8/0xc
[ 4375.652354] UBIFS error (ubi0:0 pid 481): do_readpage: cannot read page 0 of inode 17481, error -22
[ 4375.661491] UBIFS error (ubi0:0 pid 481): ubifs_read_node: bad node type (255 but expected 1)
[ 4375.670042] UBIFS error (ubi0:0 pid 481): ubifs_read_node: bad node at LEB 639:729064, LEB mapping status 0
[ 4375.679780] Not a node, first 24 bytes:
[ 4375.679784] 00000000: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
   ........................
[ 4375.696471] CPU: 0 PID: 481 Comm: cat Not tainted 5.4.0-dirty #16
[ 4375.702545] Hardware name: xlnx,zynqmp (DT)
[ 4375.706711] Call trace:
[ 4375.709145]  dump_backtrace+0x0/0x158
[ 4375.712789]  show_stack+0x14/0x20

 
Then I tried with nandtest and saw that there are differences between writing and reading. Notice that it is 1 bit difference (I don't know why).
Starting from scratch (ubi unmounted, mtd detached and erased again)

root:~# flash_erase /dev/mtd3 0 0
Erasing 1024 Kibyte @ 3900000 --  1 % complete flash_erase: Skipping bad block at 03a00000
flash_erase: Skipping bad block at 03b00000
Erasing 1024 Kibyte @ fdb00000 -- 99 % complete flash_erase: Skipping bad block at fdc00000
flash_erase: Skipping bad block at fdd00000
flash_erase: Skipping bad block at fde00000
flash_erase: Skipping bad block at fdf00000
Erasing 1024 Kibyte @ fdf00000 -- 100 % complete

root:~# nandtest /dev/mtd3
ECC corrections: 0
ECC failures   : 0
Bad blocks     : 2
BBT blocks     : 4
01900000: checking...of 4)...
compare failed. seed 524533467
Byte 0x9217c is a8 should be ac
01900000: checking...of 4)...
compare failed. seed 524533467
Byte 0x9217c is a8 should be ac
01900000: checking...of 4)...
compare failed. seed 524533467
Byte 0x9217c is a8 should be ac
01900000: checking...of 4)...
compare failed. seed 524533467
Byte 0x9217c is a8 should be ac
read/check 4 of 4 failed. seed 524533467
02000000: checking...of 4)...
compare failed. seed 2145676023
Byte 0x480df is 4a should be ca
02000000: checking...of 4)...
compare failed. seed 2145676023
Byte 0x480df is 4a should be ca
02000000: checking...of 4)...
compare failed. seed 2145676023
Byte 0x480df is 4a should be ca
02000000: checking...of 4)...
compare failed. seed 2145676023
Byte 0x480df is 4a should be ca
read/check 4 of 4 failed. seed 2145676023
02100000: checking...of 4)...
compare failed. seed 679366166
Byte 0x5803a is 3c should be 38
02100000: checking...of 4)...
compare failed. seed 679366166
Byte 0x5803a is 3c should be 38
02100000: checking...of 4)...
compare failed. seed 679366166
Byte 0x5803a is 3c should be 38
02100000: checking...of 4)...
compare failed. seed 679366166
Byte 0x5803a is 3c should be 38
read/check 4 of 4 failed. seed 679366166
...

 

So consider that the failures reported in the nandtest are products of failures in the ECC engine (arasan controler) or somthing else in the nand controller driver. 

Many thanks and regards !!

0 Kudos
watari
Teacher
Teacher
174 Views
Registered: ‎06-16-2013

Hi @joescou 

 

Which value is expected  data in Byte  0x9217c ? a8 or ac ?

 

As you may know, default value is high after formatting NAND flash.

If expected value is ac, controller write a8 instead of ac and it seems ECC engine issue.

If expected value is a8, it might be controller issue or power issue or nand flash issue.

 

Best regards,

0 Kudos
joescou
Visitor
Visitor
131 Views
Registered: ‎02-09-2021

Hi @watari

Expected data in Byte 0x9217c was 0xac (10101100),  instead I readed 0xa8 (10101000). So we can say that is a ECC engine issue.
But Byte 0x5803a should have been 0x38 (00111000),  instead I readed 0x3c (00111100). So in this case could be a controller issue or nand flash or power. 

ECC engine is also inside the controller (HW mode), seems to be a problem with the ARASAN controller. (or could be it interaction with the nand flash memory).

There is not much detailed information on the ARASAN controller. And there are many issues related to it.

Best regards,

 

0 Kudos