cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
Highlighted
Adventurer
Adventurer
513 Views
Registered: ‎05-26-2017

Arasan NAND controller Suspected DMA problems in Petalinux 2019.2

Jump to solution

Hi There !

We have a ZynqMP system with Micron MT29F4G08 NAND (its a MT29UZ4B8DZZHGPB-107 DDR+Flash Combo device). We run ubifs and over the last year have noticed two cases of flash file system corruption. This lead me to try out the mtd-utils 2.1.1 tests and sure enough the io_paral test fails about every fifth time. 

Using Petalinux 2017.3 I've made the following observations - 

* When it fails the read buffer contains an entire page (2048bytes) of zeros where data should be present. Data after the failed page is correct. 

* Only Reads fail (I verified this by patching mtd-utils to retry the read)



Using Petalinux 2019.2 I've made the following observations - 

* Both reads and writes fails, but more commonly writes. The failure mode is different, see below


Running io_paral /dev/ubi5
[io_paral] write_thread():301: written and read data are different at byte: 14640

[io_paral] write_thread():303: Wbuf: -- Data Written

9D 7F 3A CD 5E 2A 98 1F  17 35 AF E1 BD 0C 3B EC  |  ..:.^*...5....;.
B4 36 2A CA 05 67 07 D7  44 E6 EE D4 5C C7 AC 0E  |  .6*..g..D...\...
8B B8 2B F4 C6 31 5C 21  C3 90 C5 0D AF 7F 4C C5  |  ..+..1\!......L.
D4 74 57 00 E2 C0 10 5E  7C 21 92 F7 FD 99 81 D4  |  .tW....^|!......
39 92 4C 28 D4 03 9E D6  E9 1C 10 D7 3E 67 77 6A  |  9.L(........>gwj

C5 BE F2 FA F2 B3 A7 89  15 EC 47 EA 54 84 CD 16  |  ..........G.T...
EC 96 3C 83 CC D0 8E 71  50 40 5E A9 93 25 DD 6E  |  ..<....qP@^..%.n


[io_paral] write_thread():309: Rbuf:  -- Data Read back

9D 7F 3A CD 5E 2A 98 1F  17 35 AF E1 BD 0C 3B EC  |  ..:.^*...5....;.
00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |  ................
00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |  ................
00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |  ................
00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |  ................

C5 BE F2 FA F2 B3 A7 89  15 EC 47 EA 54 84 CD 16  |  ..........G.T...
EC 96 3C 83 CC D0 8E 71  50 40 5E A9 93 25 DD 6E  |  ..<....qP@^..%.n

Instead of the data being incorrect in full pages, I now see blocks of 64bytes and 128bytes be incorrect. After the incorrect data I see correct data for 960 bytes, then there is another block of 128bytes if incorrect data. I'll attached the full printout along with the MTD-patch if you want to try it out. 

Looks to me like there is some form of DMA problem. Anyone else seeing this or has any idea on what to try next ?

Thanks in advance, 

​/Otto

0 Kudos
1 Solution

Accepted Solutions
Highlighted
Adventurer
Adventurer
403 Views
Registered: ‎05-26-2017

For anyone suffering from this probem - Adding 

nand-ecc-mode = "on-die";

To the device tree seems to have fixed the problem 

 

View solution in original post

0 Kudos
5 Replies
Highlighted
Adventurer
Adventurer
404 Views
Registered: ‎05-26-2017

For anyone suffering from this probem - Adding 

nand-ecc-mode = "on-die";

To the device tree seems to have fixed the problem 

 

View solution in original post

0 Kudos
Adventurer
Adventurer
325 Views
Registered: ‎05-26-2017
Actually the 2019.2 Driver for the Arasan block does not have proper support for On-Die. They did put together a patch for it that should be available in an AR somewhere, if not let me know and I can send it
0 Kudos
Highlighted
Adventurer
Adventurer
198 Views
Registered: ‎03-27-2019

@ottob 

helllo ottob.

 

Could you tell me where the patch is?

0 Kudos
Highlighted
Adventurer
Adventurer
179 Views
Registered: ‎05-26-2017

Here it is (system would not allow me to attach it). Are you using Micron NAND as well ?

 

 

From 9c06b93a38fc98fa54c7fc3f30572796bdc31b59 Mon Sep 17 00:00:00 2001
From: Naga Sureshkumar Relli <naga.sureshkumar.relli@xilinx.com>
Date: Tue, 11 Feb 2020 23:03:04 -0700
Subject: [LINUX PATCH] mtd: rawnand: Add on-die ecc support

Add new APIs anfc_{read/write}_page(), when on-die ecc enabled.
Also don't overwrite read/write_page hooks when driver is already
registered these hooks.

Signed-off-by: Naga Sureshkumar Relli <naga.sureshkumar.relli@xilinx.com>
---
drivers/mtd/nand/raw/arasan_nand.c | 151 ++++++++++++++++++++++-------
drivers/mtd/nand/raw/nand_micron.c | 7 +-
2 files changed, 120 insertions(+), 38 deletions(-)

diff --git a/drivers/mtd/nand/raw/arasan_nand.c b/drivers/mtd/nand/raw/arasan_nand.c
index 58ad872b3476..825ba406d57e 100644
--- a/drivers/mtd/nand/raw/arasan_nand.c
+++ b/drivers/mtd/nand/raw/arasan_nand.c
@@ -217,6 +217,32 @@ static const struct mtd_ooblayout_ops anfc_ooblayout_ops = {
.free = anfc_ooblayout_free,
};

+/* Generic flash bbt decriptors */
+static u8 bbt_pattern[] = { 'B', 'b', 't', '0' };
+static u8 mirror_pattern[] = { '1', 't', 'b', 'B' };
+
+static struct nand_bbt_descr bbt_main_descr = {
+ .options = NAND_BBT_LASTBLOCK | NAND_BBT_CREATE | NAND_BBT_WRITE
+ | NAND_BBT_2BIT | NAND_BBT_VERSION | NAND_BBT_PERCHIP |
+ NAND_BBT_SCAN2NDPAGE,
+ .offs = 4,
+ .len = 4,
+ .veroffs = 20,
+ .maxblocks = 4,
+ .pattern = bbt_pattern
+};
+
+static struct nand_bbt_descr bbt_mirror_descr = {
+ .options = NAND_BBT_LASTBLOCK | NAND_BBT_CREATE | NAND_BBT_WRITE
+ | NAND_BBT_2BIT | NAND_BBT_VERSION | NAND_BBT_PERCHIP |
+ NAND_BBT_SCAN2NDPAGE,
+ .offs = 4,
+ .len = 4,
+ .veroffs = 20,
+ .maxblocks = 4,
+ .pattern = mirror_pattern
+};
+
static inline struct anfc_nand_chip *to_anfc_nand(struct nand_chip *nand)
{
return container_of(nand, struct anfc_nand_chip, chip);
@@ -429,6 +455,47 @@ static void anfc_write_data_op(struct nand_chip *chip, const u8 *buf,
pktsize);
}

+static int anfc_read_page(struct mtd_info *mtd,
+ struct nand_chip *chip, uint8_t *buf,
+ int oob_required, int page)
+{
+ u32 ret;
+ struct anfc_nand_chip *achip = to_anfc_nand(chip);
+
+ ret = nand_read_page_op(chip, page, 0, NULL, 0);
+ if (ret)
+ return ret;
+
+ anfc_read_data_op(chip, buf, mtd->writesize,
+ DIV_ROUND_UP(mtd->writesize, achip->pktsize),
+ achip->pktsize);
+ if (oob_required)
+ chip->ecc.read_oob(mtd, chip, page);
+
+ return 0;
+}
+
+static int anfc_write_page(struct mtd_info *mtd,
+ struct nand_chip *chip, const uint8_t *buf,
+ int oob_required, int page)
+{
+ u32 ret;
+ struct anfc_nand_chip *achip = to_anfc_nand(chip);
+
+ ret = nand_prog_page_begin_op(chip, page, 0, NULL, 0);
+ if (ret)
+ return ret;
+
+ anfc_write_data_op(chip, buf, mtd->writesize,
+ DIV_ROUND_UP(mtd->writesize, achip->pktsize),
+ achip->pktsize);
+
+ if (oob_required)
+ chip->ecc.write_oob(mtd, chip, page);
+
+ return 0;
+}
+
static int anfc_read_page_hwecc(struct mtd_info *mtd,
struct nand_chip *chip, u8 *buf,
int oob_required, int page)
@@ -542,47 +609,59 @@ static int anfc_ecc_init(struct mtd_info *mtd,
unsigned int ecc_strength, steps;
struct nand_chip *chip = mtd_to_nand(mtd);
struct anfc_nand_chip *achip = to_anfc_nand(chip);
+ struct anfc_nand_controller *nfc = to_anfc(chip->controller);

- ecc->mode = NAND_ECC_HW;
- ecc->read_page = anfc_read_page_hwecc;
- ecc->write_page = anfc_write_page_hwecc;
+ if (ecc_mode == NAND_ECC_ON_DIE) {
+ anfc_config_ecc(nfc, 0);
+ ecc->strength = 1;
+ ecc->bytes = 0;
+ ecc->size = mtd->writesize;
+ ecc->read_page = anfc_read_page;
+ ecc->write_page = anfc_write_page;
+ chip->bbt_td = &bbt_main_descr;
+ chip->bbt_md = &bbt_mirror_descr;
+ } else {
+ ecc->mode = NAND_ECC_HW;
+ ecc->read_page = anfc_read_page_hwecc;
+ ecc->write_page = anfc_write_page_hwecc;

- mtd_set_ooblayout(mtd, &anfc_ooblayout_ops);
+ mtd_set_ooblayout(mtd, &anfc_ooblayout_ops);

- steps = mtd->writesize / chip->ecc_step_ds;
+ steps = mtd->writesize / chip->ecc_step_ds;

- switch (chip->ecc_strength_ds) {
- case 12:
- ecc_strength = 0x1;
- break;
- case 8:
- ecc_strength = 0x2;
- break;
- case 4:
- ecc_strength = 0x3;
- break;
- case 24:
- ecc_strength = 0x4;
- break;
- default:
- ecc_strength = 0x0;
+ switch (chip->ecc_strength_ds) {
+ case 12:
+ ecc_strength = 0x1;
+ break;
+ case 8:
+ ecc_strength = 0x2;
+ break;
+ case 4:
+ ecc_strength = 0x3;
+ break;
+ case 24:
+ ecc_strength = 0x4;
+ break;
+ default:
+ ecc_strength = 0x0;
+ }
+ if (!ecc_strength)
+ ecc->total = 3 * steps;
+ else
+ ecc->total =
+ DIV_ROUND_UP(fls(8 * chip->ecc_step_ds) *
+ chip->ecc_strength_ds * steps, 8);
+
+ ecc->strength = chip->ecc_strength_ds;
+ ecc->size = chip->ecc_step_ds;
+ ecc->bytes = ecc->total / steps;
+ ecc->steps = steps;
+ achip->ecc_strength = ecc_strength;
+ achip->strength = achip->ecc_strength;
+ ecc_addr = mtd->writesize + (mtd->oobsize - ecc->total);
+ achip->eccval = ecc_addr | (ecc->total << ECC_SIZE_SHIFT) |
+ (achip->strength << BCH_EN_SHIFT);
}
- if (!ecc_strength)
- ecc->total = 3 * steps;
- else
- ecc->total =
- DIV_ROUND_UP(fls(8 * chip->ecc_step_ds) *
- chip->ecc_strength_ds * steps, 8);
-
- ecc->strength = chip->ecc_strength_ds;
- ecc->size = chip->ecc_step_ds;
- ecc->bytes = ecc->total / steps;
- ecc->steps = steps;
- achip->ecc_strength = ecc_strength;
- achip->strength = achip->ecc_strength;
- ecc_addr = mtd->writesize + (mtd->oobsize - ecc->total);
- achip->eccval = ecc_addr | (ecc->total << ECC_SIZE_SHIFT) |
- (achip->strength << BCH_EN_SHIFT);

if (chip->ecc_step_ds >= 1024)
achip->pktsize = 1024;
diff --git a/drivers/mtd/nand/raw/nand_micron.c b/drivers/mtd/nand/raw/nand_micron.c
index f5dc0a7a2456..f0b9663626df 100644
--- a/drivers/mtd/nand/raw/nand_micron.c
+++ b/drivers/mtd/nand/raw/nand_micron.c
@@ -501,8 +501,11 @@ static int micron_nand_init(struct nand_chip *chip)
chip->ecc.size = 512;
chip->ecc.strength = chip->ecc_strength_ds;
chip->ecc.algo = NAND_ECC_BCH;
- chip->ecc.read_page = micron_nand_read_page_on_die_ecc;
- chip->ecc.write_page = micron_nand_write_page_on_die_ecc;
+ if (!chip->ecc.read_page)
+ chip->ecc.read_page = micron_nand_read_page_on_die_ecc;
+
+ if (!chip->ecc.write_page)
+ chip->ecc.write_page = micron_nand_write_page_on_die_ecc;

if (ondie == MICRON_ON_DIE_MANDATORY) {
chip->ecc.read_page_raw = nand_read_page_raw_notsupp;
--
2.17.1

 

0 Kudos
Highlighted
Adventurer
Adventurer
154 Views
Registered: ‎03-27-2019

hello @ottob 

 

thank you for reply.

It worked out well thanks to you.

0 Kudos