cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
maenpaa
Adventurer
Adventurer
7,064 Views
Registered: ‎09-30-2014

QSPI flash controller failure

Dear experts,

 

I am having some trouble accessing QSPI memory over xc7z100 (PSS_IDCODE == 0x3736093), when running Linux. I have only been able to see this problem on xc7z100; this does not show on smaller zynqs.

 

In the Linux kernel code retrieved from https://github.com/Xilinx/linux-xlnx.git, the initial communication with QSPI happens in standard SPI mode. The firmware reads the device ID and other data, to properly talk to the device.

 

This works right most of the time. Sometimes the transfer gets corrupted: Zynq sends fewer clock cycles to the SPI bus than what the kernel asks for; status register 0xE000D004 updates as if the bus transfer was complete. Corrupted data is found in Rx_data_REG. (Octets are right but misplaced, and some data is missing as it was never transferred over the bus).

 

How could I get around and get the system up every time?

 

Example demonstration of the problem:
The image below shows (a part of) SPI bus transfer. Signal quality is good, and the QSPI chip responds right according to specs. In this example transfer, the kernel is reading the device ID (command RDID, 0x9f); The expected reply is eight bytes; changed from default-six (SPI_NOR_MAX_ID_LEN) for demonstrational purposes). The register TXD0 was written twice by spi-zynq-qspi.c:zynq_qspi_fill_tx_fifo. Yet the chip only sends 40 clock pulses, and claims to receive 0x0000000102204d00, where 0x0102204d00 are the data that was actually seen on bus. In the successful case, the reply would have been 0x0102204d00804331. <The type of corruption is not fully deterministic; sometimes the corruption is caused by extraneous SCK pulses in a preceding step.>

kuva_.png

 

Rebooting the Zynq appears the affect the behavior. Sometimes everything works reliably. If, however, things go wrong, the odds of corruption are above 50% per bus operation. Approximately ever one in ten reboots lead to undesirable state.

0 Kudos
20 Replies
maenpaa
Adventurer
Adventurer
7,008 Views
Registered: ‎09-30-2014

The above question addresses the QSPI driver in the Zynq chip (chapter 12 in UG585), in I/O mode.

Linux is mentioned only, since I am using it to talk to the QSPI driver at Zynq.

0 Kudos
ericv
Scholar
Scholar
6,516 Views
Registered: ‎04-13-2015

We've developed a QSPI driver for the Zynq and found out the same kind of issues.
Lowering the bus clock helped sometimes.
We didn't see any improvement playing with the delays.
The QSPI from some manufacturers seems to behave much worst than others.
0 Kudos
maenpaa
Adventurer
Adventurer
6,504 Views
Registered: ‎09-30-2014

@ericv,

 

Could you be more verbose? What were the "worst performing chips"?

 

I have not noted any dependencies with bus clock frequency.

0 Kudos
ericv
Scholar
Scholar
6,492 Views
Registered: ‎04-13-2015

We've written QSPI drivers for Cyclone V, Arria V & Zynq, testing them with about 75 different part mounted on paddle boards

They all work OK at max speed on the Altera chips but we saw the same clock slipping issue on the Zynq.

Did not see any correlation with the memory size but lowering the clock did help in some cases.

Almost all of the Micron & Macronix parts we have are fine on the Zynq

Some of the Spansion (or Cypress) and Winbond we have were the ones with the most issues.

What memory size do you need?

When I have time I can check if we have one of that size that works properly.

 

maenpaa
Adventurer
Adventurer
6,489 Views
Registered: ‎09-30-2014

I have been using S25FL512S, with Spansion logo printed on the case.

 

How does the memory type affect this? In this particular mode, Zynq is driving the clock as far as I have understood.

0 Kudos
ericv
Scholar
Scholar
6,469 Views
Registered: ‎04-13-2015

I completely agree about the clock.

My guess is the output pin drive capability may not be strong enough, but some manufacturer input may be more tolerant for out of spec than others.

I tried to changed the drive in Vivado, but couldn't and it is likely due to my lack of knowledge on using Vivado.

 

For the S25FL512S, we got the S25FL256S on the ZedBoard to work

It exhibits some random errors at 100 MHz  in 1-4-4 mode (it requires 5 dummy cycles but the driver does not use the controller dummy byte insertion).

The S25FL127S at 100 MHz exhibits the clock slipping, but not at 50 MHz.

These are the Spansion parts I've quickly checked.

You could try Micron N25Q512, it has same pin-out as S25FL512S.

That one is working OK at 50 MHz but as for the S25FL256S, it's a bit flaky at 100 MHz.

 

The 100 MHz limitation could be due to our handmade paddle boards.

 

 

 

0 Kudos
smarell
Community Manager
Community Manager
6,425 Views
Registered: ‎07-23-2012

Are you using feedback clock when operating at 100 MHz?
-----------------------------------------------------------------------------------------------
Please mark the post as "Accept as solution" if the information provided answers your query/resolves your issue.

Give Kudos to a post which you think is helpful.
0 Kudos
ericv
Scholar
Scholar
6,411 Views
Registered: ‎04-13-2015

No it's disabled.

Talking about the clock, the controller having this feature kind of indicates the clock used for the RX is the pin clock and not the internal bus clock.

This is not clear in the block diagram in the QSPI section of the TRM.

If that's the case, you should check the clock signal with a high bandwidth scope instead of a logic analyzer.

 

0 Kudos
ericv
Scholar
Scholar
6,407 Views
Registered: ‎04-13-2015

If you want to try our driver, you can get it from there:

http://www.code-time.com/bsp.html

The projects are for the ZedBoard but I think the demos should work on all Zynq chips.

 

 

 

maenpaa
Adventurer
Adventurer
6,302 Views
Registered: ‎09-30-2014

I have the feeling that @ericv and myself are seeing a different problem.

 

I am running at lower speeds (50MHz, also tried significantly slower ones). Also trace lengths are rather short (~3cm, ~6pF, no connectors or anything) and therefore my issue is probably not caused by transmission line distortions.

 

In my case, if the initial handshake works right, then the will be no corruption in torture test either.

0 Kudos
ericv
Scholar
Scholar
6,269 Views
Registered: ‎04-13-2015

You have a good point :-)

I've carefully looked again at your first post and here's what I think seems to be happening:

You are receiving 3 extra bytes at the beginning (000000) and you are missing the last 3 bytes (804331).

This would be consistent with the RX FIFO already holding 3 bytes when transferring using the TXD0 register..

When writing to / reading from the QSPI controller FIFOs, unless the TXD1, TXD2, TXD3 registers are used, the data written / read is done in chunk of 4 bytes, and you've indicated TXD0 is written twice (so 2 times 4 bytes).

I am not familiar with the Linux driver, it may not make sure the RX FIFO is empty before starting a new transfer.

 

If there is left over in the RX FIFO, you should randomly see 1, 2 or 3 extra & missing bytes (same number of extra as missing each time).

 

 

 

maenpaa
Adventurer
Adventurer
6,257 Views
Registered: ‎09-30-2014

@ericv,

 

I would like to sum the problems of mine like this: The number of clock cycles on bus occasionally differs from what was requested by a multiple of 8 bits.

 

>You are receiving 3 extra bytes at the beginning (000000) and you are missing the last 3 bytes (804331).

 

Quite right. The extra bytes (000000) never physically appeared on the bus. The number of physical spi clock cycles was less in this example. Since I am reading twice the  32-bit rx fifo, 64 bits will appear as a result of these two reads. What I found fascinating was that the FIFO data hints towards the first lword being wrong, instead of a premature termination of bus transaction.

 

 

>I am not familiar with the Linux driver, it may not make sure the RX FIFO is empty before starting a new transfer.

 

I browsed through the relevant routines, and found no bugs. But then, I am not an expert in kernel debugging.

 

0 Kudos
ericv
Scholar
Scholar
6,236 Views
Registered: ‎04-13-2015

Shouldn;t TXD0 be written 3 time instead of 2.

Your 8 byte reply from the flash memory follows the Read JEDEC ID command byte (0x9F).

To do so, TXD0 should have these write:

0x9FXXXXXX    : op-code + 3 dummy bytes for 3 bytes read

0xXXXXXXXX   : 4 dummy bytes for 4 byte read

0xXXXXXXXX   : 4 dummy for 1 byte read + filling

 

 

0 Kudos
maenpaa
Adventurer
Adventurer
6,220 Views
Registered: ‎09-30-2014

Maybe I should have more thorough in my description. The bus transfer starts with chip select, followed by the query (through TXD1 write contain the actual bus command 0x9f and subsequent RXD flush). These parts did not fit into the scope capture: non-RTOS and large gaps between the steps.

 

From HW point of view, it could be done as you suggested. However, that would lead to large rework of the driver.

0 Kudos
ericv
Scholar
Scholar
6,203 Views
Registered: ‎04-13-2015

Section 12.2.3 in the TRM / UG565

 

The user must empty the TxFIFO between consecutive accesses from:
• TXD0 to TXD1/TXD2/TXD3
• TXD1 to TXD0/TXD1/TXD2/TXD3
• TXD2 to TXD0/TXD1/TXD2/TXD3
• TXD3 to TXD0/TXD1/TXD2/TXD3

 

There is no way to manually empty the TX FIFO.

After the op-code is written with TXD1, is there a check on TX FIFO empty (with TX threshold set to 0) before moving on using TXD0?

 

 

 

 

 

 

0 Kudos
maenpaa
Adventurer
Adventurer
6,195 Views
Registered: ‎09-30-2014

@ericv,

 

It would be really convenient to have explicit TXDx-empty flags somewhere, I agree. However, the information can be acquired indirectly: The number of data entries in rx fifo matches with the number of TXDx writes. Once corresponding data has been seen on rx fifo, one can conclude that tx fifo has been emptied.

 

Note: I did not x-check this before replying, might remember wrong.

0 Kudos
ericv
Scholar
Scholar
6,095 Views
Registered: ‎04-13-2015

@maenpaa

I agree...

except there are no registers to provide the number of entries in the FIFOs and FIFO full/empty flags are typically for 32 bytes..

If the RX FIFO already contains a few unread bytes before issuing the TXD1 write, not sure about the flags and it may or may not explain the extra RXed bytes you see/

0 Kudos
ericv
Scholar
Scholar
6,086 Views
Registered: ‎04-13-2015

I forgot to state in my previous comment this key from section 12.2.4 of the TRM

 

The RxFIFO interrupt status bit indicates when data is available before data is actually available for
read. The latency is associated with clock domain crossing and is almost always made-up by the time
that software takes to service the interrupt.

 

 

maenpaa
Adventurer
Adventurer
6,080 Views
Registered: ‎09-30-2014

I agree that the "almost always" is a surprising expression in a TRM. It would be nice if the spec would be more explicit on how long wait is guaranteed to suffice. My feeling is that 90% success rate does not qualify as almost always, so these are probably unrelated.

0 Kudos
trenz-al
Scholar
Scholar
4,055 Views
Registered: ‎11-09-2013

maybe the "always" should be read as "mostly" meaning that MOST of the time that makes up the latency is taken by the interrupt latency..

0 Kudos