03-05-2020 06:36 AM
I am working with custom board that we designed, that carries UltraScale+ Zynq (ZU6) and 9 DDR4 components (9 x8 components for 64-bit interface and ECC). First, we have successfully tested prototype with Samsung's 8Gb K4A8G085WB dies specified up to 2666 MT/s for total of 8GB of RAM memory on PS controller up to 2400 MT/s.
Happy with results, we proceeded to test with final DDR4 16Gb K4AAG085WB monolithic dies, also from Samsung - same series, with different in capacity and lower max speed rating of 2400 MT/s. We were pretty confident they would work, since only difference is that they use 17 instead of 16 Row Address bits (CA lines) and A16 is always used for DDR4 as RAS_n signal. In Vivado's Processing System Configuration we changed DRAM Device Capacity (per die) to 16384 MBits and Row Address Count (Bits) to 17 and run standalone DRAM test from SDK 2019.1. Running memory test returned massive number of errors, but only during first test method MTO(0):
Starting Memory Test... 16384MB length - Address 0x0... ---------+--------+------------------------------------------------+----------- TEST | ERROR | PER-BYTE-LANE ERROR COUNT | TIME | COUNT | #0 , #1 , #2 , #3 , #4 , #5 , #6 , #7 | (sec) ---------+--------+------------------------------------------------+----------- Memtest_0 ERROR: Addr=0x00 rd/RefVal/xor =0x0000000C00000000 0x0000000400000000 0x0000000800000000 Memtest_0 ERROR: Addr=0x10 rd/RefVal/xor =0x0000001C00000010 0x0000001400000010 0x0000000800000000 Memtest_0 ERROR: Addr=0x20 rd/RefVal/xor =0x0000002C00000020 0x0000002400000020 0x0000000800000000 Memtest_0 ERROR: Addr=0x30 rd/RefVal/xor =0x0000003C00000030 0x0000003400000030 0x0000000800000000 Memtest_0 ERROR: Addr=0x40 rd/RefVal/xor =0x0000004C00000040 0x0000004400000040 0x0000000800000000 Memtest_0 ERROR: Addr=0x50 rd/RefVal/xor =0x0000005C00000050 0x0000005400000050 0x0000000800000000 Memtest_0 ERROR: Addr=0x60 rd/RefVal/xor =0x0000006C00000060 0x0000006400000060 0x0000000800000000 Memtest_0 ERROR: Addr=0x70 rd/RefVal/xor =0x0000007C00000070 0x0000007400000070 0x0000000800000000 Memtest_0 ERROR: Addr=0x80 rd/RefVal/xor =0x0000008C00000080 0x0000008400000080 0x0000000800000000 Memtest_0 ERROR: Addr=0x90 rd/RefVal/xor =0x0000009C00000090 0x0000009400000090 0x0000000800000000 MT0( 0) | 939521936 | 0, 0, 0, 805304208, 939521936, 0, 0, 805304208 | 390.974090
All other tests pass with no errors.
Since the only thing that changed was the use of A16 address line, we expected it to be culprit. To verify this we changed addressing map from BANK COL ROW to ROW BANK COL and erroneous memory address' area moved to border of lower/upper 8GB. We tried changing speed and timings even to 1600 MT/s, with no success.
We designed custom test to better understand what is happening, in which we write some data to some address and with 8GB offset and then read back contents - writing to offset of 8GB overwrites contents under original address. This works the same in other direction, so it seems like writing/reading is insensitive to A16 address bit and these addresses overlap in physical memory.
Currently, we use board with 16Gbit components with config for 8Gb with no issue with ROW BANK COL address map. We tried to measure some signal with 1600 config, and nothing worrying happens, but due to limited bandwidth of the probe (1GHz differential) we can not say for sure. We also tried to check for Command/Address parity errors, but ALERT_n signal remains high, so it seems that DRAM components capture C/A correctly.
Can you please confirm that ZynqMP is expected to work with 16Gb single rank DDR4 memories? Does anyone have any hint what could be the issue in this case and what could we try? I am not sure whether hardware or configuration is to blame at this point...
07-29-2020 05:22 AM
K4AAG085WB is dual rank meanwhile K4A8G085WB is single rank. You cannot just simply increase address lines. You need to enable Dual Rank checkbox in DDR configuration in Vivado.