08-23-2019 10:52 AM
Hello,
I'm trying to observe the correct ECC operation of a custom ZU2EG MPSoC board with PS attached DDR3L memory (40 bit wide, 3 chips). While the FSBL initializes ECC (so it claims) and the driver loads without errors and creates all sysfs entries as expected, I'm not able to trigger a reaction or false readback using the error injection method described here:
https://xilinx-wiki.atlassian.net/wiki/spaces/A/pages/18841897/Zynq+UltraScale+MPSoC+-+64-bit+DDR+access+with+ECC
The same sequence of actions works on other MPSoC boards just fine and as expected. The main difference between the board that seems related to this is (LP)DDR4 vs DDR3L. The non-working board is the only one with DDR3L.
The MPSoC TRM UG1085 mentiones ECC not being supported for LPDDR3 (which is not what this board uses). Is there any known limitation for DDR3L or a 40 bit configuration with regards to ECC error injection?
Thanks, David
08-26-2019 02:46 AM - edited 08-26-2019 02:48 AM
Hi,
The link which you referred is for 64 bit with ECC for DDR4.
Your configuration is 2x16 with ECC [40 bit wide] DDR3L.
Can you please tell me what’s the difference between the board you are using for LPDDR4 and DDR3L?
If possible, can you please share the screen shot of PS configuration for LPDDR4 and DDR3L?
Thank you
Kind Regards,
Kasthuri
08-26-2019 08:23 AM
Hello Kasthuri,
thanks for your response.
I tried some more different DDR adress areas found in /proc/iomem, and found one that works on my system. The address used in the wiki article writes and reads correctly, so I was assuming there is DDR RAM mapped to it, but this address does not appear in /proc/iomem on my system. I only have 1 GB as opposed to 4 GB. There must be some hardware address aliasing effect such that data access works, but it does seem it does not work for error injection.
The following triggers the expected result:
$ echo "CE" > /sys/devices/system/edac/mc/mc0/inject_data_poison
$ echo 0x3feffff0 > /sys/devices/system/edac/mc/mc0/inject_data_error
$ devmem 0x3feffff0 32 0x1234a596
$ devmem 0x3feffff0
$ dmesg
…
[ 715.119124] EDAC MC0: 1 CE DDR ECC error type :CE Row 32735 Bank 7 Col 0 BankGroup Number 0 Block Number 1016 on mc#0csrow#0channel#0 (csrow:0 channel:0 page:0x0 offset:0x0 grain:1 syndrome:0x0)
Thanks again for looking into it, David