06-12-2017 01:16 PM
I have a memory test that copies the data from the ZYNQ on-chip memory starting at 0x0000_0000 to another area and then attempts to modify the on-chip memory to ensure it works as a confidence test. The data is then restored to the original values (even though that is likely meaningless since my application is in DDR memory and launched there by the boot loader). This test has been working well for two years and is written in C running bare metal.
Now for some reason I'm getting intermittent Data Abort exceptions when reading address 0x0000_0000 while none of the code or application has been changed. I have seen this on two different ZYNQ devices so it is not unique to one board. The code is a simple for loop with a long pointer set to 0x0000_0000 and a second long pointer set to 0x000F_0000; the data from the first pointer is copied to the contents of the second pointer and then both pointers are incremented using the "++" operator. The copy operation stops before reaching address 0x0003_0000. The test works fine 95% of the time. I never saw this behavior before last week and like I stated we have been using this test application since 2015 without a problem. The offending address has been confirmed in the Data Abort exception routine as 0x0000_0000 so the very first copy fails.
What is going on? The pointers are definitely initialized correctly. I'm assuming some kind of timing or alignment issue with the memory controller but I don't have any idea why.
06-14-2017 05:17 AM
Ok, I added a short delay between reading the 0x0000_0000 location and writing to the 0x000F_0000 location. That improved the failure rate to where it only fails about 0.5% of the time. I don't understand why this is necessary and I have to explain this to my management at some point so I really would like some insight, particularly from Xilinx personnel!