04-18-2012 11:53 AM
After 2 days of debugging my custom MIG design (DDR3, on an ML605), which at seemingly random times didn't complete the write leveling initialization step, I came across this answer document:
The answer record describes a potential problem with the MIG reference design during initialization:
These failures are a result of a voltage spike occurring on the VCC15V rail during reconfiguration, which causes an Over-Voltage (OV) Fault to occur on the TI Power Controller. This temporarily shuts off power to the DDR3 1.5V power rail during reconfiguration, which then causes the incorrect DDR3 initialization sequence, and then causes various stages of calibration to fail.
As far as I understand, this means that the voltage rails for the DDR module are not stable right after reconfiguration. To work around this issue, I added a large (~2 second) delay to the IODELAY_CTRL reset signal. This results in delay of the initialization until way after the reconfiguration completes. I have a few questions about this issue and the workaround:
1) Does anyone have experience with this issue? I searched the forums but I couldn't find any posts related to it.
2) The issue seems to be temperature related: the higher the die temperature readout, the higher the chance of an initilization failure. In my specific case, the workaround seems to be successful up to a die temperature of 42 degrees. Is the two second delay insufficient to avoid the voltage spikes effect?
3) The MPMC does not seem to have this problem on the ML605, so there is a way around the issue that does not require raising the OV fault threshold in the TI Power Controller as proposed in the anwser record. Any clues on how it was solved in this block?
04-18-2012 08:06 PM
The AR that you cite, 39767, has a complete and permanent solution to this problem for the board: adjust the overvoltage fault threshold as described. That's the best way to solve this problem.
04-19-2012 12:24 AM
Thanks! I am aware of the offered solution, and we are currently waiting for the usb-to-gpio tool to increase the threshold voltage.
However, I was hoping to learn why delaying the PHY initialization isn't enough to mitigate the effects of the spike protection.
For example, I am now looking at an FGPA programmed with the reference MIG design ( downloaded from this page: http://www.xilinx.com/products/boards/ml605/reference_designs.htm ). The 4'th GPIO led is not on, i.e. phy_init_done is not raised. Intuitively, pressing the cpu_reset button (which is connected to the sys_rst according to the UCF file), should reset the entire MIG and DDR, and re-start the initialization. Since this time round the FPGA is not being reconfigured, there should be no voltage spikes and initialization should succeed. However ,this is not the case: no matter how many times I attempt to reset, phy_init_done is never raised.
Is there something that could explain this behaviour?
04-20-2012 08:24 AM
A small follow-up: Using our new USB-to-GPIO toy we got some new insight in the exact nature of the problem:
The graph in the top-left corner shows the 1.5V rail voltage. It is nice and constant until the over-voltage protection kicks in. The power on this rail is completely switched off, depriving the DDR module of any power. Switching the board off and on is the only way we could escape from this state. This ixplains why resetting the MIG Reference design has no effect.
We have increased the over-voltage protection threshold, and now it seems to work fine.
There is just one mystery left: why doesn't this happen more often, for example, when loading a MIG-based MPMC onto the device? The chances of seeing the problem aren't too high, but once it happens, the DDR becomes completely useless.