04-23-2019 02:54 AM
We are working on the second revision of a design involving Artix-7 FPGAs. After a run time of serveral days we're seeing the FPGAs fail with a short from some of the IO power rails to GND. Normally many of the IO pins themselves are also caught up in the short, and on some occasions some of the other power rails (both IO and internal rails) also get caught up in it.
The short is definitely occuring inside the FPGA itself, this has been proven by lifting them off the board and probing the pins directly.
On the first iteration of our design we didn't see this issue at all, and nothing has changed with regards to the power supplies between the 2 designs, about all that has changed is some signals have moved from one IO pin to another (and I have checked, they are definitely still on the correct bank for their voltage). I have probed all the relevant power rails and they all appear to be OK, though I haven't yet observed it at the actual time of failure.
2 different models/packages are affected in an identical manner:
Do you have any idea why this might be occuring?
04-24-2019 02:06 AM
Have you follow the power sequence requirement on p8 of ds181?
Note that the 3.3V-Vcco is a must, not a recommendation.
04-24-2019 10:32 AM
Hi iguo, thanks for your reply!
VCCINT and VCCBRAM rise together (same rail).
Then VCCAUX and the 1V8 VCCO banks rise together (same rail).
Then the 3V3 VCCO banks rise.
VCCO-VCCAUX never exceeds +1.5V (or -1.8V).
Also, this failure has only occured when the unit has been running for an extended period of time, by which I mean "several days", not during a power cycle.
04-24-2019 12:17 PM
Be sure to plan for situation when other boards power-up before the FPGA board. FPGA IO pins are often connected to the power rail through a diode. So, if FPGA and rail are powered down then external devices can send damaging current into the IO pin. Check FPGA data sheet for current limit specifications of the IO pins.
04-24-2019 12:42 PM
Sounds like ESD damage?
Is the bench grounded? All handling done using ESD wrist or ankle straps? Check the bench and test equipment grounding/power?
Who has access to the test area? Do you trust them?
What is the die temperature? What are the voltages on the die(using XADC)?
04-24-2019 11:41 PM
The FPGA's are the very first things that are powered, so there's no risk of something else driving them before they are powered.
I will triple check, but I'm fairly sure all current limit requirements are met. There are some LEDs that get close, but they are on a different IO bank to the one that is failing.
04-24-2019 11:58 PM
Everyone in the lab has to wear ankle straps, there is a test/sign in process that everyone has to follow (and the person in charge of the lab gets very vocal if someone tries to enter without following it). All benches etc. are grounded and tested yearly.
There's only a handful of us who ever enter the lab, and we're all fairly well informed of the risks of ESD, so to answer the question of do I trust them in this regard: Yes.
Obviously it's impossible to completely rule out the possibility of ESD damage, but we do take a lot of precautions to prevent it so it seems unlikely.
I'll get back to you shortly regarding the die temperature and the voltages, I've checked them all before but not written them down.
04-25-2019 12:10 AM
A bit of additional information, I lifted and probed a few more of the FPGA's yesterday.
All of the 50T's are failing on bank 15.
All of the 15T's are failing on bank 14.
Like I mentioned in my original post, on a couple of the FPGAs some other banks also get caught up in it, but those banks have always failed and the vast majority of devices it's just those banks that have failed. All connecting circuitry on those banks has been hextuple-checked, and we can't see anything that might be an issue.
Is there anything about these banks that might make them more susceptable to any form of damage? Or any other reason why they specifically might fail?
04-25-2019 05:57 AM
Any long cables?
A long cable can cause damaging reflections which exceed the abs max voltage limits,
Has a signal integrity engineering studt been done on all IO?
Un plugging and plugging in cables can also destroy unprotected IO pins (banks).
Fuinally, are these banks connected to LED/pushbuttons which are touched by non-ESD protected operators? Always add extra protection on anything people touch, any interfaces which get touched.
04-25-2019 08:47 AM
No long cables. No cables at all on the IO for that matter, all traces terminate on the PCB within ~10cm. The power supply traces are up to ~50cm in length, with an extra ~30cm of cable to the main supply, however the trace is interrupted by a ferrite bead ~10cm away from the FPGA.
There is no part of the board exposed (directly or indirectly) to a non-ESD protected operator. The board is fully contained inside a earthed metal enclosure.
04-25-2019 09:04 AM
So if signal integrity is good (no voltages at the IO exceed abs max numbers), then it has no excuse to be failing. I suggest you contact your local Xilinx authorized distributor to get a return merchandise authorization (RMA) to get a failure analysis done.
Of course, if you did not buy these from an authorized distributor, the other explanation is that these are recovered scrap, and they were damaged before you ever got them (one reason why you NEVER should use anything other than authorized Xilinx distributor supplied devices).
Which brings up another possibility: the devices were damaged in your pcb assembly process (either ESD, or solder was too hot).
04-25-2019 04:24 PM
Since damage seems to originate from banks 14 and 15, again check voltage at CFGBVS and reread what UG470 says about CFGBVS and banks 14 and 15. Also ensure that your JTAG programmer module is obeying these voltage rules for configuration. The JTAG programmer module should be using the voltage found on VREF pin of JTAG connector to decide what programming voltage is used on the other JTAG pins.