I just finished the initial layout for the XCZU3EG DDR4 interface using fly-by routing topology and two Micron MT40A512M16JY-083E IT:B DDR4 SDRAM 8G-Bit 512Mx16. My question is about the skew constraint between CK and the DQS signals. The example in UG583 is straight forward but I was looking for a little more information on devices that have two byte lanes per chip and a single CK input. My initial flight times from the Zynq to the DDR are listed below:
DDR1 -> CK = 93ps, DQS0 = 201ps, DQS1 = 194ps
DDR2 -> CK = 182ps, DQS2 = 280ps, DQS3 = 255ps
Is it ok that my byte lanes arrive out of order (DQS1<DQS0<DQS3<DQS2) or do I need to lengthen my traces so that they arrive sequentially? Seems sequentially would be the logical choice but the DQS/DQ signals don't have any constraints across byte lanes so it leads me to believe the error handling can correct for that.
If they can be left out of sequential order, does the CK constraint for the "first device" apply to the fastest DQS or to DQS0 always?
Does anyone have at least anecdotal evidence this will work? I do not have the board space to lengthen my byte lanes enough to make them arrive sequentially.
My only option to clean up the layout would be bit/byte swapping and I'm trying to avoid that at this time since I am new to DDR4 and I'm note sure how to account for those changes in software. (The ability to byte lane swap is another suggestion that sequential arrive is not needed but I'm still guessing without there being an actual statement in the guidelines).