10-10-2019 03:45 PM
How do I determine, how much clock skew can I have between the two 100MHz clocks going to FPGA_A and FPGA_B in the picture shown?
I need deterministic latency on this link. i.e. the number of clock periods to transfer data from FPGA_A to FPGA_B should be constant. I am particularly worried that just as SERDES delays will be affected by PVT, so will the clock delays both inside and outside the chip.
I have two Artix 7 FPGAs with a unidirectional serdes link as shown in the attached picture. FPGA_A has SERDES TX and FPGA_B has SERDES RX. These use LVDS I/Os. I use IDELAYCTRL blocks. So I think that the SERDES lanes will be compensated for PVT variations. It is also important to note, that serdes data and clock are provided to FPGA_B over a 10ft cable.
10-12-2019 01:43 AM - edited 10-12-2019 01:48 AM
the serial interface is not capable of deterministic latency. I read some papers about this, and the authors mentioned PPM must be less than 100.
But the most important thing is : There is bit slip in Serdes and that causes inconsistency on latency.
If you can align phase and compensate bit slip, Maybe you will reach your purpose.
10-12-2019 11:01 AM
Have look at a standard SerDes blocks , such as Aurora,
All of them work asyncronous,
The clock at the recever has no reference to the phase of the transmitter.
they can also allow for the inherrant small difference betwene the Tx and Rx clock,
The protocol adds extra information to the sent data at the transmitter, that the reciever can recognise that it can remove. Extra data is also used for things like marking start / stop of frames . The protocole is also used for thigns like ensuring sufficient information for the receive clock to be recovered and to ensure there is no DC bias in the data stream.
The tx and Rx fifo ensure the data rate is constrained averagly at the same rate, no data lost.
I have recolection of syncronous protocoles , I think the orriginal ATM link was. But thats 30 years ago.
I dont know if you can send data without using a protocole, I cant see how you would ensure the above,
If you want to ensure latency between multiple links is predicatble and constant,
then try using the JESD 204 link protocole,
its used in ADC / DAC links, where the data HAS to arrive at the far end at a constant rate ,
10-12-2019 04:36 PM
Thank you for your answer.
I am not seeing why in a synchronous system bitslip will cause uncertainty of 1 clock period. Can you please point to some references that can help me understand?
10-13-2019 12:21 AM - edited 11-03-2019 12:55 AM
I had some issues with this before. but I am sure about bit slip function, I need to reach deterministic latency.
I think RX Clock and Tx clock have (0 to 360)-degree difference on the base of bit slip amount, RX clock will have (0 to 360) degree difference with TX Clock. and it causes 1 clock uncertainty in latency.
hope this be helpful for you
10-13-2019 02:49 AM
10-13-2019 09:02 AM
First, lets make sure we are all on the same page. The original post clearly mentioned PLL, LVDS, IDELAYCTRL, so I am pretty sure these are not using the gigabit transcievers (GTX/GTH), but are using the ISERDES/OSERDES. Some of the responses talking about Aurora and JESD204 are for gigabit transceivers...
The issue here is not about clock tolerance (really clock skew) it is about the entire interface.
Regardless of whether you are using the ISERDES/OSERDES, the IDDR/ODDR or conventional flip-flops (in the IOB or elsewhere) what you are describing here is a system synchronous interface - both devices get a common clock and try and communicate synchronously with that common clock. This system must be analyzed as a complete system.
You don't say if this is SDR or DDR, so I am going to assume SDR (thus the interface is running at 100Mbps/pin).
In essence, you have a timing path that launches at the rising edge of the 100MHz clock in the transmitting FPGA and is captured at the next rising edge of the 100MHz clock in the receiving FPGA (I am assuming that the PLLs in both FPGAs are running at 1:1, so the interface is really 100Mbps/pin, not some multiple).
The "common point" for this system is the CLK_GEN. From here, we need to add up all the delays
For this entire path, the following equation must be true
SCD(max) + DPD(max) <= 10ns + DCD(min) - jitter(max)
Now, lets get to the 10ft cable - are you even sure the I/O in question is capable of driving a 10ft cable at 100MHz? It's not clear that the answer to this is yes. Even if you can, this is going to introduce a large delay on both of the paths through the cable; a cable usually propagates at around 1/2 the speed of light, which is about 0.5ns/ft - therefore your cable has a delay of about 20ns - this is more than twice your bit period.
Now, you have a 10ft cable on both the DPD and the DCD, so these will cancel somewhat, but even if these cancel to with 10%, this adds 2ns of uncertainty to this path. The uncertainties on the rest of the long list of items above will almost certainly add more than the remaining 8ns of uncertainty allowed in this system, which means that this system will almost certainly not meet timing. And this is at 100Mbps - at anything higher it is almost certainly out of the question.
On top of this, you have the question of framing; the transmitter is taking parallel data and converting it to serial, the receiver is doing the opposite. Without any explicit mechanism, there is no way for the receiver to know which bit in the serial stream is the most significant bit of the original parallel word. This has nothing to do with clock skew or latencies, but about how the FPGAs came out of reset; each of them somewhere is maintaining a counter of which bit it is sending/receiving next; these counters count from 0 to 9, but without some mechanism of synchronizing them, they will end up reconstructing the parallel word at the receiver incorrectly.
In short, this system is likely not viable. It is specifically for systems like this (sending data over long cables) that we have things like the Gigabit transcievers running protocols like Aurora (or other encoded formats) - to handle the long transmission times over a cable (using clock/data recovery). However, these have no latency guarantees nor predictable latencies; you need to have an even higher level of protocol (like JESD204) to control latency, and it does this with additional parallel signals and embedded sequences...
(And just a quick note, the IDELAYCTRL is only required to calibrate the IDELAY/ODELAY, it has no effect on anything else, including the ISERDES/OSERDES).
10-13-2019 10:56 AM
10-13-2019 11:59 AM
Thank you for the response. And thank you for narrowing it down to my application i.e. LVDS IO SERDES.
We have 4:1 serialization factor. SDR. So the serial clock rate is 400MHz. The data stream is encoded. We use IDELAYCTRL + DELAY TAPS to bit align. Subsequently, we do word alignment with bitslip. I would not like to increase the cost of the solution by using JESD204 if I do not have to.
We have cable drivers and receivers on the board that helps with the 10ft cable transmission. The prototype system is working in the lab. You are correct that most of the skew between the clock and data will cancel out in the cable as the same cable is used to carry those signals. Further, my assumption is that the clocks after the PLL will be compensated so the contribution to skew is only from IBUFG to PLL.
Take a look at this old paper. It talks about XCVRs but the premise I believe can be applicable to SERDES IO blocks as well. It seems to indicate that deterministic latency can be achieved in a synchronous system.
10-13-2019 12:03 PM
Thank you for your continued interest. JESD204 links increase the cost in the system for me. And as you pointed out, my application uses SERDES IO blocks and not XCVRs.
10-14-2019 12:05 AM
in this paper "High-Speed, Fixed-Latency Serial Links With FPGAs for Synchronous Transfers "
authors make some changes on IP core and they reach deterministic latency. they worked on the BitSlip issue as I mentioned in my last comment.
I think the solution is similar. you must compensate bit slip.
10-14-2019 12:22 AM
10-14-2019 01:18 PM
I am still not clear on how much cell skew we can tolerate if the idelayctrl and bitslip is enabled. Maybe someone who knows how the bitslip and idelay blocks work together can help.
The confusion is, if the clocks are synchronous and then ideally the system should provide deterministic latency. But we are not clear how much margin we can have due to pvt.
10-15-2019 01:40 AM
If I understand the diagram correctly you are sending one clock to the FPGA A and FPGA B, when you transmit from FPGA A to FPGA B you are not sending a clock with the data.
This would then be considered a Asynchronous data transfer by the FPGA as the clock is not coming with the data.
The PVT tolerances will impact the data capture at the ISERDES. You need to meet a static timing anlaysis for your interface to have reliable capture of data.
The latency numbers are documented in the SelectIO UG and are not impacted by PVT : https://www.xilinx.com/support/documentation/user_guides/ug471_7Series_SelectIO.pdf