cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
helmutforren
Scholar
Scholar
1,132 Views
Registered: ‎06-23-2014

XAPP585 ISERDES Shortcoming

Jump to solution

I just spent an emergency debugging trip out of town, because my ISERDES was working for 2 but not the 3rd Camera Link camera.  The final result was to find a shortcoming in the XAPP585 example code.

Specifically, the DDR version of this example code goes through a "training" process (my term) to find the center of the data eye, where there are 7 data bits per incoming Camera Link clock.  The starting point of this training is the bit_rate_value (expressed in funky hex nibble respresentation of decimal (give me a break!)) and the endpoint of this training is the subsequent value of c_delay_in that's supposed to position sampling in the middle of the data eye.

Well, we found that c_delay_in was starting with 23 but finishing at somewhat random values, like 11 (good video), 9 or 7 (bad video), and 4 (no video).  We discovered we could predict the final video outcome by looking at the value of c_delay_in while loading up top-level application software on a PCIe host.  Every time we power cycled the CL camera, c_delay_in came to a different final value. 

Meanwhile, we had other frequency counting logic in order to use a DRP so that our PLL/MMCM could work over the full Camera Link range of 20-85MHz.  (No, the Kintex-7 can't cover this full range with a single PLL/MMCM setting, at least not per the datasheet under all conditions.)  This frequency counting logic looked for the CL clock to be stable, such as 75MHz +/- 1MHz maximum change.  After this got stable, we checked to see if we needed to use DRP to reconfigure the PLL/MMCM.

Upon inspection, we saw that the XAPP585 "training" in serdes_1_to_7_mmcm_idelay_ddr.v used a state machine that was reset whenever the PLL/MMCM was *not* locked.  As soon as the PLL/MMCM achieved lock, the "training" would begin.  In our case, this third camera was a bit wobbly at first (whatever that means).  Since it appears the training is only on the known, fixed duty clock, it must have been the clock that was a bit wobbly.  In fact, our frequency counter didn't claim "stable" until a few seconds AFTER the PLL/MMCM claimed lock.  Our scope wasn't fast enough to really see this, and I doubt any might be good enough, but we deduced that the 3rd camera's output frequency wasn't stable enough, in either frequency or duty cycle, for the first few seconds.  Because the state machine trained as soon as the PLL/MMCM was locked, it was training on this wobbly frequency.  This was the reason for the random final c_delay_in.  

We had already modified serdes_1_to_7_mmcm_idelay_ddr.v to support DRP, guided by our frequency counter.  Now, we took our frequency counter's "stable" output and added it (negated) to the reset condition for the serdes_1_to_7_mmcm_idelay_ddr.v training state machine.  This meant that training would not happen until AFTER our frequency counter felt things were stable.  And as mentioned before, this happened several seconds AFTER the PLL/MMCM locked.

This totally FIXED our problems.  Now, c_delay_in finished at 20 or 21 every time.  (Evidentally, while 11 worked, 20 or 21 was the correct answer, and was consistently on the opposite side of 11 from the failing 9, 7, 4.) Final software captured video was GREAT EVERY TIME.

So, my lesson to you is to do was we did.  Add your own frequency stability check, and hold that training state machine in reset long beyond when the PLL/MMCM locks (as liberal as it is), and only train after your home-grown more conservative stability check has passed.  You might as well add the DRP as well.  Note that my frequency counter is pretty simple.  I create a microsecond gate (periodic pulse) from the 200MHz system clock.  The simply I count the number of rising CL edges per microsecond gate (with a bunch of clock crossing logic involved).  I have an (8x - 1x) shift/add in there to effect multiplying the result by 7 data bits per CL clock in order to get an actual diagnostic MHz readout.  I give it a while to not change much and then call it "stable".  I use this for both the DRP and now the training. 

Note in my example that the failing-video Camera's clock was already within the default range, so the DRP never had to be changed.  Were I needing to change it, this would have reset the training while the PLL/MMCM was turned off during DRP configuration.  This would have let to a later training time and good video.  However, this wasn't the case.  So the PLL/MMCM was free to keep the old, bad training, until I explicitly hooked my stable signal into the training state machine's reset condition.

Note:  I'm using Vivado 2018.1 and an in-house upgrade to DDR code from XAPP585 v1.1.1 (March 2015).  XAPP585 v1.1.2 (July 2018) wasn't available when this work started in general, but I just checked and the example code is substantially the same, appearing to have the same shortcoming.  Note that I'm using code in-house upgraded from the v1.1.1 verilog.  There appears to no longer be new verilog code in v1.1.2, just the same old v1.1.1 verilog code.

1 Solution

Accepted Solutions
helmutforren
Scholar
Scholar
1,059 Views
Registered: ‎06-23-2014

@klumsde , you are absolutely correct.  That's one reason I said "shortcoming" and not "bug".

(1) The lock issue is only a problem because the specific camera we're using doesn't have a stable clock at first.  My freq detector gets us past that timeframe.  To rephrase your rephrasing: During the unstable clock, the MMCM or PLL remains locked and so training does not occur again.  However, as that unstable clock varies, the data timing varies with it.  As a result, the previously trained and now time-fixed ISERDES logic if foiled by the varying data time.  We get bad data, including at times bad FVAL, LVAL, DVAL and thus misshapen frames.

(2) I hear a little defensiveness regarding the XAPP585 covered frequency range, LOL!  Don't worry.  I wasn't assuming the XAPP585 design was for anything more than a single chosen freq.  It's my own need that requires it to be generic across freq's.  I only mentioned this for completeness of my description.  When I modified the XAPP585 code to include the DRP, I used ?XAPP888?.  Together, these worked beautifully.  It took less than half a day to research, code, test, and confirm.  It was after this that I wrote the freq detector in order to command the code to change.

The frequency counter looks like this:

  • Uses 200MHz system clock.  Has a counter go from 0 to 199, therefore it loops every microsecond.  A microsecond gate is toggled each microsecond, so it's high for 1us and low for 1us.
  • Each microsecond low edge, a tentative freq count is compared to the prior value.  If they match to within +/- 2, then a stable counter is decremented.  Otherwise this counter is set to 2,000,000.
  • If the stable counter reaches zero, stability is claimed.  This should take 4 seconds, because the microsecond low edge only occurs every 2us and 2,000,000 occurrences are required.
  • MEANWHILE, the freq count used is generated as follows:
    • The microsecond gate from the 200MHz system clock domain is crossed over to refer to the CL clock domain. 
    • Every CL clock, the freq counter is incremented IFF the clock-crossed-microsecondgate is high.  As a result, when the clock-crossed-microsecond gate goes low, the freq counter stops being incremented.  Ignoring edge cases, the counter should give number of CL clocks per microsecond.  This is in fact "MHz"!  So the counter directly represents the MHz of the CL clock.  It's in error slightly due to clock crossing of the microsecondgate and other edge cases.  That's why the stability check allows a +/- 2 variation, chosen by experiment/experience.  That counter is then crossed over to the system clock domain for the stability logic, and locally cleared for the next time around.

The logic description above should be correct, but I may have made an error while reviewing my code.  The code itself absolutely works correctly.

View solution in original post

0 Kudos
2 Replies
klumsde
Moderator
Moderator
1,080 Views
Registered: ‎04-18-2011

Hi @helmutforren 

Thanks for providing this post. The information is very useful to other users. 

I think it is interesting to make some points about it so that anyone looking to use it as a base to design the camera link in their system can judge the XAPP585 fairly. 

So what I got from your post was that the issue is that in your system the frame or pixel clock is not very stable at the beginning and it takes time for it to get to an acceptable quality. It seems to be further confused by the MMCM locked signal. The thing to say about this is that to achieve lock you must get the frequency and phase of the input clock to match the feedback clock at the PFD to an acceptable level. if you look at the data sheet you can see that we can tolerate a resonable amount of jitter on this clock (<20% of the input period or 1ns max). So what I take from this you can lock the MMCM with an input clock that can appear jittery and in turn get instability in the Clock and data alignment training. 

So in the XAPP we are making (the not unreasonable) assumption that the incoming clock is of good quality.

In this case you're frequency counter seems necessary given the potential for this situation to occur in your design. 

The next thing you mentioned was that you had to do DRP due to the fact that your incoming clock could have a wide range of frequencies. What I would say here is that nowhere in the XAPP is it promised to accomodate any and all input clocks without having to manage MMCM VCO / MMCM output dividers. 

So there is a case to make sure that your design is robust when you try to use it in a system where the input clock is not known. 

By the way how do you drive this frequency counter? Do you take an MMCM output that is a copy of the input clock and observe it there?

-------------------------------------------------------------------------
Don’t forget to reply, kudo, and accept as solution.
-------------------------------------------------------------------------
helmutforren
Scholar
Scholar
1,060 Views
Registered: ‎06-23-2014

@klumsde , you are absolutely correct.  That's one reason I said "shortcoming" and not "bug".

(1) The lock issue is only a problem because the specific camera we're using doesn't have a stable clock at first.  My freq detector gets us past that timeframe.  To rephrase your rephrasing: During the unstable clock, the MMCM or PLL remains locked and so training does not occur again.  However, as that unstable clock varies, the data timing varies with it.  As a result, the previously trained and now time-fixed ISERDES logic if foiled by the varying data time.  We get bad data, including at times bad FVAL, LVAL, DVAL and thus misshapen frames.

(2) I hear a little defensiveness regarding the XAPP585 covered frequency range, LOL!  Don't worry.  I wasn't assuming the XAPP585 design was for anything more than a single chosen freq.  It's my own need that requires it to be generic across freq's.  I only mentioned this for completeness of my description.  When I modified the XAPP585 code to include the DRP, I used ?XAPP888?.  Together, these worked beautifully.  It took less than half a day to research, code, test, and confirm.  It was after this that I wrote the freq detector in order to command the code to change.

The frequency counter looks like this:

  • Uses 200MHz system clock.  Has a counter go from 0 to 199, therefore it loops every microsecond.  A microsecond gate is toggled each microsecond, so it's high for 1us and low for 1us.
  • Each microsecond low edge, a tentative freq count is compared to the prior value.  If they match to within +/- 2, then a stable counter is decremented.  Otherwise this counter is set to 2,000,000.
  • If the stable counter reaches zero, stability is claimed.  This should take 4 seconds, because the microsecond low edge only occurs every 2us and 2,000,000 occurrences are required.
  • MEANWHILE, the freq count used is generated as follows:
    • The microsecond gate from the 200MHz system clock domain is crossed over to refer to the CL clock domain. 
    • Every CL clock, the freq counter is incremented IFF the clock-crossed-microsecondgate is high.  As a result, when the clock-crossed-microsecond gate goes low, the freq counter stops being incremented.  Ignoring edge cases, the counter should give number of CL clocks per microsecond.  This is in fact "MHz"!  So the counter directly represents the MHz of the CL clock.  It's in error slightly due to clock crossing of the microsecondgate and other edge cases.  That's why the stability check allows a +/- 2 variation, chosen by experiment/experience.  That counter is then crossed over to the system clock domain for the stability logic, and locally cleared for the next time around.

The logic description above should be correct, but I may have made an error while reviewing my code.  The code itself absolutely works correctly.

View solution in original post

0 Kudos