02-20-2020 02:24 PM
Is there any Xilinx-recommended or "blessed" approach to get high-resolution PWM on the 7-series FPGAs, without resorting to combinatorial tricks like they use for time-to-digital converters? This is for open-loop control of a high-frequency synchronous buck converter for a drain supply modulator (efficient RF power amplification).
I would like to get < 200-ps resolution on a 10-MHz PWM. I am driving the FETs with separate PWMs and adjustable dead times, so every cycle (100 ns) there are actually three edges to place with that resolution: the falling edge of PWM1, and the rising and falling edges of PWM2.
Linearity need not be perfect as long as it's monotonic. Target device is Spartan 7 so no gigabit tranceivers.
Assuming a ~ 300-MHz clock, phase selection between 8 (?) outputs of a MMCM, every 45 degrees, doesn't quite get me there. I don't know how evenly spaced those are, or if using multiple MMCMs to get even more phases is feasible.
There is the IDELAY logic, but I don't know how workable that is for a PWM output. (Only HR banks available, so no ODELAY.)
Any other suggestion besides a carry-chain delay line and having to muck about with calibration over PVT?
02-24-2020 03:38 AM
I am not an expert on the application here.
maybe a block diagram would be the best way for us to understand what you wish to do.
you seem to want 3 edges that you can tune in terms of phase.
Would the fine phase shift on the MMCM work here?
02-24-2020 08:19 AM
The requirement is to generate two PWM signals of relatively high frequency (probably 8 MHz) to drive the upper and lower MOSFET in a synchronous buck power stage. The frequency is so high (for a buck converter, not for an FPGA!) in order to achieve a ~ 1 MHz bandwidth once the switching frequency is filtered out. This is the amplitude modulation which is then applied to the drains of an efficient switching RF PA. (The phase modulation is performed in the RF PA drive.) Taken together, this goes by various names, Kahn technique, EER, or polar transmit.
Getting back to the PWM control of the synchronous buck, we may want to run this in a zero-voltage-switching (ZVS) mode for improved efficiency. To achieve ZVS it is necessary to vary the dead time between the two PWMs. (In a conventional synchronous buck, this dead time might be fixed at a length which minimizes "shoot through".) The upshot is that we have two PWMs, call them PWMU (upper) and PWML (lower), always with the following sequence (no overlap):
PWMU 0 -> 1
PWMU 1 -> 0
PWML 0 -> 1
PWML 1 -> 0
which repeats at an 8 MHz rate. The three middle events (PWMU 1-> 0, PWML 0->1, and PWML 1->0) should be adjustable with "reasonable" precision. This is where it gets a bit fuzzy. I would like control on the order of 1 part in 1000. For 8 MHz, that implies 125 ps resolution. This is not a definite requirement but it is going to need to be somewhere in that ballpark. I don't expect amazing linearity at that level of resolution but it should at least be monotonic.
(For those wondering, no, I am not going to attempt closed-loop control of the buck. This sort of thing runs open loop with LUTs in the FPGA which are tweaked to pre-distort for nonlinearities, etc.)
I am fairly inexperienced with FPGAs, but at this point the leading contender which is not a DIY delay line would be to abuse some IDELAY blocks. Other suggestions would be welcome though.
02-24-2020 08:27 AM
I forgot to mention: I did look over the fine-phase-shift capabilities in the MMCM, but I believe that would be much too slow. I don't recall the exact numbers, but by the time you allow for the requisite number of clocks to achieve a phase inc/dec, I think the maximum cadence was something like 5 MHz? And that assumes that the phase always moves in single increments/decrements. So, not really applicable to PWM at all. The IDELAY delay line, on the other hand, looks just right, except that it's not designed for delaying an output signal. I may just need to abuse it and find a way to calibrate out the non-deterministic routing delays. As long as I can "lock" a design in place so it doesn't change on every implementation run, or when adding/removing other parts of the FPGA design, this would be acceptable.
02-24-2020 10:46 PM
As long as you only want to have that many steps and don't need to actually toggle that often (switching LVCMOS33 takes a while...), you might just use OSERDES to output up to 1250Mbps. That would give you a step-width of 800ps. I'm using a similar strategy to control a pwm for an electric engine.
02-24-2020 11:56 PM
02-25-2020 12:35 PM
@drjohnsmith I'm not sure you have understood the problem statement / requirements that I outlined earlier. The reason for using a higher-than-normal frequency for the switching power stage, is because we need high bandwidth at the *output* of the power stage. In this case the target is 1 MHz bandwidth, so a PWM frequency between 5-10 MHz is reasonable. This slide deck may be instructive:
(See especially pages 33 and following. There is also an illustration of ZVS operation on page 27.)
@klasha The requirement (which is admittedly a bit fuzzy) *is* in fact to "have that many steps" which is what leads to the 100-ps ballpark figure. 800-ps resolution is only ~ 150 steps for an 8-MHz PWM, not enough. Reducing the PWM frequency would limit the bandwidth at the output of the synchronous buck.
BTW, I think I am hearing the argument being made, What use is there for resolution in the 100s of ps when LVCMOS rise/fall times may themselves be greater than 1 ns? These are really unrelated. There is no particular reason why PWM resolution ceases to be useful once it falls below the rise/fall time of the signal. You do have to consider noise effects, but other than that, PWM resolution is still PWM resolution, irrespective of the rise/fall time.
I appreciate the ideas. One thing that would help me, I think, is if someone could explain the pitfalls and reasons not to just use an IDELAY block routed to and from the fabric. Apart from a carry-chain delay line, this is the only technique I haven't been able to rule out yet. I have seen replies in other threads to the effect "that won't do what you want," but I have never exactly understood why.
02-25-2020 02:36 PM
In a Spartan-7 you can use use the OSERDES running at a significantly higher clock rate on the "high speed" side. In a -1 device the BUFIO can go to 600MHz, and in a -2 it can go to 680MHz. The OSERDES can run at DDR, so you can get the equivalent of 1200Mt/s which gives you a resolution of 833ps. These would be essentially perfectly linear (well, except for the internal clock duty cycle, so each pair would be exactly 1666ps, but they might not be evenly split between the pair); the -2 part would bring that down to 735ps. That's not the 200ps you were looking for, but it gets you "closer".
The Spartan-7 family has only HR I/O banks, so there are no ODELAYs (so that can't help you).
A for using multiple phase of the MMCM, I'm not sure how that helps you - are you considering having multiple fabric flip-flops, each running on a different phase of the clock, and then ANDing (or ORing) the resulting outputs together? If so, you will have a really tough time getting this to produce reproducible (or linear) results - the routing from each FF to the AND/OR gate is going to be critical, and this is VERY hard to control.
Other than that, I don't really have any ideas. The Kintex-7 would be able to run faster (800MHz in some speed grades), and has some pins with ODELAYs - you can change the ODELAY "pretty quickly" but it isn't really intended for changing at those rates (a couple of 100MHz is probably the limit of how fast the ODELAY will really change).
02-25-2020 10:06 PM
A 200-MHz rate of delay updates is not slow; it is much faster than needed! My proposed PWM frequency is 8 MHz, which offers an eternity, comparatively speaking, to update an ODELAY during the glitch-free window. This only imposes a min/max duty-cycle limit; a range of, say, 5%-95% would be acceptable.
Unfortunately, moving to a Kintex-7 would blow a hole in the budget and the PCB (probably not doable on four layers).
I believe the way the multiple-clock-phases method usually works is that you have all of your clock phases distributed by the clock network (so relatively low skew). One of the phases is selected (via a mux) to clock a flip-flop which registers the output of a coarse counter. The coarse counter provides the bulk of the delay in increments of 1/(300 MHz), say, and the flip-flop interpolates within a single 1/(300 MHz) period. I'm not an expert so this is probably a flawed explanation but the multiple-clock-phases approach is well known in the literature on TDCs so it obviously works. The number of phases is usually small however.
What about IDELAY? In my simplistic understanding, it can take input from the fabric and send output to the fabric, and one can obviously go from the fabric to a nearby output pin; ergo, it should be possible to connect the output of an IDELAY to an output pin (via some extra routing). Unless it just isn't possible to build a purely combinational path from one to the other?
If this isn't possible, then a carry-chain delay line may be the last option standing. This is also a mainstay of the TDC literature but I would really like to avoid it because of the calibration difficulties.
02-25-2020 11:59 PM
02-26-2020 11:50 AM
I believe the way the multiple-clock-phases method usually works is that you have all of your clock phases distributed by the clock network (so relatively low skew). One of the phases is selected (via a mux) to clock a flip-flop which registers the output of a coarse counter.
Multiplexing clocks together is fairly limited - you need to use the BUFGMUX to do the MUXing (and keep things on the dedicated resources) - you can only MUX together a finite number of clocks (due to the interconnections between BUFGMUXes) - I doubt you would be able to do 8, and I am not even sure that 4 is possible... They would all have to come from the same MMCM.
The other thing you can investigate is the dynamic phase shift of the MMCM. That is "relatively" fast update. A single increment/decrement operation advance/retards the clock phase by 1/56 of a VCO period (which is pretty small). However, it takes 12 PSCLK cycles to do this. The PSCLK can be driven at 450MHz in the Spartan-7 in a -1 speedgrade, so you can do one increment/decrement every 27ns - you would have to do the math if that gives you the ability to move the variable edges of the PWM outputs "enough" in the amount of time available. You might be able to couple this with the clock MUX; have a 2:1 MUX between an phase shifted and unshifted clock output of the same MMCM, select the unshifted clock for the "start" edge of your PWM and then switch to the shifted clock, which can be set to the phase you want for the "stop" edge. The static timing analysis of this will be nasty, but with the appropriate exceptions you should be able to make this work.
Lastly, you may be able to use the IDELAY, but you will be fighting the tool. The IDELAY input can come from the fabric and the output can go to the fabric (without a flip-flop), but it uses general purpose routing to get there and back the round trip delay will change from run to run and will be PVT dependent. However, for a PWM (which is essentially self-timed) this may be acceptable - changing the IDELAY value should change the delay by the expected amout even if there is other unknown delays in the path.
02-27-2020 06:46 PM - edited 02-27-2020 07:02 PM
@avrumw, I still don't think the MMCM fine-shift is going to be fast enough. Back of the envelope says 1 MHz modulation bandwidth requires a PWM duty cycle change from 0% to 100% and back to 0% in approximately 1 us. So, single phase steps 27 ns apart implies only ~ 20 discrete duty cycles between 0 and 100%. Since the fine phase steps are finer than that, I don't see how it can work. It's a shame the MMCM doesn't support parallel loading of the delay tap like IDELAY/ODELAY. I guess the underlying architecture is different?
Thanks for the explanation re: IDELAY. I know that I would be "misusing" the block but I suspect it is the best option. An a-priori unknown and PVT-sensitive additive delay is not a problem, except that there are two PWMs. I suspect the *differential* delay change vs PVT is acceptable, though, if stuff is placed close together. Having this delay change every time the design changes will be an issue... Can't the tools be instructed to "freeze" a portion of the design in place after you're happy with it? Strategic manual placement of a CLB or two? I'm assuming the same routing path would generally get used, if the gates/flops are in the same locations between runs?
02-27-2020 06:56 PM
@drjohnsmith, you're right, I wasn't clear what you were suggesting. But I don't think that is a viable solution. Dithering as you propose is essentially trading bandwidth for resolution. Obviously it would be fine if this were just a normal power supply. But when the buck is switching at 8 MHz and we need 1 MHz bandwidth at the output (using a fairly steep LCLC filter), there aren't a lot of extra switching cycles to "spend" on any averaging scheme.
Also, to the extent that particular switching instants are required to achieve ZVS, averaging doesn't work. The closer you get to the right moment, the closer you get to ZVS, and averaging "too soon" on one cycle with "too late" on the next is no good, since power dissipation doesn't come in positive and negative flavors :)
02-27-2020 11:48 PM
07-29-2020 12:15 AM
@mark04 did you find any solution to your problem in the end? I'm facing more or less the exactly same problem and was thinking of using the ODELAY units - until I found out that the HR pins don't have any (and my XC7Z020 device does not have any at all). As you, I thought of abusing the IDELAY units now. Was this possible in the end?