12-23-2015 02:04 AM - edited 12-23-2015 05:24 AM
I'm trying to bypass a BUFR inside the Xilinx IP GMII-to_RMII using a (post-synthesis) tcl script, trying to solve some timing issues.
In the screenshot, the BUFR to be bypassed is the green one. It's input comes from rgmii_clk signal. This rgmii_clk signal (which enters the FPGA through an IBUF should go directly to the IDDR buffers, instead of through a BUFR. (basically the BUFR just makes my timing worse, as they signals from my external phy are center-aligned).
Everything in my script seems to work fine, I can disconnect nets, remove the BUFR, ... but the final step of connecting the rgmii_clk signal to the clock net fails (I took the input of the BUFG further down the road as 'reference' to get this net name).
I get this warning :
WARNING: [Netlist 29-47] Cannot modify dont-touch cell 'design_1_i/gmii_to_rgmii_0/U0/i_gmii_to_rgmii_block/design_1_gmii_to_rgmii_0_0_core'
I cannot find any explanation on a 'dont-touch cell'.
My first assumption was that I could not do this because the IP is 'locked' and from Xilinx, but I can delete the BUFR, disconnect nets, ...
here's my current tcl script :
# *** POST SYNTHESIS TCL SCRIPT TO BYPASS BUFR on RGMII_RXC *** # *** IBUF *** set rxc_ibuf [get_cells -hierarchical rgmii_rxc_ibuf_i] set rxc_ibuf_out [get_pins $rxc_ibuf/O] disconnect_net -objects $rxc_ibuf_out # disconnect the output of the IBUF # *** BUFG *** set rxc_bufg [get_cells -hierarchical bufg_rgmii_rx_clk] set rxc_bufg_sig_in [get_nets -of_objects [get_pins $rxc_bufg/I]]; # keep track of this signal, we'll need it later # *** detach the BUFR on all it's pins *** set rxc_bufr [get_cells -hierarchical bufr_rgmii_rx_clk]; # get ref to the BUFR disconnect_net -objects [get_pins $rxc_bufr/I]; # disconnect pins disconnect_net -objects [get_pins $rxc_bufr/O] disconnect_net -objects [get_pins $rxc_bufr/CE] disconnect_net -objects [get_pins $rxc_bufr/CLR] remove_cell $rxc_bufr; # remove the BUFR now # *** connect net now so the 'bypass' gets established *** connect_net -net $rxc_bufg_sig_in -objects $rxc_ibuf_out -hier
----> so this last command throws the warning mentioned above
01-04-2016 07:04 AM
under '7 series FGPA SelectIO Primitives' I can see 'IOBUF' (bidirectional buffer type) but not 'BUFIO' ... is this the same buffer?
No. The IOBUF is an tristateable input/output buffer - as such, it is described in UG471.
The BUFIO is a clock buffer. Like all clock buffers, it is described in the "Clocking Resources" user guide - UG472. This user guide is really a "must read" for all FPGA designers - there is nothing more important than clocking.
In the FPGA there are a number of different dedicated clock networks, each driven by their own buffer
- BUFG: Global clock buffer - drives the global clock network; can access any clocked cell anywhere in the die
- BUFH: Half clock buffer - uses the global clocking resources in one clock region; can clock any clocked cell in the clock region
- BUFR: Regional clock buffer - uses the regional clock network; also restricted to all clocked elements in one region, but is able to be synchronous to the I/O clock network
- BUFIO: I/O clock buffer - uses the I/O clock network; accesses only the (high speed) clocks in the IOB of one clock region; the IOB flip-flop, the IDDR/ODDR and the CLK pin of the ISERDES/OSERDES
- BUFMR: (Not really a clock buffer) - allows a clock entering on an MRCC pin of the FPGA to reach the BUFIO and BUFR of its own clock region as well as the clock region above and below
These should be the only things driven by input clocks (and generated MMCM/PLL clocks), and should be the only things driving clock pins of clocked cells.
Xilinx support confused you by the word "bypass" - they didn't mean it literally. They were suggesting using the BUFIO and I/O clock network, rather than the BUFR and regional clock network. The BUFIO is supposed to be slightly faster than the BUFR (although in some devices the tools don't agree with this). Xilinx support was suggesting that using the BUFIO instead of the BUFR might buy you some more margin on your interface - not actually bypassing the BUFR (and using no clock buffer - this just doesn't work).
As for not using the IDELAYs, that's also not likely to work. The FPGA doesn't want a centered window (see your post on constraining center aligned interfaces), it wants a window that comes slightly after the edge of the clock (when using BUFIO or BUFR clocking) - you will need the IDELAYs to center the valid window in the required window.
Why don't you fix your constraints then open a new thread to discuss the timing of your interface - based on the window and clocking (and speed grade of your device) - we can look at fixing the timing of your interface. RGMII should be able to be captured by the FPGA - its not really all that fast...
Avrum
12-23-2015 04:18 AM
this does not sound like a rigth way to get things right.
are you talking GMII to RGMII or you want RMII ??
12-23-2015 04:36 AM - edited 12-23-2015 04:39 AM
I'm sorry - that was a typo - I corrected it in the subject title. It concerns the GMII to RGMII IP from Xilinx (user guide PG160). Bypassing the BUFR was one of the possible / proposed solutions by Xilinx support to solve my timing closure issue with this IP combined with my phy, and I can see why - because the BUFR just shifts away the center of the data. The IDDR regs could be directly clocked from the IBUF. And I'm trying to do this in the post-synthesis netlist.
12-23-2015 04:57 AM
i see.
well the GMII to RGMII seems to be really hard nut. sometimes it works. and sometimes there is a lot of work requried to get it working. been there many times also.
that xilinx suggest that trick with bufr was new, we have no bothered to ask xilinx in this regard.
12-23-2015 05:07 AM - edited 12-23-2015 05:10 AM
Thanks @trenz-al, I'm starting to feel less dumb now seeing I'm not the only one :-) , been struggling with timing closure on this IP for many days now, while I'm just connecting this to the same Marvell phy as used on the Zedboards etc.
I'm communicating with someone from Xilinx on this, made some progress, but we'll need to pick it up after new year again, as we both go on holidays :-) In the meantime I cannot sit still, and just trying to learn as much as I can. These timing constraints on DDR / source synchronous interfaces are not easy to grasp.
I try to use the phy in 'mode 1', which is center alligned data transfer, as I thought this would avoid the use of PLL / MMCM, so less resources. (which mode 1 was meant for)
Xilinx proposed 4 solutions
solution 1 didn't give timing closure
solution 2 looks like cheating
soltuion 3 is the one I'm trying now - hence the subject of this post :-)
solution 4 : tried this one, didn't give a better results (slightly worse actually, as it introduces more uncertainty).
To me it looks like the BUFR in the IP on rgmii_rxc is a major game braker in this. Also for a center-aligned mode, I think the IDELAYS on the rgmii_rxc are unnecessary, and just introducing more uncertainty. I would at least put an IDELAY also on the rgmii clock, so both can be shifted in opposite directions if needed to get a maximum data valid window.
So I also considered removing these IDELAYs on the rgmii_rxd[*] from the IP, if I ever manage to remove the BUFR first :-)
12-23-2015 06:14 AM
hi yes, not dumb, not alone.
you are really using scripts to modify the RTL after generating it from ENCRYPTED sources? This is not something to fight with at christmas.
1) no matter what way you use, to get the timing closure, what you get in FPGA then, it may not work properly, or may not work reliably
2cents for christmas.
todo:
1 make sure you have the NDA datasheet from marvell, make sure the PHY register setting matches what the IP core is implemmenting in the regard of the clock position.
2 create your own RGMII Test IP
3 use it to validate the hardware you have
4 compare the results when using xilinx wrapper
5 write your own wrapper
I bet this all takes less time as having fun removing BUFR
12-23-2015 06:53 AM - edited 12-23-2015 06:56 AM
hello @trenz-al, I kept this solution that xilinx proposed as a last resort :-) indeed I prefer to fight with a turkey on christmas instead of encrypted IP, though in the synthesized schematic view the inner working of the IP is no longer encrypted, and relatively easy to understand.
I got my inspiration for the script from this post
regarding the todo's :
1) I do have an NDA signed with Marvell. When I read your answer, maybe I must conclude that the IP is not meant to be used with mode 1, but only with mode '0', edge aligned (?)
2) didn't really get what you mean with steps 2 to 5 : I built a design like on the screenshot, guess this is what you mean with steps 2 & 3. So what exactly do you mean with 4 & 5?
12-23-2015 07:11 AM
I meant that you write 100% own VHDL code from scatch, and do not use xilinx encrypted wrapper..
12-23-2015 07:23 AM
@trenz-al - ok - I was asuming you meant this, but not sure :-)
It's not a bad idea. I actually found a description and example code here, by ... Xilinx :-)
indeed I've been thinking about this too, although I actually hope Xilinx will improve/adapt their IP. I'm an embedded engineer, and amateur VHDL programmer, and under quiet some time pressure. But it doesn't look that hard indeed.
I'll give the mode 0 a try, and see if that would lead me to timing closure, if not than I'll be writing VHDL while eating turkey.
thanks for the tips!
12-24-2015 01:25 PM
@ronnywebers "This rgmii_clk signal (which enters the FPGA through an IBUF should go directly to the IDDR buffers, instead of through a BUFR"
Perhaps I'm missing something, but I would expect that you should ***REPLACE*** the BUFR with a BUFIO, ***NOT*** connect the IBUF output directly to the IDDR clock inputs as you have written above.
i.e. the revised clock path should like this: IBUF => BUFIO => IDDR
And if there's other non-IDDR stuff hanging off the BUFR net, it probably needs to stay on the BUFR clock net.
Also, what device are you targeting?
Notes:
- I don't know that your IBUF=>IDDR connection attempt is causing your "Netlist 29-47" error message,
but it would definitely cause other design problems
- I have never used the Xilinx RGMII stuff , the above advice is based on general V6 BUFIO/IDDR clocking experience
-Brian
12-26-2015 07:52 PM
I'm trying to bypass a BUFR inside the Xilinx IP GMII-to_RMII using a (post-synthesis) tcl script, trying to solve some timing issues.
What do you mean? How do you plan on clocking the IDDRs if you remove the BUFR?
The BUFR is a clock buffer, it takes the clock coming in on the clock capable pin and drives it to the regional clock network. It cannot be removed - it is the one (and only one) way to get it on the regional clock network. It is not surprising that your script to remove it doesn't work; the entire clock net would have to be removed from the regional clock network and placed in general fabric routing and, believe me, if you did succeed in doing this, your timing would get MUCH worse.
If you managed to remove the BUFR in the RTL, the tools would detect an unbuffered clock, and would automatically insert a BUFG. Again, your timing would get MUCH worse...
The BUFR and/or BUFIO are really your best bet for clocking a source synchronous interface - center or edge aligned (there tends to be very little difference between the BUFR and BUFIO).
I know you are still waiting for an answer on your constraint question (I am planning to get to it when I have time), but for now, you almost certainly need to abandon this idea of removing the BUFR - its not the way to go.
Avrum
01-04-2016 03:14 AM - edited 01-04-2016 03:15 AM
Hello @brimdavis & @avrumw,
thank you both for your answer, I checked again the proposal that I received from Xilinx support : you're both right, here was wat they wrote :
"Clock to the IDDR can be directly sourced from the BUFIO bypassing the BUFG/BUFR."
As I am not very familiar with the 'low level' hardware in the Zynq (7Z020) device that I'm using, I was asuming BUFIO meant IBUF, but that clearly isn't the case ... ? If I look at UG471, under '7 series FGPA SelectIO Primitives' I can see 'IOBUF' (bidirectional buffer type) but not 'BUFIO' ... is this the same buffer?
I'm starting to think it would be a better idea to get rid of the IDELAY's on the RX data lines (instead of trying to do something with the BUFR, as you both discourage), as my phy outputs center aligned data. the IDELAYs just 'offset' the center of they eye, making my timing worse. I already tried to put an IDELAY on the clock input (before the BUFR), so that both data and clock go through IDELAYs, to get the same delay paths on both data and clock, but I didn't get timing closure either with that workaround (was also proposed by Xilinx). But maybe that was due to wrong constraints - so I'll check this again :
@avrumw : thanks for replying to my constraint question, greatly appreciated, I'll go through your answer this afternoon!
01-04-2016 07:04 AM
under '7 series FGPA SelectIO Primitives' I can see 'IOBUF' (bidirectional buffer type) but not 'BUFIO' ... is this the same buffer?
No. The IOBUF is an tristateable input/output buffer - as such, it is described in UG471.
The BUFIO is a clock buffer. Like all clock buffers, it is described in the "Clocking Resources" user guide - UG472. This user guide is really a "must read" for all FPGA designers - there is nothing more important than clocking.
In the FPGA there are a number of different dedicated clock networks, each driven by their own buffer
- BUFG: Global clock buffer - drives the global clock network; can access any clocked cell anywhere in the die
- BUFH: Half clock buffer - uses the global clocking resources in one clock region; can clock any clocked cell in the clock region
- BUFR: Regional clock buffer - uses the regional clock network; also restricted to all clocked elements in one region, but is able to be synchronous to the I/O clock network
- BUFIO: I/O clock buffer - uses the I/O clock network; accesses only the (high speed) clocks in the IOB of one clock region; the IOB flip-flop, the IDDR/ODDR and the CLK pin of the ISERDES/OSERDES
- BUFMR: (Not really a clock buffer) - allows a clock entering on an MRCC pin of the FPGA to reach the BUFIO and BUFR of its own clock region as well as the clock region above and below
These should be the only things driven by input clocks (and generated MMCM/PLL clocks), and should be the only things driving clock pins of clocked cells.
Xilinx support confused you by the word "bypass" - they didn't mean it literally. They were suggesting using the BUFIO and I/O clock network, rather than the BUFR and regional clock network. The BUFIO is supposed to be slightly faster than the BUFR (although in some devices the tools don't agree with this). Xilinx support was suggesting that using the BUFIO instead of the BUFR might buy you some more margin on your interface - not actually bypassing the BUFR (and using no clock buffer - this just doesn't work).
As for not using the IDELAYs, that's also not likely to work. The FPGA doesn't want a centered window (see your post on constraining center aligned interfaces), it wants a window that comes slightly after the edge of the clock (when using BUFIO or BUFR clocking) - you will need the IDELAYs to center the valid window in the required window.
Why don't you fix your constraints then open a new thread to discuss the timing of your interface - based on the window and clocking (and speed grade of your device) - we can look at fixing the timing of your interface. RGMII should be able to be captured by the FPGA - its not really all that fast...
Avrum
01-04-2016 07:54 AM
Thanks @avrumw, that's a clear overview of the clocking possibilities. I'll go through UG 472, you're right that it's a must read. So far I focussed on firmware in the SDK, but I must go through this one :-)
I'll go to your reply on my other post, and get back to you asap on that post - had to help a collegue today so didn't find the time yet. I think you're completely right that it should be feasible to get the timing right, I think I'm just struggling with the right timing constraints and nothing more, so accepting your last answer as solution.