UPGRADE YOUR BROWSER

We have detected your current browser version is not the latest one. Xilinx.com uses the latest web technologies to bring you the best online experience possible. Please upgrade to a Xilinx.com supported browser:Chrome, Firefox, Internet Explorer 11, Safari. Thank you!

cancel
Showing results for 
Search instead for 
Did you mean: 
Highlighted
Contributor
Contributor
412 Views
Registered: ‎03-04-2016

US+ BUFG_GT fanout and cascading BUFG

Jump to solution

Hi, 

I have a question about the US+ BUFG_GTs used for GTY.

 

Currently I'm using a BUFG_GT to drive the FPGA fabric.

Due the fanout is ~8000 of TXOUTCLK I'd like to know if there are any restrictions regarding the maximum fanout of the BUFG_GT?

 

Furthermore I'm interested in cascading one (or even more?) BUFG after a BUFG_GT (or maybe BUFGCE with CE using the GTY_RESET_DONE indication to reduce clock skew?).

Currently I'm driving a 14 depth logic with a TXOUTCLK using 299.88 MHz (using GTY in XCVUP5).

The worst paths are inside of my TXOUTCLK clock domain (after routing with -0.2 ns WNS / -3.2 TNS and no hold violations).

With several post-routing phys_opt steps I've made it to meet timing in my design.

 

Is this recommended or allowed in implementation to cascade a BUFG(CE) after BUFG_GT?

Could the BUFG be used to optimize timing or will it lead to even more timing issues?

 

 

I'd be thankful for any advises to improve clock tree and timing!

Best regards, Michael

0 Kudos
1 Solution

Accepted Solutions
Teacher drjohnsmith
Teacher
398 Views
Registered: ‎07-09-2009

Re: US+ BUFG_GT fanout and cascading BUFG

Jump to solution

The clock circuit in the FPGA is "special" and not constrained the same as the logic layers,

The clock layers are implemented nothing like the simplified schematic presented in the documentatoin,

     Not least . load is not aproblem to the clock circuit,

           but the length is of concern.

it can take many 100 of ps to get a clock across a chip,

   which is one of the constraints the tools are trying to optimise.

 

Im 99.9 % certain adding explicit buffers will slow the clock down,

   that is assuming the tools dont trim them down

Give it a go is the easy and final answer, the tools dont lie.

The way to improve timming , is to look at your algorithum. Generaly add more pipe line delays in the data is a good start.

( I had a design a while back , the best layout for the logic blocks was in different areas of the chip, as apart from a bus between each of th e4 blocks, they were very tightly coupled. I changed th algorithum so the blocks could cope with a three clock delay between the 4 blocks, still same dta arate , jsut a delay, and the design meet timming easy. )

 

Dont even look at post-routing phys_opt  unless your within a few ps of an answer is my advise,

 

 

 

 

 

<== If this was helpful, please feel free to give Kudos, and close if it answers your question ==>

View solution in original post

2 Replies
Teacher drjohnsmith
Teacher
399 Views
Registered: ‎07-09-2009

Re: US+ BUFG_GT fanout and cascading BUFG

Jump to solution

The clock circuit in the FPGA is "special" and not constrained the same as the logic layers,

The clock layers are implemented nothing like the simplified schematic presented in the documentatoin,

     Not least . load is not aproblem to the clock circuit,

           but the length is of concern.

it can take many 100 of ps to get a clock across a chip,

   which is one of the constraints the tools are trying to optimise.

 

Im 99.9 % certain adding explicit buffers will slow the clock down,

   that is assuming the tools dont trim them down

Give it a go is the easy and final answer, the tools dont lie.

The way to improve timming , is to look at your algorithum. Generaly add more pipe line delays in the data is a good start.

( I had a design a while back , the best layout for the logic blocks was in different areas of the chip, as apart from a bus between each of th e4 blocks, they were very tightly coupled. I changed th algorithum so the blocks could cope with a three clock delay between the 4 blocks, still same dta arate , jsut a delay, and the design meet timming easy. )

 

Dont even look at post-routing phys_opt  unless your within a few ps of an answer is my advise,

 

 

 

 

 

<== If this was helpful, please feel free to give Kudos, and close if it answers your question ==>

View solution in original post

Contributor
Contributor
331 Views
Registered: ‎03-04-2016

Re: US+ BUFG_GT fanout and cascading BUFG

Jump to solution

@drjohnsmith 

 

Thanks a lot for your explaination! I won't insert a BUFG manually due place_design/route_design doesn't, too.

Due I have my own implemented PCS layer driven by TXOUTCLK I'll try to create several registers (e.g sync & prep length) for each lane to optimize timing instead of sharing those registers (currently four PCS layers - for each GTY channel one PCS instance - will use the same sync & prep registers).

I've already created a two-step pipeline to provide those registers in different areas on chip.

This will obviously lead in a high fanout of those registers which I'll try to compensate by duplicating the registers and keeping the equivalent registers.

 

Edit:

 

I've created several pipeline registers to optimize the timing. I've added a TX DATA / RX DATA pipeline step as well to optimize the distribution of my module close to my own implementation without making a too long routing distance to GTY.

With this I've got much better timing results and faster implementation run due Vivado doesn't have to optimize as much as before.

 

 

Best regards,

  Michael

 

0 Kudos