cancel
Showing results for 
Search instead for 
Did you mean: 
Highlighted
Adventurer
Adventurer
5,216 Views
Registered: ‎03-31-2017

xci or dcp for Xilinx IP? Is AR #69690 complete nonsense?

2017.1 has ramped up the pressure to drop dcps for Xilinx (and user) IP, and AR #69690 goes into the detail of why xcis are preferred to dcps. However, it seems to me that many, if not all, of these reasons are entirely bogus.The whole tone of #69690 is that dcps are somehow broken. The only rational argument in #69690 is that the included xdc files are incorrect because they're OOC, but it ignores the fact that these OOC xdcs are over-ridden by later in-context xdcs.

 

So, question - which do you use? Why? Do you believe #69690?

 

Consider:

 

  1. The Xilinx IP dcps are an obvious make target, like any other dcp. I have no intention of allowing the Xilinx tools to rebuild all their own dcps every time I do a build; that's my problem, not theirs. A lot of us have flows which are entirely based on dcps.
  2. It's the natural fit for a bottom-up OOC flow. There's no conceptual difference between a Xilinx IP dcp and my own bottom-up dcps.
  3. The 2017.1 dcps do still contain the xdc files, despite what it says in #69690 (well, at least the ones I've checked do)
  4. #69690 says "The generated DCP contains the constraints that were used for the OOC synthesis run. This was an out of context synthesis run which needed reasonable constraints in order to produce a reasonable netlist. However, those constraints have no knowledge of the external design." Well, actually, most of what I do is OOC. Are you saying OOC doesn't work? Of course it does.
  5. #69690 goes on: "The XCI file points to the original XDC constraints that will be applied when Vivado synthesis and implementation processes have access to the entire design. Having a knowledge of the external design allows the constraints to be set based on the design (not an artificial estimate or default value)". What? Are you suggesting that we should run up the GUI for every project, load all the constraints, and *then* regenerate IP? And you're seriously suggesting that the tools somehow use these extra constraints? I don't believe it. The IP is almost certainly generated completely OOC. Later in-context xdcs then over-ride the defaults, as explained in the docs.
  6. #69690 says "Simulation is ~100 times slower when using a dcp file. The simulation time difference is due the difference between a structural netlist simulation and behavioral models that are shipped with IP (outside DCP)". This is completely bogus, for 2 reasons: (1) simulation is not Xilinx's concern, and no-one in their right mind uses a dcp for simulation, and (2) well, actually, Xilinx is not shipping behavioural models anyway, because everything is moving to encrypted netlists (yes, a total PITA).
  7. Version control is no better in one flow than the other. The xci goes under VC, not the dcp.
  8. #69690 says (against the dcp flow) "You cannot recreate an IP core (or upgrade or make changes to it). If you lose track of the XCI file of the IP core you have no knowledge of the of the settings or DCP content so the XCI will always need to be preserved in any case". Well, you've completely missed the point. Of course you have to preserve the xci in a dcp flow, and I'm no more likely to lose an xci than I am to lose my own sources.
  9. #69690 says "Xilinx never tests standalone DCP for our IP Catalog". No, of course not. But you don't test my bitfiles either. And you don't test the xcis. So what's your point?
  10. #69690 says "Vivado needs information embedded in the XCI to correctly do memory initialization". Pass. Maybe this is true; if it is, then you need to fix the dcp, instead of junking it. And, if it true, then I can simply use the xci instead of the dcp if I need memory initialisation.

?

10 Replies
Guide
Guide
5,191 Views
Registered: ‎01-23-2009

Re: xci or dcp for Xilinx IP? Is AR #69690 complete nonsense?

eml,

 

Most of what you say is correct and reasonable.

 

As you said, the only "real" problem with the DCP is that it contains constraints that are "false" - some dummy constraints were used as part of the OOC synthesis of the IP, which are then written into the DCP. But, the scope of this "problem" is pretty limited; most of these constraints (like create_clocks) will be overridden when the IP is integrated into your top...

 

But some won't. Of specific concern are the constraints on clock domain crossing circuits within IPs. These often use set_max_delay -datapath_only. The value for these constraints are extracted from the clocks of the OOC synthesized IP (using something like get_property PERIOD [get_clocks -of_objects <some net/pin/port]). So, if the OOC IP used 100MHz clocks, then the value used will be 10ns. And this will be a fixed 10ns - in the DCP it is no longer related to the clocks of the IP - it becomes the constant "10". If you then integrate it into a system that has faster clocks, the set_max_delay will remain as 10ns unless overridden by another constraint.

 

In the .xci flow, these constraints are overridden - by the .xdc file for the IP, which re-applies the get_property PERIOD [get_clocks -of_objects <some net/pin/port], but this time it will get the periods of your real clocks, rather than the fake clock used in OOC synthesis.

 

So if you use the DCP file and don't (in another way) override the clock domain crossing constraint, your clock crossing circuit can end up under-constrained (which is bad...)

 

The bigger argument for this is that "Xilinx wants you to do it this way". When Xilinx recommends against a methodology, even if you have a workaround that seems to make sense, you are straying off the well beaten path of "what Xilinx wants you to do" into territory that may be more subject to bugs, or be entirely discontinued in the future (although it's hard to see how the latter applies in this case, but as a general rule this is something to consider).

 

Finally as for the bottom up OOC flow (and you won't like this), it is an orphaned technology... The full bottom up OOC flow (using placed and routed OOC modules) is only supported in the 7 series - it is specifically not supported in UltraScale or UltraScale+ (although OOC synthesis is still supported).

 

Avrum

Tags (2)
Highlighted
Scholar
Scholar
5,187 Views
Registered: ‎09-16-2009

Re: xci or dcp for Xilinx IP? Is AR #69690 complete nonsense?


@avrumw wrote:

Finally as for the bottom up OOC flow (and you won't like this), it is an orphaned technology... The full bottom up OOC flow (using placed and routed OOC modules) is only supported in the 7 series - it is specifically not supported in UltraScale or UltraScale+ (although OOC synthesis is still supported).

 

Avrum


 

Can you explain that last bullet?  I thought Xilinx was pushing OOC as the only way to go.  They changed the default from "Global" (basically RTL, top-down flows) to OOC (basically bottom-up netlists flows).  They were strongly encouraging folks to only use OOC.

 

Now that's reversed again? 

 

(To be clear, we don't use any of the recommended Xilinx flows - Just The RTL for us.  In our opinion, Xilinx is just completely, utterly, and hopelessly lost in their IP and revision control methodologies.  So my knowledge of the "official" flows is rather limited...)

 

Thanks,

 

Mark

Highlighted
Guide
Guide
5,175 Views
Registered: ‎01-23-2009

Re: xci or dcp for Xilinx IP? Is AR #69690 complete nonsense?

The words "out-of-context" mean a lot of things to a lot of people. But the most important distinction is are you talking about OOC synthesis or "full" OOC implementation. When one talks about "bottom up OOC", one is often talking about full OOC implementation. The original post mentioned "bottom up".

 

In "bottom up OOC implementation", the OOC module is fully placed and routed in a PBLOCK with some additional HD constraints (HD.PARTPIN and HD.CLK_SRC). This is done truly "out-of-context" - no description of the top level exists during the OOC synthesis or place and route. Once this is done, the OOC module can be directly imported into a design with the place and route information "locked" - the place and route of the "top level" will not touch the placement and routing of the stuff in the OOC partition. It is this flow that is not supported in UltraScale/UltraScale+. (Although somtehing similar can be done with the Partial Reconfiguration flow).

 

OOC synthesis is supported in all technologies supported by Vivado, and is the preferred method for synthesizing IP.

 

Avrum

Tags (1)
Highlighted
Scholar
Scholar
5,167 Views
Registered: ‎09-16-2009

Re: xci or dcp for Xilinx IP? Is AR #69690 complete nonsense?

Thanks for the clarifications.

 

In our design flows, ANY OOC is just bad methodology.  So I grouped the two together in my head.  Netlists based flows are bad. Netlists based flows with Place and Route info are worse.

 

The tradeoff just doesn't make any sort of sense to me.  RTL flows = manage 1 design. Netlists flows = manage N instances.  Netlist flows with place and route = manage N * M instances.  Moving towards the right just to save a little CPU time?  No way.  We'll throw more CPU at the problem any day of the week and just manage the 1 RTL design, thank you.

 

Regards,

 

Mark

 

 

 

 

0 Kudos
Highlighted
Guide
Guide
5,156 Views
Registered: ‎01-23-2009

Re: xci or dcp for Xilinx IP? Is AR #69690 complete nonsense?

Mark,

 

OOC flows do have a number of advantages. The biggest one is being able to "divide and conquer" the timing closure problem. With a full bottom-up OOC build methodology, a number of designers can each design sub-modules of the system, and fully implement them OOC. Each designer can worry about attaining timing closure on that one module (as long as an appropriate budget is set for the interconnecting signals), and once that is done, the top design can simply be stitched together. Furthermore, if one module has an RTL bug, it is possible to replace only that module and not affect the timing closure of the other modules.

 

But, of course, this comes with a cost. You must partition your FPGA into PBLOCKs, which means manual floorplanning. This is always a complicated task, and can make meeting timing harder (or even significantly harder). Furthermore, if you get your PBLOCK sizing wrong, or your signal budget between modules wrong, then the whole thing may have to be trashed and started over. It also means you can effectively use less of the FPGA since you need to size each PBLOCK with margin... (Note: I have never done this, and probably never will...)

 

The one place where (at least I think) OOC flows are really nice are early in the development cycle of a system. If you have a module that is "more or less complete", but not the system around it, it is very nice to be able to synthesize and even implement (if you want) that module OOC in order to see if it can meet timing. This way you can identify architectural issues with your design early on. This has always been possible by treating your module as if it were the entire design, but that approach has problems

  - the number of I/O of the sub-module may be larger than what your device has

  - getting the I/O to FPGA gives you a floorplan that is dominated by the (fake) placement of the I/O

     - this can obscure the real timing issues from the ones due to the fake I/O

 

These can all be worked around (creating wrappers around the module with flip-flops, LFSR chains, XOR chains, etc...), but it is much easier to simply declare the module OOC and throw it into the FPGA...

 

Avrum

Tags (2)
Highlighted
Scholar
Scholar
5,147 Views
Registered: ‎09-16-2009

Re: xci or dcp for Xilinx IP? Is AR #69690 complete nonsense?

 

Perhaps it's just the nature of the designs I've been working on that bottom-up methodologies have never been necessary.  'Divide and conquer" as you say with timing closure, is in the end, just covering up tool shortcomings.

 

If the tool can't handle the entire design top-down, then you as the designer are forced into bottom-up methodologies.  Vivado's much better than ISE was at timing closure.  It has been able to handle everything we through at it top-down, without issues. 

 

There's tons of other tools in my box for me to address timing closure problems.  Bottom-up (OOC) is way near the bottom of my list as to what I'd use to address the problem.  The UltraFast Methodology guide suggests this too - you go way deep within Xilinx recommended strategies, before "Considering Floorplaning" is suggested. 

 

As to budgeting timing results - sure a few throw-away OOC runs just to make sure your sub-module is in the ballpark timing wise (with margin!) - that's a great idea..  But those are explorations only.  If the RTL works there, it should work top-down too.

 

Regards,

 

Mark

 

 

Highlighted
Guide
Guide
5,141 Views
Registered: ‎01-23-2009

Re: xci or dcp for Xilinx IP? Is AR #69690 complete nonsense?

'Divide and conquer" as you say with timing closure, is in the end, just covering up tool shortcomings.

 

That's a bit harsh... The tools are pretty good. Remember, place and route are REALLY REALLY complicated (NP-hard) processes...

 

But as for the rest, I completely agree. Floorplanning really should be a "last resort" solution for timing closure, and OOC place and route requires floorplanning. There are LOTS and LOTS and LOTS of things you should do before resorting to floorplanning.

 

And Vivado is really good at getting timing closure - even on big and complicated designs. And run times are better constrained than they were in ISE.

 

But, we also need to consider that some of these FPGAs are just plain HUGE! Consider a Virtex-7 2000T - it is not inconceivable that you may have 4 (or more) main subsystems in an FPGA each filling significant portions of this really immense device (or even a complete SLR on their own). For a system like this, maybe splitting the place and route process makes sense for the reasons I mentioned above...

 

Avrum

Tags (2)
0 Kudos
Highlighted
Scholar
Scholar
5,125 Views
Registered: ‎09-16-2009

Re: xci or dcp for Xilinx IP? Is AR #69690 complete nonsense?


@avrumw wrote:

'Divide and conquer" as you say with timing closure, is in the end, just covering up tool shortcomings.

 

That's a bit harsh... The tools are pretty good. Remember, place and route are REALLY REALLY complicated (NP-hard) processes...



Actually, we're agreeing here.  I think Vivado's doing just fine given the circumstances.  There's no reason to resort to bottom-up methodologies as top-down is working fine for almost all designs.  I mean it can ALWAYS be faster.  But it's working quite well right now, given the circumstances.

 


@avrumw wrote:

But, we also need to consider that some of these FPGAs are just plain HUGE! Consider a Virtex-7 2000T - it is not inconceivable that you may have 4 (or more) main subsystems in an FPGA each filling significant portions of this really immense device (or even a complete SLR on their own). For a system like this, maybe splitting the place and route process makes sense for the reasons I mentioned above...

 


 

Agreed - but those, in my opinion, are the exception, not the rule.  For those exceptional designs, you're going to need to resort to some exceptional flows.  That's always been the case for those pushing the bleeding edge.  But why burden the rest of us with bottom-up flows and methodologies, when only those top 1% should be doing it?

 

To the OP - sorry I've hijacked your thread.  I'll leave it alone...

 

Regards,

 

Mark

0 Kudos
Highlighted
Adventurer
Adventurer
5,097 Views
Registered: ‎03-31-2017

Re: xci or dcp for Xilinx IP? Is AR #69690 complete nonsense?

> To the OP - sorry I've hijacked your thread.  I'll leave it alone...

 

No apology necessary - what with the death of usenet, it's difficult to get any real insight into this stuff, and forums are a very poor substitute.

0 Kudos
Highlighted
Scholar
Scholar
2,580 Views
Registered: ‎04-26-2012

Re: xci or dcp for Xilinx IP? Is AR #69690 complete nonsense?

@eml "However, it seems to me that many, if not all, of these reasons are entirely bogus."

 

The following Xilinx blog entry (more of the same pablum as the AR) also covers this topic:

  https://forums.xilinx.com/t5/Vivado-Expert-Series-Blog/Support-for-IP-using-quot-Standalone-quot-dcp-Instead-of-xci/ba-p/793092

 

The above also argues that checking zip (aka .xcix) files into source control is a good thing [face palm]

 

-Brian

 

p.s. I find that much of the IP related pain in Vivado version migrations is due to Xilinx's narcissistic decision to only support the latest version of their IP in a given Vivado release, making it impossible to rebuild designs in a new version of Vivado unless the old IP output products have been archived somewhere.

  This is even worse when using IP {dis}Integrator, wherein one can not make any changes to an existing BD until first updating all the Xilinx IP therein.