07-05-2018 08:35 AM
I'm designing a systolic FIR in HLS but Vivado doesn't seem to want to use the DSP ACIN/OUT chain for the sample shift register.
What I expect for an N tap FIR with constant coefficients <18b:
* Samples are <28b
* Coefficients are <18b and constant
* N DSPs (1 per tap)
* First & last DSPs with latency 4 (A1, A2, D, AD, P regs; no M)
* All other DSPs with latency 5 (A1, A2, D, AD, M, P regs)
* Samples in feed in A of the first DSP and propogate through A1, A2, ACOUT to the next DSP
* For symmetric cases, 2nd sample feeds in D from an external SRL
* Coefficients in B
This should result in a latency=N*5-2, II=1 using N DSPs and a fairly small number of LUT/FFs.
For a 64-tap symmetric filter, I see the correct number of DSPs but nearly 50k FFs instead of ~1k that I expect.
I found a few files that seem to be related:
These seem to define the various functions the DSP can synthesize to, but none include using any of the carry paths. I tried modifying these files to include the 3 new functions (fir_first, fir_mid, and fir_last) and calling _ssdm_op_DSP directly, but I get the following error:
ERROR: [TECH 200-102] Failed to evaluate 'dsp48_macro_latency_lookup DSP_Macro dsp': key "ACIN" not known in dictionary in platform 'DefaultPlatform'.
This makes some sense, given that none of the other builtins use the ACIN port, but it did exist in the dsp48e2.json file.
DefaultPlatform seems to be defined by the set of files in <VIVADO>/2018.2/common/config/CoreList.xml which includes DSP_Macro, but doesn't doesn't provide any clues beyond that.
Has anyone else attempted to dig this far?
DefaultPlatform does seem to be data-driven. Any advice on where to/how to add the missing keys?
07-05-2018 10:17 AM
Cant open the file on my phone,
but are you using the FIR compiler ?
07-05-2018 10:30 AM
I tried that first, but the FIR compiler can't do fully-pipelined (II=1) super-sample-rate (4x and 8x samples per clock) FIRs from HLS, which is ultimately what I need to make.
I'm just starting with a simpler II=1, 1 sample/clk case
07-06-2018 10:42 AM
You say the FIR compiler can't do what you want ?
I wonder why it can't ?
Is there something the interleaving / piep lining that means it cant route as you expect ?
Have you drawn out how you would expect the dsp blocks to be implimented and connected ?
If you can provide that, we can see if it is actualy possible.
07-06-2018 11:01 AM
07-06-2018 01:21 PM - edited 07-06-2018 01:25 PM
Sorry, I can't see anything in that diagram you refer to about super sample rate, 4 or 8 samples per clock you refer to.
Edit, just had thought,
are the tools being clever ?
in that they are routing the design to meet your timing constraints and stopping.
so if the device has space this is what they have ended up at.
This is from a while back,
it might be a way of forcing the usage
07-07-2018 02:18 AM
If you know exactly what you want,
instantiate the dsp's as you want them,
if you want to encourage vivado to use dsp's use the last link,
if you want to meet timing / layout, then let the tools run.
The tools work harder as they work longer,
so its probably just that the tools meet your requirements doing things simply and stop,
You see this a lot in these big chips,
one can add lots to a design, and the usage seems not to change,
Personally I'd either not worry, and just make note,
or probably, instantiate and get exactly what I want ,
and keep this for future use.