05-31-2018 11:29 PM - edited 06-08-2018 11:35 AM
I use Vivado 2017.4, for xc7z030fbg484-2.
From "UG953 (v2017.4) December 20, 2017" pages 166-167
I get this info about MACC_MACRO:
But when I tested that i got:
1 - CREG == PREG == 1;
2 - AREG == BREG == CREG == PREG == 1;
3 - AREG == BREG == CREG == PREG == MREG == 1;
4 - AREG == BREG == 2; CREG == PREG == MREG == 1;
I got the results from synthesis, they also match simulation results.
Attached files "top.vhd" and "dsp48.vhd"
Is it mistake in datasheet? Or in my configuration?
As I understand there is no way to disable (or enable, as seen from datasheet) CREG?
My simulation results:
In addition to that, there is misleading info about the output width (P port).
There is attribute to determine the width:
But from block schematic representation and port description, we see:
Which information is correct?
Output port width is determined by input ports widths?
Or it's determined by attribute?
06-06-2018 12:27 PM
I don't see any difference in the latency measurements. The CREG is not used for MACC, so it doesn't matter if you enable it or not.
06-08-2018 12:26 AM - edited 06-08-2018 12:27 AM
You have checked it yourself, or you say that based on my info?
Differences with my project and datasheet:
1 - MREG, PREG
2 - MREG
3 - all the same
4 - all the same
What do you mean it not used?
CREG is always enabled (== 1), but there is no info about that in datasheet.
06-08-2018 04:34 AM
I don't see anywhere in your tests where you tested MREG = 1 with all others =0. I also don't see the case in your tests where MREG=PREG=1, with the others=0.
CREG should not matter since the logical setup for a MAC in a single DSP48 is A and B feed the multiplier and the P output feeds back through the X or Z multiplexer to the accumulator. I think either mux should work, but it's been a while since I coded a MAC. Even if you were to build a multi-DSP48 MAC, the CREG delay would be concurrent with one of the other register delays and shouldn't add to the latency.
06-08-2018 04:57 AM
To "see" that you can synthesize the design with my code, or in any other way.
I created 4 MACC_MACRO "instantiations", with all possible values of the "Latency" parameter
From datasheet if I use "Latency" = 1, I should get DSP48 with MREG == 1.
Instead of that, I got DSP48 with PREG == 1 and MREG == 0 (let's forget about CREG for now).
Clearly not what was written in datasheet.
The same with "Latency" == 2, from datasheet MREG should be == 1,
but after synthesis I got MREG == 0.
Clearly not what was written in datasheet.
I don't really care about latency as you state it, all I care about is which registers are used inside DSP48.
When I synthesyze or simulate my design, I see that CREG == 1, but from datasheet it should be 0.
As I understand if register is not written, then it should be 0.
06-08-2018 09:52 AM
When I first read your post, I thought you were questioning the latency of the macro versus the latency you asked for. I'm not sure why you care which registers are used.
Looking at the data sheet, it does seem to infer that a latency of 1 will produce a MAC that uses the MREG. PREG=1 or AREG=BREG=1 will also produce a one clock latency. AREG=BREG=1 is probably bad for timing, so I don't expect the MAC will ever be built this way. If you ask the tool to produce a one cycle latency MAC and it does that, it's done its job. Should the data sheet list every possible register combination that may be used? I suppose it could. Maybe Xilinx will update it someday.
06-08-2018 11:41 AM
I had a task to write a universal DSP48 entity, without using any IP cores, where we could choose which registers are used, in any possible configuration.
I tried to play with macro and got thees results, so I decided to post on the forum. As I understand it's better to use full DSP48 instantiation.
To be honest, I didn't even thought about "Latency" representing real life latency, because this macro works as a MAC instruction only when "Latency" parameter is equal to 2.
My simulation results:
If I have found a mistake in datasheet is there a correct procedure to send it to Xilinx?
06-10-2018 11:20 AM
There are times a Xilinx employee will see a post and enter a change request. I don't know how to make that happen.
Instead of using a MAC, why not just Write a MAC in RTL? Bot the multiply and addition operators are synthesizable. The multiplication will almost always use the DSP. The synthesis tool may be smart enough to use the adder in the DSP. The code will be portable between different generations of parts and even between vendors.
06-10-2018 10:19 PM - edited 06-10-2018 10:20 PM
I usually use MAC as a "A*B+C", and it works, but I have problem with using internal registers, inside DSP48, if I try to use MREG, AREG, BREG, PREG or CREG, compiler uses external resources, places several DSP slices or does something else.
I would like to find 100% working solution, where I can use any internal register I would like and be sure that it will compile the exact way I want.
06-11-2018 04:29 AM
You can always instantiate the DSP48 slice and set it up any way you want to. This usually isn't pretty, but you can get what you want.
06-12-2018 04:07 AM
Yep, that was my initial plan, but I wanted to research a little bit.
Also I have found this article. It says that it;s possible to use registers from VHDL code.
Haven't fully tested that yet.
Another question if I may.
Do people instantiate DSP48 slice if they want to use internal registers?
Maybe everyone is using IP cores now and don't even think about it ?
06-12-2018 05:12 AM
It depends. If I'm writing code to be portable or readable, then I usually just use the * and + operators and make sure that there are some pipeline registers around the arithmetic. The synthesis tool will almost always try to put the multiplication in the DSP. The adder is more of a crap shoot. The adder might end up in the DSP if your code looks enough like something the tool recognizes. Don't bet on it. There are attributes that can help utilize the DSP. I don't play with them much.
I'll use a core for something complicated or if it fits in well with what I want to do. I use a lot of FIR filter cores but someday I want to try the FIR code @jmcclusk posted if I can ever find it again.
If you really want access to all of the DSP's features, you need to instantiate. I wrote a programmable resampler for a Spartan 6 a few years back and needed to instantiate. It was good practice. Might even get the guy I wrote it for to buy me lunch some day.