12-23-2019 05:35 AM - edited 12-23-2019 11:13 AM
I have to do a multiplication operation of two 1024-bit unsigned integers comming from serial communication (RS232) for the target board of BASYS 2. It seems that simply writing a verilog code multiplying these two numbers by (*) does not synthesize, or more precisely takes very long time to synthesize, at the end I think the resources of my board ( DSP) will not be enough for direct multiplication. What simple method/topology I can use to do such a big multiplication operation for varying inputs at the correct timing? And if I am doing the multiplication operation for smaller numbers at my serial communication system in one STATE, will the new method take more than one state? How will it perform in terms of timing when I am sending my outputs as character by character to the Putty screen ( 8-bit each character)
Edit: Thank you all for your help! untill now I have tried direct multiplication and fully serial add- and -shift multiplicaitons and they did not work.
This is my target device resources:
https://reference.digilentinc.com/_media/basys3:basys3_ss.pdf, it seems that I do not have even
DSP in my FPGA, and my tries to implement use more than %700 ratio of the area!
I think that I need a super-compressing method to do this kind of multiplication,
do you think that I can do it with any other algorithim or even using BRAM?
12-23-2019 05:41 AM
Two options I know of:
12-23-2019 05:49 AM
If you have two number so 1024 bits in , how many bits out are you expecting ?
answer 2048 , think about how much logic that is going to take as a single adder. ..
look up booths multipliers,
You could try tw techniques,
a) serial multiplication, e.g.
b) look at splitting the multiplicatoin into parts,
12-23-2019 06:32 AM
please use Vivado HLS to write a multiplier IP. you can limit resource by dicrective ALLOCATION, detail at to ug902 ch1 "optimize the design"--"optimize for area".
12-24-2019 08:21 AM
Yes you can multiply two 1024-bit numbers in XC3S100E, but please keep in mind that it is one of the smallest FPGAs. If I looked correctly (Table 9 of https://www.xilinx.com/support/documentation/data_sheets/ds312.pdf) it contains about 1920 flip-flops. So if you want to multiply two 1024 bit numbers you will need to store them in RAM (preferably BlockRAM since number of LUT is also limited).
As a prove, it can be done, I propose to use clasic pen and paper multiplication (http://www.klever.net/spilikin/arithmetikin/multiply#.XgI2lFVKgaw ). To make it work in FPGA few modification should be done.
First, since Spartan 3 multipliers support 18x18 bits, I would start by splitting 1024bit wide number into 18bit digits.
Second modification is that I would not store all results of multiplications but only one partial sum.
The proposed solution will need 4096bit memory (2x 1024-bit operands, 1x 2048 partial sum result), one 18x18 multiplier, one 36bit adder, 2 6bit counter for counting the position of digit and some FSM control logic (There you need to solve issue of cary for addition, so possibly another 6bit counter).
Now, lets compute expected latency of proposed algorithm, first we need to know the number of digits in each operand (1024/18) which is 57. So for every digit of the second operand, we will need to perform 57 multiplication and addition. Since there is 57 digits in the second operand as well, 57*57 MAC operation will be necessary.
12-24-2019 08:28 AM