UPGRADE YOUR BROWSER

We have detected your current browser version is not the latest one. Xilinx.com uses the latest web technologies to bring you the best online experience possible. Please upgrade to a Xilinx.com supported browser:Chrome, Firefox, Internet Explorer 11, Safari. Thank you!

cancel
Showing results for 
Search instead for 
Did you mean: 
Observer maxdz8
Observer
332 Views
Registered: ‎01-08-2018

How to add 64 bits properly?

Jump to solution

I cannot believe I have come to this, but after trying all sorts of variations I could figure out, none of them satisfies me.

DSP48 looks like a good candidate but the P reg mandates staggering of partials, with the higher bits coming one clock late. AB register 2 comes handy, but I still get the results staggered. Maybe I should just buffer them in logic again, it doesn't feel good. In my current design I know I can meet timing even without P regs but I used to get plenty of DRC violations. And OFC the dedicated carry chains would end completely unused.

Carry chains are awesome. They just work and they are easy to use. It turns out they're currently my issue in raising the clockrate. While I am still convinced slow and wide would be a better idea, I have architected the whole system to reach higher clocks. I was thinking between 300 and 350Mhz on my 7Z020, I am currently playing it easy at 100Mhz.

So I'm currently considering a hybrid approach where the DSP does the low bits, then feeding the carry to a 16-bit chain. This should get me a nice 'aligned' output. Or maybe I could do the carry chain first and feed its carry into DSP.

But the whole point is, I see several options and none appears clearly superior to the others. At the same time, I'm still not sure when to prefer one over the other. Suggestions welcome.

Tags (3)
0 Kudos
1 Solution

Accepted Solutions
Participant mwerner2000
Participant
319 Views
Registered: ‎06-05-2015

Re: How to add 64 bits properly?

Jump to solution

Dear maxdz8,

 

well this is actually a good question. In my experience adding with the DSP does not provide a great benefit in terms of saving LUT resources, when it comes to addition. Ofc u save a little bit, but the carry chains are quiet efficient as well. The only thing the DSP is really good at is multiplication and the adder is mostly used in a multiply and add operation. Now since u would actually have to use a hybrid approach, cuz the DSP can only output 48 Bits, I would recommend sticking with the pure LUT implementation.

Now if you want to reach 300+ MHz, which imho is challenging but not impossible, your only chance is to pipeline the adder. Adding registers half way, to shorten the logic path. I would recommend setting the timing constraints to the desired frequency from the start, cuz that will give you an idea how to achieve the frequency. 100 MHz is for the weak ^^ cuz even a Spartan-3 could handle that relatively easily.

Does you design have specific requirements concerning the latency of the operation or why are you so concerned about the additional clock cycles?

 

Best regards,

Martin

3 Replies
Participant mwerner2000
Participant
320 Views
Registered: ‎06-05-2015

Re: How to add 64 bits properly?

Jump to solution

Dear maxdz8,

 

well this is actually a good question. In my experience adding with the DSP does not provide a great benefit in terms of saving LUT resources, when it comes to addition. Ofc u save a little bit, but the carry chains are quiet efficient as well. The only thing the DSP is really good at is multiplication and the adder is mostly used in a multiply and add operation. Now since u would actually have to use a hybrid approach, cuz the DSP can only output 48 Bits, I would recommend sticking with the pure LUT implementation.

Now if you want to reach 300+ MHz, which imho is challenging but not impossible, your only chance is to pipeline the adder. Adding registers half way, to shorten the logic path. I would recommend setting the timing constraints to the desired frequency from the start, cuz that will give you an idea how to achieve the frequency. 100 MHz is for the weak ^^ cuz even a Spartan-3 could handle that relatively easily.

Does you design have specific requirements concerning the latency of the operation or why are you so concerned about the additional clock cycles?

 

Best regards,

Martin

Observer maxdz8
Observer
281 Views
Registered: ‎01-08-2018

Re: How to add 64 bits properly?

Jump to solution

Thank you. I'm unsure I want to invest more in this specific experiment but I am interested in acquiring a rule of thumb so discussion follows.

The main challange so far was fitting the device. It is just the obvious learning exercise at cryptomining so there's basically no requirement on latency at all.

The first versions were DSP less and did not fit. Right now I use about 50% of everything, besides adding DSPs I basically reviewed the whole thing and there have been algorithmic changes as well. Maybe I can fit without DSPs now but I'm not confident. There are some >130% congestion spots in this implementation but they're not stable.

It is a fully synchronous design stemming from the 'create AXI device' wizard. I have read a bit about timing constraints, but I have not retained them as my day job has become very demanding. In general, I am fairly satisfied with the 'implicit constraint' of stabilizing before the next clock.

To my surprise, my slowest adder has a delay of just 5.5ns in this implementation so indeed one clock of pipeline looks like could be enough.

I am a bit concerned my current design does not currently employ any interleaving so keeping pipelines will be problematic. More than fixing this design, I am interesting in understanding a rule of thumb. I will keep in mind.

0 Kudos
Highlighted
Participant mwerner2000
Participant
278 Views
Registered: ‎06-05-2015

Re: How to add 64 bits properly?

Jump to solution

Dear maxdz8,

 

can you give us some hints on what device u r using and how many instances of the adder u r trying to fit and how large each instance is? Maybe there is something wrong with the way u are describing the adder or maybe u r just trying to squeeze too much logic in that poor FPGA ;)

 

Best regards,

Martin

0 Kudos