Turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

- Community Forums
- :
- Forums
- :
- Software Development and Acceleration
- :
- HLS
- :
- Reciprocal - Divider Sizing

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page

Highlighted
I need to calculate the reciprocal of a number and the fixed point functions in the HLS math library doesn't have supported for that directly. Instead I have a relatively large divide to calculate my reciprocal. The denominator has 13 integer bits and 16 fractional bits. The first problem is that the fractional precision of the numerator dictates the fractional precision of my output. I need 32 fractional bits so my 1 is represented with 1 integer bit and 32 fractional bits. At the same time the fractional precision of my divider adds to the numerator's integer bits. As a result HLS ends up generating a relatively larger divider that is consuming a large number of resources. This would be necessary if the denominator could be less than one, but I know from context won't happen. Is there anyway I can get a smaller divider? Perhaps's there's an alternative to calculate the reciprocal that is smaller?

jprice

Scholar

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

10-12-2016 04:20 PM

4,874 Views

Registered:
01-28-2014

1 Solution

Accepted Solutions

Highlighted

u4223374

Advisor

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

10-13-2016 06:41 AM - edited 10-13-2016 07:02 AM

8,642 Views

Registered:
04-26-2015

Straight long division is what I normally use.

Depending on the resource and timing requirements (and bit-width) you might be able to get away with just using a LUT (having only a single input makes reciprocals far more practical than division). For 12-bit input, 12-bit output you'd get away with three 18K block RAMs, (for 4096*12-bit total) which is pretty affordable in most systems - and because they're dual-ported you can potentially do two divides per clock cycle (although persuading HLS to do that requires care). However, RAM requirements go up very quickly as the bit width rises; 16-bit input/output will take 64 18K RAMs.

For the larger widths, there are methods where you use a low-resolution LUT to get an approximation, then refine it with Newton's Method or something similar. Or you can have a shallower, wider LUT that gives you a couple of Taylor series terms for each point to allow accurate interpolation. These approaches make sense where you're dealing with large-ish numbers and you don't have enough clock cycles to do the usual 1-bit-per-cycle process, but they tend to be fairly DSP and RAM-intensive.

Edit: I see that your number won't be suitable for a pure lookup table. Realistically, if you can afford the 32 cycles it'll take to do it with long division, that'll be by far the easiest approach.

5 Replies

Highlighted
##

Jump to solution
I've found that you can often write your own division algorithm in C that works as well as - or better than - the built-in one (generally "better than" occurs when you can make assumptions that the tool cannot). That may well be the best way to do this one; a basic reciprocal function is essentially a division where lots of assumptions are available.

u4223374

Advisor

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

10-12-2016 05:06 PM

4,862 Views

Registered:
04-26-2015

Re: Reciprocal - Divider Sizing

Highlighted
##

Jump to solution
Yea this is the best I can think of as well. I think you can implement binary long division but skip many of the bits since you know the remainder won't be greater than the denominator for some number of iterations.Are you aware of a more efficient algorithm?

jprice

Scholar

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

10-13-2016 05:55 AM

4,838 Views

Registered:
01-28-2014

Re: Reciprocal - Divider Sizing

Highlighted

u4223374

Advisor

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

10-13-2016 06:41 AM - edited 10-13-2016 07:02 AM

8,643 Views

Registered:
04-26-2015

Straight long division is what I normally use.

Depending on the resource and timing requirements (and bit-width) you might be able to get away with just using a LUT (having only a single input makes reciprocals far more practical than division). For 12-bit input, 12-bit output you'd get away with three 18K block RAMs, (for 4096*12-bit total) which is pretty affordable in most systems - and because they're dual-ported you can potentially do two divides per clock cycle (although persuading HLS to do that requires care). However, RAM requirements go up very quickly as the bit width rises; 16-bit input/output will take 64 18K RAMs.

For the larger widths, there are methods where you use a low-resolution LUT to get an approximation, then refine it with Newton's Method or something similar. Or you can have a shallower, wider LUT that gives you a couple of Taylor series terms for each point to allow accurate interpolation. These approaches make sense where you're dealing with large-ish numbers and you don't have enough clock cycles to do the usual 1-bit-per-cycle process, but they tend to be fairly DSP and RAM-intensive.

Edit: I see that your number won't be suitable for a pure lookup table. Realistically, if you can afford the 32 cycles it'll take to do it with long division, that'll be by far the easiest approach.

Highlighted
##

Jump to solution

muzaffer

Teacher

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

10-13-2016 06:33 PM - edited 10-13-2016 06:35 PM

4,817 Views

Registered:
03-31-2012

Re: Reciprocal - Divider Sizing

you can implement SRT as described here: http://users.minet.uni-jena.de/~nez/rechnerarithmetik_5/fdiv_bug/intel_white11.pdf

Just make sure to include all the lut entries properly ;-)

Also these guys know something about division too: http://i.stanford.edu/pub/cstr/reports/csl/tr/95/675/CSL-TR-95-675.pdf

-K

- Please mark the Answer as "Accept as solution" if information provided is helpful.

Give Kudos to a post which you think is helpful and reply oriented.

Give Kudos to a post which you think is helpful and reply oriented.

Highlighted
##

Jump to solution

jprice

Scholar

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

10-14-2016 05:08 AM

4,799 Views

Registered:
01-28-2014

Re: Reciprocal - Divider Sizing

Thanks guys! That was useful information (plus I know what the Pentium floating point bug was now, that was a very interesting read).

Moral of the story: Divides are tricky, but we all knew that :).