Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

- Community Forums
- :
- Forums
- :
- Software Development and Acceleration
- :
- HLS
- :
- Re: ap_fixed details

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page

Highlighted

cyviz

Adventurer

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

03-26-2016 04:27 PM

14,859 Views

Registered:
03-13-2015

I got a few questions regarding ap_fixed datatype that I can't find answers to.

-Is ap_fixed(M,0) ever going to get negative? Is it an invalid but allowed datatype, or is the sign bit explicit or ignored?

-Is ap_fixed(N,1) correct to cover the range [-1,1>?

-What ap_fixed(W,I) datatype is the lossless PRODUCT of any ap_fixed(a,b)*ap_fixed(c,d)?

-What ap_fixed(W,I) datatype is the lossless SUM of any ap_fixed(a,b)+ap_fixed(c,d)?

1 Solution

Accepted Solutions

Highlighted

muzaffer

Teacher

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

11-03-2017 03:35 PM

9,451 Views

Registered:
03-31-2012

I just found this old thread and wanted to contribute a solution to it which I didn't have at the time:

The traits file in %XILINX/include/utils/x_hls_traits.h has the solution to this issue. One can define a trait based on types and use them like this:

typedef ap_fixed<...> foo;

typedef ap_fixed<...> bar;

typedef typename hls::x_traits<foo, bar>::MULT_T fooXbar;

This is how multiplication etc operators can expand their outputs based on the size of the input variable.

- Please mark the Answer as "Accept as solution" if information provided is helpful.

Give Kudos to a post which you think is helpful and reply oriented.

Give Kudos to a post which you think is helpful and reply oriented.

11 Replies

Highlighted
1) not sure, probably invalid.

2) Yes.

3) ap_fixed<a+c, b+d>

4) ap_fixed<max(a,c)+1, max(b,d)+1>

muzaffer

Teacher

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

04-04-2016 11:12 PM

14,450 Views

Registered:
03-31-2012

2) Yes.

3) ap_fixed<a+c, b+d>

4) ap_fixed<max(a,c)+1, max(b,d)+1>

- Please mark the Answer as "Accept as solution" if information provided is helpful.

Give Kudos to a post which you think is helpful and reply oriented.

Give Kudos to a post which you think is helpful and reply oriented.

Highlighted

cyviz

Adventurer

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

04-04-2016 11:47 PM

14,447 Views

Registered:
03-13-2015

Thanks, I think I came to the same conclusion. I wish there was a simpler way to define 3) and 4) tho. It gets a bit messy to define a bunch of lossless sums and products of some initial datatypes. It would be nice to be able to typedef a sum or product of two other types.

When it comes to 1) , if you sign extend, I suppose you will get some split range. The tools does seem to accept zero integer digits.

Highlighted

u4223374

Advisor

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

04-05-2016 03:44 AM

14,434 Views

Registered:
04-26-2015

It'd be sort of nice to have a whole page in HLS that lists all the variables in the project, which variables each one depends upon, and allows you to enter an equation to define its length. That way (a) it's nice and neat (the equations don't clutter up the code), and (b) if you change something it's easy to see how it'll affect everything else.

Matlab's FPGA tools have a very limited version of this (in that there's a page listing all the variables where you can set widths for each one). Perhaps Vivado HLS could do better?

Highlighted
Another thought. If you start with two ap_fixed types, and make any math to a ridiculous large ap_fixed<4096,2048> type, will the tools just optimize the unused bits out (preferably before it propagates into the top hierarcy)?

cyviz

Adventurer

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

04-05-2016 03:49 AM

14,430 Views

Registered:
03-13-2015

Highlighted

u4223374

Advisor

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

04-05-2016 06:16 AM

14,421 Views

Registered:
04-26-2015

I'm pretty sure that HLS will not. HLS only does really straightforward optimisations, so in "for (int i = 0; i < 10; i++)" it'll recognise that actually "i" can just be an unsigned 4-bit value, not a signed 32-bit one.

Vivado itself (non-HLS) will trim them out during synthesis and implementation, if it can logically guarantee the validity of that approach - but this still leaves you with a huge module right up to that point.

There are, of course, some areas where the tools just can't follow the logic through. If you have a 3-element vector (x,y,z) and you multiply that by L = 1/sqrt(x^2 + y^2 + z^2) then the vector is normalised. Mathematically, it's easy to show that the resulting components must be no more than 1. However, what HLS (and Vivado) sees is that you're multiplying a 32-bit (for example, maybe 16.16 fixed-point) vector by a 32-bit value (16.16 fixed-point) and therefore the result should be 64-bit (32.32 fixed-point). It can't figure out that when x, y, and/or z are large, L is always small - and vice versa - which actually ensures that 1.32 fixed-point output would work perfectly.

Highlighted

cyviz

Adventurer

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

04-05-2016 07:00 AM

14,416 Views

Registered:
03-13-2015

Your example is worth looking closer at. I guess the type of X limits your range more than you may be aware of at first sight.

The expression x^2+y^2+z^2 can overflow earlier than expected unless you cast X to some higher width/precision that can handle the sum.

Even sqrt has precision rules, but yes, the tools will not know this function at compile time, so it won't know the type. For sum and product however, the tools will know.

Highlighted

u4223374

Advisor

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

04-05-2016 05:49 PM

14,388 Views

Registered:
04-26-2015

Yes, you'd want to have the sum (x^2 + y^2 + z^2) as at least a 66-bit (34.32) value in this case, which then gets cut down quite a lot when you do the square root.

It's a problem that will always occur when you have inter-dependent data. For example, when you're summing edges in an image, it's not possible to have every single location giving a large positive edge. You can get a single large positive edge in the X axis by having pixels [0, 255] (ie so the edge value is 255). However, you can't have two of those in a row because that would imply that the image looks like [0, 255, 510] (and 510 is not a valid 8-bit value). In fact, using a very simple X-axis edge detector (edge[y][x] = image[y][x] - image[y][x-1]) the absolute maximum sum of edges in each line is 9-bit signed (and is just the last element minus the first element).

Even sum and product can be a little bit challenging. Take the following example:

ap_uint<8> image[640*480]; int accumulator = 0; for (int i = 0; i < 640*480; i++) { accumulator += image[i]; }

A very simple reading of this says that after the first loop iteration the accumulator will be 8-bit. After the second loop iteration it'll be (8-bit + 8-bit => 9-bit). After the third loop iteration it'll be (9-bit + 8-bit => 10-bit). After the fourth loop iteration it'll be (10-bit + 8-bit => 11-bit). And so on, until after the 307200th iteration it'll be 307207-bit. Of course, a more advanced analysis would correctly recognise that log2(640*480) < 19, so a 27-bit accumulator would do nicely. This comes down to how good HLS is at (a) figuring out loop tripcounts, and (b) interpreting what the user is doing.

With regards to (a) this would imply a significant change to the functionality of the loop_tripcount pragma. Currently, if you put incorrect values in here, it just means that the latency analysis is wrong (the design will work fine but it'll run for a different time to what you expected). If HLS decides bit-widths based on that pragma, then putting incorrect values in would produce an unworkable design as internal variables would overflow.

Highlighted

cyviz

Adventurer

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

04-06-2016 12:56 AM

14,374 Views

Registered:
03-13-2015

Yes I see your points regarding the loops. I would not be using the Int type like that in HLS. I would define the loop variable "i" as ap_int<19>, and if the tools allowed, the accumulator should be typedef'd like:

typedef accumulator (i'type)*(image*'type)

But for now, I would have to do manual

typedef accumulator ap_int<19+8>

Highlighted

u4223374

Advisor

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

04-06-2016 03:52 AM - edited 04-06-2016 03:58 AM

14,366 Views

Registered:
04-26-2015

HLS does seem to actually handle ints as loop variables correctly (so it cuts that one down to 19-bit).

For the other stuff, you can do all the definitions in the preprocessor.

#define IMAGE_WIDTH 640 #define IMAGE_HEIGHT 480 #define NUM_PIXELS (IMAGE_WIDTH * IMAGE_HEIGHT) #define PIXEL_WIDTH 8 #define NUM_PIXELS_LOG2 LOG2(NUM_PIXELS) // There are a few ways of doing LOG2 in the preprocessor. #define IMAGE_SUM_WIDTH (PIXEL_WIDTH + NUM_PIXELS_LOG2) #define PIXEL_EDGE_WIDTH (PIXEL_WIDTH + 1) // Edges can be positive or negative so this needs to have a sign bit added #define IMAGE_SQUARE_SUM_WIDTH (PIXEL_WIDTH * 2 + NUM_PIXELS_LOG2) typedef ap_uint<PIXEL_WIDTH> pixel_t; typedef ap_uint<IMAGE_SUM_WIDTH> accumulator_t; typedef ap_uint<PIXEL_SQUARE_SUM_WIDTH> square_accumulator_t; typedef ap_uint<NUM_PIXELS_LOG2> image_index_t; void test(pixel_t image[NUM_PIXELS]) { accumulator_t accumulator = 0; square_accumulator_t square_accumulator = 0; for (image_index_t i = 0; i < NUM_PIXELS; i++) { pixel_t pixel = image[i]; accumulator += pixel; square_accumulator += pixel*pixel; } }

Highlighted

muzaffer

Teacher

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

11-03-2017 03:35 PM

9,452 Views

Registered:
03-31-2012

I just found this old thread and wanted to contribute a solution to it which I didn't have at the time:

The traits file in %XILINX/include/utils/x_hls_traits.h has the solution to this issue. One can define a trait based on types and use them like this:

typedef ap_fixed<...> foo;

typedef ap_fixed<...> bar;

typedef typename hls::x_traits<foo, bar>::MULT_T fooXbar;

This is how multiplication etc operators can expand their outputs based on the size of the input variable.

- Please mark the Answer as "Accept as solution" if information provided is helpful.

Give Kudos to a post which you think is helpful and reply oriented.

Give Kudos to a post which you think is helpful and reply oriented.

Highlighted

nanson

Explorer

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

01-09-2018 06:06 AM

6,444 Views

Registered:
08-31-2017

For your previous comment in Q4 LOSSLESS SUM,it seems your reply is something wrong in my tests in ap_fixed<5,3> for 3.25 and ap_fixed<5,2> for 1.125.

It should be corrected as ap_fixed<max(a-c,b-d)+max(b,d)+1,max(b,d)+1>. Thanks