UPGRADE YOUR BROWSER

We have detected your current browser version is not the latest one. Xilinx.com uses the latest web technologies to bring you the best online experience possible. Please upgrade to a Xilinx.com supported browser:Chrome, Firefox, Internet Explorer 11, Safari. Thank you!

cancel
Showing results for 
Search instead for 
Did you mean: 
Adventurer
Adventurer
9,794 Views

ap_fixed details

Jump to solution

I got a few questions regarding ap_fixed datatype that I can't find answers to.

 

-Is ap_fixed(M,0) ever going to get negative? Is it an invalid but allowed datatype, or is the sign bit explicit or ignored?

-Is ap_fixed(N,1) correct to cover the range [-1,1>?

-What ap_fixed(W,I) datatype is the lossless PRODUCT of any ap_fixed(a,b)*ap_fixed(c,d)?

-What ap_fixed(W,I) datatype is the lossless SUM of any ap_fixed(a,b)+ap_fixed(c,d)?

 

0 Kudos
1 Solution

Accepted Solutions
Highlighted
Teacher muzaffer
Teacher
4,385 Views

Re: ap_fixed details

Jump to solution

I just found this old thread and wanted to contribute a solution to it which I didn't have at the time:

 

The traits file in %XILINX/include/utils/x_hls_traits.h has the solution to this issue. One can define a trait based on types and use them like this:

 

typedef ap_fixed<...> foo;

typedef ap_fixed<...> bar;

 

typedef typename hls::x_traits<foo, bar>::MULT_T fooXbar;

 

This is how multiplication etc operators can expand their outputs based on the size of the input variable.

- Please mark the Answer as "Accept as solution" if information provided is helpful.
Give Kudos to a post which you think is helpful and reply oriented.
11 Replies
Teacher muzaffer
Teacher
9,385 Views

Re: ap_fixed details

Jump to solution
1) not sure, probably invalid.
2) Yes.
3) ap_fixed<a+c, b+d>
4) ap_fixed<max(a,c)+1, max(b,d)+1>
- Please mark the Answer as "Accept as solution" if information provided is helpful.
Give Kudos to a post which you think is helpful and reply oriented.
Adventurer
Adventurer
9,382 Views

Re: ap_fixed details

Jump to solution

Thanks, I think I came to the same conclusion. I wish there was a simpler way to define 3) and 4) tho. It gets a bit messy to define a bunch of lossless sums and products of some initial datatypes. It would be nice to be able to typedef a sum or product of two other types.

 

When it comes to 1) , if you sign extend, I suppose you will get some split range. The tools does seem to accept zero integer digits.

 

0 Kudos
Scholar u4223374
Scholar
9,369 Views

Re: ap_fixed details

Jump to solution

It'd be sort of nice to have a whole page in HLS that lists all the variables in the project, which variables each one depends upon, and allows you to enter an equation to define its length. That way (a) it's nice and neat (the equations don't clutter up the code), and (b) if you change something it's easy to see how it'll affect everything else.

 

Matlab's FPGA tools have a very limited version of this (in that there's a page listing all the variables where you can set widths for each one). Perhaps Vivado HLS could do better?

0 Kudos
Adventurer
Adventurer
9,365 Views

Re: ap_fixed details

Jump to solution

Another thought. If you start with two ap_fixed types, and make any math to a ridiculous large ap_fixed<4096,2048> type, will the tools just optimize the unused bits out (preferably before it propagates into the top hierarcy)?

 

0 Kudos
Scholar u4223374
Scholar
9,356 Views

Re: ap_fixed details

Jump to solution

I'm pretty sure that HLS will not. HLS only does really straightforward optimisations, so in "for (int i = 0; i < 10; i++)" it'll recognise that actually "i" can just be an unsigned 4-bit value, not a signed 32-bit one.

 

Vivado itself (non-HLS) will trim them out during synthesis and implementation, if it can logically guarantee the validity of that approach - but this still leaves you with a huge module right up to that point.

 

There are, of course, some areas where the tools just can't follow the logic through. If you have a 3-element vector (x,y,z) and you multiply that by L = 1/sqrt(x^2 + y^2 + z^2) then the vector is normalised. Mathematically, it's easy to show that the resulting components must be no more than 1. However, what HLS (and Vivado) sees is that you're multiplying a 32-bit (for example, maybe 16.16 fixed-point) vector by a 32-bit value (16.16 fixed-point) and therefore the result should be 64-bit (32.32 fixed-point). It can't figure out that when x, y, and/or z are large, L is always small - and vice versa - which actually ensures that 1.32 fixed-point output would work perfectly.

 

 

0 Kudos
Adventurer
Adventurer
9,351 Views

Re: ap_fixed details

Jump to solution

Your example is worth looking closer at. I guess the type of X limits your range more than you may be aware of at first sight.

The expression x^2+y^2+z^2 can overflow earlier than expected unless you cast X to some higher width/precision that can handle the sum.

 

Even sqrt has precision rules, but yes, the tools will not know this function at compile time, so it won't know the type. For sum and product however, the tools will know.

 

0 Kudos
Scholar u4223374
Scholar
9,323 Views

Re: ap_fixed details

Jump to solution

Yes, you'd want to have the sum (x^2 + y^2 + z^2) as at least a 66-bit (34.32) value in this case, which then gets cut down quite a lot when you do the square root.

 

It's a problem that will always occur when you have inter-dependent data. For example, when you're summing edges in an image, it's not possible to have every single location giving a large positive edge. You can get a single large positive edge in the X axis by having pixels [0, 255] (ie so the edge value is 255). However, you can't have two of those in a row because that would imply that the image looks like [0, 255, 510] (and 510 is not a valid 8-bit value). In fact, using a very simple X-axis edge detector (edge[y][x] = image[y][x] - image[y][x-1]) the absolute maximum sum of edges in each line is 9-bit signed (and is just the last element minus the first element).

 

 

Even sum and product can be a little bit challenging. Take the following example:

 

ap_uint<8> image[640*480];

int accumulator = 0;
for (int i = 0; i < 640*480; i++) {
    accumulator += image[i];
}


A very simple reading of this says that after the first loop iteration the accumulator will be 8-bit. After the second loop iteration it'll be (8-bit + 8-bit => 9-bit). After the third loop iteration it'll be (9-bit + 8-bit => 10-bit). After the fourth loop iteration it'll be (10-bit + 8-bit => 11-bit). And so on, until after the 307200th iteration it'll be 307207-bit. Of course, a more advanced analysis would correctly recognise that log2(640*480) < 19, so a 27-bit accumulator would do nicely. This comes down to how good HLS is at (a) figuring out loop tripcounts, and (b) interpreting what the user is doing.

With regards to (a) this would imply a significant change to the functionality of the loop_tripcount pragma. Currently, if you put incorrect values in here, it just means that the latency analysis is wrong (the design will work fine but it'll run for a different time to what you expected). If HLS decides bit-widths based on that pragma, then putting incorrect values in would produce an unworkable design as internal variables would overflow.

 

0 Kudos
Adventurer
Adventurer
9,309 Views

Re: ap_fixed details

Jump to solution

Yes I see your points regarding the loops. I would not be using the Int type like that in HLS. I would define the loop variable "i" as ap_int<19>, and if the tools allowed, the accumulator should be typedef'd like:

 

  typedef accumulator (i'type)*(image*'type)

 

But for now, I would have to do manual

 

  typedef accumulator ap_int<19+8>

 

 

0 Kudos
Scholar u4223374
Scholar
9,301 Views

Re: ap_fixed details

Jump to solution

HLS does seem to actually handle ints as loop variables correctly (so it cuts that one down to 19-bit).

 

For the other stuff, you can do all the definitions in the preprocessor.

 

#define IMAGE_WIDTH 640
#define IMAGE_HEIGHT 480

#define NUM_PIXELS (IMAGE_WIDTH * IMAGE_HEIGHT)
#define PIXEL_WIDTH 8
#define NUM_PIXELS_LOG2 LOG2(NUM_PIXELS) // There are a few ways of doing LOG2 in the preprocessor.
#define IMAGE_SUM_WIDTH (PIXEL_WIDTH + NUM_PIXELS_LOG2)
#define PIXEL_EDGE_WIDTH (PIXEL_WIDTH + 1) // Edges can be positive or negative so this needs to have a sign bit added
#define IMAGE_SQUARE_SUM_WIDTH (PIXEL_WIDTH * 2 + NUM_PIXELS_LOG2)


typedef ap_uint<PIXEL_WIDTH> 		pixel_t;
typedef ap_uint<IMAGE_SUM_WIDTH> 	accumulator_t;
typedef ap_uint<PIXEL_SQUARE_SUM_WIDTH> square_accumulator_t;
typedef ap_uint<NUM_PIXELS_LOG2> 	image_index_t;

void test(pixel_t image[NUM_PIXELS]) {

	accumulator_t accumulator = 0;
	square_accumulator_t square_accumulator = 0;

	for (image_index_t i = 0; i < NUM_PIXELS; i++) {
		pixel_t pixel = image[i];
		accumulator += pixel;
		square_accumulator += pixel*pixel;
	}

}
0 Kudos
Highlighted
Teacher muzaffer
Teacher
4,386 Views

Re: ap_fixed details

Jump to solution

I just found this old thread and wanted to contribute a solution to it which I didn't have at the time:

 

The traits file in %XILINX/include/utils/x_hls_traits.h has the solution to this issue. One can define a trait based on types and use them like this:

 

typedef ap_fixed<...> foo;

typedef ap_fixed<...> bar;

 

typedef typename hls::x_traits<foo, bar>::MULT_T fooXbar;

 

This is how multiplication etc operators can expand their outputs based on the size of the input variable.

- Please mark the Answer as "Accept as solution" if information provided is helpful.
Give Kudos to a post which you think is helpful and reply oriented.
Explorer
Explorer
1,378 Views

Re: ap_fixed details

Jump to solution

@muzaffer

 

For your previous comment in Q4 LOSSLESS SUM,it seems your reply is something wrong in my tests in ap_fixed<5,3> for 3.25 and ap_fixed<5,2> for 1.125.

 

It should be corrected as ap_fixed<max(a-c,b-d)+max(b,d)+1,max(b,d)+1>. Thanks

0 Kudos