UPGRADE YOUR BROWSER

We have detected your current browser version is not the latest one. Xilinx.com uses the latest web technologies to bring you the best online experience possible. Please upgrade to a Xilinx.com supported browser:Chrome, Firefox, Internet Explorer 11, Safari. Thank you!

cancel
Showing results for 
Search instead for 
Did you mean: 
Explorer
Explorer
4,846 Views
Registered: ‎08-26-2014

Why HLS is not implementing a pure-combinatorial fixed-point multiplication?

Jump to solution

Hello,

 

I am making a comparison of a simple multiplication using floating-point and fixed-point formats. The code is the next, where I only change the in_t and out_t types to either fixed or floating-point formats:

 

void simple_operations(const in_t mat[2], out_t invOut[2])
{
    invOut[0] = mat[0] * mat[1];
}

 

I am a bit confused because when using the same amount of bytes in both formats (32- and 64-bits), both fixed-point versions have more latency than their floating-point counterparts. See next picture:

 

fixed_floating-point test.png

 

Does anyone know why is this happening? Because using a pure-combinatorial approach it should output the result in zero or one clock cycles. But apparently, this is not the case and I don't know why.

 

Many thanks,

 

Cerilet

 

 

0 Kudos
1 Solution

Accepted Solutions
Scholar austin
Scholar
9,137 Views
Registered: ‎02-27-2008

Re: Why HLS is not implementing a pure-combinatorial fixed-point multiplication?

Jump to solution

c,

 

One stage of logic, with a register, may operate in some families in as little as 1.5 nanosecond.  But the required number of stages for a full multiply is not going to be one stage (not unless you have a 128 bit look up table 64 bits wide).

 

The tools try their best to use the resources effectively.  You can ask for less delay, and see what happens.

 

Also a 64 bit floating point multiply is in no way equivalent to 64 bit fixed point.  Different animals.

 

Latency increases as pipeline registers are required to meet timing.

Austin Lesea
Principal Engineer
Xilinx San Jose
2 Replies
Scholar austin
Scholar
9,138 Views
Registered: ‎02-27-2008

Re: Why HLS is not implementing a pure-combinatorial fixed-point multiplication?

Jump to solution

c,

 

One stage of logic, with a register, may operate in some families in as little as 1.5 nanosecond.  But the required number of stages for a full multiply is not going to be one stage (not unless you have a 128 bit look up table 64 bits wide).

 

The tools try their best to use the resources effectively.  You can ask for less delay, and see what happens.

 

Also a 64 bit floating point multiply is in no way equivalent to 64 bit fixed point.  Different animals.

 

Latency increases as pipeline registers are required to meet timing.

Austin Lesea
Principal Engineer
Xilinx San Jose
Highlighted
Explorer
Explorer
4,784 Views
Registered: ‎08-26-2014

Re: Why HLS is not implementing a pure-combinatorial fixed-point multiplication?

Jump to solution

Thanks for your answer, Austin.

 

I actually thought about that and by increasing the clock period, the compiler managed to execute the multiplication with 0 latency. Here the results:

 

fixed-point test.png

 

I know the difference between fixed- and floating-point variables, but in my study in which a matrix multiplication is a small part of it, I want to point out the main strengths and weaknesess of both implementations. But actually, I was expecting faster execution times using fixed-point variables. Big surprise!

 

Thanks again,

 

Cerilet

0 Kudos