UPGRADE YOUR BROWSER

We have detected your current browser version is not the latest one. Xilinx.com uses the latest web technologies to bring you the best online experience possible. Please upgrade to a Xilinx.com supported browser:Chrome, Firefox, Internet Explorer 11, Safari. Thank you!

cancel
Showing results for 
Search instead for 
Did you mean: 
Visitor endofday
Visitor
242 Views
Registered: ‎07-24-2018

Effect of an accumulated variable in matrix multiplication

Jump to solution

Hello,

I have two functions of matrix multiplication as shown in below. The first one is with an accumulated variable Ci. Another is without Ci or it will accumulated directly to the output array C. The latency of the function with the accumulated variable is better than the one without for 25%. The question is why by adding a variable Ci for accumulation, make the latency is so much better?

Thanks.

The function with the accumulated value. (latency 7,680,009)

void matmul(float A[1600], float B[1600][1600], float C[1600])
{
#pragma HLS ARRAY_RESHAPE variable=B complete dim=1
	width:for(int i=0; i<1600; i++)
	{
		float Ci = 0;
		product:for(int k=0; k<1600; k++)
		{
#pragma HLS PIPELINE II=1
			Ci += A[k]*B[k][i];
		}
		C[i] = Ci;
	}
}

The function without the accumulated value (latency 12,800,007).

void matmul(float A[1600], float B[1600][1600], float C[1600])
{
#pragma HLS ARRAY_RESHAPE variable=B complete dim=1
	width:for(int i=0; i<1600; i++)
	{
		product:for(int k=0; k<1600; k++)
		{
#pragma HLS PIPELINE II=1
			C[i] += A[k]*B[k][i];
		}
	}
}
0 Kudos
1 Solution

Accepted Solutions
Highlighted
Voyager
Voyager
219 Views
Registered: ‎03-28-2016

Re: Effect of an accumulated variable in matrix multiplication

Jump to solution

"Ci" is a register implemented with flip-flops.  "C[1600]" is a RAM implemented with Block RAMs.  Writing to a register takes less time than writing to a RAM.  That likely accounts for the difference in the latency.

Ted Booth - Tech. Lead FPGA Design Engineer
www.designlinxhs.com
3 Replies
Highlighted
Voyager
Voyager
220 Views
Registered: ‎03-28-2016

Re: Effect of an accumulated variable in matrix multiplication

Jump to solution

"Ci" is a register implemented with flip-flops.  "C[1600]" is a RAM implemented with Block RAMs.  Writing to a register takes less time than writing to a RAM.  That likely accounts for the difference in the latency.

Ted Booth - Tech. Lead FPGA Design Engineer
www.designlinxhs.com
Scholar u4223374
Scholar
215 Views
Registered: ‎04-26-2015

Re: Effect of an accumulated variable in matrix multiplication

Jump to solution

In addition to what @tedbooth has said - what interface type are you defining for C? If it is, for example, an AXI Master, then that could take ages (in FPGA terms) to update.

0 Kudos
Visitor endofday
Visitor
184 Views
Registered: ‎07-24-2018

Re: Effect of an accumulated variable in matrix multiplication

Jump to solution

Since I did not specifically select the interface, I think the default interface was selected.

 

 

 

0 Kudos