UPGRADE YOUR BROWSER

We have detected your current browser version is not the latest one. Xilinx.com uses the latest web technologies to bring you the best online experience possible. Please upgrade to a Xilinx.com supported browser:Chrome, Firefox, Internet Explorer 11, Safari. Thank you!

cancel
Showing results for 
Search instead for 
Did you mean: 
Visitor po092000
Visitor
4,920 Views
Registered: ‎07-24-2016

Data Transfer Time problem

Hi my name is soo and i have a problem.

 

i synthesised convolution_layer_1 function. and i got a this result.

result_hls.png <HLS>

result_sdsoc.png <SDSoC>

 

 

void Convolution_Layer_1(float src[28*28], float convolution_filer[25*20], float dst[24*24*20])
{
int col, row, col_f, row_f;
int feature_map;

float temp;

float _src[28][28];
#pragma HLS ARRAY_PARTITION variable=_src complete dim=2


for(row=0;row<28;row++)
{
#pragma HLS PIPELINE II=1
for(col=0;col<28;col++)
{
_src[row*4][col]=src[(row*4)*28+col];
}
}

for(feature_map=0;feature_map<20;feature_map++)
{
for (row = 0; row < 24; row++)
{
#pragma HLS PIPELINE II=1
for (col = 0; col < 24; col++)
{
temp = 0;
for (row_f = 0; row_f<5; row_f++)
{
for (col_f = 0; col_f<5; col_f++)
{

temp += _src[row][col] * convolution_filer[feature_map * 25 + row_f * 5 + col_f];
}
dst[feature_map * 24 * 24 + row * 24 + col] = temp;
}
}
}
}
}

 

the problem is that total time of this function is more than result. 200x more slow.

i don't know why i get this result.

i think data transfer is the problem, is it ture ?

transfer size is 3136(src) + 2000(convolution_filter) + 46080(dst) = 51216(Byte)

and i use axi-simple bus. i don't know the reason of this problem.

is there any way to get solve this problem ?

 

0 Kudos
2 Replies
Xilinx Employee
Xilinx Employee
4,899 Views
Registered: ‎06-29-2015

Re: Data Transfer Time problem

Hi Soo,

 

Do you have any SDSoC pragmas on the interface of the function (ie. access_pattern(SEQUENTIAL))?

 

Can you post the data motion network report?

 

Thanks
Sam

0 Kudos
Xilinx Employee
Xilinx Employee
4,879 Views
Registered: ‎06-29-2015

Re: Data Transfer Time problem

Hi Soo,

 

Another thing to point out is the units. The HLS report is reporting in cycles running at whatever clock rate the accelerator will run at (most build-in SDSoC platforms default to 100MHz). The SDSoC performance estimation report gives processor cycles running at whatever frequency the ARM processor is running at (either 533, 666, or 800MHz depending on which board you're using, zc702 is 666MHz). So 12000 cycles @100MHz is about 80000 cycles at 666MHz.

 

Additionally, the SDSoC estimation also includes the time it takes to setup the accelerator and transfer the data to and from the accelerator. So out of the 220000 cycles the estimate gave, 80000 (or 36%) is for the actual accelerator execution. Probably 5000-10000 cycles for software to do AXI-Lite writes to initialize the accelerator and setup the DMAs for data transfer. The remaining 120000 cycles is being estimated for the data transfer time. Its likely that that cost is overly pessimistic and you may see lower actual execution time.

 

If you want to see exactly how long each of the events takes during actual execution, consider enabling the Trace feature. You can do this by clicking the "Enable Event Tracing" checkbox in the project overview. Just make sure you clean your project first, and remember that currently the incremental build does NOT work with trace (you'll need to do a manual clean, then build) anytime you make a change. 

 

You can read more about the Trace feature in Chapter 13 of UG1027: http://www.xilinx.com/support/documentation/sw_manuals/xilinx2016_2/ug1027-sdsoc-user-guide.pdf

 

And you can use the tutorials in Chapter 8 of UG1028 as a guide on what steps to follow: http://www.xilinx.com/support/documentation/sw_manuals/xilinx2016_2/ug1028-intro-to-sdsoc.pdf

 

Sam

0 Kudos