UPGRADE YOUR BROWSER

We have detected your current browser version is not the latest one. Xilinx.com uses the latest web technologies to bring you the best online experience possible. Please upgrade to a Xilinx.com supported browser:Chrome, Firefox, Internet Explorer 11, Safari. Thank you!

cancel
Showing results for 
Search instead for 
Did you mean: 
Highlighted
Adventurer
Adventurer
839 Views
Registered: ‎08-27-2018

Actual speedup is not matching the estimated speedup

Hi

I am implementing the following function on a ZCU102 board. The estimated speedup of bubbleSort_hw() function in Performance Estimation Report is 2.1. But when I actually run it on the device the speedup is 0.489869.

 

Can anyone please suggest where did I make the mistake? Please let me know if any further information required.

 

Software function:

 

void bubbleSort_sw(int vec[N]){
	int temp;

	for (int i=0 ; i< N-1; i++){
		for (int j=0 ; j< N-i-1; j++){
			if (vec[j] > vec[j+1]) {
				temp = vec[j];
				vec[j] = vec[j+1];
				vec[j+1] = temp;
			}
	    }
	}
}

 

Hardware function:

#pragma SDS data zero_copy(vec, vec2)
#pragma SDS data access_pattern(vec:SEQUENTIAL, vec2:SEQUENTIAL)
#pragma SDS data mem_attribute(vec:PHYSICAL_CONTIGUOUS, vec2:PHYSICAL_CONTIGUOUS)
void bubbleSort_hw(int vec[N], int vec2[N]){
	int temp, temp1, temp2;
	int vec_internal[N];

	//copy to internal memory
	for (int i=0; i<N; i++){
	#pragma HLS UNROLL
		vec_internal[i] = vec[i];
	}

	for (int i=0 ; i< N-1; i++){
		for (int j=1 ; j< N; j++){
		#pragma HLS PIPELINE II=1
			temp1 = vec_internal[j-1];
			temp2 = vec_internal[j];

			if (temp2 < temp1) {
				temp = temp2;
				temp2 = temp1;
				temp1 = temp;
			}
			vec_internal[j-1] = temp1;
			vec_internal[j] = temp2;
		}
	}

	//copy back to external memory
	for (int i=0; i<N; i++){
	#pragma HLS UNROLL
		vec2[i] = vec_internal[i];
	}
}


 

Selection_009.png

 

 Selection_010.png

 

Running on hardware:

Testing 1024 iterations of 1024 integer sorting...
Average number of CPU cycles running BUbbleSort in software: 25960071
Average number of CPU cycles running BubbleSort in hardware: 52993910
Speed up: 0.489869
TEST PASSED

 

Thanks

 

Tags (3)
0 Kudos
5 Replies
Xilinx Employee
Xilinx Employee
821 Views
Registered: ‎07-12-2017

Re: Actual speedup is not matching the estimated speedup

Hi @immwn ,

 

I think the speedup calculation done in the host is wrong. Can you please check that ?

 

if i divide the following two numbers, i am getting a speedup of 2.04 

Average number of CPU cycles running BUbbleSort in software: 25960071
Average number of CPU cycles running BubbleSort in hardware: 52993910

 

0 Kudos
Adventurer
Adventurer
803 Views
Registered: ‎08-27-2018

Re: Actual speedup is not matching the estimated speedup

Hi @hatchuta,

 

I checked my host program, found nothing wrong. Here it is

 

int bubbleSort_test(int *vec_sw, int *vec_hw, int *vec_hw_ret){
    std::cout << "Testing " << NUM_TESTS << " iterations of "<< N << " integer sorting..." << std::endl;

    perf_counter hw_ctr, sw_ctr;

    for (int i = 0; i < NUM_TESTS; i++){
    	init_array(vec_sw);
     	for (int k=0; k<N; k++){
     	    vec_hw[k] = vec_sw[k];
     	}
     	sw_ctr.start();
     	bubbleSort_sw(vec_sw);
     	sw_ctr.stop();

     	hw_ctr.start();
     	bubbleSort_hw(vec_hw, vec_hw_ret);
     	hw_ctr.stop();

     	if (result_check(vec_sw, vec_hw_ret))
     	    return 1;
    }

    uint64_t sw_cycles = sw_ctr.avg_cpu_cycles();
    uint64_t hw_cycles = hw_ctr.avg_cpu_cycles();
    double speedup = (double) sw_cycles / (double) hw_cycles;

    std::cout << "Average number of CPU cycles running BUbbleSort in software: " << sw_cycles << std::endl;
    std::cout << "Average number of CPU cycles running BubbleSort in hardware: " << hw_cycles << std::endl;
    std::cout << "Speed up: " << speedup << std::endl;

    return 0;
}

Any suggestion?

 

0 Kudos
Adventurer
Adventurer
749 Views
Registered: ‎08-27-2018

Re: Actual speedup is not matching the estimated speedup

Hi @hatchuta. waiting for your opinion.

Cheers

0 Kudos
Moderator
Moderator
576 Views
Registered: ‎09-19-2018

Re: Actual speedup is not matching the estimated speedup

Hi @immwn,

 

When running on hardware, have you tried setting the build configuration to Release? Let me know the results when you have tried them out.

Thanks,

Vijoy

Thanks,
Vijoy Sunil Kumar
Product Applications Engineer

- Don't forget to reply, kudo and accept as solution
0 Kudos
Teacher xilinxacct
Teacher
513 Views
Registered: ‎10-23-2018

Re: Actual speedup is not matching the estimated speedup

@immwn

reverse your calculation ... hw_cycles / sw_cycles... That should result in the ~2X result

0 Kudos