cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
Explorer
Explorer
1,402 Views
Registered: ‎08-16-2017

Performance profiling of a custom software

Jump to solution

Hello Everyone,

 

I want to increase the performance of a custom software using FPGA based accelerator.

In order to begin with, I would like to determine how much speed up I can achieve using the FPGA. I have learned the software and I know which algorithms of the software can be deployed to the FPGA. 

Is  there an efficient way where I can determine the execution time of the different algorithms of the software so that I can come up with a good estimate of the performance improvement I can achieve?

I have used the Break() command before but I don't know how accurate it is?

 

Thank you

0 Kudos
Reply
1 Solution

Accepted Solutions
Xilinx Employee
Xilinx Employee
1,731 Views
Registered: ‎09-08-2011

SDAccel requires the ap_opencl license feature. So the sdx tool will only allow sdsoc if you don't have the other licenses.

 

You can try Nimbix or AWS to use the dev environment and run on available platforms.

 

The way you estimated in HLS is about the right way to understand how long it takes to execute. But HLS only is aware of what exists in the C code created, and you would have to estimate based on the larger FPGA design how much time would be spent on communication overhead. SDAccel takes those into consideration and if you look at the UG:

 

https://www.xilinx.com/support/documentation/sw_manuals/xilinx2017_4/ug1207-sdaccel-optimization-guide.pdf

 

Going to page 55 and looking at "Figure 23: Application Timeline Window"

 

Would give you an idea of what SDAccel would provide you in terms of run time including host to FPGA overhead. And the ability to drill into where time might be wasted in a design to allow further speedup.

 

 

If at first you don't succeed, try redefining success?

View solution in original post

5 Replies
Moderator
Moderator
1,328 Views
Registered: ‎11-09-2015

Hi @vivek,

 

It sounds like a job for SDAccel. You might want to have a look to this tool.

 

Regards,


Florent
Product Application Engineer - Xilinx Technical Support EMEA
**~ Don't forget to reply, give kudos, and accept as solution.~**
Xilinx Employee
Xilinx Employee
1,306 Views
Registered: ‎09-08-2011

Hi vivek,

 

     Have you used SDAccel's HW_Emulation? It should give you a CPU cycles estimation of how long it took to run, and you can compare it to your standard C implementation. Which should let you estimate the performance speedup.

 

If you are looking for something specific from the profiling tools in SDAccel or the emulation, let us know, and we can hopefully provide how to do it.

 

Regards,

 

Evan

If at first you don't succeed, try redefining success?
Explorer
Explorer
1,300 Views
Registered: ‎08-16-2017

@evant and @florentw - Great guys.

Is there a free version or trial version for SDAccel tool?

I downloaded the SDx (which is a combination of SDSoC as well as SDAccel) but wasn't able to create a project in SDAccel.

 

I wrote some of the software algorithms in C++ and I was able to find the execution time. I converted the same C++ code into RTL using Vivado hls and used the same test vectors and found the latency. In that way, I determined the execution time = latency * clock cycles. But I didn't take into account the communication overhead between the FPGA and the host computer. Is this a fair way to estimate the speed up? Is there a way where I can make an educative guess in the communication overhead without using SDAccel or other tool?

 

Thank you.

0 Kudos
Reply
Xilinx Employee
Xilinx Employee
1,732 Views
Registered: ‎09-08-2011

SDAccel requires the ap_opencl license feature. So the sdx tool will only allow sdsoc if you don't have the other licenses.

 

You can try Nimbix or AWS to use the dev environment and run on available platforms.

 

The way you estimated in HLS is about the right way to understand how long it takes to execute. But HLS only is aware of what exists in the C code created, and you would have to estimate based on the larger FPGA design how much time would be spent on communication overhead. SDAccel takes those into consideration and if you look at the UG:

 

https://www.xilinx.com/support/documentation/sw_manuals/xilinx2017_4/ug1207-sdaccel-optimization-guide.pdf

 

Going to page 55 and looking at "Figure 23: Application Timeline Window"

 

Would give you an idea of what SDAccel would provide you in terms of run time including host to FPGA overhead. And the ability to drill into where time might be wasted in a design to allow further speedup.

 

 

If at first you don't succeed, try redefining success?

View solution in original post

Explorer
Explorer
1,285 Views
Registered: ‎08-16-2017

How do I purchase or get the ap_opencl license?

 

 

0 Kudos
Reply