UPGRADE YOUR BROWSER

We have detected your current browser version is not the latest one. Xilinx.com uses the latest web technologies to bring you the best online experience possible. Please upgrade to a Xilinx.com supported browser:Chrome, Firefox, Internet Explorer 11, Safari. Thank you!

cancel
Showing results for 
Search instead for 
Did you mean: 
Adventurer
Adventurer
3,190 Views
Registered: ‎12-16-2013

kernel start time in out-of-order command queue

Jump to solution

Hi all,

 

I have used pipes to add two vectors such as c[i] =  a[i] +b[i].

In the host I have used CL_MEM_USE_HOST_PTR |   CL_MEM_EXT_PTR_XILINX options for buffers of a and b vectors.  There are three kernels connected through two pipes, the first one (read_data_kernel) reads a  and b element by element and pushes into the pipe , the second one (add_data_kernel) pops a and b elements by elements from the pipe and pushes the result into another pipe and the last one write the result (c) into the host memory.

 

And this is the timing diagram after hardware emulation. My questions are,

why is read_data_kernel started after a delay and the add_data_kernel should go to stall waiting for data coming from read_data_kernel through pipe. 

Shouldn’t all three kernels start at the same time?

Is there any way to reduce this delay?

 

The codes also attached.

 

Thanks

Mohammad

 

add_timing.png

0 Kudos
1 Solution

Accepted Solutions
Highlighted
Xilinx Employee
Xilinx Employee
5,439 Views
Registered: ‎11-28-2007

Re: kernel start time in out-of-order command queue

Jump to solution

When running on the actual hardware, the kernel start-up time is about 30us, which unfortunately can't be hidden. In your case, it's about 1% of your computation time, which shouldn't be that much of an overhead.

Cheers,
Jim
0 Kudos
6 Replies
Xilinx Employee
Xilinx Employee
3,179 Views
Registered: ‎07-18-2014

Re: kernel start time in out-of-order command queue

Jump to solution

I also observed this similar issue in past. I guess the delay what you are seeing between Kernel start is due to kernel initialization (configuring kernel registers by host before starting it). So I believe this delay is just initial delay and will be negligible if you run your for large data count.

Change your DATA_LENGTH to something large size( for example 65536) and check the overall timeline. 

 

 

0 Kudos
Xilinx Employee
Xilinx Employee
3,177 Views
Registered: ‎07-18-2014

Re: kernel start time in out-of-order command queue

Jump to solution
you can also refer Xilinx OpenCL Pipe Memory Example:
https://github.com/Xilinx/SDAccel_Examples/tree/master/getting_started/dataflow/dataflow_pipes_ocl
Its timeline will show you that all three kernels are running concurrently.
0 Kudos
Adventurer
Adventurer
3,167 Views
Registered: ‎12-16-2013

Re: kernel start time in out-of-order command queue

Jump to solution

Hi @heeran

 

The xilinx pipe example also has the same issue. (maybe its not an issue)

this is the timing diagram with HW emulation on KU115

The add-stage starts first and after a delay the input stage starts and after more delay the output stage starts.

 

Thanks

Mohammad

xilinx_pipe_example.png

0 Kudos
Xilinx Employee
Xilinx Employee
3,153 Views
Registered: ‎07-18-2014

Re: kernel start time in out-of-order command queue

Jump to solution
hi @mohava.
Yes, this is not an issue. This is kernel initialization delay.
0 Kudos
Adventurer
Adventurer
3,125 Views
Registered: ‎12-16-2013

Re: kernel start time in out-of-order command queue

Jump to solution

Hi @heeran

 

I think you are right and the delay is all about the kernel initialisation.

As I realised it depends on the number of kernel arguments and their types. It seems that global pointers have the most delay.

This delay is not negligible sometimes. I spent some times to implement an efficient histogram algorithm on FPGA, and now it takes less than 4 msec to process 67Mbyte without considering this delay. But the delay is about 15msec which is a huge overhead and it's not negligible.

 

Now my questions are

 

What parameters have impact on this delay?

 

Is there any way to reduce or hide this delay behind other computations?

 

And interestingly, the kernel profiling (using the CL_QUEUE_PROFILING_ENABLE) does not report this delay. And says the kernel execution time is 4msec (in my histogram design) while actually it takes about 20msec. (if there are two kernels, actually the kernel profiling reports high execution time for the other one which does computation and not for the one that receives data)

 

 

Thanks

Mohammad

0 Kudos
Highlighted
Xilinx Employee
Xilinx Employee
5,440 Views
Registered: ‎11-28-2007

Re: kernel start time in out-of-order command queue

Jump to solution

When running on the actual hardware, the kernel start-up time is about 30us, which unfortunately can't be hidden. In your case, it's about 1% of your computation time, which shouldn't be that much of an overhead.

Cheers,
Jim
0 Kudos