UPGRADE YOUR BROWSER

We have detected your current browser version is not the latest one. Xilinx.com uses the latest web technologies to bring you the best online experience possible. Please upgrade to a Xilinx.com supported browser:Chrome, Firefox, Internet Explorer 11, Safari. Thank you!

cancel
Showing results for 
Search instead for 
Did you mean: 
Participant twoism_
Participant
5,814 Views
Registered: ‎12-30-2015

OpenCL kernel with async_work_group_copy()

Jump to solution


Hello,
I'm trying to synthesize an OpenCL kernel for matrix multiplication. The kernel works, but if I change the memory copy loop:

__kernel void __attribute__ ((reqd_work_group_size(WG_SIZE_0, 1, 1)))
mmul(__global const data_t *ma, __global const data_t *mb, __global data_t *mc) {

	// Input and output rows buffer
	__local data_t a_row[WG_SIZE_0];
	__local data_t c_row[WG_SIZE_0];

	// Matrix A row index
	const int i = get_global_id(0);

	// Copy row i (global_id index) form matrix A
	for (int j = 0; j < WG_SIZE_0; ++j) {
		a_row[j] = ma[i * WG_SIZE_0 + j];
	}


with:

	async_work_group_copy(a_row, ma + i * WG_SIZE_0, WG_SIZE_0, 0);
	barrier(CLK_LOCAL_MEM_FENCE);


The design becomes un-synthesizable, with the following errors:

@E [SYNCHK-11] mmul_OpenCL/srcs/mmul.cl:5: in function 'mmul': Variable '.01.reg2mem.priv.i.i' has an unsynthesizable type '[8 x i32 addrspace(1)*]*' (possible cause(s): pointer to pointer or global pointer).
@E [SYNCHK-22] mmul_OpenCL/srcs/mmul.cl:5: in function 'mmul': memory copy is not supported unless used on bus interface possible cause(s): non-static/non-constant local array with initialization).
@I [SYNCHK-10] 2 error(s), 0 warning(s).
@E [HLS-70] Synthesizability check failed.

Any ideas why this isn't working?


 

0 Kudos
1 Solution

Accepted Solutions
Xilinx Employee
Xilinx Employee
10,581 Views
Registered: ‎11-28-2007

Re: OpenCL kernel with async_work_group_copy()

Jump to solution
const int i = get_global_id(0);
async_work_group_copy(a_row, ma + i * WG_SIZE_0, WG_SIZE_0, 0);

The way async_work_group_copy is used in your code violates the OpenCL Spec. Below statement is copied from OpenCL 1.2 spec (page 278 https://www.khronos.org/registry/cl/specs/opencl-1.2.pdf ). In your code, each work-item passes a new source pointer values to async_work_group_copy fucntion.

 
async copy is performed by all work-items in a workgroup and this built-in function must therefore be encountered by all work-items in a work-group executing the kernel with the same argument values; otherwise the results are undefined.
Cheers,
Jim
0 Kudos
4 Replies
Highlighted
Xilinx Employee
Xilinx Employee
5,784 Views
Registered: ‎11-28-2007

Re: OpenCL kernel with async_work_group_copy()

Jump to solution

What is the actual data type of data_t? Can you please attach the complete code so we can try it out?

 


@twoism_ wrote:


Hello,
I'm trying to synthesize an OpenCL kernel for matrix multiplication. The kernel works, but if I change the memory copy loop:

__kernel void __attribute__ ((reqd_work_group_size(WG_SIZE_0, 1, 1)))
mmul(__global const data_t *ma, __global const data_t *mb, __global data_t *mc) {

	// Input and output rows buffer
	__local data_t a_row[WG_SIZE_0];
	__local data_t c_row[WG_SIZE_0];

	// Matrix A row index
	const int i = get_global_id(0);

	// Copy row i (global_id index) form matrix A
	for (int j = 0; j < WG_SIZE_0; ++j) {
		a_row[j] = ma[i * WG_SIZE_0 + j];
	}


with:

	async_work_group_copy(a_row, ma + i * WG_SIZE_0, WG_SIZE_0, 0);
	barrier(CLK_LOCAL_MEM_FENCE);


The design becomes un-synthesizable, with the following errors:

@E [SYNCHK-11] mmul_OpenCL/srcs/mmul.cl:5: in function 'mmul': Variable '.01.reg2mem.priv.i.i' has an unsynthesizable type '[8 x i32 addrspace(1)*]*' (possible cause(s): pointer to pointer or global pointer).
@E [SYNCHK-22] mmul_OpenCL/srcs/mmul.cl:5: in function 'mmul': memory copy is not supported unless used on bus interface possible cause(s): non-static/non-constant local array with initialization).
@I [SYNCHK-10] 2 error(s), 0 warning(s).
@E [HLS-70] Synthesizability check failed.

Any ideas why this isn't working?


 


 

Cheers,
Jim
0 Kudos
Participant twoism_
Participant
5,774 Views
Registered: ‎12-30-2015

Re: OpenCL kernel with async_work_group_copy()

Jump to solution
#define WG_SIZE_0 8
typedef int data_t;

0 Kudos
Participant twoism_
Participant
5,770 Views
Registered: ‎12-30-2015

Re: OpenCL kernel with async_work_group_copy()

Jump to solution


The problem arises when I substitute the memory copy loop with the async_work_group_copy call. The rest of the code is a plain matrix multiplier. You can replicate (and isolate) the error by synthesizing with the following snippet:

mmul.h

#ifndef MMUL_H
#define MMUL_H

#define WG_SIZE_0 8
typedef int data_t;

#endif



mmul.cl

#include <clc.h>
#include "mmul.h"

__kernel void __attribute__ ((reqd_work_group_size(WG_SIZE_0, 1, 1)))
mmul(__global const data_t *ma, __global const data_t *mb, __global data_t *mc) {

	// Input and output rows buffer
	__local data_t a_row[WG_SIZE_0];
	__local data_t c_row[WG_SIZE_0];

	// Matrix A row index
	const int i = get_global_id(0);

	// Unused
	//data_t j = get_global_id(1);
	
	// Copy row i (global_id index) form matrix A
//	for (int j = 0; j < WG_SIZE_0; ++j) {
//		a_row[j] = ma[i * WG_SIZE_0 + j];
//	}

	async_work_group_copy(a_row, ma + i * WG_SIZE_0, WG_SIZE_0, 0);
	// ensure all values have been read before begin the next stage
	barrier(CLK_LOCAL_MEM_FENCE);

}


Errors:

@I [HLS-10] Checking synthesizability ...
@E [SYNCHK-11] mmul_compute_unit:62: in function '__AESL_work_groupA': Variable '.0.reg2mem.priv.i.i' has an unsynthesizable type '[8 x i32 addrspace(1)*]*' (possible cause(s): pointer to pointer or global pointer).
@W [SYNCHK-77] The top function '__AESL_work_groupA' (mmul_compute_unit:62) has no outputs. Possible cause(s) are: (1) Output parameters are passed by value; (2) intended outputs (parameters or global variables) are never written; (3) there are infinite loops.
@I [SYNCHK-10] 1 error(s), 1 warning(s).
@E [HLS-70] Synthesizability check failed.


Thanks

 

0 Kudos
Xilinx Employee
Xilinx Employee
10,582 Views
Registered: ‎11-28-2007

Re: OpenCL kernel with async_work_group_copy()

Jump to solution
const int i = get_global_id(0);
async_work_group_copy(a_row, ma + i * WG_SIZE_0, WG_SIZE_0, 0);

The way async_work_group_copy is used in your code violates the OpenCL Spec. Below statement is copied from OpenCL 1.2 spec (page 278 https://www.khronos.org/registry/cl/specs/opencl-1.2.pdf ). In your code, each work-item passes a new source pointer values to async_work_group_copy fucntion.

 
async copy is performed by all work-items in a workgroup and this built-in function must therefore be encountered by all work-items in a work-group executing the kernel with the same argument values; otherwise the results are undefined.
Cheers,
Jim
0 Kudos