08-01-2018 06:23 PM
The below code fragment is how we enqueue task to kernel to perform simple calculation on the mapped buffer. If this code fragment runs in a normal function with single thread, it works fine. However, if we put the same code in a thread function and create multiple threads of this function to run in parallel, we see data corruptions.
Buffer buf (ctx, CL_MEM_READ_WRITE, bufsize); K.setArg(0, buf); int * ary = (int*)Q.enqueueMapBuffer(buf, CL_TRUE, CL_MAP_WRITE, 0, bufsize); // populate ary Q.enqueueUnmapMemObject(buf, ary); Q.enqueueTask(K); ary = (int*)Q.enqueueMapBuffer(buf, CL_TRUE, CL_MAP_READ, 0, bufsize); // consume ary Q.enqueueUnmapMemObject(buf, ary);
Is this code fragment thread safe? What is the proper way to enqueue tasks in parallel from a multi-threaded program?
08-01-2018 09:58 PM
Hi @kwan.huen ,
Assuming that you want to launch enqueueTask on same kernel using multiple threads.
In your makefile have a macro which can be used as control for number threads/CUs being launched.
Your required thread count should be the compute unit count. It means your kernel code and Makefile must be configured accordingly.
Please have a look at this example,
1. Command queue must be created with out of order execution
2. Your thread count equals compute unit count, so instantiate those many kernels and synchronize using kernel/read/write events
3. While launching multiple threads try to index to each of these kernels as per your thread index in host.
I hope this helps.