UPGRADE YOUR BROWSER

We have detected your current browser version is not the latest one. Xilinx.com uses the latest web technologies to bring you the best online experience possible. Please upgrade to a Xilinx.com supported browser:Chrome, Firefox, Internet Explorer 11, Safari. Thank you!

cancel
Showing results for 
Search instead for 
Did you mean: 
Visitor skotti
Visitor
685 Views
Registered: ‎06-18-2018

Do reads into local arrays inside another loop always produce read requests?

Jump to solution

Hello!

 

I have such a kernel where I decided to copy array to local memory because the next cycle will iterate through it.

However it produces long read requests which are not produced for example when I read array B int very beginning of the function. What is the reason for this?

 

__kernel void kernel0(__global double *A, __global double *B, __global double *tmp, double alpha)
{
    int b0 = get_group_id(0), b1 = get_group_id(1);
    int t0 = get_local_id(0), t1 = get_local_id(1);

    local double local_A[210] __attribute__((xcl_array_partition(complete,1)));
    local double local_B[39900] __attribute__((xcl_array_partition(block,210,1)));

    __attribute__((xcl_pipeline_loop))
    for (int c0 = 0; c0 < 39900; c0++) {
    	local_B[c0] = B[c0];
    }

    for (int c0 = 0; c0 <= 179; c0 += 1)
      __attribute__((xcl_pipeline_loop))
      for (int c1 = 0; c1 <= 189; c1 += 1) {
    	double result = 0.0;
    	for (int c3 = 0; c3 <= 209; c3 += 1) {
    		local_A[c3] = A[c0*210+c3];
    	}
        for (int c2 = 0; c2 <= 209; c2 += 1) {
          result += (((alpha) * local_A[c2]) * local_B[c2 * 190 + c1]);
        }
        tmp[c0 * 190 + c1] = result;
        barrier(CLK_LOCAL_MEM_FENCE | CLK_GLOBAL_MEM_FENCE);
      }
}
0 Kudos
1 Solution

Accepted Solutions
Xilinx Employee
Xilinx Employee
651 Views
Registered: ‎06-17-2008

Re: Do reads into local arrays inside another loop always produce read requests?

Jump to solution

Hi @skotti,

 

I am not quite clear about your query. Are you saying that the local_B buffer read is working fine but the local_A buffer read produces long read request?

 

We can support local buffer copy inside a nested loop. For example, here is an example: https://github.com/Xilinx/SDAccel_Examples/blob/master/getting_started/kernel_to_gmem/burst_rw_ocl/src/vadd.cl

 

Could you elaborate more about 'long read request'?

 

0 Kudos
3 Replies
Xilinx Employee
Xilinx Employee
652 Views
Registered: ‎06-17-2008

Re: Do reads into local arrays inside another loop always produce read requests?

Jump to solution

Hi @skotti,

 

I am not quite clear about your query. Are you saying that the local_B buffer read is working fine but the local_A buffer read produces long read request?

 

We can support local buffer copy inside a nested loop. For example, here is an example: https://github.com/Xilinx/SDAccel_Examples/blob/master/getting_started/kernel_to_gmem/burst_rw_ocl/src/vadd.cl

 

Could you elaborate more about 'long read request'?

 

0 Kudos
Visitor skotti
Visitor
645 Views
Registered: ‎06-18-2018

Re: Do reads into local arrays inside another loop always produce read requests?

Jump to solution

I mean that when reading to local array A I have a long readreq command up to 50 cycles or so before reading each values from ddr memory.

Can I somehow get read of this latency?

0 Kudos
Xilinx Employee
Xilinx Employee
580 Views
Registered: ‎06-17-2008

Re: Do reads into local arrays inside another loop always produce read requests?

Jump to solution

The access to DDR memory will cause latency overhead and I think 50 cycles is not quite huge. The latency depends on memory access manner and controller settings etc. Instead of reducing that latency, I think the better way would be hide the the latency, such as using burst transfer or on-chip pipe etc. You may refer to UG1207 or the github examples for more details.  

0 Kudos