08-04-2016 07:20 AM
Hi, i have a problem. and this is my code
#pragma SDS data copy(src_2[0:12*12*20])
#pragma SDS data mem_attribute(src_2:NON_CACHEABLE)
#pragma SDS data copy(convolution_filer_2[0:5*5*20*50])
#pragma SDS data mem_attribute(convolution_filer_2:NON_CACHEABLE)
#pragma SDS data copy(dst_2[0:50*8*8])
void Convolution_Layer_2(float src_2[12*12*20], float convolution_filer_2[5*5*20*50], float dst_2[50*8*8]);
error message is like this.
Description Resource Path Location Type
Function "Convolution_Layer_2" argument "convolution_filer_2" is mapped to RAM interface, but it's size is bigger than 16384. Please specify #pragma SDS data zero_copy(convolution_filer_2) or #pragma SDS data access_pattern(convolution_filer_2:SEQUENTIAL) work C/C++ Problem
i think problem is limitation of transfer size.
is there any way to solve this problem, such as setting max transfer size ?
08-04-2016 09:23 AM - edited 08-04-2016 09:25 AM
Please check out the SDSoC User guide page 93 which presents the maximum BRAM depth as 16384.
FPGAs have limited BRAM storage on-chip so the maximum array size for a function argument is limited to 16K (16384 elements) in SDSoC. This limitation is only for arrays on the interface of the function. You can have arrays stored as BRAMs internal to your function of any size (just beware that it might not fit if its very large).
You have two options to work around this limitation:
Be warned, that if you choose #2 that your access pattern dictates the achievable bandwidth of your system. The AXI protocol handshake takes at least 4 cycles per read/write operation in addition to the time it takes the PS memory controller to respond wit the data for reads. You may end up with 20+ cycles of latency on each read operation. So be sure to pipeline your loop that accesses this argument and that accesses are sequential in order to get burst reads/writes.
If you are able to pipeline the read/write loop on this array and have sequential incrementing accesses, Vivado HLS will automatically produce a design that emits Burst read/writes that is just as fast as a DMA engine.