I am trying to implement xfopencv applications. However, i am getting a huge overhead due to copyTo and copyFrom functions which provides me to transfer values from cv::Mat to xf::Mat. At this point, when I measure the timing of the hardware function, they are giving me a quite nice output. On the other hand, when I move my hw_ctr.start as below, I am measuring also the delay of the copyTo function's delay which is disastrous. For crop example, I am getting 930ms delay. Am I doing something wrong? Accelerating hw function with these overheads of copyto and copyfrom functions is pointless. I can see that generated block design does not use any memory. I cannot see any RAM related IP.
imgInput.copyTo(in_img.data); //also includes copyto function
crop_accel(imgInput, imgOutput, roi);//call hw function
Isn't there an answer? are all these applications pointless? The complexity of this copyTo function is O(n^4). A lot of iterations are tailored with multiplication and addition processes in it. For example, when the resolution is chosen as 1920x1080, the iteration count will be enormous. I got a correct implementation result but what to do with it if the code is executed in 1 second.