we are using the SDSoC environment with a zedboard to build a classifier which uses fft features. The image that needs to be transformed is of size 32*32. We would like to evaluate the use of the xilinx xfft IP for the transformation. When using this IP for acceleration, I noticed that there is a limitation. The size of the array which is moved to the xfft IP needs to be the same as the FFT transformation size. Our image is 1024 values, but we only do 32 point ffts. I have read in several forum entries and papers that, to do a 2D fft with the xilinx 1D xfft IP, we would need to implement a vhdl contoller which would serve as a buffer for the image, and to store intermediate results.
Designing a controller like that is new to me, could you provide me with some resources with the basics and tutorials for such a controller design ? I'm new to vdhl as well. How long do you think I would need to implement such a controller successfully ? I have only very basic vhdl design knowledge, just designed a few rudimentary adders.
I also noticed that transfering 32 floats to the xfft takes unexpectedly long. We have compiled the FFTW library on the arm chip, and it is 100 to 1000 times faster than using the xfft IP. We use the xfft IP as a C callable IP, and call it 64 times for the 32*32 image to do a full fft transformation. Is it the synchronization overhead, overshadowing the fast computation of the 32 point xfft IP ? What else can we do to improve the execution time of the xfft IP for 32*32 images?
Xilinx has a 2D-FFT demo, built with SysGen, which may help you. Go to your Xilinx Vivado installation directory, for example, if you have installed Vivado 2018.2 on the C:\, you can go to C:\Xilinx\Vivado\2018.2\examples\sysgen_demos, find sysgenMRI_2D_FFT.html, sysgenMRI_2D_FFT.slx and sysgenMRI_2D_FFT_PreloadFcn.m.