11-21-2019 01:42 PM - edited 11-21-2019 01:49 PM
I'm simulating the FFT IP to see the results(timing, accuracy) against a custom ADC data in Vivado. I was successful in simulating and verifying the results for 256-FFT for single channel. I want to simulate 128 channels (that would mean almost 11 IP blocks with 12 channels on each) but according to my understanding, I can either have 128 files, each with 256 points that I can feed in the channels simultaneously. Or I can cook up an intricate method involving, fgets, fseek and sscanf(or maybe more) to read each 256 point "frame" from one file to 128 registers. That would still require me to declare and define 128 registers manually.
Is there an easier way to do it? Or am I using the wrong tool and I should use another tool instead? (Vivado HLS or SDK)
Another question is if I push a single file having 128 rows of 256 samples one frame at a time, the simulation window suggests that I have one extra sample in my file which is causing the core to raise event_input_channel_halt after the last sample of the file. If I delete one sample from my file, it runs without raising any error interrupts. I think it's the issue of indexing but the first sample is loading first and the last sample of the file is being loaded last into the frame. I don't have a handle yet on why it is behaving this way. Where is the extra sample coming from?
Appreciate any and all help and suggestions. Thank you.
11-23-2019 07:13 PM
FFT IP can be configured to maximum 12 channels, which is on burst I/O acch only.
The multichannel on a single FFT IP is a sequential concept, if the single FFT IP runs at 120MHz clock frequency, and the IP is set to 12 channels, the input throughput of each channel is 120/12=10 MHz.
You can also create multiple FFT IPs with each IP set to one channel only, for example if you instantiate 2 FFT IPs, each IP is configured with one channel only, the overall input throughput of 2 FFT IPs is 120 * 2 = 240MHz assuming the FFT IP runs at 120MHz.
11-25-2019 06:52 PM - edited 11-26-2019 04:56 PM
I can understand the input frequency being divided into multiple channels but I'm not sure I follow your second comment. If I have 2 FFT blocks with single-channel on both of them, wouldn't my overall input channel throughput be just 120 instead of 240 MHz? I'm talking about when the two blocks are taking data in parallel as shown.
In any case, multichannel is not the path to go for me. I'm trying to implement 2D FFT on the Zynq Ultrascale+ ZCU102 board and I ran a synthesis and implementation for 1 block of FFT and 2 blocks of FFT as shown.
Now, we can see that for 2 blocks, the resources numbers increases by 2. Going by these numbers I cannot have more than 2 FFT blocks in my FPGA operating simultaneously. Even if I ignore the IO utilization percentages, the DSP utilization percentages won't let me go to do 256 blocks simultaneously.
The reason I want to instantiate 256 blocks of FFT IP is that I want to process the incoming data simultaneously and not one after the other. That's why I chose multichannel because it would reduce the numbers of blocks by a huge margin but from your comment, the low throughput is a deal-breaker. Processing multiple 256 point frames sequentially would be a lot slower than processing all the 256 point frames at once. Am I missing something? Is there a better way to implement 2D FFT?
12-04-2019 12:36 AM
For the second statement, I just say the throughput is enhanced to 240MHz = 240Mbps. Hope this is clear for you.