I have an FFT core which is 64 points and another FFT core which is 256 points.
64 point FFT core has latency of 1.108 us.
256 point FFT core has latency of 3.48 us.
I need to process data of 64 point FFT when data of 256 point core starts coming out of core.
The latency difference will require 100s of pipeline stages in 64 point FFT to cover for latency of 256 point core.
Is there anyway which is more innovative to cover for latency?
One way to do it without writing alot of code is to use fifo to store data from 64 point FFTs and start reading fifo after X cycles.