UPGRADE YOUR BROWSER

We have detected your current browser version is not the latest one. Xilinx.com uses the latest web technologies to bring you the best online experience possible. Please upgrade to a Xilinx.com supported browser:Chrome, Firefox, Internet Explorer 11, Safari. Thank you!

cancel
Showing results for 
Search instead for 
Did you mean: 
Contributor
Contributor
505 Views
Registered: ‎06-20-2019

SDSoc FFT timing estimation

Jump to solution
 

Hello,

I am trying to perform time estimation for fft application on ultra96 board. I am using a document which is attached to this question. Page 4 shows a table for previously measured timings. However, my result is not that great. I am also implementing floating point application and everything is same.

I am setting the config to 1 if it is forward fft.Since this is not run time configurable implementation, I am not setting any other value because those fields are not applicable for floating point implementation.

I am comparing xfft and fftw, too. I am using perf_counter class to measure timing by measuring clock cycles and multiplying it with period.

I tried to measure time by using both sds_alloc_cacheable and sds_alloc_noncacheable.

Can you see any typo? Why can't I get a result as in the document? Should I change the configuration(but NFFT, cyclic_prefix and scaling ifields should not be considered for my application because it is floating point app and not run time configurable. The reason is that I am using previously compiled libraries)?

 

My code is as following:

class lib_fft_cl
{
public:
   perf_counter hw_ctr, hw_ctr1; // perf_counter class is used to measure time and it has related functions

void config_settings(bool &is_it_fw){ config_t config_val = 0; if(is_it_fw == true){ config_val = (config_val | (uint16_t)0x0001); } xfft_config(&config_val); } void run(std::complex<float> *input, std::complex<float> *output, std::complex<float> *out, fftwf_plan &m_plan, const int times){ hw_ctr.start(); fftwf_execute(m_plan); hw_ctr.stop(); hw_ctr1.start(); xfft((uint64_t *)input, (uint64_t *)output); hw_ctr1.stop(); }; int main(void) { const int fft_length = 2048; int n_times = 128; fftwf_plan plan, plan1; bool is_it_fw_fft; is_it_fw_fft = true; lib_fft_cl lib_fft; std::complex<float> *datain1 = (std::complex<float> *)sds_alloc_non_cacheable(sizeof(std::complex<float>) * fft_length); std::complex<float> *datain2 = (std::complex<float> *)sds_alloc_non_cacheable(sizeof(std::complex<float>) * fft_length); std::complex<float> *in = (std::complex<float> *)fftwf_malloc(sizeof(fftwf_complex) * fft_length); std::complex<float> *out = (std::complex<float> *)fftwf_malloc(sizeof(fftwf_complex) * fft_length); lib_fft.config_settings(is_it_fw_fft); plan = fftwf_plan_dft_1d(fft_length, (fftwf_complex *)(in), (fftwf_complex *)(out), FFTW_FORWARD, FFTW_ESTIMATE); lib_fft.run(datain1, datain2, out, plan, n_times); is_it_fw_fft = false; lib_fft.config_settings(is_it_fw_fft); plan1 = fftwf_plan_dft_1d(fft_length, (fftwf_complex *)(in), (fftwf_complex *)(out), FFTW_BACKWARD, FFTW_ESTIMATE); lib_fft.run(datain1, datain2, out, plan1, n_times); //free functions are implemented return 0; }

The results I am getting are presented with attachment(for 2048 fft length and 8192 fft length)

As you can see for 8192 I am getting 333us which is too much and we expected it to be nearly 100 or something. Why would it be?

0 Kudos
1 Solution

Accepted Solutions
Contributor
Contributor
330 Views
Registered: ‎10-25-2017

Re: SDSoc FFT timing estimation

Jump to solution

 

I have built a custom SDSoC platform for the Ultra96 with a 300 MHz accelerator clock and have listed exectution times I'm seeing for a 8192 point FFT below.  The numbers line up well with the whitepaper.

Non-blocking XFFT (cached data): 40 us

Blocking XFFT (cached data): 122 us

Blocking XFFT (non-cached data): 112 us

 

5 Replies
Contributor
Contributor
423 Views
Registered: ‎10-25-2017

Re: SDSoc FFT timing estimation

Jump to solution

What is the clock frequency that you are using for the FFT accelerator?

0 Kudos
Highlighted
Contributor
Contributor
397 Views
Registered: ‎10-25-2017

Re: SDSoc FFT timing estimation

Jump to solution

 

Are you using this SDSoC platform?  If so, there are only 100 MHz & 200 MHz clocks available, with the 100 MHz clock being the default.  The whitepaper you referenced used a 300 MHz clock. 

If you compiled with the default clock of 100 MHz it would make sense that you are seeing processing times that are approximatley 3x what is given in the whitepaper.

Contributor
Contributor
383 Views
Registered: ‎06-20-2019

Re: SDSoc FFT timing estimation

Jump to solution

The document provides timings for 300MHz as you said. For my hardware function, I am able to set clock frequency to 200MHz, while I am setting the data motion network clock frequency to 100MHz. It is not possible for me to set it to a value which is higher than 200MHz. (as you stated, I just have 2 options 100 and 200MHz). Now, I am trying to create a custom platform so that I can increase the frequency value(I am not quite sure if it is doable). By this means, I can see if there is a difference. If I can, I will let you know the results. However, I want to ask sth about this clock. If I find a way to increase this clock value and increase it as much as I want, will my FPGA give a correct result? Maybe, I won't be seeing a problem but it will do sth else in background. I may think of it as it is doing everything but it doesnt, correctly. Is it possible?

Just to give some info: In another project of mine, without using an accelerator while I am getting total time of 3.2ms, when I use the accelerator(toggle a function to be HW function), interestingly, I am getting 22.9ms(if the clock frequency is set to 100MHz.) I have changed the value to 200MHz and the value got decreased from 22.9ms to 15.5ms(which is an expected result).

0 Kudos
Contributor
Contributor
364 Views
Registered: ‎10-25-2017

Re: SDSoc FFT timing estimation

Jump to solution

 

Keep in mind that the C-Callable IP will be clocked at the same rate as the data movement clock.  So if you select 100 MHz for the data movement clock, then FFT accelerator will also get clocked at 100 MHz.

Contributor
Contributor
331 Views
Registered: ‎10-25-2017

Re: SDSoc FFT timing estimation

Jump to solution

 

I have built a custom SDSoC platform for the Ultra96 with a 300 MHz accelerator clock and have listed exectution times I'm seeing for a 8192 point FFT below.  The numbers line up well with the whitepaper.

Non-blocking XFFT (cached data): 40 us

Blocking XFFT (cached data): 122 us

Blocking XFFT (non-cached data): 112 us