cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
Highlighted
Explorer
Explorer
14,297 Views
Registered: ‎05-15-2009

Core Generator - FFT bits

Jump to solution

I'm trying to implement an FFT, made via IP CORE GENERATOR Fast Fourier Transform v7.1, Radix-4 Burst I/O architecture, size 4096 samples 16-bit length.

I read in manual that output is 16+log2(4096)+1= 29 bits but what I need to know is:

in the datasheet (page 7, section "Bit and Digit Reversal" is described that "output is in reversal order".

I don't understand if this means that bits are reversed (if input is 16 bit as i1-i2-i3-i4-i5-i6-i7-i8-i9-i10-i11-i12-i13-i14-i15 but output is o27-o28-o25-o26-o23-o24-o21-o22-o19-o20 and so on as in Radix-4there's DIGIT REVERSAL, where i is input bit and o is output bit) or sample output order is reversed (if samples are s1-s2-s3-s4...s4094-s4095 output would be os4095-os4094-os4093...os3-os2-os1-oso as s is input sample 16 bit length and os is sample out 28 bit lenth). what is the correct???

Tags (1)
0 Kudos
1 Solution

Accepted Solutions
Highlighted
Advisor
Advisor
18,443 Views
Registered: ‎10-05-2010

@kensou wrote:

 

I didn't understand in what order I have xk_re and xk_im, what's the order of their bits and how to consider the output index  xk_index: I have to aplly the digit reversion to understand the index or I have only to consider dirctly the binary value? for example, as I have kx_index = "000000111111" have I to consider it's own value (63), so I should consider the kx value as the 64th output or have I to aplly the digit reversal considering the value as   "111111000000" (reversed), considering the respective output as the 40323th output? Could someone explain me the correct use of this module?


Bit/digit reversal only affects the order in which values emerge from the FFT core. The index values are correct and don't need to be reversed. The data sheet explains what reverse order looks like so that you can decide how to deal with such values.

 

Whether or not bit reversal matters will depend entirely on your application. If you don't care about the order that values emerge from the core, you can save resources and latency by using reverse order. If you'd rather they were neatly in order, use natural.

 

In pseudo code, the bit reversal process looks something like:

int natural_samples[4096];

while([current_index, sample_value] = get_fft_output())
  natural_samples[current_index] = sample_value

send_to_output(natural_samples[])

The FFT generates a sample_value for each current_index (for the purpose of the exercise, you could consider the order in which the indices are produced to be random) and these are stuck into an array using that index as a key. Once you've filled in all 4096 values, you can start outputting the data, now in natural order (0, 1, 2, ..., 4095).

 

The FFT algorithm just produces data in reverse order - the reversal option just saves you from having to implement the above algorithm yourself.

 

View solution in original post

0 Kudos
25 Replies
Highlighted
Advisor
Advisor
14,294 Views
Registered: ‎10-05-2010

It isn't the data bits (XK_RE/XK_IM) that are reversed, but the samples themselves. A 4096 (2^12) point FFT will output 4096 values (with whatever precision you require - e.g. 28 bits, if that's what you've selected).

 

Perhaps the best way to visualise what bit reversal means is to simulate the core. Feed in 4096 words and then watch the FFT start outputing data. In particular, have a look at the XK_INDEX port ("Index of output data"). In natural order mode, it'll count from 0, 1, 2, ..., 4095. In reverse order, it'll go 0, 2048, 3072, 1024, etc. (though with Radix-4 it will be different, as per the data sheet).

 

Highlighted
Explorer
Explorer
14,287 Views
Registered: ‎05-15-2009

This is the datasheet page about this problem:

 

Bit and Digit Reversal
Each architecture offers the option of natural or reversed ordering of output data, with data being input
in natural order. The FFT algorithm reorders the samples during processing such that data input in natural
order is output in reversed order. The core can optionally output the data in natural order. However,
this imposes a cost on each architecture. For the Burst I/O architectures, this imposes a time
penalty, because unloading the data cannot take place at the same time as loading input data for the
next frame, so separate unload and load phases are required. In the pipelined architecture, it requires
additional RAM storage to perform the reordering.
In the Radix-2, Burst I/O, Radix-2 Lite, Burst I/O, and Pipelined, Streaming I/O architectures, the Bit
Reverse order is simple to calculate, by taking the index of the data point, written in binary, and reversing
the order of the digits. Hence, 0000, 0001, 0010, 0011, 0100,...(0, 1, 2, 3, 4,...) becomes 0000, 1000, 0100,
1100, 0010,...(0, 8, 4, 12, 2,...).
In the case of the Radix-4, Burst I/O architecture, the reversal applies to digits and, therefore, is called
Digit Reversal. A digit in Radix-4 is two bits. Hence, 0000, 0001, 0010, 0011, 0100,...(0, 1, 2, 3, 4,...)
becomes 0000, 0100, 1000, 1100, 0001,...(0, 4, 8, 12, 1,...), as the pairs of digits are reversed. Where the
transform size requires an odd number of index bits, the odd digit in the least significant place is
moved to the most significant place, so 00000, 00001, 00010, 00011, 00100,... (0, 1, 2, 3, 4,...) becomes
00000, 10000, 00100, 10100, 01000,...(0, 16, 4, 20, 8,...)
Note: The core outputs a data point index along with the data, so this section is for information only.

 

I'm in this case: Radix-4 Burst I/O.

my fft hase these entity ( I didn't write the control signals for a shorter topic... in the program I have'em)


component fft_4096
    port (
    clk: IN std_logic;
    start: IN std_logic;
    xn_re: IN std_logic_VECTOR(15 downto 0);
    xn_im: IN std_logic_VECTOR(15 downto 0);
    xn_index: OUT std_logic_VECTOR(11 downto 0);
    xk_index: OUT std_logic_VECTOR(11 downto 0);
    xk_re: OUT std_logic_VECTOR(28 downto 0);
    xk_im: OUT std_logic_VECTOR(28 downto 0));
end component;

 

So with index 12 bit length, output both real and imaginary 29 bit, input real and imaginary 16 beat each.

 

I didn't understand in what order I have xk_re and xk_im, what's the order of their bits and how to consider the output index  xk_index: I have to aplly the digit reversion to understand the index or I have only to consider dirctly the binary value? for example, as I have kx_index = "000000111111" have I to consider it's own value (63), so I should consider the kx value as the 64th output or have I to aplly the digit reversal considering the value as   "111111000000" (reversed), considering the respective output as the 40323th output? Could someone explain me the correct use of this module?

0 Kudos
Highlighted
Advisor
Advisor
18,444 Views
Registered: ‎10-05-2010

@kensou wrote:

 

I didn't understand in what order I have xk_re and xk_im, what's the order of their bits and how to consider the output index  xk_index: I have to aplly the digit reversion to understand the index or I have only to consider dirctly the binary value? for example, as I have kx_index = "000000111111" have I to consider it's own value (63), so I should consider the kx value as the 64th output or have I to aplly the digit reversal considering the value as   "111111000000" (reversed), considering the respective output as the 40323th output? Could someone explain me the correct use of this module?


Bit/digit reversal only affects the order in which values emerge from the FFT core. The index values are correct and don't need to be reversed. The data sheet explains what reverse order looks like so that you can decide how to deal with such values.

 

Whether or not bit reversal matters will depend entirely on your application. If you don't care about the order that values emerge from the core, you can save resources and latency by using reverse order. If you'd rather they were neatly in order, use natural.

 

In pseudo code, the bit reversal process looks something like:

int natural_samples[4096];

while([current_index, sample_value] = get_fft_output())
  natural_samples[current_index] = sample_value

send_to_output(natural_samples[])

The FFT generates a sample_value for each current_index (for the purpose of the exercise, you could consider the order in which the indices are produced to be random) and these are stuck into an array using that index as a key. Once you've filled in all 4096 values, you can start outputting the data, now in natural order (0, 1, 2, ..., 4095).

 

The FFT algorithm just produces data in reverse order - the reversal option just saves you from having to implement the above algorithm yourself.

 

View solution in original post

0 Kudos
Highlighted
Explorer
Explorer
14,265 Views
Registered: ‎05-15-2009

So, I tell you this to confirm if I understood well

 

processing  the FFT the first index output is x"000H" (12 bits) then the second is x"800H" (2048 decimal), the third is "400H" (1024 decimal), the fourth is "C00H" (3072 decimal) and so on? So it would mean that these four results are, in order, the first, the 2049th (2049-1 = 2048), the 1024th and the 3073th output sample? I need to know it because i need to have the IFFT so the order of output is essential

0 Kudos
Highlighted
Advisor
Advisor
14,260 Views
Registered: ‎10-05-2010
Sounds about right, as long as you're not using Radix-4, which uses digit reversal instead of bit reversal.

If you're using the same core for IFFT, bear in mind that it only accepts samples in natural order, so it might be easier to generate FFT outputs in natural order so that you don't have to perform the reordering yourself, unless you plan to do that yourself at some point.
0 Kudos
Highlighted
Explorer
Explorer
14,259 Views
Registered: ‎05-15-2009

I didn't understand

 

I have to do this:

 

FFT-> sum to another stream of samples-> IFFT

 

I use the same cor for both FFT and IFFT, so, if I have sample output in digit reverse from FFT have I to reorder in input to IFFT for having the correct result in output from IFFT? Probabily I can change the order of adders for having them in digit reverese (or bit reverse) but have I to reorder the results in natural order before IFFT? i would be very resource-expansive...

0 Kudos
Highlighted
Advisor
Advisor
14,255 Views
Registered: ‎10-05-2010

As far as I can tell from the data sheet, the core will only accept inputs in natural order.

 

You can configure the core to reorder samples and output in natural order, which will make things very easy but may use some more BRAM. If you're running low on resources, you could try clever things like running the FFT core at double the speed and switching it from FFT to IFFT and back again.

 

What you do may also depend on what your other stream of samples looks like, the sample rate, whether your input data are continuous, etc.

0 Kudos
Highlighted
Explorer
Explorer
14,253 Views
Registered: ‎05-15-2009

Radix-4 Burst Natural uses more than 100% of device resources, seems that Radix-2 Burst Natural fits inside... I try.

 

I can't speed up clock

0 Kudos
Highlighted
Explorer
Explorer
14,252 Views
Registered: ‎05-15-2009
I'm sorry if I'm disturbing you, could you also reply to my other FFT thread?
0 Kudos
Highlighted
Explorer
Explorer
10,368 Views
Registered: ‎05-15-2009
I forgot, of course, thanx a lot for your help!!!
0 Kudos
Highlighted
Explorer
Explorer
10,365 Views
Registered: ‎05-15-2009
I re-write the other questions about FFT here:

I didn't understand a few signals timing inside FFT

when "few_inv_we" has to be activated? Whenever I like or there's a specified moment? In datashett is showed an activation when rfs and start are both '1' (pag.32), how have I to use it?



"unload" input, in datasheet seems to be '1' when edone is '1'... am I wrong?
0 Kudos
Highlighted
Advisor
Advisor
10,355 Views
Registered: ‎10-05-2010

Why can't you generate a faster clock using a DCM or similar? Of course, you shouldn't do this if you don't need to.

 

From page 26 of DS260,

 

The user is allowed great flexibility to set the transform type (Forward/Inverse) and the scaling sched- ule. The FWD_INV and SCALE_SCH values are latched into temporary registers whenever the corre- sponding WE pins are High. FWD_INV_WE and SCALE_SCH_WE can be asserted at any time until 3 cycles after START is asserted

 


"unload" input, in datasheet seems to be '1' when edone is '1'... am I wrong?

From page 33,

 

UNLOAD can be asserted any time from when EDONE goes High. 

 

 

 


0 Kudos
Highlighted
Explorer
Explorer
10,351 Views
Registered: ‎05-15-2009

is unload enough for 1 cycle clock when edone is '1' or has it to be '1' also during xk samples' outputs?

 

if i directly, as I did, connect edone to unload

 

u_FFT : FFT

port map(

...

edone => unload_int,

unload => unload:int,

...

);

 

would be fine?

 

I can't speedup clocks, the problem is that i have periods when Inputs aren't available during elaboration and, without input enable i don't know how could be possible to let the FFT "wait" the availability of them.

 

Just another thing: i put start input as '1' for all the time of FFT elaboration, assigning to it the '0' value only when my aleaboration is over... is okr I'd have to push start only at the beginning of every single input frame?

0 Kudos
Highlighted
Advisor
Advisor
10,349 Views
Registered: ‎10-05-2010

If you want to unload the data as soon as it's ready, just tigh UNLOAD to high:


In addition to using pulses,
START and UNLOAD can be tied High (Figure 14). In this case, the core continuously loads, processes, and unloads data. (p. 33)

 


is unload enough for 1 cycle clock when edone is '1' or has it to be '1' also during xk samples' outputs?


The timing diagram indicates that only one cycle is required. Have you tried simulating the core? It would be very quick for you to answer these questions with a simple test bench if you're not confident about what the data sheet says.

 


I can't speedup clocks, the problem is that i have periods when Inputs aren't available during elaboration and, without input enable i don't know how could be possible to let the FFT "wait" the availability of them.


You could always use a FIFO to wait for enough data.

 


Just another thing: i put start input as '1' for all the time of FFT elaboration, assigning to it the '0' value only when my aleaboration is over... is okr I'd have to push start only at the beginning of every single input frame?


What do you mean by 'elaboration'? From DS260,

 

START is only valid when the core is idle or if the core is in bit reversed output mode and unloading the pro- cessed data. 

 

If you're using bit/digit-reversed output order, you need to assert START again to unload the core. Please have a go at simulating the core - it really is the best way to get to grips with it.



0 Kudos
Highlighted
Explorer
Explorer
10,343 Views
Registered: ‎05-15-2009

Simulating the FFT nothing happens and appears this warning

 

WARNING:HDLCompiler:746 - "N:/M.53d/rtf/vhdl/src/unisims/primitive/ARAMB36_INTERNAL.vhd" Line 3033: Range is empty (null range)

0 Kudos
Highlighted
Advisor
Advisor
10,339 Views
Registered: ‎10-05-2010

From memory, the FFT core generates lots of warnings that are safe to ignore. If you'd like us to look at why your simulation isn't working, please post your test bench code and the parameters you used to create your FFT core.

0 Kudos
Highlighted
Explorer
Explorer
10,334 Views
Registered: ‎05-15-2009

Here are the VHDL code I used:
--------------------------------------------------------------------------------------------------------------

LIBRARY ieee;
USE ieee.std_logic_1164.ALL;

 

ENTITY FFT_test IS
END FFT_test;
 
ARCHITECTURE behavior OF FFT_test IS
 
    -- Component Declaration for the Unit Under Test (UUT)
    COMPONENT fft_module
    PORT(
         clk_in : IN  std_logic;
         reset : IN  std_logic;
         start : IN  std_logic;
         xn_real : IN  std_logic_vector(15 downto 0);
         xn_im : IN  std_logic_vector(15 downto 0);
         xn_data_valid : IN  std_logic;
         xk_im : OUT  std_logic_vector(28 downto 0);
         xk_real : OUT  std_logic_vector(28 downto 0);
         xk_data_valid : OUT  std_logic;
         xk_index : OUT  std_logic_vector(11 downto 0)
        );
    END COMPONENT;

   --Inputs
   signal clk_in : std_logic := '0';
   signal reset : std_logic := '0';
   signal xn_real : std_logic_vector(15 downto 0) := (others => '0');
   signal xn_im : std_logic_vector(15 downto 0) := (others => '0');
   signal xn_data_valid : std_logic := '0';
   signal elaborate_en : std_logic := '0';

     --Outputs
   signal xk_im : std_logic_vector(28 downto 0);
   signal xk_real : std_logic_vector(28 downto 0);
   signal xk_data_valid : std_logic;
   signal xk_index : std_logic_vector(11 downto 0);

   constant clk_in_period : time := 15 ns;
 
BEGIN
   uut: fft_module PORT MAP (
          clk_in => clk_in,
          reset => reset,
          xn_real => xn_real,
          xn_im => xn_im,
          xn_data_valid => xn_data_valid,
          xk_im => xk_im,
          xk_real => xk_real,
          xk_data_valid => xk_data_valid,
          xk_index => xk_index,
          start => start
        );

   -- Clock process definitions
   clk_in_process :process
   begin
        clk_in <= '0';
        wait for clk_in_period/2;
        clk_in <= '1';
        wait for clk_in_period/2;
   end process;
 

   stim_proc: process
   begin        
        reset <= '1';
         start <= '0';
         xn_real <= x"0000";
         xn_im <= x"0000";
         wait for clk_in_period;
         reset <= '0';
         start <= '1';
         xn_data_valid <= '1';
         xn_real <= x"9876";
         xn_im <= x"5431";
         wait for clk_in_period;
      -- hold reset state for 100 ns.
      wait for 100 ns;   
      wait;
   end process;
END;

--------------------------------------------------------------------------------------------------------------

library ieee;
  use IEEE.STD_LOGIC_1164.all;
  use IEEE.numeric_std.all;
  use IEEE.std_logic_signed.all;
--  use IEEE.std_logic_arith.all;
library unisim;
use unisim.vcomponents.all;

entity fft_module is

 Port ( clk_in : in  STD_LOGIC; -- clk
            reset : in  STD_LOGIC;

            start : in  STD_LOGIC
;
            xn_real : in  STD_LOGIC_VECTOR (15 downto 0);
            xn_im   : in  STD_LOGIC_VECTOR (15 downto 0);
            xn_data_valid : in  STD_LOGIC;   
            xk_im : out  STD_LOGIC_VECTOR (28 downto 0);
            xk_real : out STD_LOGIC_VECTOR (28 downto 0);
            xk_data_valid : out std_logic;
            xk_index: OUT std_logic_VECTOR(11 downto 0));
end fft_module;

architecture Behavioral of fft_module is


signal sclr : std_logic := '0';
signal start : std_logic := '0';
signal fwd_inv    : std_logic := '0';
signal fwd_inv_we : std_logic := '0';
signal rfd   : std_logic := '0';
signal busy  : std_logic := '0';
signal edone : std_logic := '0';
signal done  : std_logic := '0';

signal xn_real_int  : std_logic_VECTOR(15 downto 0);
signal xn_im_int    : std_logic_VECTOR(15 downto 0);
signal xn_index_int: std_logic_VECTOR(11 downto 0);

component fft_unit
    port (
     sclr : in STD_LOGIC := 'X';
    rfd : out STD_LOGIC;
    start : in STD_LOGIC := 'X';
    fwd_inv : in STD_LOGIC := 'X';
    dv : out STD_LOGIC;
    done : out STD_LOGIC;
    clk : in STD_LOGIC := 'X';
    busy : out STD_LOGIC;
    fwd_inv_we : in STD_LOGIC := 'X';
    edone : out STD_LOGIC;
    xn_re : in STD_LOGIC_VECTOR ( 15 downto 0 );
    xk_im : out STD_LOGIC_VECTOR ( 28 downto 0 );
    xn_index : out STD_LOGIC_VECTOR ( 11 downto 0 );
    xk_re : out STD_LOGIC_VECTOR ( 28 downto 0 );
    xn_im : in STD_LOGIC_VECTOR ( 15 downto 0 );
    xk_index : out STD_LOGIC_VECTOR ( 11 downto 0 )
     );
end component;

sinc_reset_process : process( reset, clk_in)
begin
if reset = '1' then
   sclr <= '1';
elsif rising_edge(clk_in) then
       sclr <= '0';
end if;
end process;
 
 
fft_programm_process : process(clk_in, sclr)
variable counter : integer range 4 downto 0 := 0;
begin
if (sclr = '1') then
     fwd_inv <= '1';
     fwd_inv_we <= '0';
     counter := 0;
elsif rising_edge(clk_in) then
        fwd_inv <= '1';
        if (counter  < 3 ) then
             if (counter = 1 )  then
                  fwd_inv_we <= '1';
             else fwd_inv_we <= '0';
             end if;
             counter := counter + 1;
        else fwd_inv_we <= '0';
        end if;
end if;
end process;    
    
data_in_process : process(reset, clk_in )
    begin
if (sclr = '1') then
     xn_real_int <= x"0000";
     xn_im_int <= x"0000";
elsif rising_edge(clk_in) then
        if (start = '1') then
                xn_real_int <= xn_real;
                xn_im_int <= xn_im;
        else xn_real_int <= x"0000";
              xn_im_int <=   x"0000";
        end if;
end if;
end process;  

fft_component : fft_unit
    port map (
            sclr => sclr,
            rfd => rfd,
            start => start,
            fwd_inv => fwd_inv,      
            dv   => xk_data_valid_int,   
            done => done,
            clk => clk_in,
            busy => busy,   
            fwd_inv_we => fwd_inv_we,
            edone => edone,
            xn_re => xn_real_int,    

            xn_im => xn_im_int,              

            xn_index => xn_index_int,
            xk_re => xk_real,   

            xk_im => xk_im,      
            xk_index => xk_index );
end Behavioral;
-----------------------------------------------------------------------------------------------------

0 Kudos
Highlighted
Explorer
Explorer
10,332 Views
Registered: ‎05-15-2009

I forgot, Radix-4 BUrst I/O,with sclr signal but no ce, 66MHz target frequency

0 Kudos
Highlighted
Advisor
Advisor
10,322 Views
Registered: ‎10-05-2010

A couple of errors - you're missing a semicolon on line 95:

 

             start : in  STD_LOGIC

 This causes the compiler to complain, on line 142:

begin
end process;

 Remove the 'end process'

 

On line 52, you're declaring elaborate_en:

elaborate_en => elaborate_en

 But this isn't declared on the fft_module entity (line 92). Similarly, you haven't declared 'start' in this port map.

 

Note that I do not really VHDL, so I could be wrong about how to define test benches, but this fails with many errors for me. You said, "Simulating the FFT nothing happens" - did you manage to get ISim to run on this code, or did ISE fail with synthesis problems? If there's a version of your code that you managed to get working, please give us a copy of that.

 

0 Kudos
Highlighted
Explorer
Explorer
8,725 Views
Registered: ‎05-15-2009

This was a previous version, i forgot to substitute the elaborate_en with start, in my project synthesis and syntax are without problems, so if you find elaborate_en replace them with start, thank you

0 Kudos
Highlighted
Explorer
Explorer
8,722 Views
Registered: ‎05-15-2009
I just modified the code, it should be the correct one... thank you
0 Kudos
Highlighted
Explorer
Explorer
8,719 Views
Registered: ‎05-15-2009

I made the simulation working! I have only one problem about this FFT:

 

I changed the code pushing start signal for one clock cycle at the beginning (xn_index = x"000") end after elaboration (when busy becomes '0' after being '1') for unloading samples.

 

I saw that after busy return to '0', the rfd signal became IMMEDIATLY '1' so  it gets new signals and xn_index inreases every clock cycle. after few cycles, dv goes '1' and also xk_index increases. So, after all, the FFT gets samplesavery time busy is '0', also when there are xk samples, dv is '1' and xk_index is inreasing

 

Is there a way to have THREE separated states, one for LOADING, one for PROCESSING and one for UNLOADING where

 

in LOADING I load signals and I increase xn_index 'till xn_index reaches its maximum WITHOUT UNLOADING SAMPLES

in PROCESSING (busy = '1') the FFT process the signals and it don't load or unload ANYTHING

in UNLOADING I unload the signals and I increase xk_index 'till xk_index reaches its end WITHOUT LOADING SAMPLES

 

so i could  return to the Loading STATE

 

I made this (no problems) but, as I wrote, pushing start again to unload samples at the end of processing, the system also begin to load...

 

I'd like a signal timing like the one in page 36, but I can't implement natural order (lack of resources) but I implemented a simple buffer for reorder the digit reverse samples (working without problems).

0 Kudos
Highlighted
Explorer
Explorer
8,714 Views
Registered: ‎05-15-2009

The problem is the lack of UNLOAD input in digit/bit reverse...

0 Kudos
Highlighted
Advisor
Advisor
8,705 Views
Registered: ‎10-05-2010

If the problem is that the FFT begins loading data while you're unloading it, why not delay asserting START until you're ready? You don't have to assert START to unload the data (Figure 12).

 

To emulate UNLOAD with a natural bit order output, you could use a FIFO. With your additional re-ordering buffer, you can just detect when the FFT has finished outputting the FFT and then signal to the next block that it's ready, and that can use an UNLOAD signal to retrieve the data in reversed order.

0 Kudos
Highlighted
Observer
Observer
8,116 Views
Registered: ‎09-13-2012

Hi 

 

I am using FFT core v7.1 with floating point setting, but I want to use it in fixed point setting but I can't understand the option in the menu. Can anyone help by telling me the option descrition so that I can set my core.

 

Regards 

Muneeb Ziaa

0 Kudos