We have detected your current browser version is not the latest one. Xilinx.com uses the latest web technologies to bring you the best online experience possible. Please upgrade to a Xilinx.com supported browser:Chrome, Firefox, Internet Explorer 11, Safari. Thank you!

cancel
Showing results for
Did you mean:
Highlighted
Visitor
642 Views
Registered: ‎03-12-2019

## Adder tree using xilinx ip core

Hello everyone,

I have a theoritical question: I need to add , in 2'complement, 8 complex values, coded in fixed point 32_24 format. the first solution that poped out to my mind is to use the adder ip core to design an adder tree, after having done a unit 2x32 bits complex adder. the issue is I'm not quite comfortable with how 2's complement addition is done and not sure how to manage precision through the tree stages...now my question(s): is it possible to use the same complex adder  through all the stages, by choosing a precision that cover the final result? if not can any one suggest how to do it?

I appologize if this is not the right place for this question.

1 Solution

Accepted Solutions
Scholar
575 Views
Registered: ‎02-24-2014

## Re: Adder tree using xilinx ip core

Recursion is a nice solution to building adder trees.   I typically use a VHDL recursive adder that operates on a vector of values to add.    Dividing the vector by 2 gives a binary tree, and dividing into 3 pieces gives a ternary tree, which is more efficient in terms of logic, but a bit slower in Fmax.   In VHDL it's pretty easy, and Verilog 2005 standard specifies that module recursion is supported.

In VHDL, the tricky part is defining a data type that's extensible in 2 dimensions,  precision and vector length.   Here's an example in VHDL-2008:

```library IEEE;
use IEEE.STD_LOGIC_1164.ALL;
use IEEE.Numeric_Std.all;

type unsigned_array is array(natural range <>) of unsigned;
end package;```

The data type "unsigned_array" is our port object for the input.  In your case, it would be an array of complex values.  Note that we need to manually compute the precision of the output.   In a perfect world this, would be computed by a function in this package, to make it easier to use.   Calculating output precision can be tricky, especially if the inputs aren't using the full range possible.    Extra functions such as saturation and clipping are left as an exercise for the reader.

Next, here is the recursive adder tree in VHDL-2008.  This one uses ternary adders in behavioral code, which may or may not synthesize correctly.  Sometimes Vivado is recalcitrant on the issue.   There's also a bit of code to detect the corner case when the input is an array of 1 bit unsigned values.

```library IEEE;
use IEEE.STD_LOGIC_1164.ALL;
use IEEE.NUMERIC_STD.ALL;
library xil_defaultlib;

Port ( clk : in STD_LOGIC;
A : in unsigned_array;
Q : out unsigned
);

architecture Behavioral of add_reduce_unsigned is

Port ( clk : in STD_LOGIC;
A : in unsigned_array;
Q : out unsigned
);
end component;

constant C_word_len : natural := A(A'high)'length;

begin

gen: case A'length generate
when 1 | 2 | 3 =>               -- ternary addition is supported for Series 7 and higher
proc: process(clk) is
begin
if rising_edge(clk) then
if A'length = 1 then
Q <= resize( A(A'low),Q'length);
elsif A'length = 2 then
Q <= resize( A(A'high),Q'length) + resize( A(A'low),Q'length);
else
Q <= resize( A(A'high),Q'length) + resize( A(A'high - 1),Q'length) + resize(A(A'low),Q'length);
end if;
end if;
end process;
when 4 | 5 | 6 =>
gen_1bit: if A(A'high)'length = 1 generate   -- special case, when we are just adding bits instead of words
proc: process(clk) is
begin
if rising_edge(clk) then
if A'length = 4 then
Q <= resize(A(A'high),Q'length)+resize(A(A'high-1),Q'length)+resize(A(A'high-2),Q'length)+resize(A(A'low),Q'length);
elsif A'length = 5 then
Q <= resize(A(A'high),Q'length)+resize(A(A'high-1),Q'length)+resize(A(A'high-2),Q'length)+resize(A(A'high-3),Q'length)+resize(A(A'low),Q'length);
else
Q <= resize(A(A'high),Q'length)+resize(A(A'high-1),Q'length)+resize(A(A'high-2),Q'length)+resize(A(A'high-3),Q'length)+resize(A(A'high-4),Q'length)+resize(A(A'low),Q'length);
end if;
end if;
end process;
else generate
signal tmp1, tmp2 : unsigned(Q'length-2 downto 0);
begin
port map( clk => clk,
A => A(A'high downto A'high - A'length/2 + 1),
Q => tmp1 );
port map( clk => clk,
A => A(A'high - A'length/2 downto A'low),
Q => tmp2 );
proc: process(clk) is
begin
if rising_edge(clk) then
Q <= resize(tmp1,Q'length) + resize(tmp2,Q'length);
end if;
end process;
end generate gen_1bit;
when others =>
signal tmp1, tmp2, tmp3 : unsigned(Q'length-2 downto 0);
begin
port map( clk => clk,
A => A(A'high downto A'high - A'length/3 + 1),
Q => tmp1 );
port map( clk => clk,
A => A(A'high - A'length/3 downto A'low + A'length/3),
Q => tmp2 );
port map( clk => clk,
A => A(A'low + A'length/3 - 1 downto A'low),
Q => tmp3 );
proc: process(clk) is
begin
if rising_edge(clk) then
Q <= resize(tmp1,Q'length) + resize(tmp2,Q'length) + resize(tmp3,Q'length);
end if;
end process;
end generate gen;
end Behavioral;
```

And finally a synthesis testcase to verify it works in Vivado synthesis.

```--
library IEEE;
use IEEE.STD_LOGIC_1164.ALL;
use IEEE.NUMERIC_STD.ALL;
library xil_defaultlib;

Port ( clk : in STD_LOGIC;
dat_in : unsigned(31 downto 0);
sum : out unsigned(42 downto 0));

architecture Behavioral of test_add is

signal A : unsigned_array(511 downto 0)(31 downto 0);

begin

proc: process(clk) is
begin
if rising_edge(clk) then
for i in A'range loop  -- just build a big shift register on A
if i = 0 then
A(i) <= dat_in;
else
A(i) <= A(i-1);
end if;
end loop;
end if;
end process;

port map( clk => clk,
A => A,
Q => sum );

end Behavioral;
```

Don't forget to close a thread when possible by accepting a post as a solution.
6 Replies
Scholar
576 Views
Registered: ‎02-24-2014

## Re: Adder tree using xilinx ip core

Recursion is a nice solution to building adder trees.   I typically use a VHDL recursive adder that operates on a vector of values to add.    Dividing the vector by 2 gives a binary tree, and dividing into 3 pieces gives a ternary tree, which is more efficient in terms of logic, but a bit slower in Fmax.   In VHDL it's pretty easy, and Verilog 2005 standard specifies that module recursion is supported.

In VHDL, the tricky part is defining a data type that's extensible in 2 dimensions,  precision and vector length.   Here's an example in VHDL-2008:

```library IEEE;
use IEEE.STD_LOGIC_1164.ALL;
use IEEE.Numeric_Std.all;

type unsigned_array is array(natural range <>) of unsigned;
end package;```

The data type "unsigned_array" is our port object for the input.  In your case, it would be an array of complex values.  Note that we need to manually compute the precision of the output.   In a perfect world this, would be computed by a function in this package, to make it easier to use.   Calculating output precision can be tricky, especially if the inputs aren't using the full range possible.    Extra functions such as saturation and clipping are left as an exercise for the reader.

Next, here is the recursive adder tree in VHDL-2008.  This one uses ternary adders in behavioral code, which may or may not synthesize correctly.  Sometimes Vivado is recalcitrant on the issue.   There's also a bit of code to detect the corner case when the input is an array of 1 bit unsigned values.

```library IEEE;
use IEEE.STD_LOGIC_1164.ALL;
use IEEE.NUMERIC_STD.ALL;
library xil_defaultlib;

Port ( clk : in STD_LOGIC;
A : in unsigned_array;
Q : out unsigned
);

architecture Behavioral of add_reduce_unsigned is

Port ( clk : in STD_LOGIC;
A : in unsigned_array;
Q : out unsigned
);
end component;

constant C_word_len : natural := A(A'high)'length;

begin

gen: case A'length generate
when 1 | 2 | 3 =>               -- ternary addition is supported for Series 7 and higher
proc: process(clk) is
begin
if rising_edge(clk) then
if A'length = 1 then
Q <= resize( A(A'low),Q'length);
elsif A'length = 2 then
Q <= resize( A(A'high),Q'length) + resize( A(A'low),Q'length);
else
Q <= resize( A(A'high),Q'length) + resize( A(A'high - 1),Q'length) + resize(A(A'low),Q'length);
end if;
end if;
end process;
when 4 | 5 | 6 =>
gen_1bit: if A(A'high)'length = 1 generate   -- special case, when we are just adding bits instead of words
proc: process(clk) is
begin
if rising_edge(clk) then
if A'length = 4 then
Q <= resize(A(A'high),Q'length)+resize(A(A'high-1),Q'length)+resize(A(A'high-2),Q'length)+resize(A(A'low),Q'length);
elsif A'length = 5 then
Q <= resize(A(A'high),Q'length)+resize(A(A'high-1),Q'length)+resize(A(A'high-2),Q'length)+resize(A(A'high-3),Q'length)+resize(A(A'low),Q'length);
else
Q <= resize(A(A'high),Q'length)+resize(A(A'high-1),Q'length)+resize(A(A'high-2),Q'length)+resize(A(A'high-3),Q'length)+resize(A(A'high-4),Q'length)+resize(A(A'low),Q'length);
end if;
end if;
end process;
else generate
signal tmp1, tmp2 : unsigned(Q'length-2 downto 0);
begin
port map( clk => clk,
A => A(A'high downto A'high - A'length/2 + 1),
Q => tmp1 );
port map( clk => clk,
A => A(A'high - A'length/2 downto A'low),
Q => tmp2 );
proc: process(clk) is
begin
if rising_edge(clk) then
Q <= resize(tmp1,Q'length) + resize(tmp2,Q'length);
end if;
end process;
end generate gen_1bit;
when others =>
signal tmp1, tmp2, tmp3 : unsigned(Q'length-2 downto 0);
begin
port map( clk => clk,
A => A(A'high downto A'high - A'length/3 + 1),
Q => tmp1 );
port map( clk => clk,
A => A(A'high - A'length/3 downto A'low + A'length/3),
Q => tmp2 );
port map( clk => clk,
A => A(A'low + A'length/3 - 1 downto A'low),
Q => tmp3 );
proc: process(clk) is
begin
if rising_edge(clk) then
Q <= resize(tmp1,Q'length) + resize(tmp2,Q'length) + resize(tmp3,Q'length);
end if;
end process;
end generate gen;
end Behavioral;
```

And finally a synthesis testcase to verify it works in Vivado synthesis.

```--
library IEEE;
use IEEE.STD_LOGIC_1164.ALL;
use IEEE.NUMERIC_STD.ALL;
library xil_defaultlib;

Port ( clk : in STD_LOGIC;
dat_in : unsigned(31 downto 0);
sum : out unsigned(42 downto 0));

architecture Behavioral of test_add is

signal A : unsigned_array(511 downto 0)(31 downto 0);

begin

proc: process(clk) is
begin
if rising_edge(clk) then
for i in A'range loop  -- just build a big shift register on A
if i = 0 then
A(i) <= dat_in;
else
A(i) <= A(i-1);
end if;
end loop;
end if;
end process;

port map( clk => clk,
A => A,
Q => sum );

end Behavioral;
```

Don't forget to close a thread when possible by accepting a post as a solution.
Scholar
555 Views
Registered: ‎07-09-2009

## Re: Adder tree using xilinx ip core

For fixed point addition / subtraction, you gain one extra bit for each addition / subtraction.

these might be good pointers

https://www.dsprelated.com/showarticle/139.php

<== If this was helpful, please feel free to give Kudos, and close if it answers your question ==>
Scholar
542 Views
Registered: ‎09-16-2009

## Re: Adder tree using xilinx ip core

@drjohnsmith wrote:

For fixed point addition / subtraction, you gain one extra bit for each addition / subtraction.

Not quite right - it's logarithmic - For N additions / subtractions, you add log2(N) bits.. I.e. Adding 8 quantities, you'll need log2(8) = 3 extra bits at the output.

Regards,

Mark

Visitor
532 Views
Registered: ‎03-12-2019

## Re: Adder tree using xilinx ip core

hello jmcclusk, Thank you for your thourough answer! actuallyI was thinking of using the core, but if vhdl supports this "recursion" I will go for it, although I only know its name, no other details...one occasion to learn new things. just an other question if I may, in how many clock cycles is recursion adder tree slower compared to a conventional one?
thanks again!
Visitor
527 Views
Registered: ‎03-12-2019