**UPGRADE YOUR BROWSER**

We have detected your current browser version is not the latest one. Xilinx.com uses the latest web technologies to bring you the best online experience possible. Please upgrade to a Xilinx.com supported browser:Chrome, Firefox, Internet Explorer 11, Safari. Thank you!

Turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

- Community Forums
- :
- Forums
- :
- Vivado RTL Development
- :
- Synthesis
- :
- Adder tree using xilinx ip core

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page

Highlighted

meguellatix

Visitor

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

03-12-2019 06:03 AM

642 Views

Registered:
03-12-2019

Hello everyone,

I have a theoritical question: I need to add , in 2'complement, 8 complex values, coded in fixed point 32_24 format. the first solution that poped out to my mind is to use the adder ip core to design an adder tree, after having done a unit 2x32 bits complex adder. the issue is I'm not quite comfortable with how 2's complement addition is done and not sure how to manage precision through the tree stages...now my question(s): is it possible to use the same complex adder through all the stages, by choosing a precision that cover the final result? if not can any one suggest how to do it?

I appologize if this is not the right place for this question.

1 Solution

Accepted Solutions

jmcclusk

Scholar

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

03-14-2019 07:11 AM - edited 03-14-2019 07:50 AM

575 Views

Registered:
02-24-2014

Recursion is a nice solution to building adder trees. I typically use a VHDL recursive adder that operates on a vector of values to add. Dividing the vector by 2 gives a binary tree, and dividing into 3 pieces gives a ternary tree, which is more efficient in terms of logic, but a bit slower in Fmax. In VHDL it's pretty easy, and Verilog 2005 standard specifies that module recursion is supported.

In VHDL, the tricky part is defining a data type that's extensible in 2 dimensions, precision and vector length. Here's an example in VHDL-2008:

library IEEE; use IEEE.STD_LOGIC_1164.ALL; use IEEE.Numeric_Std.all; package Add_Reduce is type unsigned_array is array(natural range <>) of unsigned; end package;

The data type "unsigned_array" is our port object for the input. In your case, it would be an array of complex values. Note that we need to manually compute the precision of the output. In a perfect world this, would be computed by a function in this package, to make it easier to use. Calculating output precision can be tricky, especially if the inputs aren't using the full range possible. Extra functions such as saturation and clipping are left as an exercise for the reader.

Next, here is the recursive adder tree in VHDL-2008. This one uses ternary adders in behavioral code, which may or may not synthesize correctly. Sometimes Vivado is recalcitrant on the issue. There's also a bit of code to detect the corner case when the input is an array of 1 bit unsigned values.

library IEEE; use IEEE.STD_LOGIC_1164.ALL; use IEEE.NUMERIC_STD.ALL; library xil_defaultlib; use xil_defaultlib.Add_Reduce.all; entity add_reduce_unsigned is Port ( clk : in STD_LOGIC; A : in unsigned_array; Q : out unsigned ); end add_reduce_unsigned; architecture Behavioral of add_reduce_unsigned is component add_reduce_unsigned is Port ( clk : in STD_LOGIC; A : in unsigned_array; Q : out unsigned ); end component; constant C_word_len : natural := A(A'high)'length; begin gen: case A'length generate when 1 | 2 | 3 => -- ternary addition is supported for Series 7 and higher proc: process(clk) is begin if rising_edge(clk) then if A'length = 1 then Q <= resize( A(A'low),Q'length); elsif A'length = 2 then Q <= resize( A(A'high),Q'length) + resize( A(A'low),Q'length); else Q <= resize( A(A'high),Q'length) + resize( A(A'high - 1),Q'length) + resize(A(A'low),Q'length); end if; end if; end process; when 4 | 5 | 6 => gen_1bit: if A(A'high)'length = 1 generate -- special case, when we are just adding bits instead of words proc: process(clk) is begin if rising_edge(clk) then if A'length = 4 then Q <= resize(A(A'high),Q'length)+resize(A(A'high-1),Q'length)+resize(A(A'high-2),Q'length)+resize(A(A'low),Q'length); elsif A'length = 5 then Q <= resize(A(A'high),Q'length)+resize(A(A'high-1),Q'length)+resize(A(A'high-2),Q'length)+resize(A(A'high-3),Q'length)+resize(A(A'low),Q'length); else Q <= resize(A(A'high),Q'length)+resize(A(A'high-1),Q'length)+resize(A(A'high-2),Q'length)+resize(A(A'high-3),Q'length)+resize(A(A'high-4),Q'length)+resize(A(A'low),Q'length); end if; end if; end process; else generate signal tmp1, tmp2 : unsigned(Q'length-2 downto 0); begin g1: entity xil_defaultlib.add_reduce_unsigned port map( clk => clk, A => A(A'high downto A'high - A'length/2 + 1), Q => tmp1 ); g2: entity xil_defaultlib.add_reduce_unsigned port map( clk => clk, A => A(A'high - A'length/2 downto A'low), Q => tmp2 ); proc: process(clk) is begin if rising_edge(clk) then Q <= resize(tmp1,Q'length) + resize(tmp2,Q'length); end if; end process; end generate gen_1bit; when others => signal tmp1, tmp2, tmp3 : unsigned(Q'length-2 downto 0); begin g1: entity xil_defaultlib.add_reduce_unsigned port map( clk => clk, A => A(A'high downto A'high - A'length/3 + 1), Q => tmp1 ); g2: entity xil_defaultlib.add_reduce_unsigned port map( clk => clk, A => A(A'high - A'length/3 downto A'low + A'length/3), Q => tmp2 ); g3: entity xil_defaultlib.add_reduce_unsigned port map( clk => clk, A => A(A'low + A'length/3 - 1 downto A'low), Q => tmp3 ); proc: process(clk) is begin if rising_edge(clk) then Q <= resize(tmp1,Q'length) + resize(tmp2,Q'length) + resize(tmp3,Q'length); end if; end process; end generate gen; end Behavioral;

And finally a synthesis testcase to verify it works in Vivado synthesis.

-- library IEEE; use IEEE.STD_LOGIC_1164.ALL; use IEEE.NUMERIC_STD.ALL;

library xil_defaultlib; use xil_defaultlib.Add_Reduce.all; entity test_add is Port ( clk : in STD_LOGIC; dat_in : unsigned(31 downto 0); sum : out unsigned(42 downto 0)); end test_add; architecture Behavioral of test_add is signal A : unsigned_array(511 downto 0)(31 downto 0); begin proc: process(clk) is begin if rising_edge(clk) then for i in A'range loop -- just build a big shift register on A if i = 0 then A(i) <= dat_in; else A(i) <= A(i-1); end if; end loop; end if; end process; g1: entity xil_defaultlib.add_reduce_unsigned port map( clk => clk, A => A, Q => sum ); end Behavioral;

Don't forget to close a thread when possible by accepting a post as a solution.

6 Replies

jmcclusk

Scholar

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

03-14-2019 07:11 AM - edited 03-14-2019 07:50 AM

576 Views

Registered:
02-24-2014

Recursion is a nice solution to building adder trees. I typically use a VHDL recursive adder that operates on a vector of values to add. Dividing the vector by 2 gives a binary tree, and dividing into 3 pieces gives a ternary tree, which is more efficient in terms of logic, but a bit slower in Fmax. In VHDL it's pretty easy, and Verilog 2005 standard specifies that module recursion is supported.

In VHDL, the tricky part is defining a data type that's extensible in 2 dimensions, precision and vector length. Here's an example in VHDL-2008:

library IEEE; use IEEE.STD_LOGIC_1164.ALL; use IEEE.Numeric_Std.all; package Add_Reduce is type unsigned_array is array(natural range <>) of unsigned; end package;

The data type "unsigned_array" is our port object for the input. In your case, it would be an array of complex values. Note that we need to manually compute the precision of the output. In a perfect world this, would be computed by a function in this package, to make it easier to use. Calculating output precision can be tricky, especially if the inputs aren't using the full range possible. Extra functions such as saturation and clipping are left as an exercise for the reader.

Next, here is the recursive adder tree in VHDL-2008. This one uses ternary adders in behavioral code, which may or may not synthesize correctly. Sometimes Vivado is recalcitrant on the issue. There's also a bit of code to detect the corner case when the input is an array of 1 bit unsigned values.

library IEEE; use IEEE.STD_LOGIC_1164.ALL; use IEEE.NUMERIC_STD.ALL; library xil_defaultlib; use xil_defaultlib.Add_Reduce.all; entity add_reduce_unsigned is Port ( clk : in STD_LOGIC; A : in unsigned_array; Q : out unsigned ); end add_reduce_unsigned; architecture Behavioral of add_reduce_unsigned is component add_reduce_unsigned is Port ( clk : in STD_LOGIC; A : in unsigned_array; Q : out unsigned ); end component; constant C_word_len : natural := A(A'high)'length; begin gen: case A'length generate when 1 | 2 | 3 => -- ternary addition is supported for Series 7 and higher proc: process(clk) is begin if rising_edge(clk) then if A'length = 1 then Q <= resize( A(A'low),Q'length); elsif A'length = 2 then Q <= resize( A(A'high),Q'length) + resize( A(A'low),Q'length); else Q <= resize( A(A'high),Q'length) + resize( A(A'high - 1),Q'length) + resize(A(A'low),Q'length); end if; end if; end process; when 4 | 5 | 6 => gen_1bit: if A(A'high)'length = 1 generate -- special case, when we are just adding bits instead of words proc: process(clk) is begin if rising_edge(clk) then if A'length = 4 then Q <= resize(A(A'high),Q'length)+resize(A(A'high-1),Q'length)+resize(A(A'high-2),Q'length)+resize(A(A'low),Q'length); elsif A'length = 5 then Q <= resize(A(A'high),Q'length)+resize(A(A'high-1),Q'length)+resize(A(A'high-2),Q'length)+resize(A(A'high-3),Q'length)+resize(A(A'low),Q'length); else Q <= resize(A(A'high),Q'length)+resize(A(A'high-1),Q'length)+resize(A(A'high-2),Q'length)+resize(A(A'high-3),Q'length)+resize(A(A'high-4),Q'length)+resize(A(A'low),Q'length); end if; end if; end process; else generate signal tmp1, tmp2 : unsigned(Q'length-2 downto 0); begin g1: entity xil_defaultlib.add_reduce_unsigned port map( clk => clk, A => A(A'high downto A'high - A'length/2 + 1), Q => tmp1 ); g2: entity xil_defaultlib.add_reduce_unsigned port map( clk => clk, A => A(A'high - A'length/2 downto A'low), Q => tmp2 ); proc: process(clk) is begin if rising_edge(clk) then Q <= resize(tmp1,Q'length) + resize(tmp2,Q'length); end if; end process; end generate gen_1bit; when others => signal tmp1, tmp2, tmp3 : unsigned(Q'length-2 downto 0); begin g1: entity xil_defaultlib.add_reduce_unsigned port map( clk => clk, A => A(A'high downto A'high - A'length/3 + 1), Q => tmp1 ); g2: entity xil_defaultlib.add_reduce_unsigned port map( clk => clk, A => A(A'high - A'length/3 downto A'low + A'length/3), Q => tmp2 ); g3: entity xil_defaultlib.add_reduce_unsigned port map( clk => clk, A => A(A'low + A'length/3 - 1 downto A'low), Q => tmp3 ); proc: process(clk) is begin if rising_edge(clk) then Q <= resize(tmp1,Q'length) + resize(tmp2,Q'length) + resize(tmp3,Q'length); end if; end process; end generate gen; end Behavioral;

And finally a synthesis testcase to verify it works in Vivado synthesis.

-- library IEEE; use IEEE.STD_LOGIC_1164.ALL; use IEEE.NUMERIC_STD.ALL;

library xil_defaultlib; use xil_defaultlib.Add_Reduce.all; entity test_add is Port ( clk : in STD_LOGIC; dat_in : unsigned(31 downto 0); sum : out unsigned(42 downto 0)); end test_add; architecture Behavioral of test_add is signal A : unsigned_array(511 downto 0)(31 downto 0); begin proc: process(clk) is begin if rising_edge(clk) then for i in A'range loop -- just build a big shift register on A if i = 0 then A(i) <= dat_in; else A(i) <= A(i-1); end if; end loop; end if; end process; g1: entity xil_defaultlib.add_reduce_unsigned port map( clk => clk, A => A, Q => sum ); end Behavioral;

Don't forget to close a thread when possible by accepting a post as a solution.

drjohnsmith

Scholar

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

03-14-2019 08:03 AM

555 Views

Registered:
07-09-2009

Re: Adder tree using xilinx ip core

For fixed point addition / subtraction, you gain one extra bit for each addition / subtraction.

these might be good pointers

https://www.dsprelated.com/showarticle/139.php

<== If this was helpful, please feel free to give Kudos, and close if it answers your question ==>

markcurry

Scholar

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

03-14-2019 10:03 AM - edited 03-14-2019 10:03 AM

542 Views

Registered:
09-16-2009

Re: Adder tree using xilinx ip core

@drjohnsmith wrote:

For fixed point addition / subtraction, you gain one extra bit for each addition / subtraction.

Not quite right - it's logarithmic - For N additions / subtractions, you add log2(N) bits.. I.e. Adding 8 quantities, you'll need log2(8) = 3 extra bits at the output.

Regards,

Mark

meguellatix

Visitor

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

03-14-2019 10:24 AM

532 Views

Registered:
03-12-2019

Re: Adder tree using xilinx ip core

thanks again!

meguellatix

Visitor

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

03-14-2019 10:27 AM

527 Views

Registered:
03-12-2019

Re: Adder tree using xilinx ip core

hello drjohnsmith,

thank you for the link, it's very clearly explained and intuitive!

thank you for the link, it's very clearly explained and intuitive!

meguellatix

Visitor

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

03-14-2019 10:28 AM

527 Views

Registered:
03-12-2019

Re: Adder tree using xilinx ip core

hi,

i think he meant one bit per stage...

i think he meant one bit per stage...