cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
Highlighted
Anonymous
Not applicable
9,463 Views

Different LUT6 placement in FPGA Editor and PlanAhead

Hi,

 

I'm about to write a simple and very^2 little systolic processor (2-4 Virtex5-Slices). All LUT,CARRY and FFplacements

are done by hand (incl. BEL constraints) to achive the best performance and area results.

 

My problem is, that I discovered differences in the BEL-Positions of LUT6 instances in one Slice in PlanAhead and

FPGA-Editor  (with the cost of additional routings out and again into the Slice because following FFs are not placed

appropriate)

.

 

A simple example: I define

 

  LUT1 BEL = A6LUT    FF1 BEL = AFF

  LUT2 BEL = B6LUT    FF2 BEL = BFF

  LUT3 BEL = C6LUT    FF3 BEL = CFF

  LUT4 BEL = D6LUT    FF4 BEL = DFF

 

PlanAhead displays all as expected while the FPGA-Editor shows the LUTs in reverse order:


  LUT1 BEL = D6LUT    FF1 BEL = AFF

  LUT2 BEL = C6LUT    FF2 BEL = BFF

  LUT3 BEL = B6LUT    FF3 BEL = CFF

  LUT4 BEL = A6LUT    FF4 BEL = DFF,

 

Also shown are the LUT-o6-signals routed out and again into the slice (FFs are not appropiate placed).

 

My currenent solution is to reverse the FlipFlop-BEL-Constraints (AFF..DFF => DFF..AFF) of the following

FF-Stage resulting in no extra routing costs (static timing report with better results) , but this is only a hack,

no real solution.


Can anybody help me with this problem?

 

 

Thanks,

 

Jotta

0 Kudos
7 Replies
Highlighted
Xilinx Employee
Xilinx Employee
9,223 Views
Registered: ‎11-28-2007

Would it be possible for you to provide a test case?

 

Cheers,

Jim

 

Cheers,
Jim
0 Kudos
Highlighted
Anonymous
Not applicable
9,218 Views

Hi,

 

I have just created a simple test scenario with 4 slices:

 

ISE WebPack 11.1, Floorplaner 11.2  (but the problem was the same with Floorplaner 11.1)

 

Device: XC5VFX30 -1 FF665

 

VHDL Code:

 

------------------------------------------------- library IEEE; use IEEE.STD_LOGIC_1164.ALL; use IEEE.NUMERIC_STD.ALL; Library UNISIM; use UNISIM.vcomponents.all; ------------------------------------------------- entity root is port ( clock : in std_logic; inputs : in std_logic_vector(7 downto 0); outputs: out std_logic_vector(7 downto 0) ); end root; ------------------------------------------------- architecture arch of root is constant width: integer := 8; -- inbut buffer FFs signal ib_ffd_i: std_logic_vector(width-1 downto 0); signal ib_ffd_o: std_logic_vector(width-1 downto 0); -- LUT test LUT6s signal lt_lut_i : std_logic_vector(width-1 downto 0); signal lt_lut_o5: std_logic_vector(width-1 downto 0); signal lt_lut_o6: std_logic_vector(width-1 downto 0); -- LUT test FFDs signal lt_ffd_i: std_logic_vector(width-1 downto 0); signal lt_ffd_o: std_logic_vector(width-1 downto 0); begin -- input buffer FFDs ib_ffd_gen:for i in 0 to width-1 generate FDRSE_inst: FDRSE generic map ( INIT => '0' ) port map ( C => clock, CE => '1', R => '0', S => '0', D => ib_ffd_i(i), Q => ib_ffd_o(i) ); end generate; -- LUTs lt_lut_gen:for i in 0 to width-1 generate LUT6_inst:LUT6_2 generic map ( INIT => X"0000_0006_0000_0002" ) port map ( I0 => lt_ffd_o(i), I1 => lt_lut_i(i), I2 => '0', I3 => '0', I4 => '0', I5 => '1', O5 => lt_lut_o5(i), O6 => lt_lut_o6(i) ); end generate; -- FFDs lt_ffd_gen:for i in 0 to width-1 generate FDRSE_inst: FDRSE generic map ( INIT => '0' ) port map ( C => clock, CE => '1', R => '0', S => '0', D => lt_ffd_i(i), Q => lt_ffd_o(i) ); end generate; -- connections ib_ffd_i <= inputs; lt_lut_i <= ib_ffd_o; lt_ffd_i <= lt_lut_o6; -- output outputs <= lt_ffd_o(7 downto 0); end arch; -------------------------------------------------

 

 

and the coresponding UCF file:

 

 

# === Clock === NET "clock" LOC = E18 | IOSTANDARD = LVCMOS33; NET "clock" TNM_NET = "clock_in"; TIMESPEC TS_clock_in = PERIOD "clock_in" 10 ns; # === Inputs/Outputs === NET "inputs[0]" LOC = AD14 | IOSTANDARD = LVCMOS18; NET "inputs[1]" LOC = AD13 | IOSTANDARD = LVCMOS18; NET "inputs[2]" LOC = AE13 | IOSTANDARD = LVCMOS18; NET "inputs[3]" LOC = AF13 | IOSTANDARD = LVCMOS18; NET "inputs[4]" LOC = AF14 | IOSTANDARD = LVCMOS18; NET "inputs[5]" LOC = AF15 | IOSTANDARD = LVCMOS18; NET "inputs[6]" LOC = AE15 | IOSTANDARD = LVCMOS18; NET "inputs[7]" LOC = AD15 | IOSTANDARD = LVCMOS18; NET "outputs[0]" LOC = AD16 | IOSTANDARD = LVCMOS18; NET "outputs[1]" LOC = AE16 | IOSTANDARD = LVCMOS18; NET "outputs[2]" LOC = AF17 | IOSTANDARD = LVCMOS18; NET "outputs[3]" LOC = AE17 | IOSTANDARD = LVCMOS18; NET "outputs[4]" LOC = AD18 | IOSTANDARD = LVCMOS18; NET "outputs[5]" LOC = AE18 | IOSTANDARD = LVCMOS18; NET "outputs[6]" LOC = AF18 | IOSTANDARD = LVCMOS18; NET "outputs[7]" LOC = AF19 | IOSTANDARD = LVCMOS18; # === Locations === INST "ib_ffd_gen[0].FDRSE_inst" LOC = SLICE_X0Y0 | BEL = AFF; INST "ib_ffd_gen[1].FDRSE_inst" LOC = SLICE_X0Y0 | BEL = BFF; INST "ib_ffd_gen[2].FDRSE_inst" LOC = SLICE_X0Y0 | BEL = CFF; INST "ib_ffd_gen[3].FDRSE_inst" LOC = SLICE_X0Y0 | BEL = DFF; INST "ib_ffd_gen[4].FDRSE_inst" LOC = SLICE_X0Y1 | BEL = AFF; INST "ib_ffd_gen[5].FDRSE_inst" LOC = SLICE_X0Y1 | BEL = BFF; INST "ib_ffd_gen[6].FDRSE_inst" LOC = SLICE_X0Y1 | BEL = CFF; INST "ib_ffd_gen[7].FDRSE_inst" LOC = SLICE_X0Y1 | BEL = DFF; INST "lt_lut_gen[0].LUT6_inst" LOC = SLICE_X1Y0 | BEL = A6LUT; INST "lt_lut_gen[1].LUT6_inst" LOC = SLICE_X1Y0 | BEL = B6LUT; INST "lt_lut_gen[2].LUT6_inst" LOC = SLICE_X1Y0 | BEL = C6LUT; INST "lt_lut_gen[3].LUT6_inst" LOC = SLICE_X1Y0 | BEL = D6LUT; INST "lt_lut_gen[4].LUT6_inst" LOC = SLICE_X1Y1 | BEL = A6LUT; INST "lt_lut_gen[5].LUT6_inst" LOC = SLICE_X1Y1 | BEL = B6LUT; INST "lt_lut_gen[6].LUT6_inst" LOC = SLICE_X1Y1 | BEL = C6LUT; INST "lt_lut_gen[7].LUT6_inst" LOC = SLICE_X1Y1 | BEL = D6LUT; INST "lt_ffd_gen[0].FDRSE_inst" LOC = SLICE_X1Y0 | BEL = AFF; INST "lt_ffd_gen[1].FDRSE_inst" LOC = SLICE_X1Y0 | BEL = BFF; INST "lt_ffd_gen[2].FDRSE_inst" LOC = SLICE_X1Y0 | BEL = CFF; INST "lt_ffd_gen[3].FDRSE_inst" LOC = SLICE_X1Y0 | BEL = DFF; INST "lt_ffd_gen[4].FDRSE_inst" LOC = SLICE_X1Y1 | BEL = AFF; INST "lt_ffd_gen[5].FDRSE_inst" LOC = SLICE_X1Y1 | BEL = BFF; INST "lt_ffd_gen[6].FDRSE_inst" LOC = SLICE_X1Y1 | BEL = CFF; INST "lt_ffd_gen[7].FDRSE_inst" LOC = SLICE_X1Y1 | BEL = DFF;

 

 .. no matter where I place the slices, there is always the mentioned difference between Floorplaner and FPGA Editor.

 

 

 And some additional remark: All slice instance informations are displayed in "Instance Properties->General"-Tab (including

Slice and Bel positions), but for my LUTs, I have only the Slice-info and no BEL-info/constraint ??

 

 

If you need more infos, let me no.

 

jotta

 

 

 

 

0 Kudos
Highlighted
Anonymous
Not applicable
9,217 Views

What I forgot to mention:

 

The LUT6-INIT settings (in my code:  X"0000_0006_0000_0002") doesn't matter, same problem with

a lot of different init values.

 

jotta

0 Kudos
Highlighted
Xilinx Employee
Xilinx Employee
9,202 Views
Registered: ‎11-28-2007

This looks like the same issue with LUT6_2 discussed in this thread:http://forums.xilinx.com/xlnx/board/crawl_message?board.id=IMPBD&message.id=808

 

Additional info can be found in this blog .

 

I modified your test case (see VHD below with changes in bold) and everything looks OK now (see attached snapshot in FED).

 

library IEEE;
use IEEE.STD_LOGIC_1164.ALL;
use IEEE.NUMERIC_STD.ALL;

Library UNISIM;
use UNISIM.vcomponents.all;

-------------------------------------------------

entity root is
  port
  (
    clock  : in  std_logic;
    inputs : in  std_logic_vector(7 downto 0);
    outputs: out std_logic_vector(7 downto 0)
  );
end root;

-------------------------------------------------

architecture arch of root is

  constant width: integer := 8;


  -- inbut buffer FFs
  signal ib_ffd_i: std_logic_vector(width-1 downto 0);
  signal ib_ffd_o: std_logic_vector(width-1 downto 0);


  -- LUT test LUT6s
  signal lt_lut_i : std_logic_vector(width-1 downto 0);
  signal lt_lut_o5: std_logic_vector(width-1 downto 0);
  signal lt_lut_o6: std_logic_vector(width-1 downto 0);


  -- LUT test FFDs
  signal lt_ffd_i: std_logic_vector(width-1 downto 0);
  signal lt_ffd_o: std_logic_vector(width-1 downto 0);

  attribute LOCK_PINS : string;
  attribute S         : string;

  attribute S         of lt_ffd_o : signal is "TRUE";
  attribute S         of lt_lut_i : signal is "TRUE"; 


begin

  -- input buffer FFDs
  ib_ffd_gen:for i in 0 to width-1 generate
    FDRSE_inst: FDRSE
    generic map
    (
      INIT => '0'
    )
    port map
    (
      C  => clock,
      CE => '1',
      R  => '0',
      S  => '0',
      D  => ib_ffd_i(i),
      Q  => ib_ffd_o(i)
    );
  end generate;


  -- LUTs
  lt_lut_gen:for i in 0 to width-1 generate
    attribute LOCK_PINS of LUT6_inst   : label  is "ALL";
  begin 

    LUT6_inst:LUT6_2
    generic map
    (
      INIT => X"0000_0006_0000_0002"
    )
    port map
    (
      I0 => lt_ffd_o(i),
      I1 => lt_lut_i(i),
      I2 => '0',
      I3 => '0',
      I4 => '0',
      I5 => '1',
      O5 => lt_lut_o5(i),
      O6 => lt_lut_o6(i)
    );

   
     
  end generate;


  -- FFDs
  lt_ffd_gen:for i in 0 to width-1 generate
    FDRSE_inst: FDRSE
    generic map
    (
      INIT => '0'
    )
    port map
    (
      C  => clock,
      CE => '1',
      R  => '0',
      S  => '0',
      D  => lt_ffd_i(i),
      Q  => lt_ffd_o(i)
    );
  end generate;


  -- connections
  ib_ffd_i <=  inputs;
  lt_lut_i <= ib_ffd_o;
  lt_ffd_i <=  lt_lut_o6;


  -- output
  outputs <= lt_ffd_o(7 downto 0);

end arch;


 

 

Cheers,
Jim
ScreenHunter_01 Dec. 12 07.55.gif
0 Kudos
Highlighted
Anonymous
Not applicable
9,194 Views

Hi Jim,

 

just tried it with success, thanks for your time.

 

I also tried it without the LOCK_PINS attribute, works also. I have no insight into the routing and mapping algorithm,

but I think without LOCK_PINS you give PAR maybe more freedom for optimization and for me as user it doesn't

matter if PAR replace the input pins (as long as he fullfill my BEL constraints and keeps the semantic invariant).

 

Thanks,

 

jotta

0 Kudos
Highlighted
Visitor
Visitor
7,704 Views
Registered: ‎03-05-2012

I am seeing not this precise behavior but nonetheless a failure of the placement directives to operate properly.  In particular, the BEL="?6LUT" directive seems to have no effect whatsoever on LUT6_2 instances.  It works fine for LUT6_D or LUT6 instances, but not LUT6_2.

 

The LOC directive does seem to work just fine.  The software responds by putting the first LUT6_2 I declare in the D6LUT position, the second in the C6LUT position, and so on, but I haven't run enough test cases to see if that's universal.

 

A second problem (unrelated but both are affecting me) is that there are eight flip flops in a Spartan 6 slice, but I can only find BEL specifiers for four of them: FFA, FFB, FFC, FFD.  Each of those specifiers seems to specify the "pair" of flip flops in that part of the slice, and the software seems to use whichever one it feels like.  How do I specify *completely* any of the eight flops?

 

My application supports use of all eight of the slice flip flops (I'm very happy about this), but I need to figure out how to get the placement control I need.

 

Best regards,

Kip

 

Tags (1)
0 Kudos
Highlighted
Visitor
Visitor
7,701 Views
Registered: ‎03-05-2012

PS:: I am working in Verilog and my preference is to put all of my LOC and BEL specifiers in the Verilog, not in the UCF file.

 

Kip

 

0 Kudos