UPGRADE YOUR BROWSER

We have detected your current browser version is not the latest one. Xilinx.com uses the latest web technologies to bring you the best online experience possible. Please upgrade to a Xilinx.com supported browser:Chrome, Firefox, Internet Explorer 11, Safari. Thank you!

cancel
Showing results for 
Search instead for 
Did you mean: 
Observer lss1776
Observer
1,003 Views
Registered: ‎11-02-2017

Synthesis can't finish

Jump to solution

I want to transpose a rgb picture,so I write a function.Simulation is successful,when I synthesis the code,it can't finish after an hour.So I want to how to solve,the picture and window size is 256*256.

Here is my function.

void Transpose (RGB_IMAGE& src,RGB_IMAGE& dst,int rows,int cols)
{
 WD win0,win1,win2;
 uchar  pixel;
 rgb_data pix;
 for(int row=0;row<rows;row++)
  for(int col=0;col<cols;col++)
  {
   src >> pix;
   win0.insert(pix.val[0],col,row);
   win1.insert(pix.val[1],col,row);
   win2.insert(pix.val[2],col,row);
  }

 for(int row=0;row<rows;row++)
  for(int col=0;col<cols;col++)
  {
   pix.val[0]=win0.getval(row,col);
   pix.val[1]=win1.getval(row,col);
   pix.val[2]=win2.getval(row,col);
   dst << pix;
  }
}

Thanks

0 Kudos
1 Solution

Accepted Solutions
Scholar u4223374
Scholar
1,180 Views
Registered: ‎04-26-2015

Re: Synthesis can't finish

Jump to solution

The problem is that you're asking HLS for a set of 196608 (256*256*3) 8-bit registers, with three 65536-to-1 multiplexers and three 1-to-65536 demultiplexers.

 

For reference, I have previously had HLS build a 1024-to-1 mux and 1-to-1024 demux simply as a way of wasting space on the FPGA (wanted to verify performance when it was almost full). If I remember correctly, that block took something like 20% of a Zynq 7045. What you're asking for is vastly more difficult.

 

I expect that HLS is getting bogged-down building huge multiplexers. As @rosa_bpc has said, hls:Window is designed for small spaces; 3x3 or 5x5 would be common. A 9-to-1 or 25-to-1 mux is not a huge piece of hardware.

 

You have two basic options:

 

(1) Get rid of the hls::Windows and use a simple block RAM buffer instead. 256*256*8-bit*3 will still need 96 BRAM_18K blocks, which is pretty substantial - but it's definitely something that HLS will be able to achieve.

 

(2) Do a block-wise transpose. This requires semi-random access to either input or output images (or both), so an AXI Master is generally used. You then read in small (eg. 32x32) blocks, transpose those (32*32*8-bit*3 is only going to need three block RAMs) and write them out, rather than doing the whole image at once.

 

5 Replies
Highlighted
Contributor
Contributor
970 Views
Registered: ‎03-13-2017

Re: Synthesis can't finish

Jump to solution

Hello. I would like to know what is WD. I propose you this solution:

 

void Transpose (RGB_IMAGE& src,RGB_IMAGE& dst,int rows,int cols)
{
#pragma HLS INTERFACE axis port=src 
#pragma HLS INTERFACE axis port=dst 

#pragma HLS INTERFACEap_none port=rows
#pragma HLS INTERFACE ap_none port=cols

 

    WD win0,win1,win2;
    rgb_data pix;

 

   for(int row=0;row<rows;row++)

  {

#pragma HLS loop_flatten off
#pragma HLS PIPELINE II=1
      for(int col=0;col<cols;col++)
     {

#pragma HLS loop_flatten off
#pragma HLS PIPELINE II=1


          pix =src.read();
   
          win0.insert(pix.val[0],col,row);
          win1.insert(pix.val[1],col,row);
          win2.insert(pix.val[2],col,row);

   

         dst.write(pix); 
    }
  }
}

 

--------------------------------------------------------------------------------------------
Please mark the post as an answer "Accept as solution" in case it helped resolve your query.
Give kudos in case a post in case it guided to the solution.

0 Kudos
Observer lss1776
Observer
961 Views
Registered: ‎11-02-2017

Re: Synthesis can't finish

Jump to solution
WD is defined as hls::window,size is 256*256.So I consider this may be the problem .
0 Kudos
Contributor
Contributor
952 Views
Registered: ‎03-13-2017

Re: Synthesis can't finish

Jump to solution
Why do you need WD? The usually size of hls::window is 3x3

--------------------------------------------------------------------------------------------
Please mark the post as an answer "Accept as solution" in case it helped resolve your query.
Give kudos in case a post in case it guided to the solution.
0 Kudos
Observer lss1776
Observer
944 Views
Registered: ‎11-02-2017

Re: Synthesis can't finish

Jump to solution
I want to transpose a rgb picture,the picture need to store .So I use a window,the size is the same as the picture.
0 Kudos
Scholar u4223374
Scholar
1,181 Views
Registered: ‎04-26-2015

Re: Synthesis can't finish

Jump to solution

The problem is that you're asking HLS for a set of 196608 (256*256*3) 8-bit registers, with three 65536-to-1 multiplexers and three 1-to-65536 demultiplexers.

 

For reference, I have previously had HLS build a 1024-to-1 mux and 1-to-1024 demux simply as a way of wasting space on the FPGA (wanted to verify performance when it was almost full). If I remember correctly, that block took something like 20% of a Zynq 7045. What you're asking for is vastly more difficult.

 

I expect that HLS is getting bogged-down building huge multiplexers. As @rosa_bpc has said, hls:Window is designed for small spaces; 3x3 or 5x5 would be common. A 9-to-1 or 25-to-1 mux is not a huge piece of hardware.

 

You have two basic options:

 

(1) Get rid of the hls::Windows and use a simple block RAM buffer instead. 256*256*8-bit*3 will still need 96 BRAM_18K blocks, which is pretty substantial - but it's definitely something that HLS will be able to achieve.

 

(2) Do a block-wise transpose. This requires semi-random access to either input or output images (or both), so an AXI Master is generally used. You then read in small (eg. 32x32) blocks, transpose those (32*32*8-bit*3 is only going to need three block RAMs) and write them out, rather than doing the whole image at once.