UPGRADE YOUR BROWSER

We have detected your current browser version is not the latest one. Xilinx.com uses the latest web technologies to bring you the best online experience possible. Please upgrade to a Xilinx.com supported browser:Chrome, Firefox, Internet Explorer 11, Safari. Thank you!

cancel
Showing results for 
Search instead for 
Did you mean: 
Highlighted
Observer martin-91x
Observer
4,279 Views
Registered: ‎10-02-2015

Image streaming

Jump to solution

Hi,

I'm currently working on an image processing project. Here is an overview over my project structure / functions:

The first function only converts the three incoming streams (red, green and blue) to a grayscale image with

typedef uint8 grayScale_image[HEIGHT][WIDTH] and using pipeline on the inner loops. 

void toGrayscale(grayScale_image img_in_r, grayScale_image img_in_g, grayScale_image img_in_b, grayScale_image img_out) {

	uint11 line_nr, column_nr;
	
	for (line_nr = 0; line_nr < HEIGHT; line_nr++) {
		toGrayscale_label0:
		for (column_nr = 0; column_nr < WIDTH; column_nr++) {
			img_out[line_nr][column_nr] = (img_in_r[line_nr][column_nr] + img_in_g[line_nr][column_nr] + img_in_b[line_nr][column_nr]) / 3;
		}
	}
}

 

The second involved function is a simple 3x3 gaussian filter:

void prefilter(grayScale_image img_in, grayScale_image img_out) {

		uint11 line_nr, column_nr;
		uint8 line_buff0[WIDTH], line_buff1[WIDTH];
		uint8 a_00, a_01, a_02, a_10, a_11, a_12, a_20, a_21, a_22;
	
		a_00 = a_01 = a_02 = a_10 = a_11 = a_12 = a_20 = a_21 = a_22 = 0;

		for (line_nr = 0; line_nr < HEIGHT; line_nr++) {
			prefilter_label1:
			for (column_nr = 0; column_nr < WIDTH; column_nr++) {

				a_00 = a_01;
				a_01 = a_02;
				a_02 = line_buff0[column_nr];
	
				a_10 = a_11;
				a_11 = a_12;
				a_12 = line_buff1[column_nr];
	
				a_20 = a_21;
				a_21 = a_22;
				a_22 = img_in[line_nr][column_nr];
	
				if (column_nr > 1 && line_nr > 1) {
					img_out[line_nr - 1][column_nr - 1] = (a_00 + 2 * a_01 + a_02 + 2 * a_10 + 4 * a_11 + 2 * a_12 + a_20 + 2 * a_21 + a_22) / 16;
				}

				line_buff1[column_nr] = a_22;
				line_buff0[column_nr] = a_12;
			}
		}
}

 

Both functions are designed to process one sample per clockcycle. When synthesizing each function by itself, both have equal latency / interval. In the toplevel function, both are connected (applied dataflow directive and stream directive to sept0out) :

void top(grayScale_image img_in_r, grayScale_image img_in_g, grayScale_image img_in_b, grayScale_image img_out) {

	grayScale_image step0out;

	toGrayscale(img_in_r, img_in_g, img_in_b, step0out);
	prefilter(step0out, img_out);
}

 

And here comes the problem: After connecting both functions, I get much higher intervals than I expected. The syntheses report says the following

 

WARNING: [SCHED 204-68] Unable to enforce a carried dependency constraint (II = 1, distance = 1)
   between mem_fifo write on port 'img_out' and 'store' operation (MY_IP/HW_functions.c:11) of variable 'tmp_9_i', MY_IP/HW_functions.c:11 on array 'img_out'.

INFO: [SCHED 204-61] Pipelining result: Target II: 1, Final II: 2, Depth: 4.

 

when pipelining the inner loop in the first function and

 

WARNING: [SCHED 204-68] Unable to enforce a carried dependency constraint (II = 1, distance = 1)
   between mem_fifo read on port 'img_in' and 'load' operation ('a_22', MY_IP/HW_functions.c:38) on array 'img_in'.
WARNING: [SCHED 204-68] Unable to enforce a carried dependency constraint (II = 2, distance = 1)
   between mem_fifo read on port 'img_in' and 'load' operation ('a_22', MY_IP/HW_functions.c:38) on array 'img_in'.

INFO: [SCHED 204-61] Pipelining result: Target II: 1, Final II: 3, Depth: 3.

 

when pipelining the inner loop in the second functions.

So my question is: What causes this behavior. I thought I would get a latency of 2 times the latency of one function (as both have the same) and the same II as before. But as you see, the pipelining increases my latency. But why?

 

 

Thank you for your help.

 

 

 

PS: I also get this warnings which were not present when synthesizing both functions separatly.

 

WARNING: [XFORM 203-542] Cannot flatten a loop nest 'Loop-1' (MY_IP/HW_functions.c:8:7) in function 'toGrayscale_Loop_1_proc' :
WARNING: [XFORM 203-542] the outer loop is not a perfect loop.
WARNING: [XFORM 203-542] Cannot flatten a loop nest 'Loop-1' (MY_IP/HW_functions.c:24:8) in function 'prefilter_Loop_1_proc' :
WARNING: [XFORM 203-542] the outer loop is not a perfect loop.

 

0 Kudos
1 Solution

Accepted Solutions
Scholar u4223374
Scholar
7,865 Views
Registered: ‎04-26-2015

Re: Image streaming

Jump to solution

A couple of ideas:

 

(1) It looks like you've got FIFOs for the image inputs/outputs at the moment. Is that what you expect? Normally for imaging applications you'd use AXI4 streams, and these should resolve the issues it's having with FIFOs on the inputs.

 

(2) Switching to the hls::stream type may well improve your performance substantially. Because that only has read() and write() functions (as opposed to indexed access) HLS is much less likely to get confused. In particular, your prefilter function only writes an image (WIDTH-1)*(HEIGHT-1) pixels - which might confuse HLS as it doesn't know what to do with the "spare" pixel on each line. With the streams, it doesn't have to worry about that.

 

 

4 Replies
Scholar u4223374
Scholar
7,866 Views
Registered: ‎04-26-2015

Re: Image streaming

Jump to solution

A couple of ideas:

 

(1) It looks like you've got FIFOs for the image inputs/outputs at the moment. Is that what you expect? Normally for imaging applications you'd use AXI4 streams, and these should resolve the issues it's having with FIFOs on the inputs.

 

(2) Switching to the hls::stream type may well improve your performance substantially. Because that only has read() and write() functions (as opposed to indexed access) HLS is much less likely to get confused. In particular, your prefilter function only writes an image (WIDTH-1)*(HEIGHT-1) pixels - which might confuse HLS as it doesn't know what to do with the "spare" pixel on each line. With the streams, it doesn't have to worry about that.

 

 

Observer martin-91x
Observer
4,236 Views
Registered: ‎10-02-2015

Re: Image streaming

Jump to solution

Thanks, that helped me a lot.

 

@(1): All 4 input/output arrays are set as axis interface using directives, so I expect to have AXI4 streams.

 

@(2): Switching to hls::stream solved my problem. But I don't know why using arrays with a stream directive is a problem. My code was based on c, not c++. I was thinking about switching to c++, but I while reading the VIVADO HLS paper I found this:

There is no requirement to use hls::streams and the same implementation can be performed
using arrays in the C code. The hls::stream construct does help enforce good coding practices.
More details on hls::streams are provided in HLS Stream Library in Chapter 2.

Those lines made me stick to c.

For the spare pixels: My real function uses a smaller output image with [HEIGHT-2][WIDTH-2], but to keep it simple I decided to skip that in my post above.

 

0 Kudos
Scholar u4223374
Scholar
4,227 Views
Registered: ‎04-26-2015

Re: Image streaming

Jump to solution

(1) That's odd, never seen that warning with AXI streams.

 

(2) You can definitely do image processing with arrays in C++ too, but (as in C) you have to be very careful. If your output image is 640*480 (for example), you must write 640 pixels per line (whereas your current code writes 639), they must be written in order, and each must be written exactly once. Furthermore, you have to write all of that in a way that is sufficiently obvious for HLS to recognise it. In your code, I suspect that something like this would have worked:

 

for (line_nr = 0; line_nr < (HEIGHT + 1); line_nr++) {
	prefilter_label1:
	for (column_nr = 0; column_nr < (WIDTH + 1); column_nr++) {

		// ... code removed
	
		int dataOut;
		if ((column_nr < WIDTH) && (line_nr < HEIGHT)) {
			dataOut = (a_00 + 2 * a_01 + a_02 + 2 * a_10 + 4 * a_11 + 2 * a_12 + a_20 + 2 * a_21 + a_22) / 16;
		} else {
			dataOut = 0;
		}
		if (column_nr > 1 && line_nr > 1) {
			img_out[line_nr - 1][column_nr - 1] = dataOut;
		} 
		
		// ... code removed
	}
}

With the hls::streams it has a guarantee that the those conditions are met (apart from line length, but streams don't need a line length).

0 Kudos
Observer martin-91x
Observer
4,217 Views
Registered: ‎10-02-2015

Re: Image streaming

Jump to solution

(1) Now, after switching to hls::stream, all warnings are gone :)

 

(2) Yeah, I know. As I said, my output image has a size of [HEIGHT - 2][WIDTH - 2], so this shouldn't be a problem at all.

But I was confused as toGrayscale definitly writes 640*480 pixels in order and prefilter reads the same amount in the same order. The output is still the same (with 638*478) and no warning occurs. But I got your point. In the future, I will use hls::stream as this makes life much easier.

0 Kudos