UPGRADE YOUR BROWSER

We have detected your current browser version is not the latest one. Xilinx.com uses the latest web technologies to bring you the best online experience possible. Please upgrade to a Xilinx.com supported browser:Chrome, Firefox, Internet Explorer 11, Safari. Thank you!

cancel
Showing results for 
Search instead for 
Did you mean: 
Highlighted
Adventurer
Adventurer
743 Views
Registered: ‎10-19-2015

Running Control Logic and Output in Parallel

Hi ALl,

 

I am trying to run two parts in HLS C code in parallel. As shown below, there is a "ControlLogic" which increaments a variable named row_counter_ref and writes it to cntrs_out array (this is an output port) as shown below. 

 

I want to run the code in ControlLogicBlock, ProcessingBlock and OutputLogicBlock in parallel. How can I do that ? I implemented this on FPGA, and it seems whenever OutputLogicBlock starts running, it stops the ControlLogicBlock. According to the code, HLS code the ControlLogicBlock should always run whenever conditions met, but as shown in the wave form, the counter (contrs_out) does not get incremented. ==> This means that the ControlLogicBlock does not run when OutputLogicBlock is running. 

 

 

 

#include "loop_core.h"




void  loop_core(const in_point_t* din, out_point_t dout[ROWS*(OUTPUT_WIDTH+1)],
		ap_uint<12> *current_row_cntr,
		ap_uint<12> *hdr_row_cntr,
		ap_uint<4>  *hdr_frm_cntr,
		ap_uint<16> cntrs_out[COLS]) {

#pragma HLS DATAFLOW
#pragma HLS INTERFACE axis register depth=2 port=dout
// Note: depth>1 for din breaks Co-Sim.
#pragma HLS INTERFACE axis register depth=1 port=din
#pragma HLS INTERFACE axis register depth=1 port=cntrs_out
#pragma HLS DATA_PACK variable=din field_level
#pragma HLS DATA_PACK variable=dout field_level
#pragma HLS INTERFACE ap_ctrl_none port=return


	in_point_t temp_in;
	temp_in = *din;
	out_point_t tmp_out;

	apuint2 temporal;
	apuint1 init_ref;


    /*
     * BRAM REF IMAGES
     * */
	static apuint3 REF_IMAGE1[ROWS][COLS];
	//static apuint1 REF_IMAGE2[ROWS][COLS];
	//static apuint1 REF_IMAGE3[ROWS][COLS];
	#pragma HLS ARRAY_PARTITION variable=REF_IMAGE1 cyclic factor=64 dim=2
	//#pragma HLS ARRAY_PARTITION variable=REF_IMAGE2 cyclic factor=64 dim=2
	//#pragma HLS ARRAY_PARTITION variable=REF_IMAGE3 cyclic factor=64 dim=2


    /*
     * BRAM buffer image
     * */
	 static apuint1 buffer[ROWS][COLS];
     #pragma HLS ARRAY_PARTITION variable=buffer cyclic factor=64 dim=2

	 static apuint1 buffer_variable;

    /*
     * Counters
     * */
	static ap_uint<16> counters[COLS] ={0};
    //#pragma HLS ARRAY_PARTITION variable=counters complete dim=1
    #pragma HLS ARRAY_PARTITION variable=counters cyclic factor=64 dim=1

	static ap_uint<16> counters_temp[PACK_WIDTH] ={0};
    #pragma HLS ARRAY_PARTITION variable=counters_temp complete dim=1
    //#pragma HLS ARRAY_PARTITION variable=counters cyclic factor=128 dim=1


	// row counter that counts up to 511 rows
	static ap_uint<12> row_counter = 0;

	// row counter that counts up to 511 rows
	static ap_uint<12> row_counter_ref = 0;

	// Frame counter that counts up to 15 frames
	static ap_uint<4> frame_counter = 0;

	// Frame counter that counts up to 15 frames
	static ap_uint<4> frame_counter_ref = 0;

	// Frame counter that counts up to 15 frames
    static ap_uint<16> global_frame_counter = 0;

	// A temporary variable that to pack 16 or 64 pixels into one
	OUT_DATA_TYPE_HW temp;

	// A temporary variable that holds 2048-bits of data to register.
	IN_DATA_TYPE_HW IN_DATA_LOCAL;

	// A temporary variable that holds 64-bit data
	OUT_DATA_TYPE_HW  ROW_DATA_CURRENT;

	// Indicates true if being processed.
	static ap_uint<1> process = 0;

	// Frame valid signal
	static ap_uint<1> frame_valid = 0;

	// Frame valid signal
	static ap_uint<1> line_valid = 0;

	// Frame valid signal
	static ap_uint<11> col_index = 0;

	// Copy the reference image to the buffer

	// Enable processing if 1-rwo (2048-bits) received
	ControlLogicBlock: {
	// This logic needs to run in Parallel with the Output Logic
	if (temp_in.dv==1 && temp_in.fv==1 && temp_in.lv==1 ) {
		IN_DATA_LOCAL = temp_in.in_data;
		init_ref = temp_in.init;
		temporal = temp_in.temporal;

		if(init_ref==1) {

			row_counter_ref += 1;
			if(row_counter_ref<=ROWS) {
				process = 1;

			}
			else{
				if(row_counter_ref==ROWS*4){
					row_counter_ref=0;
				}
				process = 0;
			}


			cntrs_out[col_index] = row_counter_ref;
			col_index +=1;
		}
		else {
			row_counter += 1;
		}
		//cout<<"ROW counter = "<<row_counter<<" Ref row counter= "<<row_counter_ref<<endl;
	}
	}

	*hdr_row_cntr = temp_in.dv;

	// Initialize BRAM with reference images
// ProcessingBlock if(init_ref==1 &&process==1) { cout<<"Row counter ="<<row_counter_ref<<endl; frame_valid = 1; // Initialize the reference images InitRef1:for(ap_uint<8> i=0;i<ITERATIONS;i++) { #pragma HLS PIPELINE II=1 ROW_DATA_CURRENT=IN_DATA_LOCAL.range(PACK_WIDTH*i+PACK_WIDTH_ONE_LESS, PACK_WIDTH*i); InitRef2:for(ap_uint<16> j=0;j<PACK_WIDTH;j++){ //cout<< buffer[row_counter-1][i*PACK_WIDTH+j]<<ROW_DATA_CURRENT.bit(j)<<endl; REF_IMAGE1[row_counter_ref-1][i*PACK_WIDTH+j].bit(temporal) = ROW_DATA_CURRENT.bit(j); // tmp_out.out_data.bit(j) = REF_IMAGE1[row_counter_ref-1][i*PACK_WIDTH+j].bit(temporal); } if(i==ITERATIONS-1) { process = 0; } } //cout<<"Initalizing "<<"ROWS = "<<row_counter<<endl; *hdr_frm_cntr = process; } tmp_out.row_cntr = row_counter; tmp_out.frm_cntr = frame_counter; *current_row_cntr = temp_in.fv; if(row_counter_ref==ROWS) { if(frame_counter_ref==0) { frame_counter_ref = 1; } } // OutputLogicBlock // This logic needs to run in Parallel with the Output Logic if(init_ref==1 && frame_counter_ref==1) { //cout<<"Processing Hardware Output"<<endl; frame_valid = 1; frame_counter_ref = 0; RefOut1:for(int rs=0;rs<ROWS;rs++) { RefOut2:for(int i=0;i<OUTPUT_WIDTH+1;i++) { #pragma HLS PIPELINE II=1 if(i==OUTPUT_WIDTH) { line_valid = 0; tmp_out.out_data = 0; } else { line_valid = 1; RefOut3:for(int j=0;j<PACK_WIDTH;j++) { tmp_out.out_data.bit(j) = REF_IMAGE1[rs][i*PACK_WIDTH+j].bit(temporal); } } if(rs==ROWS-1 && i==OUTPUT_WIDTH) { frame_valid= 0; } tmp_out.fv = frame_valid; tmp_out.line = line_valid; dout[rs*(OUTPUT_WIDTH+1)+i] = tmp_out; } } } } #endif

 

output_waveform_xilinx.PNG
0 Kudos
2 Replies
Scholar jprice
Scholar
725 Views
Registered: ‎01-28-2014

Re: Running Control Logic and Output in Parallel

 

I think for the dataflow directive to behave the way you want, you need to split you three blocks into three separate functions. I' m also suspicious of using static variables in dataflow, but that's because the dataflow directive is very fragile.

0 Kudos
Explorer
Explorer
705 Views
Registered: ‎08-31-2017

Re: Running Control Logic and Output in Parallel

@jprice

 

 Would you please elaborate the phrasing "the dataflow directive is very fragile" a bit more ? Thanks :)

0 Kudos