cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
moon5756
Explorer
Explorer
746 Views
Registered: ‎09-05-2015

Output array doesn't seem to be properly initialized

Jump to solution

Hi all,

I slightly modifiefd matrix multiplication tutorial code so that it works for complex numbers. Below is the code.

ar, ai is the real and imaginary parts of matrix A's element.

br, bi is the real and imaginary parts of matrix B's element.

pr, pi is the real and imaginary parts of output matrix's element.

 

 

void matrixmul(
		t_input_scalar ar[MAT_A_ROWS][MAT_A_COLS],
		t_input_scalar ai[MAT_A_ROWS][MAT_A_COLS],
		t_input_scalar br[MAT_B_ROWS][MAT_B_ROWS],
		t_input_scalar bi[MAT_B_ROWS][MAT_B_ROWS],
		t_output_scalar pr[MAT_A_ROWS][MAT_B_COLS],
		t_output_scalar pi[MAT_A_ROWS][MAT_B_COLS]){

  // Iterate over the rows of the A matrix
   Row: for(int i = 0; i < MAT_A_ROWS; i++) {
      // Iterate over the columns of the B matrix
      Col: for(int j = 0; j < MAT_B_COLS; j++) {
    	  pr[i][j] = 0;
    	  pi[i][j] = 0;
         // Do the inner product of a row of A and col of B
         Product: for(int k = 0; k < MAT_B_ROWS; k++) {
        	 pr[i][j] = ar[i][k] * br[k][j] - ai[i][k] * bi[k][j] + pr[i][j];
        	 pi[i][j] = ar[i][k] * bi[k][j] + ai[i][k] * br[k][j] + pi[i][j];
         }
      }
   }
}

 

 

Below is the HLS syntehsis report for the code above.

I am not sure about pr_V_q0 and pi_V_q0.

pr and pi are explicitly initialized to 0's. Why do I still have them as inputs?

synth_output.PNG

 

Thanks in advance.

 

 

0 Kudos
1 Solution

Accepted Solutions
u4223374
Advisor
Advisor
720 Views
Registered: ‎04-26-2015

@moon5756 The port interface type is ap_memory. This means that these arrays are not stored inside the HLS block - it's just going to create a port that you need to connect to an external memory.

Since HLS needs to both read and write from that memory, it's created both read and write ports.

 

If you change the interface type to s_axilite, that might do the job. I'm not sure if HLS will be smart enough to see that this is only written from the block; you might need to do more optimization. This might be worthwhile anyway; depending on your data types it could give a substantial performance boost.

View solution in original post

0 Kudos
5 Replies
u4223374
Advisor
Advisor
721 Views
Registered: ‎04-26-2015

@moon5756 The port interface type is ap_memory. This means that these arrays are not stored inside the HLS block - it's just going to create a port that you need to connect to an external memory.

Since HLS needs to both read and write from that memory, it's created both read and write ports.

 

If you change the interface type to s_axilite, that might do the job. I'm not sure if HLS will be smart enough to see that this is only written from the block; you might need to do more optimization. This might be worthwhile anyway; depending on your data types it could give a substantial performance boost.

View solution in original post

0 Kudos
moon5756
Explorer
Explorer
707 Views
Registered: ‎09-05-2015

@u4223374 Thanks for the quick reply.

void matrixmul(
      mat_a_t a[MAT_A_ROWS][MAT_A_COLS],
      mat_b_t b[MAT_B_ROWS][MAT_B_COLS],
      result_t res[MAT_A_ROWS][MAT_B_COLS])
{
  // Iterate over the rows of the A matrix
   Row: for(int i = 0; i < MAT_A_ROWS; i++) {
      // Iterate over the columns of the B matrix
      Col: for(int j = 0; j < MAT_B_COLS; j++) {
         res[i][j] = 0;
         // Do the inner product of a row of A and col of B
         Product: for(int k = 0; k < MAT_B_ROWS; k++) {
            res[i][j] += a[i][k] * b[k][j];
         }
      }
   }

}

The code above is from HLS tutorial and if I synthesize it, res doesn't have input port. 

Because I changed very little, I thought pr and pi shouldn't have input ports.

 

0 Kudos
nithink
Xilinx Employee
Xilinx Employee
681 Views
Registered: ‎09-04-2017

Can you share the snapshot of the interface used for the second example as you showed for the first.

From what you described, it should be same in both the cases unless there are some directives being used for the interfaces

Thanks,

Nithin

0 Kudos
moon5756
Explorer
Explorer
650 Views
Registered: ‎09-05-2015

@nithink Below is the interface snapshot of the second example from the tutorial.

mat_mul_interface.PNG

 

 

Anyway, the fact that I only need to connect to the external memory for the ap_memory port resolves the issue. Thanks @u4223374!

To be clearer, the matrix multiplication for the complex elements outputs accumulated sum to pr_V_d0, and pi_V_d0

and inputs accumulated sum from pr_V_q0, and pi_V_q0.

On the other hand, the matrixplication for integer matrix(the second code snippet) accumulates the sum inside the function.

0 Kudos
nithink
Xilinx Employee
Xilinx Employee
621 Views
Registered: ‎09-04-2017

If you see the RTL, these inputs are just created but not used in the logic

Thanks,

Nithin

0 Kudos