cancel
Showing results for
Show  only  | Search instead for
Did you mean:
Explorer
746 Views
Registered: ‎09-05-2015

## Output array doesn't seem to be properly initialized

Hi all,

I slightly modifiefd matrix multiplication tutorial code so that it works for complex numbers. Below is the code.

ar, ai is the real and imaginary parts of matrix A's element.

br, bi is the real and imaginary parts of matrix B's element.

pr, pi is the real and imaginary parts of output matrix's element.

```void matrixmul(
t_input_scalar ar[MAT_A_ROWS][MAT_A_COLS],
t_input_scalar ai[MAT_A_ROWS][MAT_A_COLS],
t_input_scalar br[MAT_B_ROWS][MAT_B_ROWS],
t_input_scalar bi[MAT_B_ROWS][MAT_B_ROWS],
t_output_scalar pr[MAT_A_ROWS][MAT_B_COLS],
t_output_scalar pi[MAT_A_ROWS][MAT_B_COLS]){

// Iterate over the rows of the A matrix
Row: for(int i = 0; i < MAT_A_ROWS; i++) {
// Iterate over the columns of the B matrix
Col: for(int j = 0; j < MAT_B_COLS; j++) {
pr[i][j] = 0;
pi[i][j] = 0;
// Do the inner product of a row of A and col of B
Product: for(int k = 0; k < MAT_B_ROWS; k++) {
pr[i][j] = ar[i][k] * br[k][j] - ai[i][k] * bi[k][j] + pr[i][j];
pi[i][j] = ar[i][k] * bi[k][j] + ai[i][k] * br[k][j] + pi[i][j];
}
}
}
}```

Below is the HLS syntehsis report for the code above.

I am not sure about pr_V_q0 and pi_V_q0.

pr and pi are explicitly initialized to 0's. Why do I still have them as inputs?

1 Solution

Accepted Solutions
720 Views
Registered: ‎04-26-2015

@moon5756 The port interface type is ap_memory. This means that these arrays are not stored inside the HLS block - it's just going to create a port that you need to connect to an external memory.

Since HLS needs to both read and write from that memory, it's created both read and write ports.

If you change the interface type to s_axilite, that might do the job. I'm not sure if HLS will be smart enough to see that this is only written from the block; you might need to do more optimization. This might be worthwhile anyway; depending on your data types it could give a substantial performance boost.

5 Replies
721 Views
Registered: ‎04-26-2015

@moon5756 The port interface type is ap_memory. This means that these arrays are not stored inside the HLS block - it's just going to create a port that you need to connect to an external memory.

Since HLS needs to both read and write from that memory, it's created both read and write ports.

If you change the interface type to s_axilite, that might do the job. I'm not sure if HLS will be smart enough to see that this is only written from the block; you might need to do more optimization. This might be worthwhile anyway; depending on your data types it could give a substantial performance boost.

Explorer
707 Views
Registered: ‎09-05-2015

@u4223374 Thanks for the quick reply.

```void matrixmul(
mat_a_t a[MAT_A_ROWS][MAT_A_COLS],
mat_b_t b[MAT_B_ROWS][MAT_B_COLS],
result_t res[MAT_A_ROWS][MAT_B_COLS])
{
// Iterate over the rows of the A matrix
Row: for(int i = 0; i < MAT_A_ROWS; i++) {
// Iterate over the columns of the B matrix
Col: for(int j = 0; j < MAT_B_COLS; j++) {
res[i][j] = 0;
// Do the inner product of a row of A and col of B
Product: for(int k = 0; k < MAT_B_ROWS; k++) {
res[i][j] += a[i][k] * b[k][j];
}
}
}

}```

The code above is from HLS tutorial and if I synthesize it, res doesn't have input port.

Because I changed very little, I thought pr and pi shouldn't have input ports.

Xilinx Employee
681 Views
Registered: ‎09-04-2017

Can you share the snapshot of the interface used for the second example as you showed for the first.

From what you described, it should be same in both the cases unless there are some directives being used for the interfaces

Thanks,

Nithin

Explorer
650 Views
Registered: ‎09-05-2015

@nithink Below is the interface snapshot of the second example from the tutorial.

Anyway, the fact that I only need to connect to the external memory for the ap_memory port resolves the issue. Thanks @u4223374!

To be clearer, the matrix multiplication for the complex elements outputs accumulated sum to pr_V_d0, and pi_V_d0

and inputs accumulated sum from pr_V_q0, and pi_V_q0.

On the other hand, the matrixplication for integer matrix(the second code snippet) accumulates the sum inside the function.

Xilinx Employee
621 Views
Registered: ‎09-04-2017

If you see the RTL, these inputs are just created but not used in the logic

Thanks,

Nithin