UPGRADE YOUR BROWSER

We have detected your current browser version is not the latest one. Xilinx.com uses the latest web technologies to bring you the best online experience possible. Please upgrade to a Xilinx.com supported browser:Chrome, Firefox, Internet Explorer 11, Safari. Thank you!

cancel
Showing results for 
Search instead for 
Did you mean: 
Highlighted
Explorer
Explorer
679 Views
Registered: ‎05-23-2017

How to reduce the runtime of compilation during hardware emulation.

Jump to solution

The compilation for hardware emulation takes a very long time.

 

I wonder 

 

INFO: [HLS 200-10] ----------------------------------------------------------------
INFO: [HLS 200-42] -- Implementing module 'read_query_pca' 
INFO: [HLS 200-10] ----------------------------------------------------------------
INFO: [SCHED 204-11] Starting scheduling ...
INFO: [SCHED 204-11] Finished scheduling.
INFO: [HLS 200-111]  Elapsed time: 28643.7 seconds; current allocated memory: 3.936 GB.
INFO: [BIND 205-100] Starting micro-architecture generation ...
INFO: [BIND 205-101] Performing variable lifetime analysis.
INFO: [BIND 205-101] Exploring resource sharing.
INFO: [BIND 205-101] Binding ...
INFO: [BIND 205-100] Finished micro-architecture generation.
INFO: [HLS 200-111]  Elapsed time: 4.29 seconds; current allocated memory: 3.936 GB.
INFO: [HLS 200-10] ----------------------------------------------------------------
INFO: [HLS 200-42] -- Implementing module 'read_query_or' 
INFO: [HLS 200-10] ----------------------------------------------------------------
INFO: [SCHED 204-11] Starting scheduling ...

Just implementing module 'read_query_pca' taks 7 hours.

It's a very simple function.

void read_query_pca(const ap_uint<512>  *query_pca, D_point_pca *query_pca_oc){
//  #pragma HLS INLINE
    ap_uint<512> query_pca_temp;
    D_point_pca query_pca_oc_temp;
    #pragma HLS ARRAY_PARTITION variable=query_pca_oc_temp.x complete dim=0
    int loop_num = D_PCA/16;// each 512 bit including 16 float data
loop_rd_q_pca:for(int j=0; j<loop_num; j++){
        query_pca_temp = query_pca[j];
        for(int k=0; k<16; k++){
            #pragma HLS unroll
            unsigned int i_temp=query_pca_temp.range(32*(k+1) -1, 32*k);
            float f_temp=*(float*)(&i_temp);
            query_pca_oc_temp.x[j*16+k] = f_temp;
        }
    }
    *query_pca_oc = query_pca_oc_temp;
}

I guess the compiler is trying many different optimizing strategies for the function, so it takes a very long time.

I wonder is there a way that can reduce the implementing time.

 

Tags (2)
0 Kudos
1 Solution

Accepted Solutions
Explorer
Explorer
633 Views
Registered: ‎05-23-2017

Re: How to reduce the runtime of compilation during hardware emulation.

Jump to solution

Finally found this was cased by the partitionning of a very large array.

View solution in original post

0 Kudos
1 Reply
Explorer
Explorer
634 Views
Registered: ‎05-23-2017

Re: How to reduce the runtime of compilation during hardware emulation.

Jump to solution

Finally found this was cased by the partitionning of a very large array.

View solution in original post

0 Kudos