UPGRADE YOUR BROWSER

We have detected your current browser version is not the latest one. Xilinx.com uses the latest web technologies to bring you the best online experience possible. Please upgrade to a Xilinx.com supported browser:Chrome, Firefox, Internet Explorer 11, Safari. Thank you!

cancel
Showing results for 
Search instead for 
Did you mean: 
Explorer
Explorer
725 Views
Registered: ‎05-23-2017

hls::stream using problem

Jump to solution

 

In ths top function "a" is the stream channel between function_a(producer)  and function_b(consumer).

I want to pipline these two functions using dataflow pragma. 

I wonder which using of the hls::stream is a better way?

 

Thanks.

function_a(input,hls::stream<int> a){
            a<<input;
}; function_b(hls::stream<int> a,b){
int b1<<a.read();
b+=b1;
};
top(input,b){
hls:stream a;
for(1:100){
#pragma HLS dataflow
function_a(input,a){}; function_b(a,b){};
} }

 

function_a(input,hls::stream<int> a){
    for(1:100){
a<<input;
} }; function_b(hls::stream<int> a,b){
for(1:100){
int b1<<a.read();
b+=b1;
}
};
top(inout,b){
hls:stream a;
#pragma HLS dataflow
function_a(intput,a){}; function_b(a,b){}; }

 

0 Kudos
1 Solution

Accepted Solutions
Explorer
Explorer
601 Views
Registered: ‎05-23-2017

Re: hls::stream using problem

Jump to solution

I implement a design based on this two strategies and find the second one is much better.

Not sure the resason.

0 Kudos
7 Replies
Teacher xilinxacct
Teacher
694 Views
Registered: ‎10-23-2018

Re: hls::stream using problem

Jump to solution

@mathmaxsean

The pseudo code doesn't have quite enough detail (or as expected) to definitively answer...

Assuming function a & b share the same stream (producer & consumer)

If the stream were not shared, the 'first' pattern will allow DATAFLOW to get parallelism in the loop... The second pattern once, if you do a return or if's etc.. will not. 

The 'first' pattern might also be smaller as it has less control structures.

Hope that helps

If so, please mark as solution accepted. Kudos also welcomed. :-)

0 Kudos
Scholar u4223374
Scholar
667 Views
Registered: ‎04-26-2015

Re: hls::stream using problem

Jump to solution

@mathmaxsean I prefer the second approach, because each function call has some overhead (unless the functions are inlined), so it makes sense to have each function do a lot of work. This pushes the overhead down to being a very small proportion of total time.

 

However, your two examples are fundamentally different. The first one will send every even-numbered stream element (0, 2, 4, 6, 8, 10, etc) to function_a, and every odd-numbered stream element (1, 3, 5, 7, 9, 11 etc) to function_b. The second will send the first 100 elements to function_a and the second 100 elements to function_b.

0 Kudos
Explorer
Explorer
652 Views
Registered: ‎05-23-2017

Re: hls::stream using problem

Jump to solution

Sorry for make the code unclear.

I add more details.

0 Kudos
Teacher xilinxacct
Teacher
640 Views
Registered: ‎10-23-2018

Re: hls::stream using problem

Jump to solution

@mathmaxsean

Due to the data dependency... The timing will be the same... But the first one will use a couple less LUTS... But, for the most part, as is, these are near equal.

Hope that helps

If so, please mark as solution accepted. Kudos also welcomed. :-)

0 Kudos
Explorer
Explorer
602 Views
Registered: ‎05-23-2017

Re: hls::stream using problem

Jump to solution

I implement a design based on this two strategies and find the second one is much better.

Not sure the resason.

0 Kudos
Teacher xilinxacct
Teacher
596 Views
Registered: ‎10-23-2018

Re: hls::stream using problem

Jump to solution

@mathmaxsean

That's odd... I did too and found the opposite... Maybe I made some different assumptions in filling in the pseudo code. Can you share your code? I am really curious. Thanks

0 Kudos
Teacher xilinxacct
Teacher
549 Views
Registered: ‎10-23-2018

Re: hls::stream using problem

Jump to solution

@mathmaxsean

Thanks for the 'brief' glimpse of your code (post now deleted)... but, yes, the 'actual' code you posted had a 'very' different work profile, and thus the work 'parallelism' of the result is very different, thus the benefit of the different different location of the looping would indeed prove better elsewhere. That should answer the 'reason'.

Glad you found the right balance for that code.

0 Kudos