cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
mbence76
Explorer
Explorer
265 Views
Registered: ‎01-18-2019

stream write() read() instruction order worth to know

Jump to solution

Dear All,

let me show you a pitfall worth to know:

Say your IP core connects to a master SPI thru an input stream i2m and an output stream m2i.
When you send out data, you also receive data.

This code works perfectly fine in C simulation, but might not work in RTL:

 

for (i=0; i<10; i++) {
  m2i.write(i);
  i2m.read(); // drop word
}

 


This is because HLS synthesizer might swap the order of the write and read instructions. Or do them parallel.
This is not a bug, it simply does not know the relation between them.


If you write

 

	for (i=0; i<10; i++) m2i.write(i);
	for (i=0; i<10; i++) i2m.read();   // drop words

 

it does solve the problem, although it might be of mere luck.

Is there a way to specifically state that one instruction must be done before the other ?

Thank you

Miklos

0 Kudos
Reply
1 Solution

Accepted Solutions
dsakjl
Explorer
Explorer
237 Views
Registered: ‎07-20-2018

Hi @mbence76 ,

try to use "pragma HLS dependence".

Regards.

View solution in original post

6 Replies
dsakjl
Explorer
Explorer
238 Views
Registered: ‎07-20-2018

Hi @mbence76 ,

try to use "pragma HLS dependence".

Regards.

View solution in original post

mbence76
Explorer
Explorer
133 Views
Registered: ‎01-18-2019

Hi @dsakjl ,

I accepted your solution, but it turned out that my code started to work for some other reason.

HLS dependence pragma is a great thing, but it does not seem to be able to prescribe order of two seemingly independent operations, such as write to one stream and read from another.

Here is a simplified snipplet of my code.

Could you please give a concrete pragma line on how to enforce dependence?

My master SPI must have a TX word in m2i  first to produce an RX word in i2m.

Thank you very much.

void my_main (  hls::stream<short> &i2m,
		hls::stream<short> &m2i	)
{
#pragma HLS TOP
#pragma HLS INTERFACE ap_fifo port=i2m
#pragma HLS INTERFACE ap_fifo port=m2i

for (int i=0; i<10; i++) {				
    m2i.write(i);
    i2m.read();
}

} // my_main

 

0 Kudos
Reply
dsakjl
Explorer
Explorer
112 Views
Registered: ‎07-20-2018

Hi @mbence76 ,

I suggest you to try with the undocumented function ap_wait_n(), which "sleep" for a given number of cycles.

You can find more on its usage in this thread: https://forums.xilinx.com/t5/High-Level-Synthesis-HLS/Vitis-Vivado-HLS-BUG-in-pulse-generation/m-p/1171461

Note that you need to include the header file: ap_utils.h

Don't know if this could be helpful. Send me your test bench, if you can, so I can test it myself.

Regards.

0 Kudos
Reply
bchebrol
Xilinx Employee
Xilinx Employee
71 Views
Registered: ‎06-04-2018

Hi @mbence76 ,

You can use the following dependence_inter design for reference : 

https://github.com/Xilinx/Vitis_Accel_Examples/tree/master/cpp_kernels/dependence_inter

Thanks,

Vishnu

-------------------------------------------------------------------------
Don't forget to reply, kudo, and accept as solution.
-------------------------------------------------------------------------

 

0 Kudos
Reply
mbence76
Explorer
Explorer
29 Views
Registered: ‎01-18-2019

Hi @dsakjl ,

I inserted an  ap_wait_n(1) ;  after m2i.write(i);  and before i2m.read();  

It worked at some places, while not at others.  But what really raised my eyebrow was when I found out that I had forgot to include ap_utils.h .  How come the compiler did not complain? 

And even if I managed to add 1 clock cycle,  I does not seem to be the proper was to enforce operation order. Honestly, I am becoming frustrated.

I have now spent days on trying to find a reliable way to make sure one operation comes before another.  Is it really too much to ask from a compiler?

It is perfectly clear what it means to compile into hardware and what parallelization and pipeline and dataflow is for. I have tried to turn off everything, all in vain. 

Why is there no option to execute operations just after each other, one by one, without trying to optimize anything? I do not want to use a soft CPU for such a simple task.

I even tried this, with no success:

while (!wrdone) {
	wrdone = m2i.write_nb(i);   // this is non-blocking!
	if (wrdone) i2m.read();
	}
wrdone = 0;

Miklos

 

 

 

0 Kudos
Reply
mbence76
Explorer
Explorer
25 Views
Registered: ‎01-18-2019

Dear @bchebrol

please see my post at 02-25-2021 02:51 AM above.

Could you please give a concrete pragma line on how to enforce dependence?

I apparently am not smart enough to work this out.

Thank you.

0 Kudos
Reply