UPGRADE YOUR BROWSER

We have detected your current browser version is not the latest one. Xilinx.com uses the latest web technologies to bring you the best online experience possible. Please upgrade to a Xilinx.com supported browser:Chrome, Firefox, Internet Explorer 11, Safari. Thank you!

cancel
Showing results for 
Search instead for 
Did you mean: 
Observer witxilinx
Observer
524 Views
Registered: ‎12-26-2018

C/RTL Co-Simulation freezes (Not support hls::stream?)

Dear Developers,

According to the ug902 document at page 220:

C/RTL Co-Simulation Support
The Vivado HLS C/RTL cosimulation-feature does not support structures or classes containing
hls::stream<> members in the top-level interface. Vivado HLS supports these structures or
classes for synthesis

So it seems like the `hls::stream` class can never be used if we were to use the C/RTL co-simulation. I tried this with my design project having a function declared like the following:

void load_module(hls::stream<vec_T> &load_queue, ...) {
#pragma HLS INTERFACE axis port = load_queue
  ...
}

void compute_module(hls::stream<vec_T> &compute_queue, ...) {
#pragma HLS INTERFACE axis port = compute_queue
...
}

void store_module(hls::stream<vec_T> &store_queue, ...) {
#pragma HLS INTERFACE axis port = store_queue
...
} void top(volatile vec_T * inst_buffer, ...) { hls::stream<vec_T> load_queue; hls::stream<vec_T> compute_queue; hls::stream<vec_T> store_queue; #pragma HLS stream variable = load_queue depth = 10 dim = 1 #pragma HLS stream variable = compute_queue depth = 10 dim = 1 #pragma HLS stream variable = store_queue depth = 10 dim = 1
...
// Load instructions into load/compute/store queue ...
// Main loop
while (true) {
// By checking queue.empty() and other complicated logic
// |
// v
while (load_module is not waiting for other module) {
load_module(load_queue, ...);
...
}
while (compute_module is not waiting for other module) {
compute_module(compute_queue, ...);
...
}
while (store_module is not waiting on other module) {
store_module(store_queue, ...);
...
}
...
// Break after executing all instructions } }

And the C/RTL co-simulation made no progress after an hour, something like this:

// RTL Simulation : "Inter-Transaction Progress" ["Intra-TransactionProgress"] @
"Simulation Time"
/////////////////////////////////////////////////////////////////////////////
//////
// RTL Simulation : 0 / 1 [0.00%] @ "110000"
// RTL Simulation : 0 / 1 [0.00%] @ "202000"
// RTL Simulation : 0 / 1 [0.00%] @ "404000"

Question

So if hls::stream class can not be used in the top-level interface, what are some alternative solutions to have a FIFO blocking queue, passing at the top-level interface, and being able to do C/RTL co-simulation? 

 

0 Kudos
8 Replies
Voyager
Voyager
510 Views
Registered: ‎03-28-2016

Re: hls::stream C/RTL Co-Simulation Support

I don't have an answer for your question, but I have seen situations where the C/RTL Co-Simulation does not work when the "RTL Selection" is set to "Verilog", but does work when set to "VHDL".  If you haven't done so already,  try running the C/RTL Co-Simulation once with Verilog and then try again with VHDL.

And of course, make sure your C simulation is working properly.  Also check the log output of the C Synthesis for anything that might cause an issue.  Sometimes you  will see warnings that a particular section of the code could cause a C/RTL Co-Simulation to lock up.

Ted Booth - Tech. Lead FPGA Design Engineer
www.designlinxhs.com
Scholar u4223374
Scholar
504 Views
Registered: ‎04-26-2015

Re: hls::stream C/RTL Co-Simulation Support

I think that is poorly worded. You can definitely use streams in cosimulation; I've written 20+ modules which have all used AXI Streams and have all (eventually) passed cosimulation. What you can't do is pass a class/structure to the top function, where that class/structure contains a stream. You're not doing this, so it should work.

 

Can you post your code, or at least your testbench code? That may assist with troubleshooting your problem. Also run synthesis and find the expected run-time.

 

At a guess, I've seen a fair few people try to write a HLS testbench like HDL: create a stream, pass that to the top function, then fill the stream with data for the top function to process. This doesn't work in HLS because the testbench is single-threaded. If you pass the stream to the top function, HLS is going to wait until the top function returns before it moves on to the next step (loading data into the stream). Instead you need to create the stream, load all the data required for the simulation run into it, and then call the top function.

Xilinx Employee
Xilinx Employee
489 Views
Registered: ‎09-05-2018

Re: hls::stream C/RTL Co-Simulation Support

Hey @witxilinx ,

You can use hls::stream in the top level interface without error. That passage from UG902 is saying you should not do this:

typedef struct(
  hls::stream<vec_T> load_queue;
  hls::stream<vec_T> compute_queue;
  hls::stream<vec_T> store_queue
) mystruct;

void module(
  mystruct
) {
  ...
}

There is an issue with the project, as you noted from the C/RTL Cosimulation not completing, but your interface is fine.

Nicholas Moellers

Xilinx Worldwide Technical Support
Observer witxilinx
Observer
474 Views
Registered: ‎12-26-2018

Re: hls::stream C/RTL Co-Simulation Support

@tedbooth Thanks for the suggestion, I will try the VHDL option soon. Regarding the C simulation, it works fine, but threre are several warnings in the C Synthesis, of which the last one about "#pragma HLS INTERFACE axis" seems the most relevant to me. (I've edited the post to show my declared pragma.) I will look into the detail of `axis` mode:

- Estimated clock period (9.275ns) exceeds the target...
- The II Violation in module '...': Unable to enforce a carried dependence constraint (II = 14, distance = 1, offset = 1)
- Unable to satisfy pipeline directive: Loop contains subloop(s) not being unrolled or flattened.
- Unable to perform loop rewinding: Function '...' (file.cc:183) contains multiple loops.
- ...
- WARNING: [XFORM 203-803] Cannot specify interface mode 'axis' (file.cc:23:1) on argument 'store_queue.V.V' (file.cc:18).
This interface directive will be discarded. Please apply it on an argument of top module.

By the way, could you tell me from your experience roughly the time to run the C/RTL co-simulation? If it is relevant, the design file contains ~1500 lines of code.

0 Kudos
Observer witxilinx
Observer
461 Views
Registered: ‎12-26-2018

Re: hls::stream C/RTL Co-Simulation Support

@u4223374 I'm not sure if I can post all the code here, but I will try to post all necessary code to make it helpful for this forum in case someone come across the same issue.

The following is my code simplified. Please let me know if there is any part unclear.

void load_module(hls::stream<vec_T> &load_queue, ...) {
#pragma HLS INTERFACE axis port = load_queue
  ...
}

void compute_module(hls::stream<vec_T> &compute_queue, ...) {
#pragma HLS INTERFACE axis port = compute_queue
...
}

void store_module(hls::stream<vec_T> &store_queue, ...) {
#pragma HLS INTERFACE axis port = store_queue
...
} void top(volatile vec_T * inst_buffer, ...) { hls::stream<vec_T> load_queue; hls::stream<vec_T> compute_queue; hls::stream<vec_T> store_queue; #pragma HLS stream variable = load_queue depth = 10 dim = 1 #pragma HLS stream variable = compute_queue depth = 10 dim = 1 #pragma HLS stream variable = store_queue depth = 10 dim = 1
...
// Load instructions into load/compute/store queue ...
// Main loop
while (true) {
// By checking queue.empty() and other complicated logic
// |
// v
while (load_module is not waiting for other module) {
load_module(load_queue, ...);
...
}
while (compute_module is not waiting for other module) {
compute_module(compute_queue, ...);
...
}
while (store_module is not waiting on other module) {
store_module(store_queue, ...);
...
}
...
// Break after executing all instructions } } // Test bench int main(void) {
// Prepare instruction buffer ------------------------- vec_T *inst_buffer = malloc(...); // Fill data ------------------------------------------
... // Invoke the top module ------------------------------ top((volatile vec_T *) inst_buffer, (volatile vec_T *) ..., (volatile vec_T *) ...,
...); // Check the output -----------------------------------
... }  

 

0 Kudos
Highlighted
Scholar u4223374
Scholar
426 Views
Registered: ‎04-26-2015

Re: hls::stream C/RTL Co-Simulation Support

@witxilinx The first thing I notice is that you're specifying interface types for all your sub-functions. There is no need to do this, and I expect this is what HLS is complaining about.

 

You need to specify interface types on your top-level module so that HLS knows what you're going to connect to it. For all the sub-functions, HLS will figure out appropriate interfacing (since it knows what's happening at both ends) with no pragmas required.

 

After that:

- "volatile" worries me. Keep in mind that a HLS C simulation must not rely on "volatile" to work, because it's single-threaded (and therefore "volatile" has no effect). If your function is only expected to work if another task is modifying that memory in parallel, that's where your problem lies.

- Is it possible that your queues are filling up and causing a deadlock? Maybe try just setting the size to 10,000 or so.

 

 

I've never actually used streams inside a block (only on the interface), and I've also never used "empty()" on a stream - my blocks all just wait until appropriate data does arrive.

Observer witxilinx
Observer
308 Views
Registered: ‎12-26-2018

Re: hls::stream C/RTL Co-Simulation Support

In case anyone has the same issue, I found this debug guide. Although I'm still no able to get the C/RTL co-simulation up and running.

0 Kudos
Observer witxilinx
Observer
289 Views
Registered: ‎12-26-2018

Re: hls::stream C/RTL Co-Simulation Support

@u4223374 I used volatile keyword because those ports connect to the DRAM with AXI Master interface, where the data changes over execution time. And I don't think it's a deadlock because I can generate the bitstream and run on the FPGA board. What I would like to really do is to check my design at the clock-cycle level, and thus I would like to perform the C/RTL co-simulation.

Recently, I have found this in the ug902 document at page 179:

In order for automatic verification to be performed, arrays on the function interface, or
array inside structs on the function interface, can use any of the following optimizations,
but not two or more:
• Vertical mapping on arrays of the same size.
• Reshape.
• Partition.
• Data Pack on structs.

Am I correct that C/RTL co-simulation won't work in the following case? I have one of may array declared using 2 pragmas:

void sub_module(ap_unit<...> array[...][...])
{
#pragma HLS INTERFACE bram port = out_mem #pragma HLS array_reshape variable = array complete dim = 2 #pragma HLS array_partition variable = array cyclic factor = 2 dim = 1
...
}

 

0 Kudos