cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
e_ensafi
Explorer
Explorer
479 Views
Registered: ‎08-13-2020

AIE x86simulator crashes when using streams and windows

I have written an application that uses GMIO at the top-level simulation::platform, and I have implemented an in-graph PL kernel to replicate and move data between AIE kernels.  The PL kernel uses HLS streams, and the AIE kernels use windows.  An earlier version of the application ran flawlessly in x86simulator when using streams or windows exclusively (and AIE tiles only).  The introduction of a stream based PL kernel in between the window based AIE kernels has apparently exposed an x86simulator bug related to CDNOx86Sim::Window2StreamsAdapter.  This is how the simulation crashes:

Thread 9 "sim.out" received signal SIGSEGV, Segmentation fault.

[Switching to Thread 0x155553158700 (LWP 1736844)]

0x00001555551a952a in CDNOx86Sim::Window2StreamsAdapter::execute() ()

   from /tools/Xilinx/Vitis/2020.2/aietools/lib/lnx64.o/libCDNOx86Sim.so

(gdb) where

#0  0x00001555551a952a in CDNOx86Sim::Window2StreamsAdapter::execute() ()

   from /tools/Xilinx/Vitis/2020.2/aietools/lib/lnx64.o/libCDNOx86Sim.so

#1  0x0000155554affcbf in std::execute_native_thread_routine (__p=0x6e4b80)

    at ../../../../../../src/lnx64/libstdc++-v3/src/c++11/thread.cc:83

#2  0x00001555554eb609 in start_thread (arg=<optimized out>)

    at pthread_create.c:477

#3  0x0000155554610293 in clone ()

    at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

(gdb) quit

Before posting any code, I need to recreate the issue in a non-proprietary test application.  However, there is always a chance that the issue will not manifest itself in a simplified system.

Tags (2)
0 Kudos
Reply
10 Replies
florentw
Moderator
Moderator
409 Views
Registered: ‎11-09-2015

Hi @e_ensafi 

Unfortunately x86simulation is not in the same state as aiesimulation. This should be improved in 2021.1.

If you can send a test case that would be great so we can make sure this will be fixed in 2021.1.

I will also try to reproduce on my side.


Florent
Product Application Engineer - Xilinx Technical Support EMEA
**~ Don't forget to reply, give kudos, and accept as solution.~**
e_ensafi
Explorer
Explorer
395 Views
Registered: ‎08-13-2020

@florentw I tried and was unable to reproduce the error in a simplified application. One thing that's unclear is if window sizes need to be adjusted to be a multiple of the stream width.  That is, if an HLS stream reads/writes 32-, 64- or 128-bit wide data, does the sending/receiving window for the connected AIE kernel need to be a multiple of 4, 8 or 16 bytes, respectively, or will this automatically be handled by the compiler?  For example, let's say that a 128-bit PL data mover is feeding an AIE kernel with a smaller window that is not a multiple of 128 bits;

 

// Internally, pl_kernel moves 64 ap_int<128> values = 1024 bytes
void pl_kernel(const ap_int<128>* mem, dir::out<hls::stream<ap_axis<128, 0, 0, 0>>&> s);

// The window expects 510 int16 values = 1020 bytes
connect<stream, window<510 * sizeof(int16)>>(pl_kernel, aie_kernel);

 

In my implementation, I added padding with additional window_read/write calls to consume/produce the extra bytes just to be on the safe side, but I don't know if this was unnecessary. The compiler sees that a 128-bit HLS stream is connected to a 1020 byte window, but it can't possibly know how much data will flow from the stream to the window.  My assumption was that I need to handle any extra data in the window based AIE kernel so that the stream based PL kernel does not stall trying to push the last few bytes to my window.  Therefore, I must process 510 int16 values and discard 2 int16 values per invocation, and my window size needs to be 512 * sizeof(int16) instead of 510.  Is this correct?

0 Kudos
Reply
florentw
Moderator
Moderator
356 Views
Registered: ‎11-09-2015

HI @e_ensafi 

I do not know how this is handled. I am not sure the compiler is taking care of this. Let me check internally


Florent
Product Application Engineer - Xilinx Technical Support EMEA
**~ Don't forget to reply, give kudos, and accept as solution.~**
florentw
Moderator
Moderator
320 Views
Registered: ‎11-09-2015

HI @e_ensafi 

I confirmed that the compiler is not taking care of adapting the data to fit the window.

If you are sending too much data, then the extra data should be written on the pong buffer. 


Florent
Product Application Engineer - Xilinx Technical Support EMEA
**~ Don't forget to reply, give kudos, and accept as solution.~**
e_ensafi
Explorer
Explorer
189 Views
Registered: ‎08-13-2020

@florentw I now have a test case that can be shared via SR.  You will receive a separate message shortly.

0 Kudos
Reply
florentw
Moderator
Moderator
170 Views
Registered: ‎11-09-2015

HI @e_ensafi 

You do not need to create a SR. We can do the debug through the forums (at least share the updates) so next user facing the issue can see the outcome. When your test case is ready, just let me know and I will send you an EZmove link so you can upload your project in private.

Regards


Florent
Product Application Engineer - Xilinx Technical Support EMEA
**~ Don't forget to reply, give kudos, and accept as solution.~**
0 Kudos
Reply
e_ensafi
Explorer
Explorer
155 Views
Registered: ‎08-13-2020

@florentw Sorry, I should have explained a bit more. The SR was already open for another issue, and the test case demonstrates both issues. In any case, you're welcome to post the outcome here for the x86 crash.

0 Kudos
Reply
florentw
Moderator
Moderator
110 Views
Registered: ‎11-09-2015

HI @e_ensafi 

I can reproduce a segmentation fault with the test case you sent and I reported this to the development team


Florent
Product Application Engineer - Xilinx Technical Support EMEA
**~ Don't forget to reply, give kudos, and accept as solution.~**
e_ensafi
Explorer
Explorer
43 Views
Registered: ‎08-13-2020

@florentw While debugging another issue, it became apparent that windows between AIE kernels must be at least 32 bytes, but this requirement was excluded from UG1076 for Vitis 2020.2 dated 11/24/2020, whereas it was stated in the 2020.1 version. Indeed, my example contained windows smaller than 32 bytes, and after I increased the window size, x86simulator stopped crashing. Of course, crashing is not a good way to handle this, and I think the compiler should at the very least refuse to compile the ADF graph if the window size is too small or not a multiple of 16 bytes, which is yet another requirement excluded from the latest UG1076.

florentw
Moderator
Moderator
21 Views
Registered: ‎11-09-2015

HI @e_ensafi 

Thanks for the update.

Yes I did not investigate too much on the test case but this should not fail with a segmentation fault.

I had an internal discussion. I do not think this information should have been removed from the UG1076. We will work for adding them back in 2020.2 with any other limitation


Florent
Product Application Engineer - Xilinx Technical Support EMEA
**~ Don't forget to reply, give kudos, and accept as solution.~**
0 Kudos
Reply