01-20-2021 07:58 AM
Is there some sort of buffering in printf() output from AIE kernels in aiesimulator? The order of my printf() outputs seems to suggest that some kernels have returned before the kernels feeding them data have returned. This is impossible because the consuming kernels cannot return until it has processed all of the output from the producing kernel. Is the output being buffered and output either randomly or under specific conditions?
01-20-2021 08:17 AM - edited 01-20-2021 08:17 AM
Hi @e_ensafi
There is no specific condition. The printf mights happen randomly.
Printf should be used only for functional verification not for event verification.
01-20-2021 08:17 AM - edited 01-20-2021 08:17 AM
Hi @e_ensafi
There is no specific condition. The printf mights happen randomly.
Printf should be used only for functional verification not for event verification.
01-21-2021 10:55 AM
Hi @e_ensafi
You can use a trick to identify which kernel is producing the printf if you add a template parameter to the function call. This way you can assign an integer ID for each kernel.
If you are using different window sizes for your kernels, you can use get_window_size() function on either your input or output window and use that in your printf argument.
Third option (not tested) is to pass a dummy RTP to set a variable in the printf.
Last resort is to copy the kernel source and hardcode a message for that specific kernel.
01-21-2021 12:08 PM
@derekh My kernels are already identified uniquely using template parameters. The problem is that I am seeing printf output from a kernel at the end of the pipeline (i.e. the one returning output to the PS program is done executing) but this is impossible until all data from kernels near the beginning of the pipeline has been consumed, in which case I would expect a printf statement from those kernels as well.
01-22-2021 06:50 AM
Hi @e_ensafi
BTW. Are you using aiesimulator or x86simulator?
I guess the behaviour will be even more accentuated in x86 simulation as this is absolutely not cycle accurate
01-22-2021 08:50 AM - edited 01-22-2021 08:51 AM
@florentw Both, but x86simulator runs to completion in the order expected, whereas aiesimulator deadlocks at some point in the manner described in some of my other posts. In aiesimulator, it gets pretty far, apparently all the way to the end of one iteration because my final AIE kernel does a printf() right before returning, but then it hangs forever and graph::wait() never returns. I am seeing similar behavior on the VCK190 hardware, but I don't know how to find out which kernel is hanging, PL or AIE, as all event tracing examples rely on a successful return from graph::wait(). Is there a way to trace what's happening on the hardware, which kernel is executing, etc?
01-22-2021 08:57 AM
HI @e_ensafi
Can you create a new topic for the hang itself (and how to check the status in HW) as this is different from the printf questions.
If you have the correct order for the printf in x86simulation but not in AIE simulator then it might worth checking. Is there any test case you could share (outisde the forums of course) to reproduce?
01-22-2021 10:15 AM - edited 01-22-2021 10:19 AM
@florentw I will have to put something together, which may require more time than I have on my current task. However, the hanging issue is directly related to my topic regarding HLS/AIE communication using streams and async windows with ping-pong buffers, fifo_depth, etc. This exam same hanging issue is what I'm trying to resolve in that thread.