cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
Highlighted
177 Views
Registered: ‎07-23-2019

xil_printf much slower than printf

I noticed something strange. I have a Zynq PS where the APU cores are clocked at 1 GHz then in software I have a loop with a number of functions, so far mostly empty that I expect to take microseconds at the most so far.

The loop is synchronized to an interrupt so it starts every 10 ms.

There is a free-running counter for timestamping purposes incremented every ms.

For test purposes, some of the functions in the loop print some data with xil_printf, including the time counter above.

Because those functions are almost empty, I would expect them to run in microseconds, but I find that in practice, it seems to take 3 ms for each of them, basically to run the xil_printf function.

I also have a series of print-outs with printf (not xil_) and for those, the counter (timestamp) is the same (to the ms), so printf is faster than xil_printf, it looks like. 

I know (and not nuch more than that, tbh) that xil_printf is a light version of printf. And it looks like slower as well. Is this something Xilinx can confirm?

0 Kudos
3 Replies
Highlighted
158 Views
Registered: ‎07-23-2019

update: after replacing those xil_printf by printf, they are the same slow, taking 3 precious ms each.

The difference in what they print is not much: the slow ones have '%s' and print text (8-12 chars) from a char *table[].

Can printing text with %s take ms? If running at 1 GHz, 1 ms has a million of clock cycles. Does it take three million cycles to do that?

0 Kudos
Highlighted
Xilinx Employee
Xilinx Employee
112 Views
Registered: ‎11-30-2007

UART output is usually pretty slow - like 115200 Baud.

Depending on how much is printed - this will take some time.

UART IP have buffers - but if the buffer if full, the CPU core has to wait until buffer space is available.

 

You might not measure the time to process the printf - but due to the fact that the UART buffers are full - you might measure the transmission time.

Tags (1)
Highlighted
106 Views
Registered: ‎07-23-2019

 

Yes, I found that they are actually the same slow because they both are blocking functions, it doesn't jump to the next instruction until the Tx buffer is empty. I learnt the alternative(s) is either to create a non-blocking print pseudo-function based on writing to a buffer that some interrupt will process in the spare time, or use the multicore architecture and have "another core for that chore". I prefer the second but it will take some time to implement...

0 Kudos