cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
floriane1
Observer
Observer
512 Views
Registered: ‎05-15-2017

UartLite IP core latencies

Jump to solution

Hi, 

I am currently using the UartLite IP core with a zybo board. When I receive a complete Uart message, an inerrupt is raised and my code read the message and send a dummy message. My problem is that the time between the last caractere received and the first my interrupt is writing, I would like to have a time < 100µs, and I am having 271µs. My question is why is a the simple process of receiving a message and send an immediate response so slow? 

Is it because the UartLite is a very simplifyed IP for Uart? Should I try to use the uart 16550 IP? Or is it my way of handling the interrupt the problem?

 

Hereafter is how I coded the interrupt:

in my Main.cpp:

Status = UartLiteIntrIitialize(&IntcInstance, &UartLiteInst, UARTLITE_DEVICE_ID, UARTLITE_IRPT_INTR);

XUartLite_SetSendHandler(&UartLiteInst, UartLiteSendHandler, &UartLiteInst);
XUartLite_SetRecvHandler(&UartLiteInst, UartLiteRecvHandler, &UartLiteInst);

XUartLite_EnableInterrupt(&UartLiteInst);
XUartLite_Recv(&UartLiteInst, ReceiveBufferPtr, 1);

In the UartLite driver:

void UartLiteRecvHandler(void *CallBackRef, unsigned int EventData)
{
// Update the number of bytes received
TotalReceivedCount += EventData;
XUartLite_Recv(&UartLiteInst, RecvBuffer, EventData);
CompleteBuffer[TotalReceivedCount-1] = RecvBuffer[0]; // when byte by byte

/* When max message size reached, request to send the answer */
if (TotalReceivedCount == 11) {

// Send dummy answer

XUartLite_Send(&UartLiteInst, &BufferValueTEST[0], nbreByteTEST);

TotalReceivedCount = 0;

}

}

xilinx1.PNG
0 Kudos
1 Solution

Accepted Solutions
derekm_
Voyager
Voyager
370 Views
Registered: ‎01-16-2019

@floriane1, I think you must be doing something odd in your code. I ran a quick test here, and I can get a very quick response using the Uart Lite in a Zynq system. See PNG below. I have a program on the host PC sending 16 bytes to the Uart, and the interrupt handler sends four bytes back. The time between the receive phase ending and the transmit phase starting is about 16 us. 

I think you should try using a 16-byte receive frame. That's what the Uart Lite works with, and it might be a bit dumb with any other frame size.  Also, I do not know if you need to call XUartLite_Recv in main(). Just do everything in your interrupt handlers.

uart_lite_response.png

Some code fragments below which might help. (Some of this is very specific to my design, but you should get the idea.)

1. UART initialisation

 

// ===== Uart header file

/* Device ID */
#define AXI_UART1_DEVICE_ID			XPAR_UARTLITE_0_DEVICE_ID

/* Uart1 Settings */
#define UART_RX_BUFFER_SIZE			16U		// 16 byte command frame from host
#define UART_TX_BUFFER_SIZE			4U		// 4 byte response frame to host



// ===== Uart source file

/* === Buffers === */
/* Uart Buffer for receiving data from host */
static uint8_t RxBuffer [UART_RX_BUFFER_SIZE];
/* Uart Buffer for sending data to host */
static uint8_t TxBuffer [UART_TX_BUFFER_SIZE];



status = XUartLite_Initialize(p_AxiUart1Inst, AXI_UART1_DEVICE_ID);
if (status != XST_SUCCESS)
{
  return status;
}

XUartLite_SetSendHandler(p_AxiUart1Inst, UartIntrSendHandler, p_AxiUart1Inst);
XUartLite_SetRecvHandler(p_AxiUart1Inst, UartIntrRecvHandler, p_AxiUart1Inst);

XUartLite_EnableInterrupt(p_AxiUart1Inst);

XUartLite_ResetFifos(p_AxiUart1Inst);

/* ---------------------------------------------------------------------
* ------------ BUFFER INITIALISATION ------------
* -------------------------------------------------------------------- */

int idx;

for (idx = UART_RX_BUFFER_SIZE; idx !=0; idx--)
{
  RxBuffer[idx] = 0;
}

for (idx = UART_TX_BUFFER_SIZE; idx !=0; idx--)
{
  TxBuffer[idx] = 0;
}

 

 

 

2. Uart Interrupt Handlers:

 

void UartIntrRecvHandler(void *CallBackRef, unsigned int event_data)
 {


	/* === RX FROM HOST === */
	/* Get the data received from the host */
	XUartLite_Recv(p_AxiUart1Inst, RxBuffer, UART_RX_BUFFER_SIZE);

	/* Call function to handle the data */
	handleCommand(RxBuffer, TxBuffer); // SPECIFIC TO MY CODE !!!!!

	/* === TX TO HOST === */
	/* Send the response data to the host. */
	n_bytes_sent = XUartLite_Send(p_AxiUart1Inst, TxBuffer, UART_TX_BUFFER_SIZE);


 }


void UartIntrSendHandler(void *CallBackRef, unsigned int event_data)
{ // NOTHING TO DO. }

 

 

3. Add Uart to interrupt system:

 

status = XScuGic_Connect(p_XScuGicInst,
  AXI_UART1_INTR_ID,
  Xil_ExceptionHandler) XUartLite_InterruptHandler,
  (void *) p_AxiUart1Inst);
if (status != XST_SUCCESS)
{
  return XST_FAILURE;
}

/* Set/Get priorities */
XScuGic_SetPriorityTriggerType(p_XScuGicInst, AXI_UART1_INTR_ID, 0x0A, 0x03);

/* Enable the interrupt for UART 1 */
XScuGic_Enable(p_XScuGicInst, AXI_UART1_INTR_ID);

 

 

 

 

View solution in original post

9 Replies
derekm_
Voyager
Voyager
488 Views
Registered: ‎01-16-2019

I can't help with the UARTlite core, but I will ask: why are you not using one of the UARTs on the processing system side? On the Zybo board you already have direct access to UART1 (via the micro USB cable connected to J12). If you use this option, you will see really fast return times, even less than 1us. Or you could route UART0 through EMIO to one of the Pmod ports. I don't know the response time on that side, but it should be almost as fast as using UART 1.

I can help with the SCUGIC/UART interrupt code if you need.

0 Kudos
floriane1
Observer
Observer
480 Views
Registered: ‎05-15-2017

Hi derekm_, 

Thanks a lot for your reply, I am using the UartLite IP because in the future I would like to implement up to 8 Uart devices with different baudrate, and there is only 2 Uarts on the processing side. If I can't find a way to improve the speed, a solution could be using 4 Zybo boards (or using another other board), but I would prefere not to do that.

0 Kudos
derekm_
Voyager
Voyager
457 Views
Registered: ‎01-16-2019

Unfortunately I haven't used the UARTlite a lot, so I don't know how to optimize that code. Hopefully someone else can help. (The only quick change I can think of is to increase the speed of FCLK_CLK0 and see if that helps.)

It's definitely worth trying the 16550 as well, to see if that is better.

Your other option is to generate the 8x UART in verilog or VHDL. That would be a very neat solution. But if you have no experience in HDLs then I understand you might not like to go that way either.

0 Kudos
dpaul24
Scholar
Scholar
446 Views
Registered: ‎08-07-2014

@floriane1 ,

My question is why is a the simple process of receiving a message and send an immediate response so slow?

Have you measured the latency (in terms of the system clocks) after you receive the UART characters, raising an interrupt to the Zynq and finally Zynq responding back?

If you have a simulation model of your design this latency can be measured. The latency of the Zynq processing system is likely the cause of this delay, but I am not very sure.

As someone above has pointed out - I can't help with the UARTlite core, but I will ask: why are you not using one of the UARTs on the processing system side?

This might reduce the latency!

 

------------FPGA enthusiast------------
Consider giving "Kudos" if you like my answer. Please mark my post "Accept as solution" if my answer has solved your problem

derekm_
Voyager
Voyager
410 Views
Registered: ‎01-16-2019

@dpaul24, just a comment on your point about the Zynq interrupt latency; from experience it is very low, even for an interrupt that originates in the PL. In one design I have, it is on the order of 3-4 us for an external interrupt. For interrupts that originate in the processing system (TTC, for example), the latency is on the order of 400 ns.

So I think the most likely cause is the Uart Lite code.

derekm_
Voyager
Voyager
371 Views
Registered: ‎01-16-2019

@floriane1, I think you must be doing something odd in your code. I ran a quick test here, and I can get a very quick response using the Uart Lite in a Zynq system. See PNG below. I have a program on the host PC sending 16 bytes to the Uart, and the interrupt handler sends four bytes back. The time between the receive phase ending and the transmit phase starting is about 16 us. 

I think you should try using a 16-byte receive frame. That's what the Uart Lite works with, and it might be a bit dumb with any other frame size.  Also, I do not know if you need to call XUartLite_Recv in main(). Just do everything in your interrupt handlers.

uart_lite_response.png

Some code fragments below which might help. (Some of this is very specific to my design, but you should get the idea.)

1. UART initialisation

 

// ===== Uart header file

/* Device ID */
#define AXI_UART1_DEVICE_ID			XPAR_UARTLITE_0_DEVICE_ID

/* Uart1 Settings */
#define UART_RX_BUFFER_SIZE			16U		// 16 byte command frame from host
#define UART_TX_BUFFER_SIZE			4U		// 4 byte response frame to host



// ===== Uart source file

/* === Buffers === */
/* Uart Buffer for receiving data from host */
static uint8_t RxBuffer [UART_RX_BUFFER_SIZE];
/* Uart Buffer for sending data to host */
static uint8_t TxBuffer [UART_TX_BUFFER_SIZE];



status = XUartLite_Initialize(p_AxiUart1Inst, AXI_UART1_DEVICE_ID);
if (status != XST_SUCCESS)
{
  return status;
}

XUartLite_SetSendHandler(p_AxiUart1Inst, UartIntrSendHandler, p_AxiUart1Inst);
XUartLite_SetRecvHandler(p_AxiUart1Inst, UartIntrRecvHandler, p_AxiUart1Inst);

XUartLite_EnableInterrupt(p_AxiUart1Inst);

XUartLite_ResetFifos(p_AxiUart1Inst);

/* ---------------------------------------------------------------------
* ------------ BUFFER INITIALISATION ------------
* -------------------------------------------------------------------- */

int idx;

for (idx = UART_RX_BUFFER_SIZE; idx !=0; idx--)
{
  RxBuffer[idx] = 0;
}

for (idx = UART_TX_BUFFER_SIZE; idx !=0; idx--)
{
  TxBuffer[idx] = 0;
}

 

 

 

2. Uart Interrupt Handlers:

 

void UartIntrRecvHandler(void *CallBackRef, unsigned int event_data)
 {


	/* === RX FROM HOST === */
	/* Get the data received from the host */
	XUartLite_Recv(p_AxiUart1Inst, RxBuffer, UART_RX_BUFFER_SIZE);

	/* Call function to handle the data */
	handleCommand(RxBuffer, TxBuffer); // SPECIFIC TO MY CODE !!!!!

	/* === TX TO HOST === */
	/* Send the response data to the host. */
	n_bytes_sent = XUartLite_Send(p_AxiUart1Inst, TxBuffer, UART_TX_BUFFER_SIZE);


 }


void UartIntrSendHandler(void *CallBackRef, unsigned int event_data)
{ // NOTHING TO DO. }

 

 

3. Add Uart to interrupt system:

 

status = XScuGic_Connect(p_XScuGicInst,
  AXI_UART1_INTR_ID,
  Xil_ExceptionHandler) XUartLite_InterruptHandler,
  (void *) p_AxiUart1Inst);
if (status != XST_SUCCESS)
{
  return XST_FAILURE;
}

/* Set/Get priorities */
XScuGic_SetPriorityTriggerType(p_XScuGicInst, AXI_UART1_INTR_ID, 0x0A, 0x03);

/* Enable the interrupt for UART 1 */
XScuGic_Enable(p_XScuGicInst, AXI_UART1_INTR_ID);

 

 

 

 

View solution in original post

floriane1
Observer
Observer
309 Views
Registered: ‎05-15-2017

Thank you all for taking the time to answer me, derekm_ can you tell me which baude rate you used for your test? Because I changed my code to do exactly the same thing as you did (with the same buffer size) and I still measure the same results (in that case I was using a 9 600 baude rate). Then I decided to try the same code with a 115 200 baude rate and with this, the time between the last received character and the first sent has been decreased from 271µs to 40µs. I am now wondering why this time is related to the baude rate? The UartLite clock is independent of the baude rate, so the time between the last and first character should just correspond to the acquisition and response time, or am I missing something?

About the Zynq interrupt latency, I tried the same test but without using interrupt and I got about the same results, so apparently the interrupt latency is not playing a big role in the delay I am observing.

0 Kudos
derekm_
Voyager
Voyager
299 Views
Registered: ‎01-16-2019

The baud rate is 115200, and I have a standard Zynq clock set-up: 33.333 MHz ext clock; 6:2:1 ratio; 667 MHz CPU clock, 533 MHz DDR clock, 1000 MHz IO clock. FCLK_CLK0  in the programmable logic is 50 MHz. If you have identical settings, then I would expect the results to be very close. It might be to do with how we test. In the logic analyzer capture above, I am directly probing the UART Tx/Rx pins on a Pmod USB-Uart.

 

floriane1
Observer
Observer
220 Views
Registered: ‎05-15-2017

Thank you derekm_, my parameters are almost the same: zynq clock 50Mhz, cpu clock ratio 6:2:1, CPU clock 650Mhz, ddr clock 525, MhZ , FCLK_CLK0  100Mhz, and I am reading the delay through a oscilloscope connected to one of the high-speed Pmod port. In any case, I can accept the 40µs delay I am observing for the 115200 baud rate. I suppose the UartLite latency will increase if the baud rate decrease and that was why I observed that huge delay with a 9600 baud rate. Thank you all for helping me out