cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
stephan_hochmueller
Participant
Participant
2,200 Views
Registered: ‎06-30-2017

XSDK OpenAMP echo-test: Blocking "platform_poll" function (Standalone)

Hello,

we started our developement under 2018.1 toolchain. Now we switched to 2018.3 and run into a problem in the OpenAMP echo test example. 

There are big changes between the two versions. In 2018.2 the funktion "hil_poll" was called in the main. This function could be configured as blocking or non blocking. In 2018.3 the whole "hill" library is removed. Now the function "platform_poll" is used instead. This function is defined in platform_info.c and is blocking and can't be configured as non blocking.

Due to the fact we poll other things to in the main loop we can't run the remote firmware any more. Is there a way to come over that problem? Or any documentation? I figured out that in https://www.xilinx.com/support/documentation/sw_manuals/xilinx2018_3/ug1186-zynq-openamp-gsg.pdf in chapter "OpenAMP Xilinx SDK Key Source Files" still the old structure is descriped. 

Please help!

Regards,

Stephan 

 

5 Replies
ahavens
Observer
Observer
2,066 Views
Registered: ‎08-04-2016

I ran into the same issue, but it looks to me like the platform_info.c file is inteded to be an example or at least customizable, so I modified the platform_poll to be non blocking and that works for me.

int platform_poll(void *priv)
{
	struct remoteproc *rproc = priv;
	struct remoteproc_priv *prproc;
	unsigned int flags;

	prproc = rproc->priv;

	flags = metal_irq_save_disable();
	if (!(atomic_flag_test_and_set(&prproc->ipi_nokick))) {
		metal_irq_restore_enable(flags);
		remoteproc_get_notification(rproc, RSC_NOTIFY_ID_ANY);
		return 0;
	}

	metal_irq_restore_enable(flags);

	return -EAGAIN;
}

It looks like it is not used internally at all so you could probably change the signature to match the old hil_poll and have it be an option to block or not.  

1,136 Views
Registered: ‎01-10-2020

I am trying to establish an APU-RPU communication via rpmsg and openamp. In our project, we need to do the following steps:

1- APU sends an rpmsg to RPU

2- the RPU jumps via its platform poll function to the rpmsg_callback

3- in the callback we implemented a while loop that should be executed till a new rpmsg is received.

4- after receiving the apu rpmsg, RPU leaves this while loop, executes platfrom_poll function and starts the rpmsg_callback with the new received data again.

I edited the platform poll function so that it works in non-blocking mode. 

in the first rpmsg, ipi and polling are working correctly. But when I call platform poll inside my rpmsg_callback, RPU doesn't detect any new rpmsg.

Is it not possible to check the ipi flags and to do the polling process inside an rpmsg_callback ?

 

I would gratefull if you could help me to solve this issue.

BR

Iheb 

0 Kudos
ahavens
Observer
Observer
1,107 Views
Registered: ‎08-04-2016

It has been a while since I worked with this, but off the cuff I don't think you want to be calling platform_poll from within the rpmsg_callback. If I recall correctly the rpmsg_callback is an interrupt handler, and the platform_poll changes the interrupt masking. I don't think either original platform_poll(https://github.com/Xilinx/embeddedsw/blob/master/lib/sw_apps/openamp_echo_test/src/machine/zynqmp_r5/platform_info.c#L232) or my modification above is reentrant so unpredictable things might happen with the interrupt masking. If I had to guess I expect that in your case the interrupt is getting stuck in a masked out state and will never get unmasked.

The echo_test example just calls the poll in the main app loop. https://github.com/Xilinx/embeddedsw/blob/master/lib/sw_apps/openamp_echo_test/src/system/generic/rpmsg-echo.c#L73

In our case our callback copies the incoming command to a queue which then gets processed in the main loop, but not in the interrupt context. If I recall correctly this is similar to what the rpmsg driver does on the Linux side (https://elixir.bootlin.com/linux/v5.10-rc7/source/drivers/rpmsg/rpmsg_char.c#L111). In our case we had cases in both directions that could send data faster than it could be consumed and it would cause the queues to grow so we added a throttling mechanism on top to keep it under control.

0 Kudos
1,033 Views
Registered: ‎01-10-2020

I implemented it again the same way as described in your comment. So in the rpmsg callback on RPU side , the received data will be copied on local buffer. Then in the main loop, in every iteration ,the non-blocking platform poll is called once, then some data are received from FPGA to RPU via DMA interrupts. 

For the first time, APU could send an rpmsg to the RPU (=> the ipi interrupt was done successfully). Then the dma controller is initialized , the FPGA starts generating dma interrupts and the RPU starts receiving data via DMA. The DMA gererates an interrupt every 144us which sets a variable on 1. In the main loop, first we execute a non-blocking platfrom poll to check if an ipi interrupt from APU was generated and if a new rpmsg was received. Then we check the flag of the dma-receive. If we have new dma data, then we should send it back to FPGA via dma and we should send it to APU via rpmsg-send.
when I try to send a second rpmsg from APU to RPU at run-time (during the dma-send-receive stuff), the platform poll couldn't detect an ipi interrupt. 
The Ipi mailbox is showing the following warning:

 [ 160.016745] zynqmp-ipi-mbox mailbox@ff90000: Try increasing MBOX_TX_QUEUE_LEN
[ 160.023897] r5@0: Failed to kick remote.

Does this mean that Linux tried to generate an ipi interrupt, but something on RPU side is disabling the ipi interrupt ? is it possible the the dma interrupts are prohibiting the execution of any other interrupt at run time ?

 

0 Kudos
ahavens
Observer
Observer
1,017 Views
Registered: ‎08-04-2016

If I recall correctly we saw that message with trying to send too many messages from the APU to the RPU. I think at that time we were sending a bunch of "throttle" messages from the APU to the RPU to indicate that the APU was ready to read and the RPU could send data. Ultimately we had to clean that up to only send one throttle message so that it would not spam.  We don't use dma much, but I think it is fairly common for interrupt handlers to mask other interrupts.

If I recall correctly he message actually means that it could not send a message from the APU to the RPU.

https://elixir.bootlin.com/linux/v5.10.1/source/drivers/mailbox/mailbox.c#L259

By default there are 20 of those. I am pretty sure they are separate from the virtio  buffers that default to 512 buffers of size 512 bytes (which we have modified on our system to be less count and bigger size)

https://elixir.bootlin.com/linux/v5.10.1/source/include/linux/mailbox_controller.h#L102

If I recall correctly we tried using a patch to increase MBOX_TX_QUEUE_LEN and it just delayed things and did not solve the root problem.

0 Kudos