UPGRADE YOUR BROWSER

We have detected your current browser version is not the latest one. Xilinx.com uses the latest web technologies to bring you the best online experience possible. Please upgrade to a Xilinx.com supported browser:Chrome, Firefox, Internet Explorer 11, Safari. Thank you!

cancel
Showing results for 
Search instead for 
Did you mean: 
1,700 Views
Registered: ‎08-01-2017

Kernel oops when testing openAMP throughput

I'm trying to get an idea of the throughput on the openAMP channel between P1 and P0 using a ZedBoard.

P0 is running PetaLinux 4.4 (2016.2) and P1 is running RTOS + openAMP and a proprietary application.

 

The test is as follows:

After Linux boots, I'll manually start P1 (via remoteproc).

P1 will set itself up and start all tasks. The throughput task will sleep for 30 seconds before starting the tests; the openAMP task is separate. The test consists of sending the biggest payload possible as fast as it can. My messages contain a 12-byte header (type, subtype, payload length) followed by the payload (just ASCII text); and that's on top of the 16 byte RPMSG header. The biggest payload possible is thus 512 - 16 - 12 = 484 bytes.

I have a circular buffer owned by the openAMP task which will buffer requests to be sent when the task gets CPU time. The messages will be sent all at once, one rpmsg_send() for each message.

 

When I try running the test like this, almost immediately upon the flooding start I'll get a kernel panic on drivers/virtio/virtio_ring.c

I added a couple of printks to that file in order to debug. The prints were added to virtqueue_get_buf() and detach_buf()

Code in virtqueue_get_buf()
    /* detach_buf clears data, so grab it now. */
    ret = vq->data[i];
/******************************/
printk("last_used: %d, i: %u, *len: %u, ret %d\n", last_used, i, (*len), *(int*)ret);
print_hex_dump(KERN_DEBUG, "virtqueue_get_buf data[i]: ", DUMP_PREFIX_NONE, 16, 1,
                    vq->data[i], 16, true);
/******************************/
    detach_buf(vq, i);

Code in detach_buf()
    /* Put back on free list: find end */
    i = head;

    printk("head: %u, freeing addr: %lu\n", i, vq->vring.desc[i].addr);

    /* Free the indirect table */

And this is the result:

last_used: 232, i: 230, *len: 67372036, ret 1
head: 230, freeing addr: 134597632
virtio_rpmsg_bus virtio0: inbound msg too big: (67372036, 41)
last_used: 233, i: 233, *len: 512, ret 1
head: 233, freeing addr: 134599168
last_used: 234, i: 234, *len: 512, ret 1
head: 234, freeing addr: 134599680
last_used: 235, i: 235, *len: 512, ret 1
head: 235, freeing addr: 134600192
last_used: 236, i: 236, *len: 512, ret 1
head: 236, freeing addr: 134600704
last_used: 237, i: 237, *len: 512, ret 1
head: 237, freeing addr: 134601216
last_used: 238, i: 236, *len: 67372036, ret 1
head: 236, freeing addr: 134600704
virtio_rpmsg_bus virtio0: inbound msg too big: (67372036, 78)
virtio_rpmsg_bus virtio0: input:id 236 is not a head!

 

During a normal execution, i.e. one that doesn't crash, last_used and i are always in sync. Only when i differs (for reasons unbeknownst to me) do I see the panic.

When I use smaller payloads or add a delay (essentially a context switch of the throughput task) after each message, the crash takes much longer to happen. Or doesn't happen at all.

 

Has anyone ever seem this before? Has it been identified and fixed in a later PetaLinux release?

0 Kudos