10-21-2020 03:09 AM - edited 10-29-2020 09:50 AM
[ 88.188616] Configured vdma with YUV frame addresses [ 152.819701] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000018 [ 152.833423] Mem abort info: [ 152.836200] ESR = 0x96000005 [ 152.839238] Exception class = DABT (current EL), IL = 32 bits [ 152.845140] SET = 0, FnV = 0 [ 152.848178] EA = 0, S1PTW = 0 [ 152.851303] Data abort info: [ 152.854167] ISV = 0, ISS = 0x00000005 [ 152.857986] CM = 0, WnR = 0 [ 152.860941] user pgtable: 4k pages, 39-bit VAs, pgdp = 000000003f6595a2 [ 152.867543]  pgd=0000000000000000, pud=0000000000000000 [ 152.874325] Internal error: Oops: 96000005 [#1] SMP [ 152.879184] Modules linked in: dlnx(O) al5d(O) al5e(O) allegro(O) xlnx_vcu_clk xlnx_vcu xlnx_vcu_core mali(O) uio_pdrv_genirq [las] [ 152.892398] CPU: 0 PID: 5979 Comm: vcu-app Tainted: G O 4.19.0-xilinx-v2019.1 #1 [ 152.901083] Hardware name: xlnx,zynqmp (DT) [ 152.905252] pstate: 00000085 (nzcv daIf -PAN -UAO) [ 152.910030] pc : idr_find+0x8/0x20 [ 152.913421] lr : find_vpid+0x44/0x50 [ 152.916985] sp : ffffff8008003d90 [ 152.920284] x29: ffffff8008003d90 x28: 0000000000000001 [ 152.925587] x27: ffffff8008d66928 x26: ffffff800921ac10 [ 152.930891] x25: ffffffc87bba1600 x24: ffffff8009198648 [ 152.936194] x23: 0000000000000038 x22: ffffff8008003f04 [ 152.941497] x21: 0000000000000000 x20: ffffff8009198648 [ 152.946801] x19: ffffff8000c24588 x18: ffffff80091a92c8 [ 152.952105] x17: 0000000000000000 x16: 0000000000000000 [ 152.957408] x15: 0000000000000000 x14: ffffffc875c43100 [ 152.962712] x13: ffffffc875c43000 x12: ffffffc875c43028 [ 152.968015] x11: ffffffc875c43101 x10: 0000000000000040 [ 152.973319] x9 : ffffff80091aafc8 x8 : ffffffc87b400268 [ 152.978622] x7 : 0000000000000000 x6 : ffffffc87b400240 [ 152.983926] x5 : ffffffc87b400428 x4 : 000000000000002c [ 152.989230] x3 : 00000000ffffffff x2 : 0000000000000000 [ 152.994533] x1 : 000000000000082f x0 : 0000000000000008 [ 152.999838] Process vcu-app (pid: 5979, stack limit = 0x000000007e9b0f63) [ 153.006607] Call trace: [ 153.009040] idr_find+0x8/0x20 [ 153.012077] find_vpid+0x44/0x50 [ 153.015292] irq_handler+0x70/0xd8 [dlnx] [ 153.019293] __handle_irq_event_percpu+0x6c/0x168 [ 153.023987] handle_irq_event_percpu+0x34/0x88 [ 153.028414] handle_irq_event+0x40/0x98 [ 153.032233] handle_fasteoi_irq+0xc0/0x198 [ 153.036313] generic_handle_irq+0x24/0x38 [ 153.040305] __handle_domain_irq+0x60/0xb8 [ 153.044385] gic_handle_irq+0x5c/0xb8 [ 153.048030] el1_irq+0xb0/0x140 [ 153.051156] release_task.part.3+0x34c/0x478 [ 153.055417] do_exit+0x61c/0x980 [ 153.058629] __arm64_sys_exit+0x14/0x18 [ 153.062450] el0_svc_common+0x84/0xd8 [ 153.066103] el0_svc_handler+0x68/0x80 [ 153.069834] el0_svc+0x8/0xc [ 153.072701] Code: a8c17bfd d65f03c0 a9bf7bfd 910003fd (b9401002) [ 153.078784] ---[ end trace 3817f32f6d49de58 ]--- [ 153.083383] Kernel panic - not syncing: Fatal exception in interrupt [ 153.089722] SMP: stopping secondary CPUs [ 153.093636] Kernel Offset: disabled [ 153.097107] CPU features: 0x0,20802004 [ 153.100838] Memory Limit: none [ 153.103879] ---[ end Kernel panic - not syncing: Fatal exception in interrupt ]---
10-22-2020 04:05 PM
>Should we use the "prefetch-buffer" option using the encoder buffer in the VCU design in order to run the pipeline with the main profile? The VCU document specifies pipelines where we see the prefetch buffer is enabled.
I guess yes.
BTW, did you set QoS parameter before launch your encoring ?
At least, it seems that frame corruption is related with QoS parameter.
10-29-2020 10:07 AM
1. I tried setting the QoS settings for the AXI HP ports connected
These are the values being set in the AFIFM register
for the encoder connected AXI HP0 & HP1 port
devmem 0xFD380008 w 0x3
devmem 0xFD38001C w 0x3
devmem 0xFD390008 w 0x3
devmem 0xFD39001C w 0x3
devmem 0xFD380004 w 0xF
devmem 0xFD380018 w 0xF
devmem 0xFD390004 w 0xF
devmem 0xFD390018 w 0xF
2. Enabled the Optional encoder buffer and tried with prefetch-buffer= TRUE
With both these changes again we are facing the frame tearing/getting older frames after sometimes in the encoding pipeline using Main profile.
Our frame is being sourced from the appsrc Gstreamer element, having a custom module to program the VDMA and maps the 8 cyclic buffers to userspace, there is handshaking mechanism based on interrupt received from VDMA to read one frame behind from the 8 cyclic buffers VDMA writes periodically at 16.667 ms interval.
With this implementation, it is not showing issues in baseline profile but in main profile.
This is the pipeline we are trying out:
Appsrc ! rawvideoparse width=1920 height=1080 format=NV12 framerate=60/1 ! queue ! omxh264enc prefetch-buffer=TRUE b-frames=0 Target-bitrate=60000 gop-mode=basic control-rate=constant num-slices=8 initial-delay=250 cpb-size=500 ! video/x-h264,profile=main,alignment=au ! queue ! h264parse ! matroskamux ! filesink
We have verified there is no issue from the handshaking mechanism, frames are properly read from VDMA in the appsrc.
Query: With setting both QoS port setting as well as enabling the prefetch-buffer as per your suggestion, we are still getting the frame corruption , Can you suggest what could be the change required, any parameter need to be changed?
10-29-2020 04:21 PM
>Query: With setting both QoS port setting as well as enabling the prefetch-buffer as per your suggestion, we are still getting the frame corruption , Can you suggest what could be the change required, any parameter need to be changed?
From your 1st post, I suspected transaction issue on internal bus.
So, I gave my suggestion to improve transaction rate as QoS setting and prefetch-buffer.
However, you are still facing an issue as frame corruption.
I guess this is caused by handshake procedure, especial performance on appsrc.
Would you try performance analysis to investigate the route cause on AXI4 bus and an internal crossbar switch ?
10-29-2020 11:43 PM
Can you please go through the procedure that we are following here, any possible suggestion out of that ?
BTW, We are very thankful for your involvement with our issue.
>Would you try performance analysis to investigate the route cause on AXI4 bus and an internal crossbar switch ?
Will try to figure out this, could you give some more context on this, I am actually somewhat new to these areas?
> I guess this is caused by handshake procedure, especial performance on appsrc.
We also don't think it can be a transaction issue, we are confused a bit based on the experiments we tried.
The procedure that we are following is:
We allocate 8 buffers for 1080p NV12 format in the PS DDR, configure the VDMA with the start addresses of these buffers to start the cyclic DMA transfer to these buffers. Every time w.r.t frame competition the VDMA triggers an interrupt which the custom driver handles & notify the appsrc so as to push frame one by one to encoder plugin. We have the register from VDMA also to know which frame it has completed writing. Based on these inputs, the appsrc will be reading one frame behind the VDMA write index so as to prevent any overlap issue, this is verified that the appsrc is reading one frame behind the VDMA write pointer traversing through the cyclic buffers.
NOTE: With this implementation, we didn't saw the issue when trying out baseline profile, We were able to get encoding pipeline run more than 5 min using the baseline profile with the same handshaking mechanism in place, but when using the main profile it is showing this frame corruption/ ordering issue hardly after the 1 min or even less.
If It was some basic handshaking issue, should we be getting the same issue with baseline profile also?
The artifact we are seeing is like after some time, in the output we see frames jumping backwards and proceeding again from that point, this repeats like somewhere the old frame data is getting stored. We also observe that after sometime it is also getting back to normal. ( at the instant when it is back to normal, it will jump directly to the latest frame location , we experimented with a stopwatch on the image scene)
FYI: 0 - 1 min , it is okay , at 1 min 5 sec the issue starts where it jumps 8 frames back and proceed, this fallback happens every 300 ms. This continues till 1min 33 sec and after that it suddenly jumps to 1 min 35 sec, seems like it eliminates the some frames to get to the current location.
From appsrc, we printed the write index from vdma and the read index passed from appsrc to encoder, this seems in sync. we are running this at 60fps without any frame index overlapp/ missing.
11-03-2020 02:19 PM
>>Would you try performance analysis to investigate the route cause on AXI4 bus and an internal crossbar switch ?
>Will try to figure out this, could you give some more context on this, I am actually somewhat new to these areas?
>> I guess this is caused by handshake procedure, especial performance on appsrc.
>We also don't think it can be a transaction issue, we are confused a bit based on the experiments we tried.
I suggest you to observe system transaction with System ILA IP first.
You can confirm performance on internal bus.
>FYI: 0 - 1 min , it is okay , at 1 min 5 sec the issue starts where it jumps 8 frames back and proceed, this fallback happens every 300 ms. This continues till 1min 33 sec and after that it suddenly jumps to 1 min 35 sec, seems like it eliminates the some frames to get to the current location.
Also, I suggest you to analyze this phenomenon with gstshark, too.
Would you try them ?
11-04-2020 03:25 AM
> I suggest you observe system transactions with System ILA IP first. You can confirm performance on the internal bus.
Sure, I will check on this.
> Also, I suggest you analyze this phenomenon with gstshark, too
Yes, we are yet to analyze with gst shark, In our latest test we had increased the AXI memory mapped clock from 227MHz to 250 MHz.
The current pipeline observation is that with sync=true, appsrc is pushing at 60fps but after some 40sec run, all frames drops happening between 4-5 sec, it seems like some bottleneck in the pipeline. And the 2K frame size 3.1 MB getting copied at the OMxencoder plugin using CPU cycles shouldn't get a bottleneck right?
will check the performance analysis with gst-shark as well as system ILA to get performance on the internal bus.