08-16-2020 10:59 PM
We are successful to encode firstname.lastname@example.org, 10bit 4:2:2 image data by VCU of MPSoC, using GStreamer pipeline.
Now, we want to reduce the resolution from 4128x2192 to 4096x2160. We implemented cropping by simple hardware circuit. But now, the frame rate (H.265 encoded MP4 file) is somewhat reduced to 4.7 fps. We use Video Frame Buffer Write IP of Xilinx. When we check the interrupt output of the IP, it's 7.5fps as expected.
So we suspect there might be some problem(s) somewhere around GStreamer, but have no clue how to proceed.
The environment is PetaLinux 2019.1.
Any help is appreciated.
08-16-2020 11:06 PM
08-17-2020 12:02 AM
Hi @watari ,
Thank you for the information. The original poster of the aforementioned thread seems to have resolved the problem by not using VCU but software encoder.
When I checked the CPU usage during encoding by top command, only 1 CPU is utilized nearly 90%. I have not checked the CPU usage of the previous resolution (4128x2192) yet.
The CPU usage was way above my expectation though. I'll check the usage in the previous resolution.
08-17-2020 01:08 AM - edited 08-17-2020 04:29 PM
Hi @watari ,
I checked the CPU usage during encoding for the resolution of 4128x2192 (the resultant frame rate is correctly 7.5fps), and found out that all of the CPU cores were BELOW 5%.
As I wrote earlier, the usage was nearly 90% when I encode 4096x2160 (and the frame rate doesn't reach 7.5fps). I don't know why there's such a huge difference.
Only things which pop into my head is the profile of the MP4 bitstream is 6 (4128x2192) vs 5 (4096x2160). And the clock rate for the VCU is somewhat low (150MHz). But considering higher resolution works correctly, I think the clock rate is not the cause of the problem...
#EDIT on Aug. 18, 2020
I thought AR #72324 (especially AR #72460) might solve the problem. But applying the patch doesn't improve the situation.
08-17-2020 10:49 PM
Btw, I am still suffering from that problem, my issue is not resolved. Using a software encoder even if the device has a VCU is definitely not a solution.
08-20-2020 10:39 PM - edited 08-20-2020 10:41 PM
Hi @so-lli1 ,
I didn't have a time to thoroughly investigate the problem this week. My plan is to enable debug in GStreamer and use perf command to find out which part of the program consumes so much time. Actually, I found out memcpy function (called from v4l2src) of the C library consumes most of the time, but because I haven't enabled debug in GStreamer, I could only chase the function call below C library (not above GStreamer).
08-21-2020 02:48 AM
I tried this alreay, as you suggested. At least for me, it did not make any difference:
08-23-2020 09:38 PM
Hello @watari ,
I tried some of other io-modes and found out that io-mode=5 (dmabuf-import) works well.
I don't know why the original io-mode=4 fail depending on the frame resolution, but considering my time is limited, I'll stick to io-mode=5 for now.
Thank you for the advice.