cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
radimsosty
Visitor
Visitor
4,004 Views
Registered: ‎01-19-2015

ZCU102: How to pass GPU output to the PL

Hello everybody,

 

I have some problems with passing GPU output to the PL and HDMI output on ZCU102 board.

 

What I need to do is run OpenGL application on Mali and transfer output video stream into the PL.

 

I have acces to HDMI FrameBuffer Example Design and I also looked at Zynq UltraScale MPSoC Base TRD and Zynq UltraScale+ MPSoC: Embedded Design Tutorial.

 

My first try was add X11 libraries into petalinux system from HDMI FrameBuffer Example Design and run tricube application from EDT on this design. This fails on "can't open display".

 

So I have done some research and I think that solution is use DRM to initialize EGL. But every tutorial I found is using GBM library and it looks like GBM can't be added to rootfs through petalinux-config.

 

Could someone give me some advice, please?

0 Kudos
15 Replies
sandeepg
Moderator
Moderator
3,927 Views
Registered: ‎04-24-2017

Hi @radimsosty,

 

Take a look on TRD design http://www.wiki.xilinx.com/Zynq+UltraScale+MPSoC+Base+TRD+2017.4

 

This example has HDMI Tx 

http://www.wiki.xilinx.com/Zynq%20UltraScale%20MPSoC%20Base%20TRD%202017.4%20-%20Design%20Module%206 

 

 

Thanks,
Sandeep
PetaLinux Yocto | Embedded SW Support

---------------------------------------------------------------------------
Don’t forget to Reply, Kudo, and Accept as Solution.
---------------------------------------------------------------------------
0 Kudos
radimsosty
Visitor
Visitor
3,871 Views
Registered: ‎01-19-2015

Hello @sandeepg,

 

Thanks for your answer. I know about TRD, I'm writing about it in the original post. But I was unable to find solution in it too. This design is a bit complicated and if I understand it correctly, GPU is utilized by Qt only. I need a way how to make it work with pure OpenGL application.

 

I'm using a different way right now, DP live video output interface. I'm able to get video stream into the PL with this. But it needs a monitor connected to DP connector, so I guess I need to edit some drivers, or fake EDID on DP's DCC lines on my final PCB.

 

So I have some (partial) solution, but I would prefer the way shown in TRD.

0 Kudos
chrisar
Xilinx Employee
Xilinx Employee
3,840 Views
Registered: ‎08-01-2007

In theory, what you are trying to do should be possible.

Maybe you need to use modetest to get the HDMI register and then set the proper $DISPLAY variable.

A few comments on the The Zynq UltraScale+ MPSoC ZCU102 design might look complicated, but if you break it down into the essential parts you need.  You are using HDMI Tx, so you need to look and see what is feeding it.  The input is the Video Mixer, which gets its input from 3 possible places, but these are all memory interface, and one of them comes form the Zynq UltraScale+ MPSoC block, which should give you access to the output of the GPU, which is being written to DDR memory.  So you should be able prototype the software side using the ZCU102 TRD and then once you get it working you could try to make a simpler hardware setup that includes the keep IP in the Video Data Path.

 

Another idea is to take a look at the HDMI FrameBuffer Example Design.  This is a bit simpler design but again you could review the data patch and probably do some software development here and see if you can accomplish what you want and then simplify the design to closer to what your end application will need

 

 

Last, I'd also like to point out a few things about the DisplayPort live interface:

  • As for using the DisplayPort Controller live is a bit of a complex solution based on the current drivers.  As you pointed out you will also have to have something connected to the DisplayPort in order for this to work.  The current Linux DisplayPort Controller Driver always assumes that you are connecting to a monitor.  If you don't it won't enable.  I'm not exactly sure what triggers this, but it is likely the HPD signal.
  • Just using the EDID is probably not sufficient, as the EDID would be read via the AUX traffic.  You might be able to modify the driver to "trick" it into thinking it was connected, but that will probably require you to modify the driver.
Chris
Versal ACAP: AI Engines | Embedded SW Support

---------------------------------------------------------------------------
Don’t forget to Reply, Kudo, and Accept as Solution.
---------------------------------------------------------------------------
0 Kudos
jdefields
Explorer
Explorer
3,797 Views
Registered: ‎12-02-2014

@chrisar, what needs to happen to escalate this use case to where Xilinx will dedicate some engineering time to creating a true GPU accelerated, PL framebuffer solution.  I think there are enough requests in the forum to justify it at this point.

0 Kudos
tschesnok
Observer
Observer
1,913 Views
Registered: ‎07-23-2019

Wow.. how nice and helpful would that be.. but they can't even respond to this post after 2 full years. I really don't get it.. their hardware guys must have worked VERY hard to get that GPU working.. probably for some key customers.. but startups.. they get NO support. Somehow someone seems to not realize that startups are 50% of their potential future business...  I guess I'm just going to use a plain FPGA and a dedicated ARM.

0 Kudos
tschesnok
Observer
Observer
1,912 Views
Registered: ‎07-23-2019

All this thread tells me is that IT CAN NOT BE DONE. It is saying STAGE AWAY from GPU programming on the Ultrascale. Don't build products using this product... since it means a world of hurt.

 

I mean even the people helping are No Sure... 

0 Kudos
1,776 Views
Registered: ‎05-18-2019

hello

pleas i want to know if you resolve this probleme ?

thanks

 

0 Kudos
jrhtech
Voyager
Voyager
1,726 Views
Registered: ‎10-04-2017

  I got a version of this working by defining a fake display with a custom edid(we have a very special display) and by modifying the GPU driver to directly DMA into the PL.   This really has a limited use case because you can't do offscreen rendering or even multiple apps because all the GPU output is sent directly to the PL.  Also, the GPU does scattered writes across the  "DMA buffer" and we discovered the DMA transfer gets slower the more triangles we try to draw.   

 Ultimately I don't think having the GPU output passed to the PL will work very well; more likely modifying the display driver to DMA to the PL at the point where it normally outputs to the display controller(like DP) might work.

 

jeff

0 Kudos
tschesnok
Observer
Observer
1,713 Views
Registered: ‎07-23-2019

Thanks for your feedback. That is what I was thinking.. replace the DP logic. Is The Display Port logic driven by the GPU (feeding row by row) or does it pull from the buffer at its own pace and address access? Sorry I'm new to this and trying to find out how best to invest my time to find a solution.. if at all. I'm working on a custom, non linear / non traditional display and need GPU buffer access. I would set this up as the traditional double buffered display.. one written to by the GPU (next frame) and one being read by PL (current Frame).

0 Kudos
jrhtech
Voyager
Voyager
1,650 Views
Registered: ‎10-04-2017

I haven't done the work yet so I can't tell you for sure.  However, the memory that the GPU DMAs into are all passed in buffers(or addresses I should say) and as far as I could tell it's memory, not a memory mapped address.  My guess is that when the GPU has written all it's data to a buffer, it notifies another layer/driver/whatever and that piece of software deals with getting the data to the final destination.  For example, the DP pipeline provide a buffer and when the GPU is done the DP pipeline handles the final DMA to the DP controller in HW. 

 Like I said, I haven't done the work so this is just based on the work I did hacking the GPU driver to DMA directly into a PL BRAM.   This is just good enough for where we are at right now but long term it will have to change, especially because of the latency of the DMA into the PL.

 

jeff

0 Kudos
mrbietola
Scholar
Scholar
1,566 Views
Registered: ‎05-31-2012

we used the live output of DP to get graphics in the PL. But is not plain graphics, it's already mixed with video from DP live input. 

We would like to have different independent graphics, is there a way to do this? 

0 Kudos
tschesnok
Observer
Observer
1,498 Views
Registered: ‎07-23-2019

I had posted this idea elsewhere.. but as I get up to speed with zynq dev I had the following idea that may work:

OpenGL is memory mapped (you get a handle). The only way to get buffer access is via glReadPixel. That will always work as a solution to get access to the final render target of the GPU.. BUT making that call stalls the pipeline and creates a CPU memcopy.. very very slow. some have tired this and get 15 fps down from 50-60fps or worse. 

I'm new to this but I believe AXI memory access is "trusted" and can access any of the DDR space as initiated from the PL. Why not upload a specific pattern after allocation and then search for that pattern. Once you find the pattern you get the hard address of the OpenGL render target. Keep in mind that this target will be in GPU format (probably just byte aligned in some way and RGBA may not be in the same order).. but it should be accessible without the GPU / CPU having a clue. Moreover it will probably be the same address every time or in a similar area so finding it in the future would be very fast. It is clearly a hack.. but one that should hold up.

There should be no performance hit other than the extra read overhead at the DDR controller. It would be very similar to the hit you get from attaching a DP display. 

 

Any thoughts?

 

Still.. would it not be nice if Xilinx posted a solution.. since probably 1/2 of all ultrascale users need this? Who does not want to us PL as part of the graphics pipeline?

0 Kudos
ksloatdesignlinx
Explorer
Explorer
1,401 Views
Registered: ‎02-24-2020

Hi @tschesnok 

There is a Xilinx supported way to pass GPU output to the PL and most of it is straight from the TRD as the original reply mentioned.

They key thing you need is a Linux DRM display device/pipeline that represents/controls your display PL. In the case of the TRD, they are using HDMI Tx in the PL as a display output. This would need to be replaced by your custom PL. If you are using a piece of completely custom IP, Xilinx actually has a base Linux DRM device driver called "pl-disp" that represents a DRM CRTC and plane. You may need to implement the additional DRM pieces (encoder/connector) to get the system to fully work correctly, but after your device appears as a display, then the Mali can render to it using any of the regularly supported methods (fbdev, X11, Wayland, etc).

In addition, the TRD also mentions that DMABUF is supported my the GPU:
"The libMali user-space library implements the OpenGLES 2.0 API which is used by the Qt toolkit for hardware-accelerated graphics rendering. The Mali driver also supports DMABUF which provides a mechanism for sharing buffers between devices and frameworks through file descriptors without expensive memory copies (0-copy sharing)."

Since it's supported by both the GPU and the DRM frameworks, this could be another more low level method of doing the same (and some of the former mentioned display server/compositors might be doing this under the hood already). 

Ken Sloat - Embedded Software Engineer
https://www.designlinxhs.com
0 Kudos
tschesnok
Observer
Observer
1,382 Views
Registered: ‎07-23-2019

Thanks @ksloatdesignlinx ,

This makes sense. I will dig into this. Worried that I have to learn more about Linux displays than I wanted to. Wish someone had a demo for this.

0 Kudos
ksloatdesignlinx
Explorer
Explorer
1,374 Views
Registered: ‎02-24-2020

Yes unfortunately that's probably a true statement. Luckily though there are lots of good examples and documentation about the DRM system and Xilinx has done a lot of the more difficult parts by creating most of the display pipeline for you. Here are some good resources on the subject:

https://events.static.linuxfound.org/sites/events/files/slides/brezillon-drm-kms.pdf

https://xilinx-wiki.atlassian.net/wiki/spaces/A/pages/18842520/Xilinx+DRM+KMS+driver

https://github.com/Xilinx/linux-xlnx/blob/master/Documentation/devicetree/bindings/display/xlnx/xlnx%2Cpl-disp.txt

Ken Sloat - Embedded Software Engineer
https://www.designlinxhs.com
0 Kudos