The Xilinx Zynq UltraScale+ MPSoC is good for many applications including embedded vision. It’s APU with two or four 64-bit ARM Cortex-A53 processors, Mali GPU, DisplayPort interface, and on-chip programmable logic (PL) give the Zynq UltraScale+ MPSoC plenty of processing power to address exciting applications such as ADAS and vision-guided robotics with relative ease. Further, we can use the device’s PL and its programmable I/O to interface with a range of vision and video standards including MIPI, LVDS, parallel, VoSPI, etc. When it comes to interfacing image sensors, the Zynq UltraScale+ MPSoC can handle just about anything you throw at it.
Once we’ve brought the image into the Zynq UltraScale+ MPSoC’s PL, we can implement an image-processing pipeline using existing IP cores from the Xilinx library or we can develop our own custom IP cores using Vivado HLS (high-level synthesis). However, for many applications we’ll need to move the images into the device’s PS (processing system) domain before we can apply exciting application-level algorithms such as decision making or use the Xilinx reVISION acceleration stack.
I thought I would kick off the fourth year of this blog with a look at how we can use VDMA instantiated in the Zynq MPSoC’s PL to transfer images from the PL to the PS-attached DDR Memory without processor intervention. You often need to make such high-speed background transfers in a variety of applications.
To do this we will use the following IP blocks:
Zynq MPSoC core – Configured to enable both a Full Power Domain (FPD) AXI HP Master and FPD HPC AXI Slave, along with providing at least one PL clock and reset to the PL fabric.
VDMA core – Configured for write only operations, No FSync option and with a Genlock Mode of master
Test Pattern Generator (TPG) – Configurable over the AXI Lite interface
AXI Interconnects – Implement the Master and Slave AXI networks
Once configured over its AXI Lite interface, the Test Pattern Generator outputs test patterns which are then transferred into the PS-attached DDR memory. We can demonstrate that this has been successful by examining the memory locations using SDK.
Enabling the FPD Master and Slave Interfaces
For this simple example, we’ll clock both the AXI networks at the same frequency, driven by PL_CLK_0 at 100MHz.
For a deployed system, an image sensor would replace the TPG as the image source and we would need to ensure that the VDMA input-channel clocks (Slave-to-Memory-Map and Memory-Map-to-Slave) were fast enough to support the required pixel and frame rate. For example, a sensor with a resolution of 1280 pixels by 1024 lines running at 60 frames per second would require a clock rate of at least 108MHz. We would need to adjust the clock frequency accordingly.
Block Diagram of the completed design
To aid visibility within this example, I have included three ILA modules, which are connected to the outputs of the Test Pattern Generator, AXI VDMA, and the Slave Memory Interconnect. Adding these modules enables the use of Vivado’s hardware manager to verify that the software has correctly configured the TPG and the VDMA to transfer the images.
With the Vivado design complete and built, creating the application software to configure the TPG and VDMA to generate images and move them from the PL to the PS is very straightforward. We use the AXIVDMA, V_TPG, Video Common APIs available under the BSP lib source directory to aid in creating the application. The software itself performs the following:
Initialize the TPG and the AXI VDMA for use in the software application
Configure the TPG to generate a test pattern configured as below
Set the Image Width to 1280, Image Height to 1080
Set the color space to YCRCB, 4:2:2 format
Set the TPG background pattern
Enable the TPG and set it for auto reloading
Configure the VDMA to write data into the PS memory
Set up the VDMA parameters using a variable of the type XAxiVdma_DmaSetup – remember the horizontal size and stride are measured in bytes not pixels.
Configure the VDMA with the setting defined above
Set the VDMA frame store location address in the PS DDR
Start VDMA transfer
The application will then start generating test frames, transferred from the TPG into the PS DDR memory. I disabled the caches for this example to ensure that the DDR memory is updated.
Examining the ILAs, you will see the TPG generating frames and the VDMA transferring the stream into memory mapped format:
TPG output, TUSER indicates start of frame while TLAST indicates end of line
VDMA Memory Mapped Output to the PS
Examining the frame store memory location within the PS DDR memory using SDK demonstrates that the pixel values are present.
Test Pattern Pixel Values within the PS DDR Memory
You can use the same approach in Vivado when creating software for a Zynq Z-7000 SoC iinstead of a Zynq UltraScale+ MPSoC by enabling the AXI GP master for the AXI Lite bus and AXI HP slave for the VDMA channel.
Should you be experiencing trouble with your VDMA based image processing chain, you might want to read this blog.