The typical Embedded Vision system must process video frames, extract features from those processed frames, and then make decisions based on the extracted features. Pixel-level tasks can require hundreds of operations per pixel and require hundreds of GOPS (giga operations/sec) when you’re talking about HD or 4K2K video. Contrast that with the frame-based tasks, which “only” require millions of operations per second but the algorithms are more complex. You need a hardware implementation for the pixel-level tasks while fast processors can handle the more complex frame-based tasks. This explanation is how Mario Bergeron, a Technical Marketing engineer from Avnet, launched into his presentation at last week’s Embedded Vision Summit 2015 in Santa Clara, California.
Bergeron then showed a slide of just such an application—face recognition and gaze tracking— as implemented using Xylon’s LogicBricks IP core library for Xilinx All Programmable devices and he showed how the system distributed the application tasks between hardware and software, as shown below. One important factor to note: the hardware reduces the image data rate by 54x before sending it to the microprocessor.
One of the most interesting points Bergeron made during his talk was the wide range of interfaces that embedded vision designers must deal with. For an example, Bergeron showed a slide summarizing four image sensors in the PYTHON image-sensor family from ON Semiconductor. The sensor family includes devices with image sizes ranging from 640x480 pixels to 5120x5120 pixels, frame rates ranging from 80 to 840 frames/sec, and bandwidth requirements ranging from 2.88 to 19.84Gbps over multiple LVDS I/O pins. It would be extremely difficult to develop one piece of hardware that could handle the full breadth of this one image-sensor family but it’s easily handled by one Xilinx All Programmable device combined with ready-made IP cores from vendors such as XYLON and Auviz Systems.
Another tool that Bergeron specifically mentioned was the new Xilinx SDSoC Development Environment for Xilinx Zynq SoCs and MPSoCs, which offers up a software-centric, system-optimizing compiler that accepts system-level descriptions written in C or C++ and generates both the software application and the hardware configuration needed to implement the system as described. The SDSoC Development Environment employs software compilers, HLS (high-level synthesis), and prebuilt hardware infrastructure to assemble such systems.
Later in the day, Bergeron demonstrated the above application at the Avnet booth at the expo connected with the Embedded Vision Summit. The demo shows the face-detection application running in real time on a Xilinx Zynq SoC on a MicroZed SOM (system on module) plugged into a special video carrier card. Together, the MicroZed SOM and the carrier card comprise the Avnet Embedded Vision Development Kit, which can accept imagers from multiple vendors as discussed in this video: