One of the great things about Zynq and Zynq MPSoC devices and the MicroBlaze microprocessor is that we can run (them?) on embedded Linux operating systems. This gives us the ability to easily work with networking, communications and leverage high-level open source frameworks.
Of course, the great thing about using the SoCs of FPGA-based processors is their highly flexible nature that allows new peripherals to be added as needed. For embedded Linux solutions, this flexibility can be an issue as the kernel needs to know the hardware it is running on, its configuration, and also what peripherals are available.
DesignLinx and its customers have been early adopters of the Xilinx® SDAccel™ development environment for both cloud and on-premises applications, using the SDAccel development environment to target both Amazon AWS F1 and Xilinx Alveo™ data center accelerator card with accelerated software. Along with SDSoC and the Xilinx SDK, the SDAccel flow is now part of the Vitis™ unified software platform in version 2019.2, allowing developers to use a single platform for all software tasks on Xilinx devices.
I must admit, I have not worked with the board that this series of blogs is named after since I created the Vitis Acceleration Platform. However, in the past week, I have had two clients reach out to me for help regarding controlling and communicating with custom IP they developed in the programmable logic.
This got me thinking that what would help them is a PYNQ image for the MicroZed board. This helps in several ways:
PYNQ comes with drivers for most PL peripherals and IP. As such, we can focus on the configuration and behavior of the IP.
The PL design will be undergoing changes in development, and PYNQ enables the new overlay to be uploaded and tested with ease.
Of course, to be able to provide this image, I first need to create a PYNQ image for the MicroZed 7020. Doing so is quite straight forward and it is what I am going to demonstrate. To do this, we will need a virtual machine with the latest PYNQ repository cloned. You can see how to do this here.
Just a week ago, Xilinx announced the arrival of Radiation Tolerant (RT) Kintex UltraScale (XQRKU060), our newest addition to the Space-Grade (XQR) portfolio. The new device enables a broader array of functionality and delivers a significant increase in performance compared to our previous generations and other FPGA vendors. By adding an UltraScale device, we are skipping three process nodes from 65nm to 20nm. The product table below shows how RT Kintex® UltraScale™ stacks up to our previous XQR devices, Virtex®-4QV (V4QV) and Virtex-5QV (V5QV).
I do a lot of work for clients using the PYNQ framework on a wide range of different boards. I have noticed a few great things about it, especially when creating custom overlays, so I thought they would make for a good blog. A few weeks ago I wrote about the benefit of using PYNQ in development, so in this blog, we are going to examine a few more interesting aspects of PYNQ.
Understanding Overlays – We can determine what IP blocks are included in an overlay and which driver is being used by simply using the command print(<overlay>.__doc__). This is especially useful when we are working with custom overlays and we want to understand what they contain and how to use them.
One of the key elements of any embedded system is the ability to perform operations at specific intervals, like reading a sensor or updating calculations for example. The best way to do this is to use a timer which triggers a periodic interrupt, indicating to the system it is time to perform the required action.
In the Arm Cortex-M1 core, we have been examining the timer which is called the system timer and is controlled by the SysTick registers. There are a total of four registers which the SysTick timer uses.
Xilinx is partnering with Monolithic Power Systems (MPS) to hold an informational webinar on getting up to speed with the new Zynq® UltraScale+™ RFSoC ZCU216 evaluation kit. This webinar reviews various topics from ground level knowledge of the Zynq UltraScale+ RFSoC device to getting started on the ZCU216 evaluation kit, to an ultra-low noise power solution from MPS, and much more.
Last week we examined how to implement an Arm Cortex-M1 processor core from scratch using the IP provided by the Arm DesignStart program. In this week’s blog, I am going to demonstrate how to configure the software element of the build.
Having used Vivado Design Suite to generate the bitstream which contained the processor and its tightly coupled instruction and data memories. Now we need to create a board support package and the actual application. Once these have been created, we will be updating the tightly coupled instruction and data memories in the bitstream with the new application.
One of the more popular online and in-person classes I present at conferences and in webinars discusses how to implement Arm Cortex-M1 and Cortex-M3 processors in Xilinx programmable logic devices.
In this class, we start with an existing reference design, learn about the tool flow, and implement an application based on the provided reference design. While providing an excellent introduction to working with the Arm Cortex-M1 and Cortex-M3 processors, it does not show how to create Arm Cortex-M1 and Cortex-M3 solutions from scratch.
Over the next few blogs, I am going to explain this beginning with how to implement an Arm Cortex-M1 processor within the programmable logic.
At DesignLinx Hardware Solutions, we use PetaLinux to create custom Linux images in support of our customer’s custom Xilinx based products. When I first heard about PetaLinux, I will admit it; I was skeptical. I come from an embedded Linux background and have done numerous projects involving pure Yocto/Bitbake/OE and integrating Linux within different SoC platforms. Yocto is a great way to create a custom embedded Linux distribution. From building everything from source to its super extensible interface, Yocto allows users to create a custom Linux distribution for their products.
The problem is that Yocto is hard. There is quite a steep learning curve that can make adopting it tough if not painful. Additionally, without a speedy build machine, full images can often take many hours to build (depending on the number of packages). When I finally tried to use PetaLinux, I was pleasantly surprised. It seemed to have many of the advantages of Yocto without the learning curve and build time.
Achieving higher resolution is a never-ending race for camera, TV, and display manufacturers. After the emergence of 4K ultra high definition (Ultra HD) imaging in the market, it became the main standard for today’s multimedia products. 4K consumers are everywhere, from live sports broadcasting to video conferencing on our mobile devices.
4k Ultra HD brings us bigger screens, which gives the viewer an immersive experience. With this standard, the pixilation problem for big screens was solved. There are, however, many technical challenges in developing systems to process 4k Ultra HD resolution data. As an example, a 4K frame size is 3840 x 2160 pixels (8.5 Mpixel) and is refreshed at a 60Hz, equating to about 500 Mpixel/sec. This requires a high-performance system to process 4K frames in real time. Another bottleneck is power consumption, particularly for embedded devices where power is critical. Being low power yet high performance, Xilinx® Zynq® UltraScale+™ MPSoC has shown a strong potential to tackle these challenges. In this blog, you’ll learn all you need to know to start developing a 4K video conferencing project using Zynq UltraScale+ MPSoC.
High Level Synthesis is great for implementing algorithms. However, there are times as we develop our HLS IP that we need to think about how it interfaces with the rest of the system beyond the AXI Interfaces which are our main interfaces.
This can be challenging in HLS as it often means we need to be able to wait on external signals, or to be able to wait for several clock cycles etc. implementing these can be challenging in HLS.
In this blog we are going to look at how we can implement structure in out HLS algorithms
The COVID-19 forces many FPGA designers to work from home while still facing challenging engineering deadlines. Taking boards or lab setups home may not be an option – particularly when a group needs to collaborate through shared hardware / Devices-under-Test (DUT). Having a physically distributed team use the hardware from a central location presents some challenges like: usage administration, swapping SD-Cards, power cycling boards, handling GPIOs, UARTs, etc.
Xilinx® Versal™ silicon architecture and software tools provides a way to drastically improve image quality, speed, and accuracy in medical ultrasound systems using advanced imaging techniques. This greatly improves ultrasound-based diagnostic ability in complicated procedures.
High-speed real-time data acquisition and processing form the fundamental part of all innovative design developments. Taking this into account, iWave Systems has successfully developed and demonstrated a high-speed analog data acquisition and processing system over the JESD204B serial interface on our Zynq® UltraScale+™ MPSoC Development Platform. The JESD204B interface offers seamless connectivity between the AD-FMCDAQ2-EBZ data converters and the Zynq UltraScale+ MPSoC platform, accelerating the development of analog- based designs.
Over the last few blogs (P1, P2, P3), we have looked in depth at High-Level Synthesis (HLS) and its use in image processing.
HLS provides real advantages for image processing as it allows us to focus on our algorithm. We can also achieve very high frame rates when working with HLS, with a little thought about the optimizations we apply.
A few weeks ago we looked at reading a line of data from DDR memory such that we could create a simple test pattern.
For many applications however, we want the ability to inject a two dimensional image into the image processing stream. This gives us the ability to test our image processing algorithms performance using synthetic images.
In markets across the world, continuous demand for higher bandwidth scales beyond what today's technologies and form factors can support. The demand is for more efficient, pervasive compute that scales beyond what CPU and GPU technologies can match.
The Versal™ Premium series provides breakthrough heterogeneous integration, high-performance compute, connectivity, and security in an adaptable platform with a minimized power and area footprint. This highly integrated platform allows users to focus on their unique core competencies and novel algorithms, rather than designing connectivity and memory infrastructure, to achieve the earliest possible time to market.
One of the great things about image processing is we can layer video streams on top of each other. This gives us the ability to do picture in picture and the ability to overlay text and graphics on the screen.
When it comes to displaying information, typically the information we want to display will result from sensor data which has been gathered by a processor. For example temperature, pressure, altitude navigation information etc.
Displaying this information can be achieved in several different ways, if the processor is capable enough it can read the sensors, process the information and then create its own frame buffer in DDR memory which can be applied as an overlay to the output image.
One of the great and often unknown elements of the Vivado Design Suite and Vitis Unified Software Platform is how easy it can be to create a complex application.
The SP701 board is designed for industrial applications, including image processing. To support this it, contains both an MIPI CSI interface and MIPIDSI and HDMI outputs.
Creating image processing systems can be complex so you need to configure the input image capture stream, implement image recovery (e.g. demosiac to convert raw data to RGB pixels), frame buffers using VDMA, and finally create the image output path.
Architecting this image processing pipeline and configuring the settings can be time consuming.
However, if you have an SP701 board, there is a much faster way to get an image processing system up and running. Simply open the example design which is provided with the MIPI CSI-2 RX subsystem.
Xilinx® Alveo™ data center accelerator cards provide programmability, scalability, and performance across any server deployment. These products provide a low latency, power efficient solution that can be easily installed throughout a data center. Adaptable accelerator cards can be deployed to unlock dramatic throughput and latency improvements for demanding compute, network, and storage workloads. From machine learning inference, video transcoding, and data analytics to computational storage, electronic trading, and financial risk modeling, the Alveo cards bring programmability, flexibility, and high throughput while allowing low latency performance advantages to any server deployment.
Did Adam Taylor’s recent article make you curious about the new SP701 Evaluation Kit with a cost effective and very capable Spartan®-7 XC7S100 FPGA on it? Then you probably noticed that there are two Ethernet ports and wondered what you can do with board. Here’s an idea: Use it as an EtherCAT Slave!
Xilinx has released the world-class, Zynq® UltraScale+™ RFSoC ZCU216 Evaluation kit, specially built for system architects and RF designers. This revolutionary platform delivers the power of an adaptable radio platform in a power-efficient, high-performance development system with full software programmability.
We have looked at the Vitis embedded flow a couple of times in the blog; however, to date we have not created a real application using it.
As I talked about last week, I recently received a Spartan-7-based SP701 board and wanted to create a simple application that demonstrates some industrial and embedded vision use cases.
One very important use case in the industrial arena is the interfacing with sensors to understand the equipment's environment. This enables prognostics to be implemented based on the environmental conditions, perhaps detecting failure or other deviations from expectations. Of course, this information can also be provided to a digital twin.
The popular Ultra96-V2 development board from Avnet is now available in an industrial temperature-grade version. The Ultra96-V2 I-grade single board computer (SBC) joins the Ultra96-V2 C-grade SBC, which features a Xilinx® Zynq® UltraScale+™ MPSoC ZU3EG device and supports a temperature range of 0°C to +60°C. The functionality of the I-grade SBC is identical to the Ultra96-V2 C-grade SBC, but offers a wider operating temperature range from –40°C to +85°C (–40°F to +185°F).
When I am not creating blogs or projects, I spend most of my time consulting with clients on FPGA and SoC design. I am lucky to be working on several exciting projects from autonomous driving to satellite payloads.
One thing that I notice I keep mentioning to clients using Zynq and Zynq MPSoC devices is how the PYNQ framework can help during commission and testing of the IP and overall system.
With PYNQ, we can load in the bit file and get started quickly and easily accessing the design using the Jupyter environment.
This means we can start testing the IP and overall application without the need in many cases to write a lot of embedded Linux drivers, thanks to the PYNQ Lib.
Let's take a look at a simple example for the Ultra96-V2 where we want to be able to access a BRAM in the PL.
Over the years we have looked lots at embedded visions systems, as they form a corner stone for many exciting applications, e.g. vision guided robotics, autonomous vehicles, etc.
However, as I was creating a recent embedded vision project targeting the Genesys ZU platform, I noticed several of the key building blocks have been obsoleted and replaced with more capable IP blocks post 2019.1.