It’s interesting how often I seem to go through similar development projects with different clients. Within a short period of time recently, I had several clients who are looking at DDR3/4 implementations or struggling with their implementation. As it just so happens, now I have several clients who are doing interesting things with MicroBlaze.
These projects range from Triple Modular Redundant versions flying in space to implementing TinyML and machine learning analyzing sensors on a IOT board.
Back on the 29th September 2013, I sat down and wrote what would become the first blog in this series. It was amazingly simple looking at how to bring up the Zynq 7010 on the MicroZed. 400 weeks later and I find myself again sitting down, again on a Sunday to write the 400th installment.
It has been quite a journey and I never expected writing the first blog where it would lead and eventually what it would enable. However, over all installments the blog has been focused on one element helping people to understand how to design with modern FPGAs and in particular Xilinx FPGA and SoC.
Last week we examined several techniques for generating non-integer clock divisions in our FPGA if no PLL was available or we couldn’t use one for that development. This week, we are going to look at another of Peter’s circuits that provides clock switching between two clocks without a glitch on the output.
For most developments, we would use a BUFGCTRL or BUFGMUX which provides glitch-less outputs on the clock. However, as one of the LinkedIn comments on last week’s post reminded me, it is good to understand the principals.
Modern FPGA devices really spoil us. They include PLLs, DCM, DSP etc. and a range of interfaces that significantly ease our developments. Recently though, I faced a situation where the PLL could not be used for safety reasons. This got me scratching my head a little about how I was going to generate a 20 MHz clock from a 50 MHz source without a PLL and also without changing the oscillator frequencies since the 50 MHz is supplied by another satellite payload in this case.
The Kintex UltraScale+ family is considered to be the best price/performance/ watt balance FPGA device built on TSMC 16nm FinFET Technology from Xilinx. Combine with new UltraRAM, new interconnect optimization technology (SmartConnect), this device can deliver the most cost-effective solution for applications that require high-end capabilities GTY transceivers for 100Gbps network and PCIe® Gen4 connectivity, especially for networking and data storage application.
This article demonstrates the 100Gb/s solution of TCP Offload Engine networking IP Core and NVMe PCIe Gen4 SSD Host IP Core implementation on Xilinx’s KCU116 Evaluaiton Kit, which is no CPU solutions for 12GB/s TCP transmission over 100GbE interface and NVMeG4-IP core, which is able to achieve incredibly fast performance ~4GB/s per SSD.
Over the last two months, I have had several clients approach me for help regarding DDR3 / DDR3L interfaces that they have connected to Xilinx FPGAs. These projects ranged from just starting a design to having a prototype / early production board and realizing that the DDR3 doesn’t function correctly. All these projects used traditional FPGAs from the UltraScale and 7 Series device families.
Last year, Xilinx has added a one-of-a-kind speed-demon to our production-proven Virtex® UltraScale+™ HBM FPGA family, VU57P, which integrates 16GB of HBM DRAM with 58G PAM4 transceivers, our high-performance FPGA fabric, and high-speed connectivity hard IP. The production version of the VU57P devices are shipping now.
As a consultant, correctly estimating the time it will take to develop an FPGA or module can mean the difference between making a profit or a loss. If a development takes longer than I expected, it has a double impact. It not only delays payment but also means I can’t start on the next project.
Apparently, I am not alone in this challenge. Respondents to the AspenCore embedded survey indicated that achieving project deadlines was their major challenge.
To aid in this process, I have an established system for estimating the timescale, complexity, and cost of the development that I am being asked to quote. Some of the high-level things I consider are:
One aspect of FPGA design that we haven’t really examined is multi-gigabit transceivers (MGT). These transceivers are available in many Xilinx devices including ones in the 7 Series, UltraScale, UltraScale+ and Versal families.
I first worked with gigabit transceivers back in 2006 on an image processing application using Virtex-II Pro FPGAs. Incidentally it was also my first SoC design because we also used the PowerPC core. Back then, we thought 3.125 Gbps was lightning fast.
Of course, modern FPGAs provide us with a range of bandwidths across several types of gigabit transceiver.
Managers of Xilinx based programs are under pressure to produce quality results on time and on budget. They don’t have time to reinvent the wheel or deal with avoidable problems. Recognizing and mitigating risk, knowing how to accelerate schedules, along with handling project issues and additional costs, are all facets that can make or break a project…especially with challenging designs.
When managers have a working knowledge of the range of Xilinx devices, hardware & software development tools, and emerging applications, they can help their team make better decisions. From device selection to understanding the challenges engineers face, developing a working knowledge of Xilinx devices and tools can help managers accelerate projects and take the mystery out of FPGAs and Adaptive SoCs.
One of the big advantages of the Kria SoM is the ease with which new applications can be created, either around the existing applications or with new AI algorithm dropped in. Knowing how to do both is critical in order to be able to get the most out of the Kria SoM.
In our previous blog, we looked at setting up the Kria and running the smart camera application from the command line.
There is, however, another way to run the smart camera application. By using Jupyter Notebook labs, we can connect to the Kria SoM over an ethernet network and create applications that use both existing and new AI networks.
One of the most common requests I get via my consultancy is about the RFSoC and how its capabilities can be leveraged. If you haven’t used the device before, its capabilities can appear a little daunting, especially when it comes to the configuration of the RF ADC / DAC and associated circuitry.
I’ve said several times that I’m a big fan of the PYNQ framework for rapid prototyping and accelerating production designs. Indeed, we have even previously touched on the PYNQ overlays available by the University of Strathclyde.
If you missed the announcements and my recent blog, Xilinx now offers a range of production-ready system on modules (SOM) named Kria. These SOMs contain a custom UltraScale+ FPGA and are designed to enable you to hit the ground running on day one and take the same elements into production. This is one of the key benefits of using SOMs.
In April, Xilinx hit an exciting milestone as we announced full production shipments for the Versal™ AI Core series and Versal Prime series. For the Prime series, this means the first of the VM1xxx production devices are available; VM1xxx devices feature PCIe® Gen 4 support and 32G GTY transceivers along with the programmable network on chip (NoC) and other software-programmable silicon infrastructure common to all Versal ACAPs. With a wide range of device sizes, VM1xxx devices are ideal for inline storage acceleration, satellite and other avionics control applications, and more.
Big ideas don’t usually require a lot of words. It’s been my experience, the more words you use, the more specific, and smaller, an idea becomes. When we started developing Kria™ system-on-modules (SOMs), we rallied behind the audacious concept of, “Xilinx benefits without FPGA design.” What a rabbit-hole that led us down!
A couple of weeks ago I was working to bring up the sensors connected to the I2C and SPI interfaces on the custom SensorsThink Smart Sensor IoT Board that Dan Binnun and I have designed. Connected to the I2C networks are several sensors including accelerometers, magnetic sensors, linear accelerometers and temperature.
When I ran the I2CDetect commands on the four I2C buses, we saw the expected responses on I2C-0, I2C-2 and I2C-3. However, I2C-1 did not contain the expected response which shows the address of any sensors connected to the network.
Not too long ago, the path from algorithm to fielded machine learning (ML) model was a long and complex undertaking. Those with access to a ML expert with a working knowledge of neural network deployment might have options, but they were time-consuming to develop.
I use PYNQ a lot and one of the things I have been meaning to blog about is how it interacts with hardware.
Whether it’s a custom overlay or one from the community, PYNQ is great at abstracting away the challenges of working with the hardware in the overlay. Often, before we can start the main functionality of the overlay (which normally uses DMA), we need to configure the IP blocks in the overlay for the specific application at hand.
The simplest way to do this is to use the read or write class which is provided as part of the default IP class that wraps the Memory Mapped IO (MMIO) PYNQ class to support the selected driver.
Next-generation network security implementation is under constant evolution and going through an architectural shift from lookaside implementation to inline implementation. With the beginning of 5G deployments and multifold increase in the number of connected devices, the architecture for security implementation needs to be re-visited and modified. 5G throughput and latency requirements are changing the access networks while requiring the need for extra security.
I am a big fan of the system-on-module concept, this blog even inherited its name from the first SoM I started writing about back in 2013, the MicroZed. Used correctly SoMs can really help reduce design risk and accelerate development as the SoM is available from day one. This makes the Xilinx news from today even more exciting. Earlier today, Xilinx introduced their own portfolio of production ready SoMs called Kria. Available for order is the Kria K26 SoM, targeted for Vision AI applications, and the Kria KV260 Vision AI Starter Kit, good for evaluation.
I recently hosted two webinars for Crowd Supply on how to create a fun, breakout Pong-type game on the Basys3 board. The class was not intended for FPGA experts and meant to be an introduction to the Basys3 board and the Xilinx ecosystem and therefore included several different design aspects.
The sensors on the smart sensor IoT development board are connected to the programmable logic element of the Zynq-7020 device that is fitted on the board. These sensors are connected with the exact connection shown below using either a I2C or SPI interface as is common for embedded sensors.
Did you catch the recent launch of our new Artix® UltraScale+™ FPGA and Zynq® UltraScale+ MPSoC products? These new devices represent a new level of compute density perfect for compact edge devices. And with new InFO packaging from TSMC, they offer up to 60% smaller and 70% thinner options compared to existing products.
The smart sensor IOT board we developed contains several interfaces that really require the use of an operating system to get them up and running. Along with the WiFi and Ethernet interfaces, there is also the USB interface. The USB interface on the smart sensor IOT board has been designed so that it can be configured to support USB-OTG or as a USB host.
In this blog, we are going to create a PetaLinux system that supports a USB host to prove that the USB and Ethernet interfaces are working correctly.
To get started, we first need to create a new Vivado project that contains the Zynq processing system configured for the custom Zynq board. Rather than having to enter this configuration from scratch, we can use the TCL script generated in my first blog that featured the smart sensor IOT board.
In a previous installment, we built the Vivado element of an image processing system to be deployed on a Trenz ZynqBerry Zero module. I like this module because it enables a small form factor while also providing HDMI and MIPI interfaces. Of course, as you can see from the previous implementation, the device is fairly full but there is still room to sneak in additional functionality if we wanted to do some image processing.
In a previous instalment, we looked at the Trenz ZynqBerry Zero module which implements a Zynq-7010 device in the same format as the popular Raspberry Pi Zero form factor. This board also provides USB OTG, UART / JTAG over USB, MIPI, HDMI and an SD Card along with several GPIO.
As the board is quite tiny, I wanted to create a simple image processing system using the MIPI input. What is interesting about the design on the ZynqBerry Zero is that unlike the Zynq MPSoC, which has native support for the MIPI D-PHY in the IO, the Zynq-7000 SoC does not have native MIPI D-PHY support. As a result, board designers have a decision to make about how to implement the MIPI interface. An external MIPI D-PHY can be used to provide full MIPI compliance, or alternatively, a MIPI compatible approach can be used which provides a low-cost solution. Xilinx application note 894 provides a range of information on the MIPI interfacing solutions.
The Xilinx® Virtex® UltraScale+™ VU19P FPGA provides the highest logic density and I/O count on a single device ever built in 16nm FPGA, enabling emulation and prototyping, as well as test, measurement, compute, networking, aerospace, and defense-related applications. Your company just decided to use it in your next product. That’s great news for your company and your future. This giant will muscle in on some new market share and provide flexible upgrade paths for the next five years!
Last week, we looked at how we could create a custom Vivado configuration for a Zynq PS on a custom designed board. The fun begins once we have created that configuration and built the BIT file because we need to ensure the configuration is correct and the software is executable on the processing system.
The approach we took here to be remarkably simple. The first thing I wanted to do was to run a simple “Hello, World” application. At this point, we are still unsure the DDR is going to work because of either the Vivado PS DDR configuration or signal integrity and layout.
Of course, the former can be corrected but the latter is more of a fundamental issue. However, my co-author Dan Binnun spent considerable time on the analysis of the DDR during the design.
The Ultra96-V2 is an Arm®-based, Xilinx Zynq® UltraScale+™ MPSoC single board computer based on the Linaro 96Boards Consumer Edition (CE) specification. Ultra96-V2 provides flexible connectors and peripheral expansion options to enable engineers to develop a range of applications from AI/ML to robotics to Bluetooth demonstration platform. The Ultra96-V2 is an entry-level Zynq UltraScale+ MPSoC development environment for any engineers who want to prototype and build a proof-of-concept platform. Additionally, Ultra96-V2 is available in both commercial and industrial temperature grade options.
You may have seen my posts referencing the book I’ve been writing over the last two years with Dan Binnun and Saket Srivastava on how to design embedded systems. When we set out on this journey, we decided we wouldn’t just write about how to do it, we would also design and implement the board at the same time.
The book will be out later this year and will cover the engineering life cycle including a requirements-to-component selection, schematics, layout, and how to develop FPGA applications and reliability calculations etc. Of course, we will be making all the design information available and there will be a dedicated site that will have significant content related to the board.
The board we designed is intended to be used for IOT smart sensor applications. It’s not a development board, but a board designed to real requirements, for a real application. So, let’s take a look at what we have on the board.