Jonathan Blanchard is a lead embedded software developer for real-time kernels at Micrium, the company responsible for the µC/OS RTOS that runs on a variety of processors including the dual-core ARM Cortex-A9 processor in the Xilinx Zynq-7000 SoC. Blanchard has just posted a step-by-step blog about getting Micrium’s µC/OS up and running on the low-cost, Zynq-based snickerdoodle board from krtkl. (For more information about the snickerdoodle, see “$55, Zynq-based, WiFi-enabled Snickerdoodle makes MAKE:’s Maker’s Guide to Boards.”)
Zynq-based snickerdoodle board from krtkl
Blanchard’s blog describes how he installed the µC/OS RTOS on the snickerdoodle and how he downloaded the appropriate BSP for the Xilinx SDK under Vivado 2016.2.
His blog ends happily: “And indeed, the snickerdoodle said hello.”
NGCodec is now offering its H.265/HEVC video codec through the Origami Ecosystem launched earlier this year by Image Matters (see “Image Matters launches Origami Ecosystem for developing advanced 4K/8K video apps using the FPGA-based Origami module”). NGCodec says that its H.265/HEVC video encoder is optimized for low latency and low bit rates and can support two 1080p30 channels or one 1080p60 channel on the Origami B20 board, which is based on a Xilinx Kintex UltraScale KU060 FPGA. (For more information on the Origami B20 board, see “Image Matters, inrevium America, and Tokyo Electron Device team up on agile FPGA platform for fast design of advanced 4K/8K video systems.”) NGCodec plans to sell the Origami B20 board pre-loaded with its H.265/HEVC codec IP in 4Q2016.
For more information about NGCodec’s H.265/HEVC video encoder, see “NGCodec demos HEVC H.265 Real-Time Encoder running on Kintex-7 FPGA at NAB 2015” and “NGCodec demos real-time HEVC/H.265 hardware encoder at NAB 2016 – new demo captured on video.”
And here’s the video demo of the NGCodec H.265/HEVC video encoder on Xilinx KCU105 dev board (based on a Kintex UltraScale KU040 FPGA) earlier this year at NAB 2016:
Here’s a 3-minute video by Andrew Powell showing you something pretty slick and useful—that you can connect a $13 SainSMART I2C 20-character by 4-line display to a Xilinx Zynq-7000 SoC on a Digilent ZYBO trainer board through a PMOD connector with no glue, graft Arduino LCD drivers to PetaLinux, and get “Hello World” up on the LCD. Easy Peasy.
Warning: Someone needs to introduce Mr. Powell to the concept of a camera tripod, but the video makes its point nevertheless.
Powell’s code is on github.
You can use Keysight Technologies’ U5340A FPGA Development Kit for High-Speed Digitizers to develop custom algorithms with high-speed Keysight digitizers with 8- to 12-bit resolution and sampling rates ranging from 1 to 4 Gsamples/sec. Keysight’s Giovanni Lucia has just published an article titled “Embedding custom real-time processing in a multi-gigasample high-speed digitizer” on Embedded.com that describes why you would want to develop custom processing algorithms directly into a high-speed digitizer and how to go about doing it.
Lucia explains that the custom processing algorithms running on an on-board FPGA speeds algorithmic execution and reduces the amount of data you’ll need to extract from the digitizer, reducing I/O and storage loads. A surprising revelation is that you can also insert encryption algorithms into the FPGA, which improves data security by never allowing unencrypted data to leave the digitizer. Development projects targeting rapidly changing markets such as 5G may well benefit from such security measures.
The article also enumerates the minimum features you should expect from an FPGA development kit designed for digitizers:
Keysight’s U5340A FPGA Development Kit is built on Mentor Graphics HDL Designer and ModelSim and Xilinx development tools and LogiCore IP. The kit is compatible with several Keysights digitizers including:
All of these high-speed Keysight digitizers have Xilinx Virtex-6 FPGAs on board for signal processing, both Keysight-written and custom.
Note: For more information about the Keysight U5340A FPGA Development Kit, see “Keysight ups PCIe and AXIe game—lets you add signal processing to the FPGAs in its high-speed digitizers.”
By Adam Taylor
Over the last several blogs, we have looked at how we build a Linux system using both the RAM Disk and file system approaches. The most recent blog culminated with the ZedBoard functioning as a single board computer.
Over the coming weeks, I want to explore embedded vision. To do this we are going to be using the following for vision applications:
Zynq SBC running simple OpenCV demo
An increasing number of embedded applications use vision, ranging from simple security and monitoring systems to robotics, driver awareness, and medical imaging. We must also remember that embedded vision can cover a wider section of the electromagnetic spectrum, from ultraviolet, which is used in scientific imaging, to infrared, commonly used for night vision, security and safety, and thermography.
I think it’s sensible to dedicate a number of blogs to embedded vision topics to see the different approaches, the challenges, and how we can overcome the challenges.
No matter the area of the visual spectrum we are working in, in a typical embedded vision system we will want to:
The beauty of the Zynq-7000 SoC is that we can perform image processing operations within the PS (processor system) or the PL (programmable logic) and indeed we can use tools like we have looked at previously (such as SDSoC) to accelerate PS performance. Or we can use HLS (High Level Synthesis) to generate RTL modules used in the image-processing algorithm.
When we develop image-processing applications we normally use HLLs (high-level languages) and libraries of algorithms to save time. One such collection of image-processing applications is OpenCV, which provides a number of C++ algorithms for real-time, computer-vision applications.
We can use OpenCV on Microsoft Windows or Linux machines. This means we can develop our algorithm on a development machine and then cross compile it to run on the Zynq SoC’s PS under Linux. Even more exciting, Xilinx Vivado HLS supports OpenCV. We can create AXI Streaming IP modules and drop them into the Zynq SoC’s PL within the image-processing chain. Now that is cool.
First, we’ll look at how we can use Vivado HLS and OpenCV on the ZedBoard SBC that we created last week. To use OpenCV on the Zynq SoC, we need to install both the include files and the libraries it uses. We can then develop the OpenCV code.
The first step for the Zynq SBC is to open a terminal window and download OpenCV. There are a number of ways you can do this, including building it from scratch using the source, however I opted for the simplest method and used a command. In my defense, I am time limited when I write these blogs. You may also be under a time crunch so my approach actually has broad appeal.
I used this command to load OpenCV on the Zynq SBC:
sudo apt-get install libopencv-dev
Once the OpenCV files loaded, I was ready to write my first OpenCv application, which opens and displays a specific file. You can find this on my github page. (See below for a link.)
When it comes to compiling the code we can use the built-in GCC compiler using the command line:
g++ `pkg-config - - cflags opencv` <filename.cpp> `pkg-config - -libs opencv` -o <output name>
When I ran this, I got the image appearing above.
Now so far, we have assumed that we wanted to develop the embedded-vision application on the Zynq SBC. However SDK comes with the OpenCV libraries, so if we wish to use them we can develop our application using SDK on a workstation host and then upload the file to our SBC or to another Zynq implementation. To do this though we need to make SDK aware of the include directory and the library locations. These are in the screenshots below for my installation of Xilinx SDK:
Setting Include Directory in SDK
Setting Libraries in Xilinx SDK
Now we have pipe-cleaned the process. Next week, we’ll look a little more at different image-processing applications using this SBC and the many different ways we can implement them.
The code is available on Github as always.
If you want E book or hardback versions of previous MicroZed chronicle blogs, you can get them below.
The Ethernet Alliance and NBASE-T Alliance have announced a collaborative effort to accelerate mainstream deployment of 2.5GBASE-T and 5GBASE-T Ethernet, which leverages the last 13 years of infrastructure construction based on Cat5e and Cat6 cabling (some 70 billion meters of cable!). The two organizations plan to validate multi-vendor interoperability at a plugfest scheduled for the week of October 10, 2016 at the University of New Hampshire InterOperability Laboratory (UNH-IOL) in Durham, NH. For more information on the Ethernet Alliance/NBASE-T Alliance plugfest, please contact email@example.com or firstname.lastname@example.org.
Note: Xilinx is a founding member of the NBASE-T Alliance. (See “NBASE-T aims to boost data center bandwidth and throughput by 5x with existing Cat 5e/6 cable infrastructure” and “12 more companies join NBASE-T alliance for 2.5 and 5Gbps Ethernet standards.”
For additional information about the PHY technology behind NBASE-T, see “Boost data center bandwidth by 5x over Cat 5e and 6 cabling. Ask your doctor if Aquantia’s AQrate is right for you” and “Teeny, tiny, 2nd-generation 1- and 4-port PHYs do 5 and 2.5GBASE-T Ethernet over 100m (and why that’s important).”)
I’m not sure that Inrevium named its ACDC Quattro Kintex UltraScale Development Platform after the Australian rock and roll band AC/DC, but the specs on the board certainly do recall the band’s albums High Voltage and Powerage. This is one high-powered dev board and much of that power comes from the Xilinx Kintex UltraScale KU115 FPGA, the largest member of the Kintex UltraScale FPGA family with 1.451 million system logic cells. The Kintex UltraScale KU115 is also a DSP monster with 5520 DSP48E2 slices on chip. (See “The UltraScale DSP48E2: More DSP in every slice.”) The FPGA is flanked by 4Gbytes of DDR4-2400 SDRAM and enough I/O to do what needs to be done. I/O ports on the board include one SFP+ optical cage, four FMC connectors, and 20 SMA connectors for various clocks. The board has already been proven to work with Inrevium HDMI4K, 12G-SDI, DP1.2, V-by-One, MIPI, and Zynq FMC cards.
Here’s a block diagram of the Inrevium ACDC Quattro Kintex UltraScale Development Platform:
Block diagram of the Inrevium ACDC Quattro Kintex UltraScale Development Platform
And here’s a photo of the board:
Inrevium ACDC Quattro Kintex UltraScale Development Platform
Please contact Inrevium for more information about the ACDC Quattro Kintex UltraScale Development Platform.
While browsing the Digilent Web site, I spotted a new blog posted by Quinn Sullivan titled “Embrace Your Geekness!” that featured cookies—not the HTML kind but the eating kind. In the same spirit at the Zynq UltraScale+ MPSoC cookie discussed in one of Xcell Daily’s posts last year (see “First Xilinx Zynq UltraScale+ MPSoC Cookie Ships”), Sullivan’s blog post shows four cookies made to look like four Digilent products, all based on Xilinx All Programmable devices:
Digilent’s All Programmable Cookies
From left to right, these Digilent products are:
The cookies look so good, it’s a shame to eat them. I don’t suppose they’ll power up.
Several low-cost Digilent dev boards based on Xilinx All Programmable devices have Digilent PMOD expansion connectors on them including:
Diglilent ARTY Board with four PMOD connectors along the top
“PMOD” is a contraction of “Peripheral Modules” and Digilent offers a boatload of these small PMODs to get your prototype up and running quickly. Currently, Digilent is running a “Summer Sale” on PMODs “now through the end of Summer.” I counted 68 PMODs on sale—way too many to list here. However, I’ll list a few of my favorites with the list and sale prices:
This is your chance to stock up on some peripherals at pretty low prices.
Opsero Electronic Design’s FPGA Drive cleverly allows you to connect an M.2 NVMe SSD to an FPGA dev board using a PCIe or FMC connection. A picture is worth 100 words in a blog post:
Opsero Electronic Design’s FPGA Drive connects an M.2 SSD to your FPGA/SoC dev board
The image on the left shows an FPGA Drive board with the PCIe form factor plugged into a Xilinx KC705 Kintex-7 FPGA Eval Kit and the image on the right shows an FPGA Drive board with an FMC connector plugged onto an Avnet PicoZed SOM based on a Xilinx Zynq Z-7030 SoC. A standard M.2 NVMe SSD will plug into either FPGA Drive board.
However, a hardware connection alone is not sufficient to make an SSD work within a system. You need drivers and a file system. For information on that software, turn to Jeff Johnson’s FPGAdeveloper blog titled “Measuring the speed of an NVMe PCIe SSD in PetaLinux” where Johnson tests the FPGA Drive’s performance using Xilinx PetaLinux on the KC705 board—where it runs on a Xilinx MicroBlaze processor—and the Avnet PicoZed SOM—where it runs on the Zynq SoC’s dual-core ARM Cortex-A9 MPCore processor. (Johnson is an electronic design consultant and Opsero is his design services company.)
Here’s a pleasant surprise: About two years ago, Andy Brown purchased 40 obsolete but unused, waffle-packed Xilinx Virtex-E FPGAs (introduced in 1999, see page 5 in Xcell journal, Issue 34) on eBay for the bargain price of less than £3 each and decided to build an FPGA development board around the device. (Note: Please do not consider eBay as an authorized Xilinx distributor based on this Xcell Daily blog post. They’re not, at least not in the present space-time continuum.)
Now practically speaking, the Virtex-E XCV600E device that Brown uses in this tutorial has been obsolete for many years—so obsolete that you will have a hard time locating the data sheet on the Xilinx Web site (it’s easier using Google) and you’ll need a really old version of the Xilinx ISE Design Suite—ISE Version 10.1 was the last to support Virtex-E devices—and a PC or virtual PC running Microsoft Windows XP to create designs for these historic parts.
What’s resulted from Brown’s efforts, and the reason for this blog post, is that Brown has written a really nice, basic tutorial on FPGA design with a jam-packed, 24-minute video based on his experience in designing this dev board.
Brown’s lightning-fast tutorial covers:
And it’s all explained in Brown’s exceptionally easy-to-understand, matter-of-fact manner.
Andy Brown’s FPGA Development Board based on an obsolete Xilinx Virtex-E FPGA
(Photo from Andy’s Workshop)
And now, here for your entertainment and amusement is a Virtex-E FPGA marketing video from the year 2000, complete with some exciting disco music and an FPGA memory hierarchy that has stood the test of time for nearly two decades, until Xilinx introduced its 16nm UltraScale+ device families (keep reading below the video for more details):
For comparison purposes, the largest Xilinx Virtex-E device circa 1999, the XCV3200E with 73,008 logic cells and 1,038,336 bits of BRAM, is functionally smaller than a contemporary, mid-sized, 28nm Xilinx Artix-7 A75T FPGA with 75,520 logic cells and 3,780,000 bits of BRAM. In addition, the Artix-7 FPGA has some very important programmable features that the Xilinx Virtex-E FPGA did not, like its 180 DSP48E1 slices and eight 6.6Gbps SerDes transceivers.
As amusing as this comparison of old versus new might be, the Virtex-E FPGA family was still quite capable for its day. For example, here’s a 16-year-old paper titled “20-GFLOPS QR processor on a Xilinx Virtex-E FPGA,” written by Richard L. Walke, Robert W. M. Smith, and Gaye Lightbody of the Defence Evaluation and Research Agency in the UK and published in the year 2000 in the SPIE Proceedings. The FPGA used for the work cited in this paper was a Xilinx Virtex-E XCV3200E.
Xilinx Virtex-E FPGAs were the “bigger, better, faster” versions of the original, groundbreaking Xilinx Virtex parts, made possible by jumping from a 0.22μm IC process technology to a 0.18μm IC process technology. These first Virtex and Virtex-E devices introduced BRAMs (block RAMs) with true dual-port capabilities. Much improved BRAMs are still available today on Xilinx All Programmable devices and they are now augmented with UltraRAMs in the latest Xilinx UltraScale+ FPGAs and Zynq UltraScale+ MPSoCs. For the first time in 17 years, UltraRAMs introduce a new level in the All Programmable device memory hierarchy. (For more information about UltraRAM, see “UltraRAM: a new tool in the memory hierarchy you’ll want because it fits so well into your system designs.”)
Brown had his own very valid educational reasons for designing his dev board using the Xilinx Virtex-E parts that he purchased through eBay. You may well want that design experience or you may want to start learning to use modern FPGAs more quickly. The Virtex-E XCV600E FPGA on Andy Brown’s dev board actually has fewer on-chip programmable-logic resources than today’s smallest Artix-7 FPGA, the A15T. So if you want to dive right into FGPA exploration, you might want to consider getting the $99 Digilent ARTY board based on a Xilinx Artix-7 A35T FPGA or the $189 Digilent ZYBO trainer board based on a Xilinx Zynq Z-7010 SoC. Both the Artix-7 A35T FPGA and the Zynq Z-7010 SoC have more on-chip FPGA resources than the 17-year-old Virtex-E XCV600E FPGA and you’ll get to use the latest Xilinx Vivado Design Suite tools with these newer devices.
(For more information about the Digilent ARTY board, see “ARTY—the $99 Artix-7 FPGA Dev Board/Eval Kit with Arduino I/O and $3K worth of Vivado software. Wait, What????” and for more information on the Digilent ZYBO trainer board, see “ZYBO has landed. Digilent’s sub-$200 Zynq-based Dev Board makes an appearance (with pix!)”)
IBM Power Systems recently placed a story on the Washington Post Web site titled “Powerful computing crunches genomic data at warp speed” that describes the contribution Edico Genome is making to medical research using hardware accelerators to speed genomic research. Edico Genome president and CEO Pieter van Rooyen and his team were analyzing blood work in South Africa, working on two diseases—HIV and tuberculosis—and they developed a custom hardware accelerator card to speed results. That accelerator card is the Dragen, which plugs into IBM’s Power Systems S822LC high-performance computing servers based on IBM’s POWER8 processor. The DRAGEN accelerator card is based on a Xilinx 28nm FPGA.
The result? “Previously, sequencing one genome would have taken around 30 hours,” van Rooyen said. “We have that down to 26 minutes. And soon, we’ll have that down to 10 minutes.” He foresees a time when a full sequence can be compiled in near real time.
Edico Genome Dragen Accelerator Card for Genome Analysis is based on a Xilinx 28nm FPGA
For more information about Edico Genome’s Dragen accelerator card, see “FPGA-based Edico Genome Dragen Accelerator Card for IBM OpenPOWER Server Speeds Exome/Genome Analysis by 60x.”
By Adam Taylor
In the build we used last week for our Linux example, we placed the ZedBoard’s required boot files on an SD Card. One of these files was a RAMdisk image of the file system needed for Linux. This file loads into RAM each time we boot the ZedBoard. Changes we make within the file system while the board is operating—e.g. transferred files, etc—are lost the next time we reboot.
If we actually want to keep these changes between power cycles, we need to put a file system on non-volatile memory such as the SD Card. Boards like the Snickerdoodle and Parallela use a file system in place of the RAMdisk. Although it is possible to update the RAMdisk contents as this blog shows, it requires a number of steps.
Going forward I want to look at image processing and Open CV so it is important that we have a file system we can save changes in. I also want to use the ZedBoard as a single bard computer so I need to use the HDMI outout to generate a display of the desktop. That’s another reason we need a file system.
To do this we need a file system and we need to format the SD Card correctly. Looking first at the SD Card, we need to create two paritions. One partition is FAT formatted and that’s where we store the following files:
The second SD Card partition is where we store the file system. This is normally the larger of the two paritions and it’s formated for a Linux file format (e.g. EXT2 or EXT4).
We will need to use a Linux-based OS to create the partitions and then the install the file system. This is where our virtual machine comes in very handy.
With the Linux OS up and running, insert the SD Card into its reader, open a disk utility, and format the SDCard. Once it’s formatted, we create two partitions called boot and rootfs, sized and formatted as follows:
Typically, the file system will be distributed as a compressed folder which you must extract onto the file system partition.
One of the most commonly used file systems for the Zynq-7000 SoC is Linaro, which can be obtained from linaro.org. This is this file system used for this example. I will be using the Zedboard.org desktop Ubuntu Linux example available here.
Should we need to build a system from scratch, we use u-boot to ensure it knows where the file system resides.
While the example comes with the device tree, kernel image, and Boot.bin file necessary to boot the system, we need to download the filesystem.
With the file system downloaded the next step is to unzip the contents of the filesystem to the SD card rootfs partition. We do this using the command in a Linux terminal window:
sudo tar - - strip-components=3 -C <path to rootfs>/rootfs -xzpf <path to downloaded fs>/linaro-precise-ubuntu-desktop-20121124-560.tar
This may take some time (5 to 15 minutes or even longer). The final step is to move the DTB, kernel image, and boot.bin files onto the boot partition of the SD Card. We should then be ready to boot the system.
When I did this I was presented with the following desktop on my HDMI Monitor:
The code is available on Github as always.
If you want E book or hardback versions of previous MicroZed chronicle blogs, you can get them below.
Note: The Amazon ebooks have been listed at $0.00 for a long time. That’s about to end on July 20 so you might want to download them today or tomorrow before the price changes.
VadaTech just announced three new PCIe carrier cards for FMC, which means that each of the PCIe boards has a VITA-57 FMC connector on it but they also have some pretty capable Xilinx FPGAs on board as well. The three new cards are:
Today’s press release say that these cards are “ideal for bringing COTS PCIe systems up to date with the latest FPGAs,” and they are. They’re also good for ASIC prototyping/emulation and for building 100G networking gear; the series of three PCIe carrier cards gives you a family of products to use as a broad foundation for a range of end products.
VadaTech PCI595-- PCIe FPGA Carrier for FMC, Virtex UltraScale VU440 FPGA
Adam Taylor, author of the long-running MicroZed Chronicles and all-round engineer’s engineer, thinks there are ten FPGA design techniques that every design engineer should know. He’s published the list in an article on the EETimes Web site. These ten design techniques represent the fundamentals of digital system design and it would be well worth your time to learn them.
The ten design techniques are:
That’s Adam Taylor’s top 10. Master ‘em all!
The MV1-D2048x1088-HS03-96-G2 GigE hyperspectral camera from Photonfocus images 2048x1088-pixel video at 42fps in 16 spectral bands from 470nm to 630nm (the blue through orange part of the visible spectrum) using an IMEC snapshot mosaic CMV2K-SM4X4-470-630-VIS CMOS hyperspectral image sensor. The camera images each spectral band at 512X256 pixels. This new Photonfocus GigE hyperspectral video camera complements the Photonfocus MV1-D2048x1088-HS02-96-G2 GigE hyperspectral video camera introduced last year, which images the 600nm to 975nm part of the visible spectrum in 25 spectral bands. (See “Hyperspectral GigE video cameras from Photonfocus see the unseen @ 42fps for diverse imaging applications.”)
Photonfocus MV1-D2048x1088-HS03-96-G2 GigE hyperspectral video camera and IMEC image sensor
Here’s a spectral sensitivity plot for this new hyperspectral camera:
Photonfocus MV1-D2048x1088-HS03-96-G2 GigE hyperspectral video camera spectral sensitivity plot
Not coincidentally, the new Photonfocus hyperspectral camera is based on a Xilinx Spartan-6 FPGA vision platform—the same vision platform the company developed and used in last year’s MV1-D2048x1088-HS02-96-G2 GigE hyperspectral video camera. The FPGA-based platform gives the company maximum ability to adapt to new image sensor types as they appear. The Spartan-6 FPGA allows the engineers at Photonfocus to more easily adapt to radically different sensor types with different pin-level interfaces, bit-level interface protocols, and overall image-processing requirements.
Here, the requirements of the 16-band hyperspectral image sensor versus a 25-band hyperspectral image sensor clearly require significantly different image processing. Using an FPGA-based platform like the one based on a Spartan-6 FPGA that Photonfocus is using allows you to turn new products quickly and to capitalize on developments such as the introduction of new image sensors. This latest camera in the growing line of Photonfocus cameras is proof.
Contact Photonfocus for information about these GigE hyperspectral cameras.
According to an EETimes article titled “Virtual Nets Pave Road to Acceleration” written by Cavium’s Nabil Damouny, the ETSI (European Telecommunications Standards Institute) NFV work group has published several NFV (network functions virtualization) guides over the past six months relating to networking data-plane acceleration for VNFs (virtual network functions) requiring high performance and/or low latency. These documents address use cases, VNF interfaces, virtual switch benchmarks, and virtual network management. (Damouny is an ETSI NFV work group rapporteur, meaning he reports on the group’s activities.)
The documents include:
The ETSI Centre for Testing and Interoperability is organizing its first NFV Plugfest, currently scheduled for January 23 to February 3, 2017. It will be held in Leganes near Madrid, Spain. There’s also an ETSI NFV Webinar titled “ETSI NFV Interfaces and Architecture Overview,” scheduled for September 8, 2016.
Note: Xilinx Ireland is a member of the ETSI NFV work group.
OK, it’s a solitaire-playing robot based on a delta-configured 3D-printer chassis, but this demo of the Zynq UltraScale+ MPSoC thoroughly uses the device’s full capabilities to perform all of the following tasks on one chip:
The demo distributes these tasks across all of the computing and processing elements in the Zynq UltraScale+ MPSoC including:
The Zynq UltraScale+ MPSoC’s programmable logic fabric handles Input from the demo system’s camera, which requires high-speed processing. Letting the on-chip FPGA fabric handle the direct video processing is far more practical than using software-driven microprocessors with respect to power-consumption and performance. However, the on-chip ARM Cortex-A53 processors are ideal for image recognition and decision making based on the recognized objects.
The dual-core ARM Cortex-R5 real-time processors handle motor control. The operate in lockstep because this is a safety-critical operation. Any time there’s metal in motion that could cause accidental injury, you have a safety-critical task on your hands.
Finally, the Mali-400 handles the GUI’s graphics and the video inset overlay.
Quite a lot packed into that one chip. Maybe this system looks a lot like one you need to design.
Here’s the 4-minute video:
For more information on the Zynq UltraScale+ MPSoC and additional technical details about this demo, see Glenn Steiner's blog post "Heterogeneous Multiprocessing: What Is It and Why Do You Need It?"
The MANGO Project, funded from the European Union’s Horizon 2020 research and innovation program, has developed a research vehicle for exploring power, performance, and predictability for new manycore architectures with an emphasis on extreme resource efficiency through power-efficient, heterogeneous HPC (high-performance computing) architectures. This hardware research vehicle, the MANGO architecture, serves as both a computing platform and an emulation platform for new HPC architectures.
The current plan is to demonstrate three real-time applications running on the MANGO platform: transcoding, medical imaging, and security-related applications. In addition, researchers will use the MANGO platform for researching QoS strategies within a heterogeneous HPC context and hardware/software co-design for power and thermal management.
The MANGO platform combines general-purpose compute nodes (GNs) with heterogeneous acceleration nodes (HNs). MANGO GNs are standard blade servers that combine high-end CPUs and GPUs. MANGO HNs are based on reprogrammable hardware using Xilinx All Programmable devices, implemented using modular proFPGA quad motherboards and daughter cards from PRO DESIGN Electronic gmbh. Each proFPGA daughter card currently carries a Xilinx Virtex-7 2000T FPGA or a Xilinx Zynq-7000 SoC.
MANGO HNs implement tiles, defined as a basic computing unit with the needed communication support. The figure below shows the overall MANGO architecture with GNs on the left and HNs on the right.
MANGO Architecture Phase I
Tiles are replicated many times within the system, making them appear to be homogeneous components but internally, each tile can be customized to meet different computing needs and capabilities, resulting in a heterogeneous computing configuration as shown below:
MANGO Architecture Tile Detail
The figure shows a daughter card carrying a Xilinx Virtex-7 2000T FPGA implementing four HN tiles and a daughter card carrying a Zynq-7000 SoC implementing two HN tiles.
System architecture is developed using PRO FPGA’s proFPGA Builder software, which automatically detects the physical layout of motherboards and daughter cards and generates a code framework for multi-FPGA HDL designs including scripts for simulation, synthesis, and for running the design. The Xilinx Vivado Design Suite then performs the synthesis, placement, and routing for the Xilinx FPGAs in the system.
For more information about the MANGO Project, download this paper: “The MANGO FET-HPC Project: An Overview.”
Today, National Instruments (NI) launched its 2nd-generation PXIe-5840 Vector Signal Transceiver (VST), which combines a 6.5GHz RF vector signal generator and a 6.5GHz vector signal analyzer in a 2-slot PXIe module. The instrument has 1GHz of instantaneous bandwidth and is designed for use in a wide range of RF test systems including 5G and IoT RF applications, ultra-wideband radar prototyping, and RFIC testing. Like all NI instruments, the PXIe-5840 VST is programmable with the company’s LabVIEW system-design environment and that programmability reaches all the way down to the VST’s embedded Xilinx Virtex-7 690T FPGA. (NI’s 1st-generation VSTs employed Xilinx Virtex-6 FPGAs.)
National Instruments uses this FPGA programmability to create varied RF test systems such as this 8x8 MIMO RF test system:
And this mixed-signal IoT test system:
For additional information on NI’s line of VSTs, see:
S2C has announced eight new Prototype Ready interface cards and accessories to its growing library of off-the-shelf hardware and software products compatible with its Prodigy Complete Prototyping Platform. These new modules allow you to prototype SoC designs using a variety of pre-verified interfaces that work out-of-box with S2C’s comprehensive family of Prodigy Logic Modules, which includes modules based on Xilinx Virtex UltraScale, Kintex UltraScale, Virtex-7, and Kintex-7 All Programmable devices. Interfaces in the library now include general-purpose ports (GPIO, USB 2.0, USB 3.0, PCIe and PCI, Gigabit Ethernet, GMII and RGMII, and RS-232); high-speed GT-based ports (PCIe Gen2 and Gen 3, SFP+, and SATA); media-oriented peripherals (HDMI, DVI, MIPI, and VGA); and expansion ports (FMC and DDR2/3/4).
For more information about S2C’s Prodigy Logic Modules, see:
And don’t forget to get a copy of S2C’s new book on FPGA Prototyping, “PROTOTYPICAL: The Emergence of FPGA-based Prototyping for SoC Design.” (See “New S2C Book on FPGA Prototyping: Download it for free immediately before they change their minds!”)
When you’re first developing the algorithms to use programmable logic for controlling a lot of power, things can go south very quickly making sophisticated diagnostics is pretty handy. ELMG Digital Power, a New Zealand consultancy focused on power control, has developed a sophisticated monitoring capability in the form of a data-collection IP block called Dlog that you can use to collect data from within a Xilinx Zynq-7000 SoC and a companion application called ControlScope that allows you to visualize significant events occurring in your power-control design. There’s a new blog posted on ELMG’s Web site that describes these two products.
Combined, Dlog and ControlScope allow you to collect and analyze large amounts of data so that you can detect power-control problems such as:
Data can be logged to on-board Flash memory cards at a high data rate. It can also be transferred over Ethernet to a PC running at 25Mbytes/sec.
This is probably a good time to remind you of tomorrow’s free ELMG Webinar on Zynq-specific power control presented by Dr. Tim King, ELMG Digital Power’s Principal FPGA Engineer. Key digital power control questions to be answered in this Webinar include:
There are a limited number of spots for this Webinar and I received an email today from ELMG that said only 150 spots remained.
By Adam Taylor
Having completed the “hello world” program on our Zynq-based Snickerdoodle system last week (see “Adam Taylor’s MicroZed Chronicles Part 137: Getting the Snickerdoodle to say “hello world” and wireless transfer”), we’ll now look more deeply into how we can exploit the capabilities of the Zynq-7000 SoC’s PL (programmable logic) side with the Linux OS.
We have looked at Linux previously, including Xilinx’s PetaLinux. However, looking back I see that there are a few areas that I want to cover in more depth. So before showing you how to build a Zynq SoC PL configuration and an appropriate Linux OS for the Snickerdoodle, I am going to quickly review what we need to do for the general case.
We first look at what we need to do to create a Zynq design that exploits the PL’s hardware capabilities while running a Linux OS. To run this we need the following:
The procedures for generating the PL bit file and the first-stage bootloader are the same regardless of which operating system we wish to use. However, the remaining tasks will be new if we have not developed for Linux before.
Rather helpfully Xilinx provides everything we need prebuilt (kernel image, uboot, ramdisk, etc.) with each new version of Vivado. We can obtain prebuilt versions of these files from the Xilinx Linux Wiki for the Zedboard, ZC702, and ZC706 dev boards. Using these prebuilt kernel, ramdisk, and uboot files with an updated device tree that represents the PL design in the bit file is a good way to get our system up and running quickly. To use this approach, we need to check that the prebuilt kernel contains the drivers for our PL design. We can find the list of drivers here.
To make the Linux OS familiar with the hardware, we use the device tree blob, which details the memory, interrupts, locations, etc. of the hardware connected to the processor. When we develop a bare-metal application, we need to generate a Board Support Package (BSP) that contains details of the drivers required and address locations. The device tree blob does something similar to the BSP but does it for Linux. We can generate the raw device tree source (DTS) file within either Microsoft Windows or Linux. However, we can only compile the DTS into the DTB using a Linux installation.
We need to download the device tree plug-in for the Xilinx SDK from the Xilinx github to generate the DTS file. We can also get all of the other files to build the kernel, uboot, etc. from the same GitHub repository.
Once we have downloaded the device tree compiler onto our computer, we need to add it as a repository into which the SDK will generate the DTS from our hardware platform specification. In my system I added this as a global repository. With the plug-in installed, we will see a new device_tree option under file type when we select File -> New -> Board Support Package. For Linux applications, we wish to generate a device tree file.
The example hardware platform specification for this demo is simple and connects the PS to the ZedBoard’s LEDs via an 8-bit AXI_GPIO module.
This process takes your hardware platform from Vivado (importing it as we previously have done for bare-metal builds) and creates the device tree source. As Vivado generates this file, it will ask you to define any boot arguments. For the pre-built Zedboard, these arguments are:
console=ttyPS0,115200 root=/dev/ram rw earlyprintk
This process generates several files:
We then use the device tree compiler to convert these files into a compiled device tree blob that we can use in our system. We must use a Linux machine to do this.
The first thing to do within our Linux environment is to download the device tree compiler. If you do not already have it, use the command:
sudo apt-get install device-tree-compiler
Once this is installed, we can compile the device tree source using the command:
dtc -I dts -O dtb <path> system.dts -o <path>devicetree.dtb
The dtb (device tree blob) is device-independent and does not require cross compilation for the ARM architecture.
With the device tree compiled, we can then create a boot image (boot.bin) using a first-stage bootloader based on the hardware platform and the prebuilt uboot.elf.
We can then put the boot.bin, devicetree.dtb, ramdisk, and kernel image files on an SD Card, insert it into the ZedBoard, and the Linux OS should boot successfully. For this example, the PL design has an AXI_GPIO module connected to the LEDs on the ZedBoard. If all is working properly, we will be able to toggle the LEDs on and off.
There are also other ways to check for successful mapping to the PL hardware. For example, we can connect to the ZedBoard using WinSCP and explore the file system of the Linux OS running on the ZedBoard. To see that our PL device has been correctly picked up, we can navigate to the directory /sys/firmware/devicetree/base/amba_pl/ where we’ll see the GPIO module and the address range that Vivado assigned to it.
If we wish to test the functionality of the GPIO driving the LEDs using SSH, we can access the ZedBoard and issue commands to control the LED status. We find these within the /sys/class/gpio/ directory where there are two exported GPIOchips. The LEDs are connected to the first one, which ranges from IO 898 to 905. These I/O addresses correspond to the eight LEDs. We can work out the GPIO size by looking in the gpiocchipXXX directories and examining the ngpio file.
We can quickly test the GPIO by turning on the LEDs using the commands in the screen shot below:
The code is available on Github as always.
If you want E book or hardback versions of previous MicroZed chronicle blogs, you can get them below.
There’s a new memory in the system-design hierarchy and it’s called UltraRAM. For as long as we’ve had digital electronic computing machines—since 1945—we’ve used memory hierarchy to handle varied storage requirements. At the small, fast end are registers. At the other end, where there are massive storage needs, there’s disk and tape. Focusing exclusively on semiconductor memory, DRAM and Flash EPROM occupy the big, slow end of today's memory hierarchy. System designers strongly want to keep data on chip as long as possible because of the power and delay penalties associated with going off chip. UltraRAM, a new and larger on-chip SRAM technology incorporated into all Xilinx UltraScale+ device families (Virtex UltraScale+ FPGAs, Kintex UltraScale+ FPGAs, and Zynq UltraScale+ MPSoCs), adds a new and very useful level in the memory hierarchy for systems based on All Programmable FPGAs and SoCs.
You will want to consider using UltraRAMs for your next design. Here’s why:
As discussed in previous Xcell Daily blog posts, UltraRAM blocks are dual-ported, 288Kbit synchronous SRAMs that blow away old on-chip memory limits. You can easily stitch together several UltraRAM blocks to create larger, multi-Mbit, on-chip memories without using additional programmable logic.
UltraRAMs do not replace the bulk storage of off-chip SDRAMs or the non-volatile storage of Flash EPROM. Instead, they postpone the need to go off chip with your data. You want to do that for at least three reasons:
Instead of replacing SDRAM and Flash memory, you can use the UltraRAMs’ speed and capacity to replace intermediate-capacity storage requirements currently served by off-chip SRAMS, RL-DRAMs, and CAMs in applications including:
Absorbing those external memories into an FPGA or MPSoC can reduce your BOM cost in addition to boosting performance, cutting power consumption, and decreasing the amount of pcb real estate consumed by external memory chips.
You might think that UltraRAMs eliminate the need for BRAMs, which FPGAs have incorporated for more than a decade. Not so. BRAMs have at least three advantages over UltraRAM:
For these reasons, all UltraScale+ devices incorporate BRAMs in addition to any on-chip UltraRAMs.
There’s an excellent, new 20-minute video that gives you the technical details you need to help you judge the value of UltraRAM to your system-level design. Here it is:
And don’t forget the UltraRAM White Paper, available here.
For previous Xcell Daily UltraRAM coverage, see
I designed workstation boards at Cadnetix based on VERSAmodule bus protocols back in 1981 and I used PLDs to do so. That was three years before Xilinx was founded and four years before the birth of the FPGA, so I used bipolar PALs (mostly 16L8s) for my designs. VME (VERSAmodule Eurocard) board designers later used FPGAs. A new article published this week on Electronic Design’s Web site titled “VME Interfaces Return to FPGAs” discusses the full-circle history of VMEbus design from FPGAs to ASSPs and now back to FPGAs. Author Michael Slonosky, Senior Product Marketing Manager for Power Architecture SBCs at Curtiss-Wright Defense Solutions, writes:
“Originally handled with discrete components, the VME interface moved to FPGAs to reduce board real-estate usage and increase performance and features. FPGA implementations then migrated into integrated devices, such as Tundra/IDT’s Universe, Universe II, and more recently, the Tempe device. Now, following the announcement in 2014 that the popular Tempe Tsi148 VMEbus bridge chip was being end-of-life’d (EOL), many leading COTS vendors have returned to the FPGA approach.”
The author then cites Curtiss-Wright’s Helix, an FPGA-based replacement for the Tempe Tsi148 VMEbus bridge chip with superset features. In addition to allowing Curtiss-Wright to add features, basing Helix on an FPGA essentially future-proofs the interface design. Curtiss-Wright is a major VMEbus supplier so the company is necessarily committed to making sure that there’s a future roadmap for its existing board designs and for new complementary boards yet to be designed.
Curtiss-Wright Helix VMEbus Interface Chip based on Artix-7 FPGA
This example underscores the FPGA’s ability to act like a Fountain of Youth for still-viable system designs facing oblivion simply because of component obsolescence. FPGAs like the Artix-7 device used by Curtiss-Wright to implement Helix extend a product’s design life. Unlike rather specialized ASSPs like a VMEbus interface chip, FPGAs tend to have extremely long life cycles because their universal nature allows them to be designed into a very wide range of long-lived systems.
Proof? Here’s what Curtiss-Wright’s Web page says:
“Going forward, Helix will be implemented in our flagship VME boards. Several current products (Power Architecture VME-194 and Intel VME-1908) will be updated to use Helix, thereby significantly extending their lifecycle.”
The VMEbus dates back to 1981 and the legendary 32-bit Motorola 68000 microprocessor yet it’s still used today for advanced military and telecom systems. The “VM” in VME stands for VERSAmodule, which was Motorola’s first card-level bus for the 68000 processor and the “E” stands for Eurocard, a European standard pc board format that used vastly superior 2-piece DIN connectors in place of the VERSAmodule’s card-edge connectors. VMEbus cards became extremely popular, especially for military and telecom applications, because they were rugged and they provided a reasonable amount of board space for electronics. Instead of fading away with technology like many early microprocessor bus standards, the VMEbus has tracked electronic advances with speed and capability extensions standardized under VITA (the VMEbus International Trade Association), which became an official standardization body way, way back in 1993. Now that VME design has reverted to using FPGAs as interface chips, perhaps the ever-evolving VME standard will see another 35 years of longevity.
Existing system architectures aren't keeping up with the performance demands of data-intensive applications. What can you do? Accelerate! But coupling a powerful server processor with hardware acceleration technology isn't easy. That’s why the OpenPOWER Foundation is sponsoring the Developer Challenge, split into “courses.” The two courses that apply to FPGA acceleration are:
The grand prize for each course is an all-expenses paid trip to Supercomputing 2016 for one team member and an Apple Watch Sport for each of as many as five team members. Second price for each course is an Apple iPad Air 16GB for as many as five team members. Third price for each course is an Apple Watch Sport for as many as five team members.
You’ll be working in the OpenPOWER Foundation’s SuperVessel development environment, which now includes the Xilinx SDAccel Development Environment, which delivers a GPU-like and CPU-like programming experience for data center workload acceleration. (See “Google and Rackspace to develop servers based on IBM’s POWER architecture while IBM and Xilinx bring SDAccel to SuperVessel.”) If you would like to develop your Open Road entry using SDAccel, sign up for the Developer Challenge and then request access to SDAccel from Xilinx.
Slightly confused? You can watch the recorded Orientation Hangout here to meet the experts and get a tour through useful resources and demos to help you get going on your challenge submission. There’s also a Hangout on using AccDNN for Deep Learning on FPGAs to Thursday 6/7 at 9PM EST. (That’s today!)
With the Zynq UltraScale+ MPSoC, there’s never an excuse for running shy of processing power. If as many as four 64-bit ARM Cortex-A53 application processors, two ARM Cortex-R5 real-time processors, a hardened and embedded ARM Mali-400 GPU, and an embedded H.265/H.264 video codec don’t get you there, then you’ll find big chunks of 16nm FPGA programmable logic fabric on these devices to build whatever hardware you need. Accelerators built with programmable logic can boost performance by as much as 100x with significantly less power consumed the same function implemented in software.
Xilinx along with EEJournal.com has just posted a new 14-minute Chalk Talk video covering these topics. If you look really quickly, you’ll also see a futures roadmap hinting at a 7nm Zynq family fly by as well, making the Zynq architecture a 3-node family with lots of options for your next design, and your next design, and your next.
Here’s the video:
Here are a two additional Zynq UltraScale+ MPSoC infobits to consider:
There’s a new Xilinx video online that discusses the power and performance advantages of the new 16nm Kintex UltraScale+ FPGAs versus the extremely successful 28nm Kintex-7 All Programmable devices. Essentially, you get a fortunate choice:
Both are great choices to have and you can make the choice at any time because all you’re doing is selecting the core operating voltage. The Kintex UltraScale+ FPGAs will run with core operating voltages of either 0.85V or 0.72V. Which you pick depends on your power and performance requirements.
Here’s the 5-minute video:
HiTech Global’s new HTG-847 Emulation/Prototyping Platform provides ASIC and SoC design teams with a fast way to test their designs. The platform accommodates two or four Xilinx Virtex UltraScale VU440 FPGAs for a maximum capacity of 22.164M system logic cells, 11,500 DSP slices, and 354.4Mbits of fast on-chip SRAM. Each Virtex UltraScale VU440 FPGA communicates with each of the other three Virtex UltraScale VU440 FPGAs on the platform using four 16.3Gbps GTH serial transceivers and 168 single-ended I/O pins as shown in this block diagram:
HiTech Global’s HTG-847 Emulation/Prototyping Platform – Block Diagram
The HTG-847 Emulation/Prototyping Platform features 12 FPC expansion connectors carrying an aggregate of 144 GTH serial transceivers, each capable of bidirectional 16.3Gbps communications, and 1920 single-ended select I/O pins. The company offers several FMC mezzanine boards to expand the HTG-847 Emulation/Prototyping Platform and FMC cables for connecting each HTG-847 board to two adjacent boards.
Here’s a photo of the board:
HiTech Global’s HTG-847 Emulation/Prototyping Platform
Mentor Embedded along with Xilinx will present a free Webinar on July 13 titled “Developing Multiple OSes on a Xilinx Zynq UltraScale+ MPSoC.” You might want to register for this event if you are using or considering the Xilinx UltraScale+ MPSoC in your latest design because successfully running multiple OSes on an MPSoC that combines multiple asymmetric microprocessor cores (ARM Cortex-A53s and Cortex-R5s in the case of the Zynq UltraScale+ MPSoC), graphics processing units, DSPs, FPGA-based offload engines, and programmable I/O is no simple feat. This Webinar promises to help you gain a deeper understanding of the key issues and challenges encountered when developing and debugging software on complex systems based on Xilinx Zynq UltraScale+ MPSoCs. You will learn how to realize the full potential of the device.
In particular, the Webinar will cover Mentor Embedded’s OpenAMP-compatible Multicore Framework and its extensions. You will learn about various use cases and configurations that allow applications on the various cores in the Zynq UltraScale+ MPSoC to run natively, either supervised or in trusted secure environments, or both. These use cases will allow you to design robust and secure devices that can be certified to ISO 26262, IEC, or DO-178C standards, if needed.