Meanwhile, there seems to be a new pledge level that I don’t recall: a $179 level that includes a 1080p video camera. That’s in addition to the touch screen and voice input, which gives the Mycroft Mark II an even more interesting user interface. There are only a limited number of $179 pledge options, with 177 remaining as of the posting of this blog.
Aldec recently published a very descriptive example design for a high-performance, re-programmable network router/switch based on the company’s TySOM-2A-7Z030 embedded development board and an FMC-NET daughter card. The TySOM-2A-7Z030 incorporates a Xilinx Zynq Z-7030 SoC. The Zynq SoC’s dual-core Arm cortex-A9 MPCore processor runs the OpenWrt Linux distribution for embedded devices in this design. It’s a favorite for network switch developers. The design employs the programmable logic (PL) in the Zynq SoC to create four 1G/2.5G Ethernet MACs that connect to the FMC-NET card’s four Ethernet PHYs and a 10G Ethernet subsystem that connects to the FMC-NET card’s QSFP+ card cage.
We recently looked at how we could use the Zynq SoC’s XADC streaming output with DMA. For that example, I demonstrated only outputting one XADC channel over an AXI stream. However, it is important we understand how we can use multiple channels within an AXI stream to transfer them to processor memory whether we are using the XADC as the source or not.
To demonstrate this, I will be updating the XADC design that we used for the previous streaming example. Upgrading the software is simple. All we need to do is enable another XADC channel when we configure the sequencer and update the API we use. Updating the hardware is a little more complicated.
To upgrade the Vivado hardware design, the first thing we need to do is replace the DMA IP module with the multi-channel DMA (MCDMA) IP core. The MCDMA IP core supports as many as 16 different input channels. DMA channels are mapped to AXI Stream contents using the TDest bus, which is part of the AXIS standard.
As with the previous XADC streaming example, we’ll configure the MCDMA for uni-directional operation (write only) and support for two channels:
Configuration of the MCDMA IP core.
Vivado design with the MCDMA IP Core (Tcl BD available on GitHub)
TDest is the AXI signal used for routing AXI Stream contents. In addition, when we configure the XADC for AXI Streaming, the different XADC channels output on the stream are identified by the TId bus.
To be able to use the MCDMA in conjunction with the XADC, we need to remap the XADC TId channel to the MCDMA TDest channel. We also need to packetize data by asserting TLast on the MCDMA AXIS input.
In the previous example we used a custom IP core to generate the TLast signal. A better solution however is to remap the TId and generate the TLast signal using the AXIS subset converter.
AXIS Subset Converter Configuration
The eagle-eyed will at this point notice that the XADC uses channel numbers up to 31 with the auxiliary inputs using channel ID’s (16 to 31), which are outside the channel range of the MCDMA. If we are using the auxiliary inputs, we can also use the AXIS subset convertor to remap these higher channel numbers into the MCDMA range by remapping the lower four bits of the XADC TId to the MCDMA TDest channel. When using this method the lower XADC channels cannot be used otherwise there would be conflict.
Output of the XADC with multiple channels (Temperature & VPVN Channels)
Output of the AXIS subset block following remapping and TLast Generation
When it comes to updating the software application, we need to use the XMCDMA.h header APIs to configure the MCDMA and set up the buffer descriptors for each of the channels. The software performs the following steps:
Allocate memory areas for the receive buffer and the buffer descriptors.
For each Channel, create the buffer descriptors.
Populate the buffer descriptors and the receive buffer address.
Reset the receive buffer memory contents to zero.
Invalidate the caching in on the receive buffer to ensure the values can be seen in DDR memory.
Commit the channels to the hardware.
Start the MCDMA transfers for each channel.
The software application defines several buffer descriptors for each channel. When it comes to the receive buffer for this example, I have used a single receive buffer so the received data for both channels shares the same address space. This can be seen below. Halfwords starting 0x4yyy relate to the VPVN input while the device temperature half words start 0x9yyy.
Memory Contents showing the two channels
This is a simple adaption to the existing software to use multiple receive buffers in memory. For many applications, separate receive buffers are more useful.
Being able to move AXI Data streams to memory-mapped locations is a vital requirement for many applications—for example signal processing, communication, and sensor interfacing. Using the AXI subset convertor allows us to correctly remap and format the AXIS stream data into a compliant format for the MCDMA IP core.
Today, Digilent announced a $299 bundle including its Zybo Z7-20 dev board (based on a Xilinx Zynq Z-7020 SoC), a Pcam 5C 5Mpixel (1080P) color video camera, and a Xilinx SDSoC development environment voucher. (That’s the same price as a Zybo Z7-20 dev board without the camera.) The Zybo Z7 dev board includes a new 15-pin FFC connector that allows the board to interface with the Pcam 5C camera over a 2-lane MIPI CSI-2 and I2C interfaces. (This connector is pin-compatible with the Raspberry Pi’s FFC camera port.) The Pcam 5C camera is based on the Omnivision OV5640 image sensor.
Digilent has created the Pcam 5C + Zybo Z7 demo project to get you started. The demo accepts video from the Pcam 5C camera and passes it out to a display via the Zybo Z7’s HDMI port. All IP used in the demo including a D-PHY receiver, CSI-2 decoder, Bayer to RGB converter and gamma correction is free and open-source so you can study exactly how the D-PHY and CSI-2 decoding works and then develop you own embedded vision products.
If you want this deal, you’d better hurry. The offer expires February 23—three weeks from today.
Rigol’s new RSA5000 real-time spectrum analyzer allows you to capture, identify, isolate, and analyze complex RF signals with a 40MHz real-time bandwidth over either a 3.2GHz or 6.5GHz signal span. It’s designed for engineers working on RF designs in the IoT and IIot markets as well as industrial, scientific, and medical equipment. Rigol was demonstrating the RSA5000 real-time spectrum analyzer at this week’s DesignCon being held at the Santa Clara Convention Center. I listened to a presentation from Rigol’s North American General Manager Mike Rizzo and then a demo by Rigol’s Director of Product Marketing & Software Applications Chris Armstrong, both captured in the 2.5-minute video below.
Rigol RSA5000 Real-Time Spectrum Analyzer
Based on what I saw in the demo, this is an extremely responsive instrument—far more responsive than a swept spectrum analyzer—with several visualization display modes to help you isolate the significant signal in a sea of signals and noise, in real time. It’s capable of continuously executing 146,484 FFTs/sec, which results in a minimum 100% POI (probability of intercept) of 7.45μsec. You need some real DSP horsepower to achieve that sort of performance and the Rigol RSA5000 real-time spectrum analyzer gets this performance from a pair of Xilinx Zynq Z-7015 SoCs. (You'll find many more details about real-time spectrum analysis and the RSA5000 Real-Time Spectrum Analyzer in the Rigol app note "Realtime Spectrum Analyzer vs Spectrum Analyzer," attached at the end of this post. See below.)
Here’s the short presentation and demo of the Rigol RSA5000 real-time spectrum analyzer from DesignCon 2018:
Mike Rizzo told me that the Rigol design engineers selected the Zynq Z-7015 SoCs for three main reasons:
High-bandwidth access between the Zynq SoC’s PS (processing system) and PL (programmable logic)
Excellent development tools including Xilinx’s Vivado HLS
If you’re looking for a very capable spectrum analyzer, give the Rigol RSA5000 a look. If you’re designing your own real-time system and need high-speed computation coupled with fast user response, take a look at the line of Xilinx Zynq SoCs and Zynq UltraScale+ MPSoCs.
The Avnet MiniZed is an incredibly low-cost dev board based on the Xilinx Zynq Z7007S SoC with WiFi and Bluetooth built in. It currently lists for $89 on the Avnet site. If you’d like a fast start to using this dev board, Avnet is ready to help. As of now, it’s placed four MiniZed Speedway Design Workshops online so that you can learn at your own convenience and your own pace. The four workshops are:
In the Developing Zynq Hardware Speedway, you will be introduced to the single ARM Cortex –A9 Processor core as you explore its robust AXI peripheral set. Doing so you will utilize the Xilinx embedded systems tool set to design a Zynq AP SoC system, add Xilinx IP as well as custom IP, run software applications to test the IP, and finally debug your embedded system.
From within an Ubuntu OS running within a virtual machine, learn how to install PetaLinux 2017.1 and build embedded Linux targeting MiniZed. In the hands-on labs learn about Yocto and PetaLinux tools to import your own FPGA hardware design, integrate user space applications, and configure/customize PetaLinux.
Using proven flows for SDSoC, the student will learn how to navigate SDSoC. Through hands-on labs, we will create a design for a provided platform and then also create a platform for the Avnet MiniZed. You will see how to accelerate an algorithm in the course lab.
Here are two reasons you might want to participate in this Kickstarter campaign:
The Mycroft Mark II is a hands-free, privacy-oriented, open-source smart speaker with a touch screen. It has advanced far-field voice recognition and multiple wake words for voice-based cloud services such as Amazon’s Alexa and Google Home, courtesy of Aaware’s technology. (See “Looking to turbocharge Amazon’s Alexa or Google Home? Aaware’s Zynq-based kit is the tool you need.”) The finished smart speaker requires a pledge of $129 (or $299 for three of them) but the dev kit version of the Mycroft Mark II requires a pledge of only $99, which is cheap as dev kits go. (Note: there are only 88 of these kits left, as of this writing.)
You could look at the Mycroft Mark II as a general-purpose, $99 Zynq UltraScale+ MPSoC open-source dev kit with a touch screen that’s also been enabled for voice control, which you can use as a platform for a variety of IIoT, cloud computing, or embedded projects. That in itself is a very attractive offer. As the Mycroft Mark II Kickstarter project page says: “The Mark II has special features that make hacking and customizing easy, not to mention thorough documentation and a community to lean on when building. Support for our community is central to the Mycroft mission.” That’s a lot for a sub-$100 dev kit, don’t you think?
If you’d like some intense training on the Xilinx Zynq UltraScale+ MPSoC—one of the most powerful embedded application processor (plus programmable logic) families that you can throw at an embedded-processing application—then Hardent’s 3-day class titled “Embedded System Design for the Zynq UltraScale+ MPSoC” might just be what you’re looking for. There’s a live, E-Learning version kicking off February 7 with live, in-person classes scheduled for North America from February 21 (in Ottawa) through August. The schedule’s on the referenced Web page.
You certainly might want a comprehensive course outline before you decide, so here it is:
Zynq UltraScale+ MPSoC Overview – Overview of the Zynq UltraScale+ MPSoC All Programmable device.
Application Processing Unit – Introduction to the members of the APU (based on 64-bit Arm Cortex-A53 processors) and how to configure and manage the APU cluster.
Real-Time Processing Unit – Introduction to the various elements within the RPU including the dual-core Arm Cortex-R5 processor and different modes of configuration.
QEMU – Introduction to the Quick Emulator: an emulation tool for the Zynq UltraScale+ MPSoC device that lets you run software whenever, wherever without the actual hardware.
Platform Management Unit –Tools and techniques for debugging your Zynq UltraScale+ MPSoC design.
Booting – Learn how to implement an embedded system including the boot process and boot-image creation.
AXI – Discover how the Zynq UltraScale+ MPSoC’s PS (processing system) and PL (programmable logic) connect to permit designers to create very high performance embedded systems with hardware-speed processing where needed.
Clocks and Resets – Overview of the Zynq UltraScale+ MPSoC’s clocking and reset functions, focusing more on capabilities than specific implementations.
DDR SDRAM and QoS – Learn how to configure the system’s DDR SDRAM to maximize system performance.
System Protection – Covers all the hardware elements that support the separation of software domains within the the Zynq UltraScale+ MPSoC’s PS.
Security and Software – Shows you how to use the safety and security features of the the Zynq UltraScale+ MPSoC in the context of embedded system design and introduces several standards.
ARM TrustZone Technology – Presents the use of the Arm TrustZone technology.
Linux – Discussion and examples showing you how to configure Linux to manage multiple processors.
Yocto – Compares kernel-building methods between a "pure" Yocto build and the Xilinx PetaLinux build (which uses Yocto "under-the-hood").
OpenAMP – Introduction to the concept of the Multicore Association’s OpenAMP framework for asymmetric multiprocessing on heterogeneous processor architectures like the the Zynq UltraScale+ MPSoC.
Hardware/Software Virtualization – Covers the hardware and software elements of virtualization. A lab shows you how to use hypervisors.
Xen Hypervisor – Starts with a description of generic hypervisors and then discusses the details of implementing a hypervisor based on Xen.
Ecosystem Support – Overview of the Zynq UltraScale+ MPSoC’s supported operating systems, software stacks, hypervisors, etc.
FreeRTOS – Overview of FreeRTOS with examples of how to use it.
Software Stack – Introduces the concept of a software stack and discusses the many available stacks for the Zynq UltraScale+ MPSoC.
Curtiss-Wright’s VPX3-535 3U OpenVPX transceiver module implements a single-slot, dual-channel, 6Gsamples/sec analog data-acquisition and processing system using two 12-bit, 6Gsamples/sec ADCs and two 12-bit, 6Gsamples/sec DACs. This is the type of capability you need for demanding applications such as radar, Signal Intelligence (SIGINT), Electronic Warfare (EW), and Software Defined Radio (SDR). This amount of analog-to-digital and digital-to-analog conversion capability demands wicked-fast digital processing and on the VPX3-535 transceiver module, that digital processing comes in the form of two of Xilinx’s most powerful All Programmable devices: a Virtex UltraScale+ VU9P and a Zynq UltraScale+ ZU4 MPSoC.
Here’s a block diagram of the Curtiss-Wright VPX3-535 module:
The VPX3-535 is Curtiss-Wright’s first publicly announced module to feature full compliance to the VITA 48.8 Air-Flow-Through (AFT) cooling standard, which ensures optimal performance in the harshest conditions. VITA 48.8 provides a low-cost, effective means to cool high-power COTS 3U and 6U VPX modules that dissipate ~150W+.
At the same time, Curtiss-Wright is also introducing a conduction-cooled variant, called the VPX3-534, which designed for applications that do not require the performance of the VPX3-535. The VPX3-534 supports the same dual-channel, 12-bit, 6Gsamples/sec ADC and DAC channels as the VPX3-535 but it replaces the Virtex UltraScale+ FPGA with a Xilinx Kintex UltraScale KU115 FPGA. This module also supports an option for four 3Gsamples/sec ADC channels.
Please contact Curtiss-Wright directly for more information about the VPX3-535 and VPX3-534 OpenVPX transceiver modules.
Keysight published a 14-minute video back in 2015 that gives you the basics behind RF beamforming and its use in 5G applications. The video also invites you to download a free, 30-day trial of Keysight’s SystemVue with Keysight’s 5G simulation library to try out some of the concepts discussed in the video and the link appears to be active still.
Here’s the video:
Meanwhile, should you need an implementation technology for RF beamforming (5G or otherwise), allow me to suggest that the new Xilinx Zynq UltraScale+ RFSoC with its many integrated RF ADCs and DACs be at the top of your technology choices. There is literally no other device like the Zynq UltraScale+ RFSoC. It’s in a category of one.
For more information about the Zynq UltraScale+ RFSoC, see:
The Zynq UltraScale MPSoC is a complex system on chip containing as many as four Arm Cortex-A53 application processors, a dual-core Arm Cortex-R5 real-time processor, a Mali GPU, and of course programmable logic. When it comes to generating our software application, we want to use the A53-based Application Processing Unit (APU) and R5 Real-Time Processing Unit (RPU) cores appropriately. This means we want to use the APU for computationally intensive, high-level applications or virtualization while using the RPU for real-time control and monitoring.
This means the APU will likely be running an operating system such as Linux while the real-time needs are addressed by the RPU using bare-metal software or a simplified OS such as FreeRTOS. Often an overall system solution requires communication between the APU and RPU to achieve the desired solution functionality but communication between different processors running different applications has previously been challenging and ad-hoc with inter-processor communications (IPC) using shared memory, mail boxes, or even networks for IPC. As a result, IPC solutions differed from implementation to implementation and device to device, which increased development time and hence time to market.
This is inefficient engineering.
To best leverage the capabilities of the UltraScale+ Zynq MPSoC, we need an open framework that allows us to abstract device-specific interfaces and enables the implementation of AMP (asymmetric multi-processing) with greater ease across multiple projects.
OpenAMP developed by the Multicore Association provides everything we need to run different operating systems on the APU and RPU. Of course, for OpenAMP to function from processor to processor, we need an abstraction layer that provides device-specific interfaces (e.g. interrupt handlers, memory requests, and device access). The libmetal library provides these for Xilinx devices through several APIs that abstract the processor.
For our Zynq UltraScale+ MPSoC designs, the provided OpenAMP frameworks enable messaging between the master processor and remote processor and lifecycle management of the remote processor using the following structures:
Remoteproc – enables lifecycle management of the remote processor. This includes downloading the application to the remote processor, stopping and starting the remote processor, and system resource allocation as required.
RPMsg - supports IPC between different processors in the system.
OpenAMP remoteproc and RPMsg concepts
For this example, we are going to run Linux on the APU and a bare-metal application on the RPU using RPMsg within the kernel space. When we run the RPMsg from within the kernel space, the remote application lifecycle must be managed by Linux. This means the remote processor application does not run independently. If we develop the RPMsg application to run within the Linux user space, the remote processor can run independently.
To create this example first we need to enable remote-processor support within our Linux build. This requires that we rebuild the petalinux project, customising the kernel and root fs. If you are not familiar with building petalinux you might want to read this blog.
Within our petalinux project, the first thing we need to do is enable the remoteproc driver. Using a terminal application within the petalinux project, issue the command:
petalinux-config -c kernel
This will open the kernel configuration menu. Here we can enable the remote-processor drivers which are located under:
Device Drivers -> Remoteproc drivers
Enabling the Remoteproc drivers
The second step is to include the OpenAMP examples within the file system. Again inside the project, issue the command:
petalinux-config -c rootfs
Within the configuration menu, navigate to Filesystem Packages -> misc and enable the packagegroup-petalinux-openamp:
Enabling the package group
The final step before we can rebuild the petalinux image is to update the device tree. We can find an OpenAMP template dtsi file at the location:
Once the kernel, filesystem, and device tree have been updated, rebuild the petalinux image using the command below:
This will generate an updated Linux build that we can copy onto the boot medium of choice and run on our Zynq UltraScale+ MPSoC design.
Using a terminal connected to our preferred development board (in my case the UltraZed), we can test the OpenAMP examples we included within the Linux file System. There are three examples provided: echo test, matrix multiplication, and proxy server.
I ran the matrix-multiply example because it demonstrates the remote processor performing mathematical calculations.
Using the terminal, I entered the following commands:
Following the on-screen menu and commands, I ran the example which provided the results below:
Executing the Matrix Multiply Example
Matrix Multiply example running
This example shows that the OpenAMP framework is running correctly on the Zynq UltraScale+ MPSoC petalinux build and that we can begin to create our own applications. If you want to run the other two examples refer to UG1186.
If we wish to create our own OpenAMP-based application for the RPU, which uses the kernel space RPMsg, we can create this using the SDK and install the generated elf as an app within petalinux. Although it does mean we need to rebuild the petalinux image again, we will look at how we do this in another blog. There is a lot more for us to explore here.
Note: We have looked at OpenAMP before for the Zynq 7000 series of devices in blogs 169 & 171.
Do you need an extremely powerful yet extremely tiny SOM to implement a challenging embedded design? Enclustra’s credit-card sized Mercury+ XU1 is worth your consideration. It packs a Xilinx Zynq UltraScale+ MPSoC, as much as 8Gbytes of DDR4 SDRAM with ECC, 16Gbytes of eMMC Flash memory, two Gbit Ethernet PHYs, two USB 2.0/3.0 PHYs, and 294 user I/O pins on three 168-pin Hirose FX10 connectors into a mere 74x54mm. That’s a lot of computational horsepower in a teeny, tiny package.
Here’s a block diagram of the Mercury+ XU1 SOM:
By itself, the Zynq UltraScale+ MPSoC gives the SOM a tremendous set of resources including:
A quad-core Arm Cortex-A53 64-bit application processor
A dual-core Arm Cortex-R5 32-bit real-time processor
An Arm Mali-400 MP2 GPU
As many as 747K system logic cells (in a Zynq UltraScale+ ZU15EG MPSoC)
As much as 26.2Mbits of BRAM and 31.5Mbits of UltraRAM (in a Zynq UltraScale+ ZU15EG MPSoC)
As many as 3528 DSP48E2 slices (in a Zynq UltraScale+ ZU15EG MPSoC)
As with many product family designs, Enclustra is able to tailor the price/performance of the Mercury+ XU1 SOM by offering multiple versions based on different pin-compatible members of the Zynq UltraScale+ MPSoC family.
Please contact Enclustra directly for more information about the Mercury+ XU1 SOM.
Today marks the launch of Joshua Montgomery’s Mycroft Mark II open-source Voice Assistant, a hands-free, privacy-oriented smart speaker with a touch screen that also happens to be based on a 6-microphone version of Aaware’s Sound Capture Platform. In fact, according to today’s article on EEWeb written by my good friend and industry gadfly Max Maxfield, Aaware is designing the pcb for the Mycroft Mark II Voice Assistant, which will be based on a Xilinx Zynq UltraScale+ MPSoC according to Max’s article. (It’s billed as a “Xilinx quad-core processor” in the Kickstarter project listing.) According to Max’s article, “This PCB will be designed to support different microphone arrays, displays, and cameras such that it can be used for follow-on products that use the Mycroft open-source voice assistant software stack.”
To repeat: That’s an open-source, consumer-level product based on one of the most advanced MPSoC’s on the market today with at least two 64-bit Arm Cortex-A53 processors and two 32-bit Arm Cortex-R5 processors plus a generous chunk of the industry’s most advanced programmable logic based on Xilinx’s 16nm UltraScale+ technology.
Aaware’s technology starts with an array of six individual microphones. The outputs of these microphones are combined and processed with several Aaware-developed algorithms including acoustic echo cancellation, noise reduction and beamforming that allow the Mycroft Mark II smart speaker to isolate the voice of a speaking human even in noisy environments. (See “Looking to turbocharge Amazon’s Alexa or Google Home? Aaware’s Zynq-based kit is the tool you need.”) The combination of Aaware’s Sound Capture Platform, Mycroft’s Mark II smart speaker open-source code, and the immensely powerful Zynq UltraScale+ MPSoC give you an incredible platform for developing your own end products.
Here’s a 3-minute video demo of the Mycroft Mark II smart speaker’s capabilities:
Pledge $99 on Kickstarter and you’ll get a DIY dev kit that includes the pcbs, an LCD, speakers, and cables but no handsome plastic housing. Pledge $129—thirty bucks more—and you get a built unit in an elegant housing. There are higher pledge levels too.
What’s the risk? As of today, the first day of the pledge campaign, the project is 167% funded, so it’s already a “go.” There are 28 days left to jump in. Also, Mycroft delivered the Mark I speaker, a previous Kickstarter project, last July so the company has a track record of successful Kickstarter project completion.
Korea-based ATUS has just published a 4-minute video of its Zynq-based CNN (convolutional neural network) performing real-time object recognition on a 416x234-pixel dashcam video stream at 46.7fps. Reliable, real-time object recognition is essential to the development of autonomous driving and ADAS systems. ATUS’ design is based on a Xilinx Zynq Z-7020 SoC running a YOLO (you only look once) object-detection system. In the video below, the system recognizes cars, trucks, buses, and pedestrians.
Without a doubt, how we use the Zynq SoC’s XADC in our developments is the one area I receive the most emails about. Just before Christmas, I received two questions that I thought would make for a pretty good blog. The first question asked how to use the AXI streaming output with DMA while the follow up question was about how to output the captured data for further analysis.
The AXI streaming output of the XADC is very useful when we want to create a signal-processing path, which might include filters, FFTs, or our own custom HLS processing block for example. It is also useful if we want to transfer many samples as efficiently as possible to the PS (processing system) memory for output or higher-level processing and decision making within the PS.
The XADC outputs samples on the AXI Stream for each of its channels when it is enabled. The AXI Streaming interface implements the optional TID bus that identifies the channel currently streaming on the AXI stream data output to allow downstream IP cores to correlate the AXI Stream data with an input channel. If we output only a single XADC channel, we do not need to monitor the TID signal information. However if we are outputting multiple channels in the AXI stream, we need to pay attention to the TID information to ensure that our processing blocks use the correct samples.
We need a more complex DMA architecture to support multi-channel XADC operation. The DMA IP Core must have its scatter-gather engine enabled to provide multi-channel support. Of course, this level of complexity is not required if we’re only using a single XADC channel.
For the following example, we will be using a single XADC output channel so that I can demonstrate these concepts simply. I will return to this example in a later blog and expand the design for multiple output channels.
We will use a DMA transfer from the PL to the PS to move XADC samples into the PS memory. However, we cannot connect the XADC and DMA AXI stream interfaces directly. That design won’t function correctly because the DMA IP Core requires the assertion of the optional AXI Stream signal TLast to signal AXI transfer completion. Unfortunately, the XADC’s AXI Streaming output does not contain this signal so we need to add a block to drive the TLast signal between the XADC and DMA.
This interface block is very simple. It should allow the user to define the size of the AXI transfer and it needs to assert the TLast signal once the AXI transfer size is reached. Rather helpfully, an IP block called Tlast_gen that implements this functionality has been provided here. All we need to do is add this IP core to the project IP repository and include it in our design.
We can use an AXI GPIO in the PL to control the size of the transfer dynamically at run time.
Creating the block diagram within Vivado for this example is very simple. In fact, most of the design can be automatically created using Vivado, as shown in this video:
The final block diagram is below. I have uploaded the TCL BD description to my GitHub to enable more detailed inspection and recreation of the project.
Once the project has been completed in Vivado, we can build the design and export it to SDK to create the application.
The software application performs the following functions:
Initialize the AXI GPIO
Initialize and configure the XADC to read a single channel (VP/VN on the Zedboard)
Reset the DMA channel
Loop forever performing the following
Simple DMA transfer from the PL to the PS
Flush the cache at the PS transfer address to ensure that we can see the data in the PS DDR memory
Flushing the cache is very important. If we don’t flush the cache, we will not see the captured ADC values in memory when we examine the PS DDR memory.
We also need to take care when setting the number of ADC samples transferred. The output Stream is 16 bits wide. The tlast_gen block counts these 16-bit (2-byte) transfers while the DMA transfer counts bytes. So we need we need to set the tlast_gen transfer size to be half the number of bytes the DMA is configured to transfer. If we fail to set this correctly, we will only be able to perform the transfer once. Then the DMA will hang because the tlast signal will not be generated.
Generation of the tlast signal
When I ran this software on the ZedBoard, I could see the ADC values changing as the DMA transfer occurred in a memory watch window.
Memory watch window showing the 256-byte DMA capture
Now that we can capture the ADC values in the PS memory, we may want to extract this information for further analysis—especially if we are currently verifying or validating the design. The simplest way to do this is to write out the values over an RS-232 port. However, this can be a slow process and it requires modification to the application software.
Another method we can use is the XSCT Console within the debug view in SDK. Using XSCT we can:
Read out a memory address range
Read out a memory address range as a TCL list
Read out a memory address range to a binary file
The simplest approach is to output a memory address range. To do this, we use the command:
mrd <address> <number of words>
Reading out 256 words from the address 0x00100000
While this technique outputs the data, the output format is not the easiest for an analysis program to work with because it contains both address and data values.
We can obtain a more useful data output by requesting the data to be output as a TCL list using the command:
mrd -value -size h 0x00100000 128
Reading out a TCL list
We can then use this TCL list with a program like Microsoft Excel, MATLAB, or Octave to further analyze the captured signal:
Captured Data analyzed and plotted in Excel.
Finally, if we want to download a binary file containing the memory contents we can use the command:
mrd -bin -file part233.bin 0x00100000 128
We can then read this binary file into an analysis program like Octave or MATLAB or into custom analysis software.
Hopefully by this point I have answered the questions posed and shared the answers more widely, enabling you to get your XADC designs up and running faster.
Embedded-vision applications present many design challenges and a new ElectronicsWeekly.com article written by Michaël Uyttersprot, a Technical Marketing Manager at Avnet Silica, and titled “Bringing embedded vision systems to market” discusses these challenges and solutions.
First, the article enumerates several design challenges including:
Meeting hi-res image-processing demands within cost, size, and power goals
Handling a variety of image-sensor types
Handling multiple image sensors in one camera
Real-time compensation (lens correction) for inexpensive lenses
Distortion correction, depth detection, dynamic range, and sharpness enhancement
Next, the article discusses Avnet Silica’s various design offerings that help engineers quickly develop embedded-vision designs. Products discussed include:
First, the YouTube video demonstrates the IoT design interacting with an app on a mobile phone. Then video takes you step-by-step through the creation process using the Xilinx Vivado development environment.
The YouTuber writes:
“I implemented a web server using Python and bottle framework, which works with another C++ application. The C++ application controls my custom IPs (such as PWM) implemented in PL block. A user can control LEDs, 3-color LEDs, buttons and switches mounted on ZYBO board.”
The YouTube video’s Web page also lists the resources you need to recreate the IoT design:
A quick look at the latest product table for the Xilinx Zynq UltraScale+ RFSoC will tell you that the sample rate for the devices’ RF-class, 14-bit DAC has jumped to 6.554Gsamples/sec, up from 6.4Gsamples/sec. I asked Senior Product Line Manager Wouter Suverkropp about the change and he told me that the increase supports “…an extra level of oversampling for DOCSIS3.1 [designers]. The extra oversampling gives them 3dB processing gain and therefore simplifies the external circuits even further.”
Zynq UltraScale+ RFSoC Conceptual Diagram
For more information about the Zynq UltraScale+ RFSoC, see:
Xilinx has announced availability of automotive-grade Zynq UltraScale+ MPSoCs, enabling development of safety critical ADAS and Autonomous Driving systems. The 4-member Xilinx Automotive XA Zynq UltraScale+ MPSoC family is qualified according to AEC-Q100 test specifications with full ISO 26262 ASIL-C level certification and is ideally suited for various automotive platforms by delivering the right performance/watt while integrating critical functional safety and security features.
The XA Zynq UltraScale+ MPSoC family has been certified to meet ISO 26262 ASIL-C level requirements by Exida, one of the world's leading accredited certification companies specializing in automation and automotive system safety and security. The product includes a "safety island" designed for real-time processing functional safety applications that has been certified to meet ISO 26262 ASIL-C level requirements. In addition to the safety island, the device’s programmable logic can be used to create additional safety circuits tailored for specific applications such as monitors, watchdogs, or functional redundancy. These additional hardware safety blocks effectively allow ASIL decomposition and fault-tolerant architecture designs within a single device.
Bitmain’s Antminer S9 Bitcoin Mining Machine uses a Zynq Z-7010 SoC as a main control processor
The Powered by Xilinx program has just published a 3-minute video containing an interview with Yingfei Li, Bitmain’s Marketing Director, and Wenguo Zhang, Bitmain’s Hardware R&D Director. In the video, Zhang explains that the Zynq Z-7010 solved multiple hidden problems with the company’s previous-generation control panel, thanks to the Zynq SoC’s dual-core Arm Cortex-A9 MPCore processor and the on-chip programmable logic.
Due to the success that Bitmain has had with Xilinx Zynq SoCs in it’s Antminer S9 Bitcoin mining machine, the company is now exploring the use of Xilinx 20nm and 16nm devices (UltraScale and UltraScale+) for future, planned AI platforms and products.
DornerWorks is one of only three Xilinx Premier Alliance Partners in North America offering design services, so the company has more than a little experience using Xilinx All Programmable devices. The company has just launched a new learn-by-email series with “interesting shortcuts or automation tricks related to FPGA development.”
The series is free but you’ll need to provide an email address to receive the lessons. I signed up and immediately received a link to the first lesson titled “Algorithm Implementation and Acceleration on Embedded Systems” written by DornerWorks’ Anthony Boorsma. It contains information about the Xilinx Zynq SoC and Zynq UltraScale+ MPSoC and the Xilinx SDSoC development environment.
Need a tiny-but-powerful SOM for your next embedded project? The iWave iW-RainboW-G28M SOM based on a Xilinx Zynq Z-7007S, Z-7014S, Z-7010, or Z-7020 SoC is certainly tiny—it’s a 67.6x37mm plug-in SoDIMM—and with one or two Arm Cortex A9 MPCore processors, 512Mbytes of DDR3 SDRAM, 512Mbytes of NAND Flash, Gigabit Ethernet and USB 2.0 ports, and an optional WiFi/Bluetooth module it certainly qualifies as powerful and it’s offered in an industrial temp range (-40°C to +85°C).
iWave’s iW-RainboW-G28M SoDIMM SOM is based on any one of four Xilinx Zynq SoCs
iWave’s SOM design obviously takes advantage of the pin compatibility built into the Xilinx Zynq Z-7000S and Z-7000 device families.
You’ll find the announcement for the iW-RainboW-G28M SoDIMM SOM here and the data sheet here.
Please contact iWave directly for more information about the iW-RainboW-G28M SoDIMM SOM.
The recent introduction of the groundbreaking Xilinx Zynq UltraScale+ RFSoC means that there are big changes in store for the way advanced RF and comms systems will be designed. With as many as 16 RF-class ADCs and DACs on one device along with a metric ton or two of other programmable resources, the Zynq UltraScale+ RFSoC makes it possible to start thinking about single-chip Massive MIMO systems. A new EDN.com article by Paul Newson , Hemang Parekh, and Harpinder Matharu titled “Realizing 5G New Radio massive MIMO systems” teases a few details for building such systems and includes this mind-tickling photo:
A sharp eye and keen memory will link that photo to a demo from last October’s Xilinx Showcase demo at the Xilinx facility in Longmont, Colorado. Here’s Xilinx’s Lee Hansen demonstrating a similar system based on the Xilinx Zynq UltraScale+ RFSoC:
For more details about the Zynq UltraScale+ RFSoC, contact your friendly neighborhood Xilinx or Avnet sales rep and see these previous Xcell Daily blog posts:
As a Xilinx employee I would like to contribute on the Pros ... and the Cons.
Let start with the Cons: if there is a processor that suits all your needs in terms of cost/power/performance/IOs just go for it. You won't be able to design the same thing in an FPGA at the same price.
Now if you need some kind of glue logic around (IOs), or your design need multiple processors/GPUs due to the required performance then it's time to talk to your local FPGA dealer (preferably Xilinx distributor!). I will try to answer a few remarks I saw throughout this thread:
FPGA/SoC: In the majority of the FPGA designs I’ve seen during my career at Xilinx, I saw some kind of processor. In pure FPGAs (Virtex/Kintex/Artix/Spartan) it is a soft-processor (Microblaze or Picoblaze) and in a [Zynq SoC or Zynq Ultrascale+ MPSoC], it is a hard processor (dual-core Arm Cortex-A9 [for Zynq SoCs] and Quad-A53+Dual-R5 [for Zynq UltraScale+ MPSoCs]). The choice is now more complex: Processor Only, Processor with an FPGA aside, FPGA only, Integrated Processor/FPGA. The tendency is for the latter due to all the savings incurred: PCB, power, devices, ...
Power: Pure FPGAs are making incredible progress, but if you want really low power in stand-by mode you should look at the Zynq Ultrascale+ MPSoC, which contains many processors and particularly a Power Management Unit that can switch on/off different regions of the processors/programmable logic.
Analog: Since Virtex-5 (2006), Xilinx has included ADCs in its FPGAs, which were limited to internal parameter measurements (Voltage, Temperature, ...). [These ADC blocks are] called the System Monitor. With 7 series (2011) [devices], Xilinx included a dual 1Msamples/sec@12-bits ADC with internal/external measurement capabilities. Lately Xilinx [has] announced very high performance ADCs/DACs integrated into the Zynq UltraScale+ RFSoC: 4Gsamples/sec@12 bits ADCs / 6.5Gsamples/sec@14 bits DACs. Potential applications are Telecom (5G), Cable (DOCSYS) and Radar (Phased-Array).
Security: The bitstream that is stored in the external Flash can be encoded [encrypted]. Decoding [decrypting] is performed within the FPGA during bitstream download. Zynq-7000 SoCs and Zynq Ultrascale+ MPSoCs support encoded [encrypted] bitstreams and secured boot for the processor[s].
Ease of Use: This is the big part of the equation. Customers need to take this into account to get the right time to market. Since 2012 and [with] 7 series devices, Xilinx introduced a new integrated tool called Vivado. Since then a number of features/new tools have been [added to Vivado]:
IP Integrator(IPI): a graphical interface to stitch IPs together and generate bitstreams for complete systems.
Vivado HLS (High Level Synthesis): a tool that allows you to generate HDL code from C/C++ code. This tool will generate IPs that can be handled by IPI.
SDSoC (Software Defined SoC): This tool allows you to design complete systems, software and hardware on a Zynq SoC/Zynq UltraScale+ MPSoC platform. This tool with some plugins will allow you to move part of your C/C++ code to programmable logic (calling Vivado HLS in the background).
SDAccel: an OpenCL (and more) implementation. Not relevant for this thread.
There are also tools related to the MathWorks environment [MATLAB and Simulink]:
System Generator for DSP (aka SysGen): Low-level Simulink library (designed by Xilinx for Xilinx FPGAs). Allows you to program HDL code with blocks. This tools achieves even better performance (clock/area) than HDL code as each block is an instance of an IP (from register, adder, counter, multiplier up to FFT, FIR compiler, and VHLS IP). Bit-true and cycle-true simulations.
Xilinx Model Composer (XMC): available since ... yesterday! Again a Simulink blockset but based on Vivado HLS. Much faster simulations. Bit-true but not cycle-true.
All this to say that FPGA vendors have [expended] tremendous effort to make FPGAs and derivative devices easier to program. You still need a learning curve [but it] is much shorter than it used to be…
One of life’s realities is that the most advanced semiconductor devices—including the Xilinx Zynq UltraScale+ MPSoCs—require multiple voltage supplies for proper operation. That means that you must devote a part of the system engineering effort for a product based on these devices to the power subsystem. Put another way, it’s been a long, long time since the days when a single 5V supply and a bypass capacitor were all you needed. Fortunately, there’s help. Xilinx has a number of vendor partners with ready, device-specific power-management ICs (PMICs). Case in point: Dialog Semiconductor.
If you need to power a Zynq UltraScale+ ZU3EG, ZU7EV, or ZU9CG MPSoC, you’ll want to check out Dialog’s App Note AN-PM-095 titled “Power Solutions for Xilinx Zynq Ultrascale+ ZU9EG.” This document contains reference designs for cost-optimized, PMIC-based circuits specifically targeting the power requirements for Zynq UltraScale+ MPSoCs. According to Xilinx Senior Tech Marketing Manager for Analog and Power Delivery Cathal Murphy, Dialog Semi’s PMICs can be used for low-cost power-supply designs because they generate as many as 12 power rails per device. They also switch at frequencies as high as 3MHz, which means that you can use smaller, less expensive passive devices in the design.
It also means that your overall power-management design will be smaller. For example, Dialog Semi’s power-management ref design for a Zynq UltraScale+ ZU9 MPSoC requires only 1.5in2 of board space—or less for smaller devices in the MPSoC family.
You don’t need to visualize that in your head. Here’s a photo and chart supplied by Cathal:
The Dialog Semi reference design is hidden under the US 25-cent piece.
As the chart notes, these Dialog Semi PMICs have built in power sequencing and can be obtained preprogrammed for Zynq-specific power sequences from distributors such as Avnet.
Cathal also pointed out that Dialog Semi has long been supplying PMICs to the consumer market (think smartphones and tablets) and that the power requirements for Zynq UltraScale+ MPSoCs map well into the existing capabilities of PMICs designed for this market, so you reap the benefit of the company’s volume manufacturing expertise.
Late last week, Avnet announced that it’s now offering the Aaware Sound Capture Platform paired with the MiniZed Zynq SoC development platform as a complete dev kit for voice-based cloud services including Amazon Alexa and Google Home. It’s listed on the Avnet site for $198.99. Avnet and Aaware are demonstrating the new kit at CES 2018, being held this week in Las Vegas. You’ll find them at the Eureka Park booth #50212 in the Sands Expo.
The Aaware Sound Capture Platform coupled to a Zynq-based Avnet MiniZed dev board
The Aaware Sound Capture Platform couples as many as 13 MEMS microphones (you can use fewer in a 1D linear or 2D array) with a Xilinx Zynq Z-7010 SoC to pre-filter incoming voice, delivering a clean voice data stream to local or cloud-based voice recognition hardware. The system has a built-in wake word (like “Alexa” or “OK, Google”) that triggers the unit’s filtering algorithms.
Avnet’s MiniZed dev board is usually based on a single-core Zynq 7Z007S but the MiniZed board included in this kit is actually based on a dual-core Zynq Z-7010 SoC. This board offers you outstanding wireless I/O in the form of a WiFi 802.11b/g/n module and a Bluetooth 4.1 module.
For more information about the Aaware Sound Capture Platform, see:
Adam Taylor has been writing about the use of Xilinx All Programmable devices for image-processing platforms for quite a while and he has wrapped up much of what he knows into a 44-minute video presentation, which appears below. Adam is presenting tomorrow at the Xilinx Developer Forum being held in Frankfurt, Germany.
My good friend Jack Ganssle has long published The Embedded Muse email newsletter and the January 2, 2018 issue (#341!) includes an extensive review of the new $759, Zynq-based Siglent SDS1204X-E 4-channel DSO. Best of all, he’s giving one of these bad boys away at the end of January. (Contest details below.)
Siglent’s Zynq-based SDS1204X-E 4-channel DSO. Photo credit: Jack Ganssle
“I'm blown away by the advanced engineering and quality of manufacturing exhibited by this and some other Chinese test equipment. Steve Leibson wrote a piece about how the unit works, and it's clear that the innovation and technology in this unit are world-class.”
In my own review of the Siglent SDS1202X-E last November, I wrote:
“Siglent’s SDS-1202X-E and SDS-1104X-E DSOs once again highlight the Zynq SoC’s flexibility and capability when used as the foundation for a product family. The Zynq SoC’s unique combination of a dual-core Arm Cortex-A9 MPCore processing subsystem and a good-sized chunk of Xilinx 7 series FPGA permits the development of truly high-performance platforms.”
Last April, I wrote:
“The new SDS1000X-E DSO family illustrates the result of selecting a Zynq SoC as the foundation for a system design. The large number of on-chip resources permit you to think outside of the box when it comes to adding features. Once you’ve selected a Zynq SoC, you no longer need to think about cramming code into the device to add features. With the Zynq SoC’s hardware, software, and I/O programmability, you can instead start thinking up new features that significantly improve the product’s competitive position in your market.
“This is precisely what Siglent’s engineers were able to do. Once the Zynq SoC was included in the design, the designers of this entry-level DSO family were able to think about which high-performance features they wished to migrate to their new design.”
All of that is equally true for the Siglent SDS1204X-E 4-channel DSO, which is further proof of just how good the Zynq SoC is when used as a foundation for an entire product-family.
Now if you want to win the Siglent SDS1204X-E 4-channel DSO that Jack’s giving away at the end of January, you first need to subscribe to The Embedded Muse. The subscription is free, Jack’s an outstanding engineer and a wonderful writer, and he’s not going to sell or even give your email address to anyone else so consider the Embedded Muse subscription a bonus for entering the drawing. After you subscribe, you can enter the contest here. (Note: It’s Jack’s contest, so if you have questions, you need to ask him.)
In the video, KORTIQ CEO Harold Weiss discusses using low-end Zynq SoCs (up to the Z-7020) and Zynq UltraScale+ MPSoCs (the ZU2 and ZU3) to create low-power solutions that deliver “just enough” performance for target industrial applications such as video processing, which requires billions of operations per second. The Zynq SoCs and Zynq UltraScale+ MPSoCs consume far less power than competing GPUs and CPUs while accelerating multiple CNN layers including convolutional layers, pooling layers, fully connected layers, and adding layers.