We have detected your current browser version is not the latest one. Xilinx.com uses the latest web technologies to bring you the best online experience possible. Please upgrade to a Xilinx.com supported browser:Chrome, Firefox, Internet Explorer 11, Safari. Thank you!

Adam Taylor’s MicroZed Chronicles Part 222, UltraZed Edition Part 20: Zynq Watchdogs

by Xilinx Employee ‎10-30-2017 09:49 AM - edited ‎10-30-2017 09:50 AM (2,990 Views)


By Adam Taylor


As engineers we cannot assume the systems we design will operate as intended 100% of the time. Unexpected events and failures do occur. Depending upon the application, the protections implemented against these unexpected failures vary. For example, a safety-critical system used for an industrial or automotive application requires considerably more failure-mode analysis and much better protection mechanisms than a consumer application. One simple protection mechanism that can be implemented quickly and simply in any application is the use of a watchdog and this blog post describes the three watchdogs in a Zynq UltraScale+ MPSoC.


Watchdogs are intended to protect the processor against the software application crashing and becoming unresponsive. A watchdog is essentially a counter. In normal operation, the application software prevents this counter from reaching its terminal count by resetting it periodically. Should the application software crash and allow the watchdog to reach its terminal count by failing to reset the counter, the watchdog is said to have expired.


Upon expiration, the watchdog generates a processor reset or non-maskable interrupt to enable the system to recover. The watchdog is designed to protect against software failures so it must be implemented physically in silicon. It cannot be a software counter for reasons that should be obvious.


Preventing the watchdog’s expiration can be complicated. We do not want crashes to be masked resulting in the watchdog’s failure to trigger. It is good practice for the application software to restart the watchdog in the main body of application. Using an interrupt service routine to restart the watchdog opens the possibility of the main application crashing but the ISR still being serviced. In that situation, the watchdog will restart without a crash recovery.


The Zynq UltraScale+ MPSoC provides three watchdogs in its processing system (PS):


  • Full Power Domain (FPD) Watchdog protecting the APU and its interconnect
  • Low Power Domain (LPD) Watchdog protecting the RPU and its interconnect
  • Configuration and Security Unit (CSU) Watchdog protecting the CSU and its interconnect


The FPD and LPD watchdogs can be configured to generate a reset, an interrupt, or both should a timeout occur. The CSU can only generate an interrupt, which can then be actioned by the APU, RPU, or the PMU. We use the PMU to manage the effects of a watchdog timeout, configuring it via to act via its global registers.


The FPD and LPD watchdogs are clocked either from an internal 100MHz clock or from an external source connected to the MIO or EMIO. The FPD and LPD watchdogs can output a reset signal via MIO or EMIO. This is helpful if we wish to alert functions in the PL that a watchdog timeout has occurred.


Each watchdog is controlled by four registers:


  • Zero Mode Register – This control register enables the watchdog and enables generation of reset and interrupt signals along with the ability to define the reset and interrupt pulse duration.
  • Counter Control Register – This counter configuration register sets the counter reset value and clock pre-scaler.
  • Restart Register – This write-only register takes a specific key to restart the watchdog.
  • Status Register – This read-only register indicates if the watchdog has expired.


To ensure that writes to the Zero Mode, Counter Control, and Restart registers are intentional and not the result of an incorrect software operation, write accesses to these registers require that specific keys, different for each register, must be included in the written data word for each write take effect.





Zynq UltraScale+ MPSoC Timer and Watchdog Architecture




To include the FPD or LPD watchdogs in our design, we need to enable them in Vivado. You do so using the I/O configuration tab of the MPSoC customization dialog.






Enabling the SWDT (System Watchdog Timer) in the design




For my example in this blog post, I enabled the external resets and connected them to an ILA within the PL so that I can capture the reset signal when it’s generated.





Simple Test Design for the Zynq UltraScale+ MPSoC watchdog




To configure and use the watchdog, we use SDK and the API defined in xwdtps.h. This API allows us to configure, initialize, start, restart, and stop the selected watchdog with ease.


To use the watchdog to its fullest extent, we also need to configure the PMU to respond to the watchdog error. This is simple and requires writes to the PMU Error Registers (ERROR_SRST_EN_1 and ERROR_EN_1) enabling the PMU to respond to watchdog timeout. This will also cause the PS to assert its Error OUT signal and will result in LED D3 illuminating on the UltraZed SOM when the timeout occurs.


For this example, I also used the PMU persistent global registers, which are cleared only by a power-on reset, to keep a record of the fact that a watchdog event has occurred. This count increments each time a watchdog timeout occurs. After the example design has reset 6 times, the code finally begins to restart the watchdog and stays alive.


Because the watchdog causes a reset event each time the processor reboots, we must take care to clear the previously recorded error. Failure to do so will result in a permanent reset cycle because the timeout error is only cleared by a power-on reset or a software action that clears the error. To prevent this from happening, we need to clear the watchdog fault indication in the PMU’s global ERROR_STATUS_1 register at the start of the program.






Reset signal being issued to the programmable logic




Observing the terminal output after I created my example application, it was obvious that the timeout was occurring and the number of occurrences was incrementing. The screen shot below shows the initial error status register and the error mask register. The error status register is shown twice. The first instance shows the watchdog error in the system and the second confirms that it has been cleared. The reset reason also indicates the cause of the reset. In this case it’s a PS reset.







Looping Watchdog and incrementing count




We have touched lightly on the PMU’s error-handling capabilities. We will explore these capabilities more in a future blog.


Meanwhile, you can find the example source code on the GitHub.


If you want E book or hardback versions of previous MicroZed chronicle blogs, you can get them below.




First Year E Book here

First Year Hardback here.




MicroZed Chronicles hardcopy.jpg 



Second Year E Book here

Second Year Hardback here



MicroZed Chronicles Second Year.jpg 



The Xilinx Zynq UltraScale+ MPSoC incorporates two to four 64-bit Arm Cortex-A53 processors. If you’re trying to figure out why you want those in your next design and how you will harness them, there’s a new, free, 1-hour Webinar just for you on November 2 titled “Shift Gears with a 64-bit ARM-Powered Zynq UltraScale+ MPSoC.”


Topic to be covered:


  • The possibilities that open up with the 64-bit Arm Cortex-A53 processor architecture
  • What you need to understand to fully exploit this technology in your design
  • A thorough look at the programmer’s model enhancements, including exception model, execution states, memory management, and cache coherency


Doulos, a Xilinx Authorized Training Provider and ARM Approved Training Center, is presenting the Webinar.


It’s next week, so register here. Now.




Now that Amazon has made the FPGA-accelerated AWS EC2 F1 instance based on multiple Xilinx Virtex UltraScale+ VU9P FPGAs generally available, the unbound imaginations of some really creative people have been set free. Case in point: the cloud-based FireSim hardware/software co-design environment and simulation platform for designing and simulating systems based on the open-source RocketChip RISC-V processor. The Computer Architecture Research Group at UC Berkeley is developing FireSim. (See “Bringing Datacenter-Scale Hardware-Software Co-design to the Cloud with FireSim and Amazon EC2 F1 Instances.”)


Here’s what FireSim looks like:



FireSim diagram.jpg 



According to the AWS blog cited above, FireSim addresses several hardware/software development challenges. Here are some direct quotes from the AWS blog:


1: “FPGA-based simulations have traditionally been expensive, difficult to deploy, and difficult to reproduce. FireSim uses public-cloud infrastructure like F1, which means no upfront cost to purchase and deploy FPGAs. Developers and researchers can distribute pre-built AMIs and AFIs, as in this public demo (more details later in this post), to make experiments easy to reproduce. FireSim also automates most of the work involved in deploying an FPGA simulation, essentially enabling one-click conversion from new RTL to deploying on an FPGA cluster.”


2: “FPGA-based simulations have traditionally been difficult (and expensive) to scale. Because FireSim uses F1, users can scale out experiments by spinning up additional EC2 instances, rather than spending hundreds of thousands of dollars on large FPGA clusters.”


3: “Finding open hardware to simulate has traditionally been difficult. Finding open hardware that can run real software stacks is even harder. FireSim simulates RocketChip, an open, silicon-proven, RISC-V-based processor platform, and adds peripherals like a NIC and disk device to build up a realistic system. Processors that implement RISC-V automatically support real operating systems (such as Linux) and even support applications like Apache and Memcached. We provide a custom Buildroot-based FireSim Linux distribution that runs on our simulated nodes and includes many popular developer tools.”


4: “Writing hardware in traditional HDLs is time-consuming. Both FireSim and RocketChip use the Chisel HDL, which brings modern programming paradigms to hardware description languages. Chisel greatly simplifies the process of building large, highly parameterized hardware components.”



Using high-speed FPGA technology to simulate hardware isn’t a new idea. Using an inexpensive, cloud-based version of that same FPGA technology to develop hardware and software from your laptop while sitting in a coffee house in Coeur d’Alene, Timbuktu, or Ballarat—now that is something new.





National Instruments’ (NI’s) PXI FlexRIO modular instrumentation product line has been going strong for more than a decade and the company has just revamped its high-speed PXI analog digitizers and PXI digital signal processors by upgrading to high-performance Xilinx Kintex UltraScale FPGAs. According to NI’s press release, “With Kintex UltraScale FPGAs, the new FlexRIO architecture offers more programmable resources than previous Kintex-7-based FlexRIO modules. In addition, the new mezzanine architecture fits both the I/O module and the FPGA back end within a single, integrated 3U PXI module. For high-speed communication with other modules in the chassis, these new FlexRIO modules feature PCIe Gen 3 x8 connectivity for up to 7 GB/s of streaming bandwidth.” (Note that all Kintex UltraScale FPGAs incorporate hardened PCIe Gen1/2/3 cores.)


The change was prompted by a tectonic shift in analog converter interface technology—away from parallel LVDS converter interfaces and towards newer, high-speed serial protocols like JESD204B. As a result, NI’s engineers re-architected their PXI module architecture by splitting it into interface-compatible Mezzanine I/O modules and a trio of back-end FPGA carrier/PCIe-interface cards—NI calls them “FPGA back ends”—based on three pin-compatible Kintex UltraScale FPGAs: the KU035, KU040, and KU060 Kintex UltraScale devices. These devices allow NI to offer three different FPGA resource levels with the same pcb design.



NI FlexRIO Digitizers based on Kintex UltraScale FPGAs.jpg




Modular NI PXI FlexRIO Module based on Xilinx Kintex UltraScale FPGAs




The new products in the revamped FlexRIO product line include:


Digitizer Modules – New PXI FlexRIO Digitizers higher-speed sample rates and wide bandwidth without compromising dynamic range. The 16-bit, 400MHz PXIe-5763 and PXIe-5764 operate at 500 Msamples/sec and 1 Gsamples/sec respectively. (Note: NI previewed these modules earlier this year at NI Week. See “NI’s new FlexRIO module based on Kintex UltraScale FPGAs serves as platform for new modular instruments.”)



NI FlexRIO Digitizers based on Kintex UltraScale FPGAs.jpg 



  • Coprocessor Modules – Kintex UltraScale PXI FlexRIO Coprocessor Modules add real-time signal processing capabilities to a system. A chassis full of these modules delivers high-density computational resources—the most computational density ever offered by NI.


  • Module Development Kit – You can use NI’s LabVIEW to program these PXI modules and you can using use the Xilinx Vivado Project Export feature included with LabVIEW FPGA 2017 to develop, simulate, and compile custom I/O modules for more performance or to meet unique application requirements. (More details available from NI here.)


Here’s a diagram showing you the flow for LabVIEW 2017’s Xilinx Vivado Project Export capability:




 NI Vivado Project Export for LabVIEW 2017.jpg




NI’s use of three Kintex UltraScale FPGA family members to develop these new FlexRIO products illustrate several important benefits associated with FPGA-based design.


First, NI has significantly future-proofed its modular PXIe FlexRIO design by incorporating the flexible, programmable I/O capabilities of Xilinx FPGAs. JESD204B is an increasingly common analog interface standard, easily handled by the Kintex UltraScale FPGAs. In addition, those FPGA I/O pins driving the FlexRIO Mezzanine interface connector are bulletproof, fully programmable, 16.3Gbps GTH/GTY ports that can accommodate a very wide range of high-speed interfaces—nearly anything that NI’s engineering teams might dream up in years to come.


Second, NI is able to offer three different resource levels based on three pin-compatible Kintex UltraScale FPGAs using the same pcb design. This pin compatibility is not accidental. It’s deliberate. Xilinx engineers labored to achieve this compatibility just so that system companies like NI could benefit. NI recognized and took advantage of this pre-planned compatibility. (You'll find that same attention to detail in every one of Xilinx's All Programmable device families.)


Third, NI is able to leverage the same Kintex UltraScale FPGA architecture for its high-speed Digitizer Modules and for its Coprocessor Modules, rather than using two entirely dissimilar chips—one for I/O control in the digitizers and one for the programmable computing engine Coprocessor Modules. The same programmable Kintex UltraScale FPGA architecture suits both applications well. The benefit here is the ability to develop common drivers and other common elements for both types of FlexRIO module.



For more information about these new NI FlexRIO products, please contact NI directly.



Note: Xcell Daily has covered the high-speed JESD204B interface repeatedly in the past. See:












Envious of all the cool FPGA-accelerated applications showing up on the Amazon AWS EC2 F1 instance like the Edico Genome DRAGEN Genome Pipeline that set a Guinness World Record last week, the DeePhi ASR (Automatic speech Recognition) Neural Network announced yesterday, Ryft’s cloud-based search and analysis tools, or NGCodec’s RealityCodec video encoder?






Well, you can shake off that green monster by signing up for the free, live, half-day Amazon AWS EC2 F1 instance and SDAccel dev lab being held at SC17 in Denver on the morning of November 15 at The Studio Loft in the Denver Performing Arts Complex (1400 Curtis Street), just across the street from the Denver Convention Center where SC17 is being held. Xilinx is hosting the lab and technology experts from Xilinx, Amazon Web Services, Ryft, and NGCodec will be available onsite.



Here’s the half-day agenda:


8:00 AM               Doors open, Registration, and Continental Breakfast

9:00 AM               Welcome, Technology Discussion, F1 Developer Use Cases and Demos

9:35 AM               Break

9:45 AM               Hands-on Training Begins

12:00 PM             Developer Lab Concludes



A special guest speaker from Amazon Web Services is also on the agenda.


Lab instruction time includes:


  • Step-by-step instructions to connect to an F1 instance
  • Interactive walkthrough of the SDAccel Development Environment
  • Highlights of SDAccel IDE features: compile, debug, profile
  • Instruction for how to develop a sample framework acceleration app



Seats are necessarily limited for a lab like this, so you might want to get your request in immediately. Where? Here.




One of the several products announced at DeePhi’s event held yesterday in Beijing was the DP-S64 ASR (Automatic speech Recognition) Acceleration Solution, a Neural Network (NN) application that runs on Amazon’s FPGA-Accelerated AWS EC2 F1 instance. The AWS EC2 F1 instances’ FPGA acceleration is based on multiple Xilinx Virtex UltraScale+ VU9P FPGAs. (See “AWS makes Amazon EC2 F1 instance hardware acceleration based on Xilinx Virtex UltraScale+ FPGAs generally available.”)




DeePhi AWS EC2 F1 ASR Announcement.jpg 



For details and to apply for a free trial of DeePhi’s DP-S64, send an email to deephi-lstm-aws@deephi.tech.


For information about the other NN products announced yesterday by DeePhi, see “DeePhi launches vision-processing dev boards based on a Zynq SoC and Zynq UltraScale+ MPSoC, companion Neural Network (NN) dev kit.”





Yesterday, DeePhi Tech announced several new deep-learning products at an event held in Beijing. All of the products are based on DeePhi’s hardware/software co-design technologies for neural network (NN) and AI development and use deep compression and Xilinx All Programmable technology as a foundation. Central to all of these products is DeePhi’s Deep Neural Network Development Kit (DNNDK), an integrated framework that permits NN development using popular tools and libraries such as Caffe, TensorFlow, and MXNet to develop and compile code for DeePhi’s DPUs (Deep Learning Processor Units). DeePhi has developed two FPGA-based DPUs: the Aristotle Architecture for convolutional neural networks (CNNs) and the Descartes Architecture for Recurrent Neural Networks (RNNs).



DeePhi DNNDK Design Flow.jpg 


DeePhi’s DNNDK Design Flow




DeePhi Aristotle Architecture.jpg 


DeePhi’s Aristotle Architecture




DeePhi Descartes Architecture.jpg 


DeePhi’s Descartes Architecture




DeePhi’s approach to NN development using Xilinx All Programmable technology uniquely targets the company’s carefully optimized, hand-coded DPUs instantiated in programmable logic. In the new book “FPGA Frontiers” published the Next Platform Press, DeePhi’s co-founder and CEO Song Yao describes using his company’s DPUs: “The algorithm designer doesn’t need to know anything about the underlying hardware. This generates instruction instead of RTL code, which leads to compilation in 60 seconds.” The benefits are rapid development and the ability to concentrate on NN code development rather than the mechanics of FPGA compilation, synthesis, and placement and routing.


Part of yesterday’s announcement included two PCIe boards oriented towards vision processing that implement DeePhi’s Aristotle Architecture DPU. One board, based on the Xilinx Zynq Z-7020 SoC, handles real-time CNN-based video analysis including facial detection for more than 30 faces simultaneously for 1080p, 18fps video using only 2 to 4 watts. The second board, based on a Xilinx Zynq UltraScale+ ZU9 MPSoC, supports simultaneous, real-time video analysis for 16 channels of 1080p, 18fps video and draws only 30 to 60 watts.


DeePhi Zynq SoC PCIe card.jpg 


DeePhi PCIe NN board based on a Xilinx Zynq Z-7020 SoC




DeePhi PCIe NN card based on Zynq UltraScale Plus MPSoC .jpg 


DeePhi PCIe NN board based on a Xilinx Zynq UltraScale+ ZU9 MPSoC




For more information about these products, please contact DeePhi Tech directly.




Here’s a great, very short video from Samtec showing its Firefly QSFP28 copper “flyover” cable carrying four lanes of 28Gbps serial data error free over six feet of copper Twinax cable with plenty of margin. (It’s a 3-foot cable with the signals looped back at one end in a passive loop with no repeaters.)


That’s an amazing feat! The 28Gbps data streams are both transmitted from and received by the same Xilinx VCU118 Dev Kit board, which is based on a Xilinx Virtex UltraScale+ VU9P FPGA. Samtec’s video uses its FMC+ Active Loopback Card for the demo.


If you do this sort of thing on a pcb—even using a super-advanced, super-high-speed, ultra-low-loss pcb dielectric material like Panasonic’s MEGTRON6—your on-board reach is only five to eight inches. So Samtec’s Firefly system really extends your reach for these very high-speed data streams.


There are two main reasons why this high-speed signal-transmission system works:


  1. The superior performance of Samtec’s Firefly copper Twinax interconnect.
  2. The bulletproof GTY SerDes transceivers in the Virtex UltraScale+ VU9P FPGA.



Here’s the 2.5-minute Samtec video demo:





And here’s Samtec’s original 3-minute Firefly/UltraScale+ demo video, mentioned in the above video:






Last month, I described BrainChip’s neuromorphic BrainChip Accelerator PCIe card, a spiking neural network (SNN) based on a  Xilinx Kintex UltraScale FPGA. (See “FPGA-based Neuromorphic Accelerator board recognizes objects 7x more efficiently than GPUs on GoogleNet, AlexNet.”) Now, BrainChip Holdings has announced that it has shipped a BrainChip Accelerator to “a major European automobile manufacturer” for evaluation in ADAS (Advanced Driver Assist Systems) and autonomous driving applications. According to BrainChip’s announcement, the company’s BrainChip Accelerator “provides a 7x improvement in images/second/watt, compared to traditional convolutional neural networks accelerated by Graphics Processing Units (GPUs).”



BrainChip FPGA Board.jpg 


BrainChip Accelerator card with six SNNs instantiated in a Kintex UltraScale FPGA



Do you need to pack the maximum amount of embedded resources into a minimum space? Then take a look at the just-announced Aldec TySOM-3-ZU7EV Embedded Prototyping Board. Aldec’s TySOM-3-ZU7EV board jams a Xilinx Zynq UltraScale+ ZU7EV MPSoC, DDR4 SoDIMM, WiFi, Bluetooth, two FMC sites, HDMI input and output ports, a DisplayPort connection, a QSFP+ optical cage, and a Pmod connector into a dense, 100x144mm footprint. Don’t let the word “Prototyping” in the product name fool you; in my opinion, this board looks like a good production foundation for many embedded designs.


The on-board Zynq UltraScale+ ZU7EV MPSoC itself provides you with a quad-core, 64-bit ARM Cortex-A53 MPCore processor; a dual-core, 32-bit ARM Cortex-R5 MPCore processor capable of lockstep operation that’s especially useful for safety-critical designs; an ARM Mali-400 MP2 GPU; four tri-mode Ethernet ports; two USB3.0 ports; a variety of other useful peripherals; and a very big chunk of Xilinx UltraScale+ programmable logic with 1720 DSP48E2 slices and more than 30Mbits of high-speed, on-chip SRAM in three forms (distributed RAM, BRAM, and UltraRAM). That’s an immense amount of processing capability—ample for a wide variety of embedded designs.


The board also includes a SoDIMM socket capable of holding 4Gbytes of DDR4 SDRAM, a MicroSD card socket capable of holding SD cards with as much as 32Gbytes of Flash memory, 256Mbits of on-board QSPI Flash memory, and 64Kbits of EEPROM.


Here’s a photo of this packed board:




Aldec TySOM-3 Board.jpg


Aldec TySOM-3-ZU7EV Embedded Prototyping Board





Here’s a diagram showing you the board’s diminutive dimensions:




Aldec TySOM-3 Dimension Diagram.jpg 


Aldec TySOM-3-ZU7EV Embedded Prototyping Board Dimension Diagram




And finally, here’s the TySOM-3-ZU7EV board’s block diagram:



Aldec TySOM-3 Board Block Diagram.jpg



Aldec TySOM-3-ZU7EV Embedded Prototyping Board Block Diagram





If you’re visiting Arm TechCon this week in the Santa Clara Convention Center, you’ll find Aldec in the Expo Hall.





Earlier this month, I described Aaware’s $199 Far-Field Development Platform for cloud-based, voice controlled systems such as Amazon’s Alexa and Google Home. (See “13 MEMS microphones plus a Zynq SoC gives services like Amazon’s Alexa and Google Home far-field voice recognition clarity.”) This far-field, sound-capture technology exhibits some sophisticated abilities including:


  1. The ability to cancel interfering noise without a reference signal. (Competing solutions focus on AEC—acoustic echo cancellation—which cancels noise relative to a required audio reference channel.)
  2. Support for non-uniform 1D and 2D microphone array spacing.
  3. Scales up with more microphones for noisier environments.
  4. Offers a one-chip solution for sound capture, multiple wake words, and customer applications. (Today this is a two-chip solution.)
  5. Makes everything available in a “software-ready” environment: Just log in to the Ubuntu linux environment and use Aaware’s streaming audio API to begin application development.



Aaware Far Field Development PLatform.jpg 


Aaware’s Far-Field Development Platform




These features are layered on top of a Xilinx Zynq SoC or Zynq UltraScale+ MPSoC and Aaware’s CTO Chris Eddington feels that the Zynq devices provide “well over” 10x the performance of an embedded processor thanks to the devices’ on-chip programmable logic, which offloads a significant amount of processing from the on-chip ARM Cortex processor(s). (Aaware can squeeze its technology into a single-core Zynq Z-7007S SoC and can scale up to larger Zynq SoC and Zynq UltraScale+ MPSoC devices as needed by the customer application.)


Aaware’s algorithm development is based on a unique tool chain:


  • Algorithm development in MathWork’s MATLAB.
  • Hand-coding of an equivalent application in C++.
  • Initial hardware-accelerator synthesis from the C++ specification using Vivado HLS.
  • Use of Xilinx SDSoC to connect the hardware accelerators to the AXI bus and memory.



This tool chain allows Aaware to fit the features it wants into the smallest Zynq Z-7007S SoC or to scale up to the largest Zynq UltraScale+ MPSoC.








The $89 Avnet MiniZed board is such a cool tool for fast-paced embedded design with its on-board Zynq Z-7007S SoC, WiFi, Bluetooth, Arduino shield header, and USB 2.0 host interface. Now, you can get a no-cost head start using this board through a 1-hour Webinar scheduled for November 15 titled “$89 MiniZed: Up Your Game in Embedded Design.”



MiniZed Block Diagram.jpg 


Avnet’s $89 MiniZed Dev Board Block Diagram




For more information about the Avnet MiniZed board, see “Avnet’s $89 MiniZed dev board based on Zynq Z-7000S SoC includes WiFi, Bluetooth, Arduino—and SDSoC!


Amazon AWS’ re:Invent 2017 takes place in Las Vegas on November 27 through December 1. (Tickets nearly sold out as of today.)  CMP402, a class session during the event, is titled “Accelerate Your C/C++ Applications with Amazon EC2 F1 Instances.” Here’s the verbatim class description:


“The newly introduced Amazon EC2 F1 OpenCL development workflow helps software developers with little to no FPGA experience supercharge their applications with Amazon EC2 F1.  Join us for an overview and demonstration of how to accelerate your C/C++ applications in the cloud using OpenCL with Amazon EC2 F1 instances.  We walk you through the development flow for creating a custom hardware acceleration for a software algorithm.  Attendees get hands-on and creative by optimizing an algorithm for maximum acceleration on Amazon EC2 F1 instances.”


The Amazon AWS EC2 F1 instance gets its acceleration from Xilinx UltraScale+ VU9P FPGAs and the C/C++/OpenCL programming facility is based on SDAccel—Xilinx’s development environment for accelerating cloud-based applications using C, C++, or OpenCL—which became available for the AWS EC2 F1 instance just last month. (See “SDAccel for cloud-based application acceleration now available on Amazon’s AWS EC2 F1 instance.”)


For more information about the Amazon AWS EC2 F1 instance, see:









Last week, Edico Genome and Children’s Hospital of Philadelphia (CHOP) set a new worlds record in genome processing by analyzing 1000 pediatric genomes through Edico Genome’s DRAGEN Genome Pipeline in two hours and 25 minutes. They accomplished this feat by harnessing 1000 Amazon AWS EC2 F1 instances—cloud acceleration instances that are based on Xilinx UltraScale+ VU9P FPGAs. The team analyzed a pediatric cohort of 1,000 whole genomes supplied by the Center for Applied Genomics (CAG), one of CHOP’s specialized Center of Emphasis.


CAG’s primary goal is translating basic research findings to medical innovations and it’s dedicated to the development of new and better ways to diagnose and treat children affected by rare and complex medical disorders. Results from the rapid analysis will be utilized by the CAG with the hope of uncovering genetic links to common childhood diseases, including asthma, autism, diabetes, epilepsy, obesity, schizophrenia, pediatric cancer and a range of rare diseases. 


“Today’s speed test is a culmination of two years of collaboration between CAG and Edico Genome, including beta-testing their product in our center. We utilize DRAGEN as part of our genomic workflow to achieve our mission of translating basic research findings to medical innovations.  The speed of this technology in processing vast amounts of raw data in a matter of minutes will allow us to deliver actionable results in hours—an important capability as we go forward in realizing the benefits of precision medicine for children and families,” said Hakon Hakonarson, M.D., Ph.D., director of CAG at CHOP.


For more information about Edico Genome’s DRAGEN Genome Pipeline, see:











By Adam Taylor


In all of our previous MicroBlaze soft processor examples, we have used JTAG to download and execute the application code via SDK. You can’t do that in the real world. To deploy a real system, we need the MicroBlaze processor to load and boot its application code from non-volatile memory without our intervention. I thought that showing you how to do this would make for a pretty interesting blog. So using the $99 Digilent Arty A7 board, which is based on a Xilinx Artix-7 FPGA, I am going to demonstrate the steps you need to take using the Arty’s on-board QSPI Flash memory. We will store both the bitstream configuration file and the application software in the QSPI flash.


The QSPI therefore has two roles:


  • Configure the Artix FPGA
  • Store the application software


For the first role, we do not need to include a QSPI interface in our Vivado design. All we need to do is update the Vivado configuration settings to QSPI, provided the QSPI flash memory is connected to the FPGA’s configuration pins. However, we need to include a QSPI interface in our design to interface with the QSPI Flash memory once the FPGA is configured and the MicroBlaze soft processor is instantiated. This addition allows a bootloader to copy the application software from the QSPI Flash memory to the Arty’s DDR SDRAM where it actually executes.


Of course, this raises the question:


Where does the MicroBlaze bootloader come from?


The process flowchart for developing the bootloader looks like this:






Development Flow Chart



Our objective is to create a single MCS image containing both the FPGA bitstream and the application software that we can burn into the QSPI Flash memory. To so this we need to perform the following steps in Vivado and SDK:



  • Include a QSPI Interface within the existing Vivado MicroBlaze design


  • Edit the device settings in Vivado to configure the device using Master SPI_4 and to compress the bit file, then to build and export the application to SDK







  • In SDK, create a new application project based on the exported hardware design. During the project-creation dialog, select the SREC SPI Bootloader template. This selection creates an SREC bootloader application that will load the main application code from QSPI Flash memory. Before we build the bootloader ELF, we first need to define the address offset from the base address of the QSPI to the location of the application software. In this case it is 0x600000. We define this offset within blconfig.h. We also need to update the SREC Bootloader BSP to identify the correct serial Flash memory device family. To do this, reconfigure the BSP. The family identification number to use is defined within xilisf.h, available under BSP libsrc. For this application, we select type 5 because the Arty board uses a Micron QSPI device, which is type 5.







  • Now create a second application project In SDK. This is the application we are going to load using the bootloader. For this application, I created a simple “hello world” application, ensuring in the linker file that this program will run from DDR SDRAM. To create the single MCS file, we need the application software to be in the S-record format. This format stores binary information in an ASCII format. (This format is now 40 years old and was originally developed for the 8-bit Motorola 6800 microprocessor.) We can use SDK to convert the generated ELF into S-record format. To generate the S-record file in SDK, open a bash shell and enter the following command in the directory that contains the application ELF:


cmd /c mb-objcopy -O srec <app>.elf <app>.srec



  • With the bootloader ELF created, we now need to merge the bitstream with the bootloader ELF in Vivado. This step allows the bootloader to be loaded into and run from the MicroBlaze processor’s local memory following configuration. Because this memory is small, the bootloader application must also be small. If you are experiencing problems reducing the size of the software application, consider using a compiler optimisation before increasing the local memory size.







  • With the merged bit file created and the S-record file available, use Vivado’s hardware manager to add the configuration memory:






  • The final step is to generate the unified MCS file containing the merged bitstream and the application software. When generating this file, we need to remember to load the application software using the same offset used in the SREC bootloader.







Once the file is built and burned into the QSPI memory, we can test to see that the MCS file works by connecting the Arty board to a terminal and pressing the board’s reset button. After a few seconds you should see the Arty board’s “done” LED illuminate and then you should see the results of the SREC bootloader appear in the terminal window. This report should show that the S records have loaded from QSPI memory into to DDR SDRAM before the program executes.


We now have a working MicroBlaze system that we can deploy in our designs.




The project, as always, is on GitHub.




If you want E book or hardback versions of previous MicroZed chronicle blogs, you can get them below.




First Year E Book here

First Year Hardback here.




MicroZed Chronicles hardcopy.jpg 



Second Year E Book here

Second Year Hardback here



MicroZed Chronicles Second Year.jpg 








Earlier this year, Shahriar Shahramian, who regularly posts in-depth electronics videos on his YouTube channel “The Signal Path,” reviewed and then tore down a Rohde & Schwarz Spectrum Rider FPH. This hand-held, battery-powered, portable spectrum analyzer has been on the market for a couple of years now and covers 5kHz to 2, 3, or 4GHz, depending on options. (Shahriar Shahramian is department head for millimeter-Wave ASIC Research at Nokia Bell Labs.)




Rohde and Schwarz 4GHz Spectrum Rider FPH.jpg 



Rohde & Schwarz Spectrum Rider FPH portable spectrum analyzer




Shahriar’s extensive user review and teardown video of the Spectrum Rider FPH lasts an hour, so here are some salient points in the video.



1:10 – Model comparison of R&S portable spectrum analyzers


5:44 – Instrument hardware overview and GUI characteristics


22:24 – Using the FPH Spectrum Rider to analyze unknown wireless signals including a multi-tone, QPSK modulated signal, AM/FM demodulation analysis and frequency hopping


46:46 – FPH teardown and analysis


55:33 – Overview of the Instrument View remote connection software


57:42 – Concluding remarks




Most important for Xcell Daily, at 48 minutes into the video Shahriar finally cracks the instrument apart and finds a Zynq SoC handling essentially all of the instrument’s RF digital signal processing; I/O control; its clean, responsive, elegant, and well-thought-out user interface based on hard buttons and a pinch-sensitive touch screen; and its Instrument View remote front-panel interface that operates over the USB port or Ethernet.


The Zynq SoC is a perfect fit for such an application. The Zynq Processing System (PS) handles the user interface and general supervision. The Zynq Programmable Logic (PL) section with its high-speed programmable logic and DSP slices handles the instrument’s high-speed control and signal processing.


Here’s a photo clipped from the video. Shahriar is pointing to the Zynq SoC in this photo:




Rohde and Schwarz 4GHz Spectrum Rider FPH Zynq Detail.jpg


A Zynq SoC manages the overall operation and digital signal processing for the Rohde & Schwarz Spectrum Rider FPH




You should note that this portable, hand-held instrument has an 8-hour battery life despite the rather sophisticated RF and digital electronics. I strongly suspect the high level of integration made possible by the Zynq SoC has something to do with this.


Because of its All Programmable flexibility, the Zynq SoC makes a terrific foundation for entire product families. You can see this here because Rohde and Schwarz has also used the Zynq SoC as the digital heart of its Scope Rider, a multi-function, hand-held, 2/4-channel, 500MHz DSO (as reported by the Powered by Xilinx Web page.) The family resemblance with the Spectrum Rider FPH is quite strong:





Rohde and Schwarz Scope Rider.jpg


Rohde & Schwarz Scope Rider



I’ll go out on a limb and guess that there’s a lot of shared code (not to mention case tooling) between the company’s Scope Rider and Spectrum Rider FPH. The Zynq SoC’s PS and PL along with its broadly programmable I/O pins combine to create a very flexible design platform for a product family or multiple product families. That kind of leverage allows you to create an assembly line for new-product development with competition-beating time to market.




Here’s The Signal Path’s YouTube video review of the Rohde & Schwarz 4.0GHz Spectrum Rider FPH:






For more information about the Spectrum Rider FPH and the Scope Rider, please contact Rohde & Schwarz directly.




There’s a new tutorial on Digilent’s Web site that tells you how to use its Digital Discovery high-speed logic analyzer and pattern generator to get a detailed look at the boot sequence of a Zynq Z-7000 SoC by monitoring the SPI interface between the Zynq SoC and its SPI-attached QSPI boot ROM. You only need seven connections. The tutorial uses a custom interpreter script for the Digital Discovery analyzer to decode the SPI traffic. The script is installed in the analyzer’s PC-based control software called Waveforms and the tutorial page gives you the script.


The entire boot transfer sequence takes 700msec and the entire boot-sequence acquisition consumes a lot of memory: 268,435,456 samples in this case. The Digital Discovery doesn’t store that many samples—it doesn’t need to do so because it sends the acquired data to the attached PC over the connecting USB cable.




Digilent Digital Discovery captures Zynq Boot Sequence over SPI.jpg 



Digilent’s Digital Discovery logic analyzer captures the entire boot sequence for a Zynq Z-7000 SoC over SPI




There’s nothing particularly unusual about a logic analyzer capturing a processor’s boot sequence. However in this case, Digilent’s Digital Discovery is based on a Xilinx Spartan-6 LX25 FPGA (see “$199.99 Digital Discovery from Digilent implements 800Msample/sec logic analyzer, pattern generator. Powered by Spartan-6”) and it’s monitoring the boot sequence of a Xilinx Zynq SoC. Good tools all around.



Wireless, handheld, 30MHz, $379 IkaScope DSO based on Spartan-3 FPGA gets a distributor: Saelig

by Xilinx Employee ‎10-19-2017 10:02 AM - edited ‎10-19-2017 10:03 AM (2,318 Views)


Last month, I wrote about the wireless, 30MHz, handheld IkaScope WS200 DSO created by IkaLogic. It’s based on a Spartan-3 FPGA, which handles almost all of the instrument’s internal functions and is fast enough to support the WS200 DSO’s 200MHz sample rate. (See “Wireless, €299, 30MHz, 200Msamples/sec DSO fits entirely inside of a probe thanks to the integration power of a Spartan FPGA.”) Late last month, the Saelig Company announced that it was carrying the IkaScope WS200. It lists for $379. (Click here for the Saelig Web page.)






The 30MHz, 200Msamples/sec WS200 IkaScope DSO from IkaLogic




The IkaScope WS200 DSO’s internal 450mAh battery lasts about one week with daily regular use before you need to recharge it. That’s because the instrument uses smart power management to turn itself on when it detects pressure on the probe tip, signaling that you’re taking a measurement, and turn itself off when it observes that no measurements have been taken for a period of time. The IkaScope's Automatic History feature saves a captured signal waveform when it detects a pressure release on the DSO’s probe tip.


The IkaScope WS200 DSO has no display. Instead, it uses WiFi to communicate with a Windows/Mac/Linux/Android/iOS app running on a PC or tablet. That means you can use finger gestures to control the display if the PC or tablet has a touch screen, another innovation built into this instrument.




If you need a ready-to-plug module for building big systems, take a look at VadaTech’s new AMC580 FPGA Carrier module, which features a Xilinx Zynq UltraScale+ ZU19EG MPSoC connected to 8Gbytes of 64-bit DDR4 SDRAM with ECC, 64Gbytes of Flash memory plus 128Mbytes of boot Flash memory, and SD card slot for even more Flash storage, and two FMC connector sites. The Zynq UltraScale+ MPSoC interfaces directly to the AMC FCLKA, TCLKA-D, FMC DP0-9, and all FMC LA/HA/HB pairs.


The Zynq UltraScale+ ZU19EG MPSoC incorporates a tremendous number of on-chip system resources for your system design including four 64-bit, Arm Cortex-A53 application processors running as fast as 1.3GHz; two 32-bit Arm Cortex-R5 lockstep-capable, real-time processors running as fast as 533MHz; an Arm Mali-400 MP2 GPU running as fast as 667MHz; and 1143 system logic cells, 1968 DSP48E2 slices, 34.6Mbits of BRAM, 36Mbits of UltraRAM, four 100G Ethernet MACs, and four 150G Interlaken ports.


Here’s a photo of VadaTech’s new AMC580 FPGA Carrier module:



VadaTech AMC580.jpg 


The VadaTech AMC580 FPGA Carrier module incorporates a Zynq UltraScale+ ZU19EG MPSoC





And here’s a block diagram of the AMC580 module:




VadaTech AMC580 Block Diagram.jpg 


VadaTech AMC580 FPGA Carrier module block diagram




VadaTech provides reference VHDL developed using the Xilinx Vivado Design Suite for testing basic hardware functionality along with a Linux BSP, build scripts, and device drivers for the AMC580.



Please contact VadaTech directly for more information about the AMC580 FPGA Carrier module.




RedZone Robotics’ Solo—a camera-equipped, autonomous sewer-inspection robot—gives operators a detailed, illuminated view of the inside of a sewer pipe by crawling the length of the pipe and recording video of the conditions it finds inside. A crew can deploy a Solo robot in less than 15 minutes and then move to another site to launch yet another Solo robot, thus conducting several inspections simultaneously and cutting the cost per inspection. The treaded robot traverses the pipeline autonomously and then returns to the launch point for retrieval. If the robot encounters an obstruction or blockage, it attempts to negotiate the problem three times before aborting the inspection and returning to its entry point. The robot fits into pipes as small as eight inches in diameter and even operates in pipes that contain some residual waste water.




RedZone Robotics Solo Sewer Inspection Robot.jpg


RedZone Robotics Autonomous Sewer-Pipe Inspection Robot




Justin Starr, RedZone’s VP of Technology, says that the Solo inspection robot uses its on-board Spartan FPGA for image processing and for AI. Image-processing algorithms compensate for lens aberrations and also perform a level of sensor fusion for the robot’s multiple sensors. “Crucial” AI routines in the Spartan FPGA help the robot keep track of where it is in the pipeline and tell the robot what to do when it encounters an obstruction.


Starr also says that RedZone is already evaluating Xilinx Zynq devices to extend the robot’s capabilities. “It’s not enough for the Solo to just grab information about what it sees, but let’s actually look at those images. Let’s have the solo go through that inspection data in real time and generate a preliminary report of what it saw. It used to be the stuff of science fiction but now it’s becoming reality.”


Want to see the Solo in action? Here’s a 3-minute video:










WP460, “Reducing System BOM Cost with Xilinx's Cost-Optimized Portfolio,” was recently updated. This White Paper contains a wealth of helpful information if you’re trying to cut manufacturing costs in your system. Updates to the White Paper include new information about new Spartan-7 FPGA family and Zynq Z-7000S SoCs with single-core Arm Cortex-A9 processors that run at clock rates as fast as 766MHz. Both of these Xilinx device families help drive BOM costs down while still delivering plenty of system performance—just what you’d expect from Xilinx All Programmable devices.


The White Paper also contains additional information about new MicroBlaze soft microprocessor presets, which you can read about in the new MicroBlaze Quick Start Guide. You can drop a 32-bit RISC MicroBlaze soft processor into any current Xilinx device very efficiently. The microcontroller preset version of the MicroBlaze processor consumes fewer than 2000 logic cells, yet runs at a much higher clock rate than most low-cost microcontrollers. There’s plenty of performance there when you need it.





Nutaq has just posted information about an intense demo where four of the company’s PicoSDR 8x8 7MHz-6GHz software-defined radio systems—based on Xilinx Virtex-6 FPGAs—are ganged to create a 32-antenna, massive-MIMO basestation that can communicate wirelessly with six UEs (user equipment systems) simultaneously. The UEs are simulated using three Xilinx ZC702 eval kits based on Zynq Z-7020 SoCs.




Nutaq 32-antenna massive-MIMO array.jpg


Nutaq’s 32-antenna massive-MIMO array




Here’s a Nutaq video of the demo:







All of the signal processing in this demo is performed on a laptop PC using Mathworks’ MATLAB, which generates waveforms for transmission by the simulated UEs and decodes received signals from the PicoSDR 8x8 receiver. As explained in the video, the transmission waveforms are downloaded to the BRAMs in the Zynq SoCs on the ZC706 boards, wirelessly transmitted to the massive-MIMO receiving antenna, captured by the PicoSDR 8x8 systems, and then sent back to MATLAB for decoding into separate UE constellations.


For more information about this demo and the PicoSDR 8x8 systems, contact Nutaq directly.


For more information in Xcell Daily, see “Nutaq LTE demo shows FPGA-based PicoSDR 8x8 working with Matlab LTE System Toolbox running 16-element MIMO.”




Once again, the two LIGO and the Advanced Virgo gravity-wave observatories have jointly recorded a cataclysmic event, but this time it’s two neutron stars colliding instead of a pair of black holes—and this time, the event was visible! A paper published yesterday in Physical Review Letters by the LIGO Scientific Collaboration and the Virgo Collaboration titled “GW170817: Observation of Gravitational Waves from a Binary Neutron Star Inspiral” starts out by saying “On August 17, 2017, the LIGO-Virgo detector network observed a gravitational-wave signal from the inspiral of two low-mass compact objects consistent with a binary neutron star (BNS) merger.” Although this new ripple in spacetime was recorded on August 17, it occurred about 130 million years ago. Two seconds after the gravity waves were detected, NASA’s Fermi satellite and ESA’s INTEGRAL satellite both detected a gamma-ray burst from the same direction. Astrophysicists have theorized that neutron-star collisions would produce gamma-ray bursts and the Integral and Fermi observations tend to support that theory.


Using triangulation data from these observations, the LIGO team quickly alerted astronomers around the world and told them where to look. Optical, ground-based observatories were able to image the event visibly and for days after. The Hubble Space Telescope was brought to bear on the event and it captured this image:




Neutron Binary Star Merger captured by Hubble Space Telescope.jpg


Neutron Binary Star Merger Captured by the Hubble Space Telescope

Photo credit: NASA and ESA




This event doesn’t just confirm the gravity-wave observatories’ ability to capture heretofore undetected astronomical phenomena; it has also produced new data that tends to confirm some theoretical models of the universe including the manufacture of heavier elements including silver, gold, platinum, and uranium from the collision. Here’s Dan Kasen, a theoretical physicist at UC Berkeley, to explain this in a riveting 3-minute video:






And why discuss this huge leap in astrophysics in an Xcell Daily blog post? Because Xilinx Spartan FPGAs are incorporated into the design of the gravity-wave observatories and therefore play a role in the discoveries. For more information about this, see:












Here’s a hot-off-the-camera, 3-minute video showing a demonstration of two ZCU106 dev boards based on the Xilinx Zynq UltraScale+ ZU7EV MPSoCs with integrated H.265 hardware encoders and decoders. The first ZCU106 board in this demo processes an input stream from a 4K MIPI video camera by encoding it, packetizing it, and then transmitting it over a GigE connection to the second board, which depacketizes, decodes, and displays the video stream on a 4K monitor. Simultaneously, the second board performs the same encoding, packetizing, and transmission of another video stream from a second 4K MIPI camera to the first ZCU106 board, which displays the second video stream on another 4K display.


Note that the integrated H.265 hardware codecs in the Zynq UltraScale+ ZU7EV MPSoC can handle as many as eight simultaneous video streams in both directions.


Here’s the short video demo of this system in action:









For more information about the ZCU106 dev board and the Zynq UltraScale+ EV MPSoCs, contact your friendly, neighborhood Xilinx or Avnet sales representative.




The free, Web-based airhdl register file generator from noasic GmbH, an FPGA design and coaching consultancy and EDA tool developer, uses a simple, online definition tool to create register definitions from which the tool then automatically generates HDL, a C header file, and HTML documentation. The company’s CEO Guy Eschemann has been working with FPGAs for more than 15 years, so he’s got considerable experience in the need to create bulletproof register definitions to achieve design success. His company noasic is a member of the Xilinx Alliance Program.


What’s the big deal about creating registers? Many complex FPGA-based designs now require hundreds or even thousands of registers to operate and monitor a system and keeping these register definitions straight and properly documented, especially within the context of engineering changes, is a tough challenge for any design team.


The best way I’ve seen to put register definition in context comes from the book “Hardware/Firmware Interface Design: Best Practices for Improving Embedded Systems Development” written by my friend Gary Stringham:



“The hardware/firmware interface is the junction where hardware and firmware meet and communicate with each other. On the hardware side, it is a collection of addressable registers that are accessible to firmware via reads and writes. This includes the interrupts that notify firmware of events. On the firmware side, it is the device drivers or the low-level software that controls the hardware by writing values to registers, interprets the information read from the registers, and responds to interrupt requests from the hardware. Of course, there is more to hardware than registers and interrupts, and more to firmware than device drivers, but this is the interface between the two and where engineers on both sides must be concerned to achieve successful integration.”



The airhdl EDA tool from noasic is designed to help your hardware and software/firmware teams “achieve successful integration” by creating a central nexus for defining the critical, register-based hardware/firmware interface. It uses a single register map (with built-in version control) to create the HDL register definitions, the C header file for firmware’s use of those registers, and the HTML documentation that both the hardware and software/firmware teams will need to properly integrate the defined registers into a design.



Here’s an 11-minute video made by noasic to explain the airhdl EDA tool:






Consider signing up for access to this free tool. It will very likely save you a lot of time and effort.



For more information about airhdl, use the links above or contact noasic GmbH directly.



By Adam Taylor


One ongoing area we have been examining is image processing. We’ve look at the algorithms and how to capture images from different sources. A few weeks ago, we looked at the different methods we could use to receive HDMI data and followed up with an example using an external CODEC (P1 & P2). In this blog we are going to look at using internal IP cores to receive HDMI images in conjunction with the Analog Devices AD8195 HDMI buffer, which equalizes the line. Equalization is critical when using long HDMI cables.





Nexys board, FMC HDMI and the Digilent PYNQ-Z1




To do this I will be using the Digilent FMC HDMI card, which provisions one of its channels with an AD8195. The AD8195I on the FMC HDMI card needs a 3v3 supply, which is not available on the ZedBoard unless I break out my soldering iron. Instead, I broke out my Digilent Nexys Video trainer board, which comes fitted with an Artix-7 FPGA and an FMC connector. This board has built-in support for HDMI RX and TX but the HDMI RX path on this board supports only 1m of HDMI cable while the AD8195 on the FMC HDMI card supports cable runs of up to 20m—far more useful in many distributed applications. So we’ll add the FMC HDMI card.


First, I instantiated a MicroBlaze soft microprocessor system in the Nexys Video card’s Artix-7 FPGA to control the simple image-processing chain needed for this example. Of course, you can implement the same approach to the logic design that I outline here using a Xilinx Zynq SoC or Zynq UltraScale+ MPSoC. The Zynq PS simply replaces the MicroBlaze.


 The hardware design we need to build this system is:


  • MicroBlaze controller with local memory, AXI UART, MicroBlaze Interrupt controller, and DDR Memory Interface Generator.
  • DVI2RGB IP core to receive the HDMI signals and convert them to a parallel video format.
  • Video Timing Controller, configured for detection.
  • ILA connected between the VTC and the DVI2RBG cores, used for verification.
  • Clock Wizard used to generate a 200MHz clock, which supplies the DDR MIG and DVI2RGB cores. All other cores are clocked by the MIG UI clock output.
  • Two 3-bit GPIO modules. The first module will set the VADJ to 3v3 on the HDMI FMC. The second module enables the ADV8195 and provides the hot-plug detection.







The final step in this hardware build is to map the interface pins from the AD8195 to the FPGA’s I/O pins through the FMC connector. We’ll use the TMDS_33 SelectIO standard for the HDMI clock and data lanes.


Once the hardware is built, we need to write some simple software to perform the following:



  • Disable the VADJ regulator using pin 2 on the first GPIO port.
  • Set the desired output voltage on VADJ using pins 0 & 1 on the first GPIO port.
  • Enable the VADJ regulator using pin 2 on the first GPIO port.
  • Enable the AD8195 using pin 0 on the second GPIO port.
  • Enable pre- equalization using pin 1 on the second GPIO port.
  • Assert the Hot-Plug Detection signal using pin 2 on the second GPIO port.
  • Read the registers within the VTC to report the modes and status of the video received.



To test this system, I used a Digilent PYNQ-Z1 board to generate different video modes. The first step in verifying that this interface is working is to use the ILA to check that the pixel clock is received and that its DLL is locked, along with generating horizontal and vertical sync signals and the correct pixel values.


Provided the sync signals and pixel clock are present, the VTC will be able to detect and classify the video mode. The application software will then report the detected mode via the terminal window.





ILA Connected to the DVI to RGB core monitoring its output







Software running on the Nexys Video detecting SVGA mode (600 pixels by 800 lines)




With the correct video mode being detected by the VTC, we can now configure a VDMA write channel to move the image from the logic into a DDR frame buffer.



You can find the project on GitHub



If you are working with video applications you should also read these:



PL to PS using VDMA

What to do if you have VDMA issues  

Creating a MicroBlaze System Video

Writing MicroBlaze Software  




If you want E book or hardback versions of previous MicroZed chronicle blogs, you can get them below.




First Year E Book here

First Year Hardback here.




MicroZed Chronicles hardcopy.jpg  




Second Year E Book here

Second Year Hardback here



MicroZed Chronicles Second Year.jpg 




Previously, I wrote about MicroCore Labs’ cycle-exact MCL65 soft-core version of the venerable 6502 microprocessor, which found its way into many significant microprocessor-based designs in the 1970s, 1980s, and beyond. When I wrote about the core, instantiated in less than 1% of a Xilinx Spartan-7 S50 FPGA, it was operating a Commodore VIC-20 personal computer. (See “Cycle-exact 6502 processor clone fits in 0.77% of a Spartan-7 (and Spartan-3) FPGA, powers VIC-20 PC.”) Now, there’s further evidence of the core’s cycle-exactness: it’s running an Atari 2600 VCS (Video Computer System) and an Apple II personal computer. Both of these machines were introduced in 1977 and both are significant in the context of the MCL65 processor core because both of them rely entirely on instruction-level timing loops for critical functions.


The Atari 2600 VCS intimately ties its video to the NTSC timing of analog TV because it only had 128 bytes (not mega-, not kilo-, just plain old bytes) of RAM. That’s too small for a frame buffer so the processor had to generate a new screen 30 times a second and stay in sync with the TV’s horizontal line drawing and vertical refresh rate. (See Wired.com’s fascinating “Racing the Beam: How Atari 2600's Crazy Hardware Changed Game Design.”)


Here’s a photo of MicroCore Labs’ MCL65 processor core running on a $109 Digilent Arty S7 Spartan-7 FPGA dev board and operating an Atari 2600 VCS:




MicroCore Labs MCL6502 processor running an Atari 2600 VCS.jpg


MicroCore Labs’ MCL65 processor core running an Atari 2600 VCS, using a Spartan-7 FPGA




Then there’s the Apple II personal computer. The legendary Steve Wozniak designed the Disk II floppy disk drive for the Apple II. After studying existing floppy-controller designs based on discrete TTL chips (North Star Computers) and the WD1771 floppy controller, he decided that he could do better with less hardware. His controller design was simpler in terms of hardware, relying on precise, instruction-level processor timing to encode data to be written to the disk and to decode the signals coming off the disk. According to Wikipedia, Wozniak call the Disk II project “my most incredible experience at Apple and the finest job I did", and credited it and VisiCalc with the Apple II's success.


Here’s a photo of MicroCore Labs’ MCL65 processor core on a $109 Digilent Arty S7 Spartan-7 FPGA dev board operating an Apple II personal computer and booting Apple DOS 3.3 from a Disk II drive:




MicroCore Labs MCL6502 processor running an Apple II PC.jpg


MicroCore Labs’ MCL65 processor core booting Apple Dos 3.3 from a Disk II drive for an Apple II computer, using a Spartan-7 FPGA




So far, this is all somewhat interesting on an intellectual level but what’s in it for you? How about downloadable source code for MicroCore Labs’ instruction-cycle and bus-cycle accurate MCL65 processor core? Click here.







PrecisionFDA, a cloud-based platform created by the US government’s FDA to benchmark genomic analysis technologies and advance regulatory science, recently conducted a challenge called “Hidden Treasures – Warm Up.” The challenge tested the ability and accuracy of genomic analysis pipelines to find in silico injected variants in FASTQ files from exome sequencing of reference cell lines. (FASTQ files are text-based filed used to store biological sequences using ASCII encoding.) PrecisionFDA announced the results of this challenge at the Festival of Genomics, held in Boston on October 4, 2017. There were 86 valid entries from 30 participants. Out of the 86 entries, 45 found all 50 injected variants. Among entries catching all 50 injected variants, Edico Genome’s DRAGEN V2 Germline Pipeline received the highest score in five of the six accuracy metrics: SNP recall and SNP F-score, and indel precision, indel recall and indel F-score. Edico’s entry placed second on the sixth metric: SNP precision.


DRAGEN V2 is the second iteration of the DRAGEN Germline Pipeline, which employs improved sample-specific calibration of the sequencer and sample prep error models, as well as an improved mapper and aligner algorithm. The DRAGEN Genome pipeline is capable of ultra-fast analysis of Next Generation Sequencing (NGS) data, reducing the time required for analyzing a whole genome at 30x coverage from about 10 hours to approximately 22 minutes. This pipeline harnesses the tremendous power of the DRAGEN Bio-IT Platform and includes highly optimized algorithms for mapping, aligning, sorting, duplicate marking, haplotype variant calling, compression and decompression.


Edico Genome’s Dragen pipeline is available as an FPGA-based hardware platform for on-site use and in Amazon’s AWS EC2 F1 instance in the AWS cloud. In both cases, the extreme acceleration comes from running the pipeline on Xilinx All Programmable devices. In the case of the AWS EC2 F1 instance, the Xilinx device used is the Virtex UltraScale+ VU9P FPGA.



For more information about Edico Genome’s DRAGEN pipeline, see:









According to this just-posted Xilinx press release: “… Alibaba Cloud, the cloud computing arm of Alibaba Group, has chosen Xilinx for next-generation FPGA acceleration in their public cloud.” Alibaba Cloud is calling this FPGA-accelerated service its “F2” instance and has seen acceleration factors as large as 30x with the F2 instance versus the same applications running on cloud-based CPUs. The company expects its customers to use this new capability to accelerate applications including data analytics, genomics, video processing, and machine learning.


Alibaba Cloud is the largest cloud provider in China and its F2 instances are available as of today.




Avnet’s MiniZed SpeedWay Design Workshops are designed to help you jump-start your embedded design capabilities using Xilinx Zynq Z-7000S All Programmable SoCs, which meld a processing system based on a single-core, 32-bit, 766MHz Arm Cortex-A9 processor with plenty of Xilinx FPGA fabric. Zynq SoCs are just the thing when you need to design high-performance embedded systems or need to use a processor along with some high-speed programmable logic. Even better, these Avnet workshops focus on using the Avnet MiniZed—a compact, $89 dev board packed with huge capabilities including built-in WiFi and Bluetooth wireless connectivity. (For more information about the Avnet MiniZed dev board, see “Avnet’s $89 MiniZed dev board based on Zynq Z-7000S SoC includes WiFi, Bluetooth, Arduino—and SDSoC! Ships in July.”)


These workshops start in November and run through March of next year and there are four full-day workshops in the series:


  • Developing Zynq Software
  • Developing Zynq Hardware
  • Integrating Sensors on MiniZed with PetaLinux
  • A Practical Guide to Getting Started with Xilinx SDSoC


You can mix and match the workshops to meet your educational requirements. Here’s how Avnet presents the workshop sequence:



Avnet MiniZed Workshops.jpg 




These workshops are taking place in cities all over North America including Austin, Dallas, Chicago, Montreal, Seattle, and San Jose, CA. All cities will host the first two workshops. Montreal and San Jose will host all four workshops.


A schedule for workshops in other countries has yet to be announced. The Web page says “Coming soon” so please contact Avnet directly for more information.


Finally, here’s a 1-minute YouTube video with more information about the workshops





For more information on and to register for the Avnet MiniZed SpeedWay Design Workshops, click here.


About the Author
  • Be sure to join the Xilinx LinkedIn group to get an update for every new Xcell Daily post! ******************** Steve Leibson is the Director of Strategic Marketing and Business Planning at Xilinx. He started as a system design engineer at HP in the early days of desktop computing, then switched to EDA at Cadnetix, and subsequently became a technical editor for EDN Magazine. He's served as Editor in Chief of EDN Magazine, Embedded Developers Journal, and Microprocessor Report. He has extensive experience in computing, microprocessors, microcontrollers, embedded systems design, design IP, EDA, and programmable logic.