We have detected your current browser version is not the latest one. Xilinx.com uses the latest web technologies to bring you the best online experience possible. Please upgrade to a Xilinx.com supported browser:Chrome, Firefox, Internet Explorer 11, Safari. Thank you!


Every device family in the Xilinx UltraScale+ family of devices (Virtex UltraScale+ FPGAs, Kintex UltraScale+ FPGAs, and Zynq UltraScale+ MPSoCs) have members with 28Gbps-capable GTY transceivers. That’s likely to be important to you as the number and forms of small, 28Gbps interconnect grow. You have many such choices in such interconnect these days including:



  • QSFP28 Optical
  • QSFP28 Direct-Attach Copper
  • SFP28 Optical
  • SFP28 Direct-Attach Copper
  • Samtec FireFly AOC (Active Optical Cable or Twinax ribbon cable)



The following 5.5-minute video demonstrates all of these interfaces operating with 25.78Gbps lanes on Xilinx VCU118 and KCU116 Eval Kits, as concisely explained (as usual) by Xilinx’s “Transceiver Marketing Guy” Martin Gilpatric. Martin also discusses some of the design challenges associated with these high-speed interfaces.


But first, as a teaser, I could not resist showing you the wide-open IBERT eye on the 25.78Gbps Samtec FireFly AOC:




Kintex Ultrascale Firefly AOC IBERT Eye.jpg 




Now that’s a desirable eye.


Here’s the new video:







Avnet’s $89 MiniZed dev board based on the Xilinx single-core Zynq Z7007S SoC is a bargain with its on-board WiFi and Bluetooth interface modules. Perhaps all you need is a little help getting started?


Wish granted.



Avnet MiniZed 3.jpg


Avnet MiniZed Dev Board




Avnet’s Technical Marketing team has just published a bunch of tutorials to help you spin up quickly. The tutorials are:


  • Build a Zynq Hardware Platform
  • First Application - Hello World
  • Generate and Run Test Applications
  • FSBL and Boot from QSPI
  • PL SPI Controller


You can download those tutorials and a few more things here.


Is Opal Kelly’s SYZYGY the new "Goldilocks" high-speed, mezzanine-board and peripheral I/O standard?

by Xilinx Employee ‎08-14-2017 11:41 AM - edited ‎08-14-2017 11:43 AM (1,353 Views)


Today, Opal Kelly announced the SYZYGY open I/O standard, designed to connect peripheral devices and mezzanine boards to the high-speed SerDes ports of FPGAs and SoCs like the Zynq SoC and Zynq UltraScale+ MPSoC. The SYZYGY standard sits somewhere between the inexpensive, low-performance Pmod interface and the higher-pin-count, high-speed FMC interface standard in both speed and cost. In creating the SYZYGY open standard, Opal Kelly is shooting for a “Goldilocks” I/O standard that’s “just right.”


From today’s announcement:



“The SYZYGY specification defines two connector types: the Standard SYZYGY connector offers up to 28 single-ended, impedance-controlled signals, 16 of which may be defined as differential pairs for interface standards such as LVDS. The Transceiver SYZYGY connector boasts four lanes of Gigabit-class transceiver connections and also offers up to 18 single-ended signals. The Transceiver connector is intended for use with JESD204B data acquisition, SFP+ transceivers, and other devices requiring high-speed SERDES. Both Standard and Transceiver connectors have optional low-cost, high-performance coaxial or twinaxial cable assemblies…


“SYZYGY is intended to fit the sweet spot of peripheral connectivity between the existing low-performance, low pin-count [Digilent] Pmod and the expensive, high-performance ultra-high pin-count of FMC,” said Jake Janovetz, President of Opal Kelly Incorporated. “We envision SYZYGY occupying the space between present standards where pin economy, low cost, and high performance converge. Carriers could offer multiple connectivity options to provide additional flexibility to system implementers…


“Opal Kelly will release their upcoming SYZYGY Compatible carrier, the Hub, an open-source board incorporating a Xilinx Zynq SoC. As an open-source hardware design, the Hub will serve as a SYZYGY reference platform for adopters and manufacturers. Opal Kelly also plans to add SYZYGY support to several future FPGA integration products.”



Here’s a preview photo of Opal Kelly’s SYZYGY-compatible Hub carrier board based on the Xilinx Zynq SoC. The Hub board has four SYZYGY ports on its periphery, marked Ports A,B, C, and D in the photo.



Opal Kelly Syzygy-HubPhoto.jpg


Opal Kelly’s Hub, a SYZYGY-compatible carrier board based on the Xilinx Zynq SoC




The SYZYGY Standard connectors are Samtec 40-pin QTE/QSE connectors with 0.8mm pin pitch. The The SYZYGY Transceiver connectors are Samtec 40-pin QTH-DP/QSH-DP connectors with 0.5mm pin pitch. The Transceiver connectors are optimized for differential-pair signaling, as you can see from this drawing:



Opal Kelly SYZYGY Connectors.jpg


SYZYGY Standard (top) and Transceiver (bottom) Connectors



Samtec also provides pre-assembled cables for these connections: EQCD-020 impedance-controlled cables for the Standard SYZYGY interface and HQDP-20 twinax differential cables for the Transceiver SYZYGY interface.


Here’s a graph of throughput versus pin count for various I/O standards and protocols to give you an idea of where the SYZYGY I/O standard fits in the I/O spectrum:



Opal Kelly SYZYGY IO Performance vs Pin Count Chart.jpg 




You’ll find the SYZYGY specification here.


Please contact Opal Kelly directly for more information about the SYZYGY I/O standard and the Zynq-based Hub board.




Adam Taylor’s MicroZed Chronicles, Part 211: Working with HDMI using Zynq SoC and MPSoC Dev Boards

by Xilinx Employee ‎08-14-2017 10:12 AM - edited ‎08-14-2017 10:16 AM (1,580 Views)


By Adam Taylor


Throughout this series we have looked at numerous image-processing applications. One of the simplest ways to capture or display an image in these applications is using HDMI (High Definition Multimedia Interface). HDMI is a proprietary standard that carries HD digital video and audio data. It is a widely adopted standard supported by many video displays and cameras. Its widespread adoption makes HDMI an ideal interface for our Zynq-based image processing applications.


In this blog, I am going to outline the different options for implementing HDMI in our Zynq design using the different boards we have looked as targets. This exploration will also provide ideas for us when we are designing our own custom hardware.





Arty Z7 HDMI In and Out Example




The several Zynq boards we have used in this series so far support HDMI using one of two methods: an external or internal CODEC.






Zynq-based boards with HDMI capabilities




If the board uses an external CODEC, it is fitted with an Analog Devices ADV7511 or ADV7611 for transmission and reception respectively. The external HDMI CODEC interfaces directly with the HDMI connector and generates the TMDS (Transition-Minimized Differential Signalling) signals containing the image and audio data.


The interface between the CODEC and Zynq PL (programmable logic) consists of a I2C bus, pixel-data bus, timing sync signals, and the pixel clock. We route the pixel data, sync signals, and clock directly into the PL. We use the I2C controller in the Zynq PS (processing system) for the I2C interface with the Zynq SoC’s I2C IO signals routed via the EMIO to the PL IO.

To ease integration between CODEC and PL, AVNET has developed two IP cores. They are available on the Avnet GitHub. In the image-processing chain, these IP blocks will be located at the very front and end of the chain if you are using them to interface to external CODECs.


The alternate approach is to use an internal CODEC located within the Zynq PL. In this case, the HDMI TMDS signals are routed directly to the PL IO and the CODEC is implemented with programmable logic. To save having to write such complicated CODECs from scratch, Digilent provides two CODEC IP cores. They are available from the Digilent GitHub. Using these cores within the design means the TMDS signals’ IO standard within the constraints file is set to TMDS_33 IO.


Note: This IO standard is only available on the High Range (HR) IO banks.





 HDMI IP Cores mentioned in the blog




Not every board I have discussed in the MicroZed Chronicles series can both receive and transmit HDMI signals. The ZedBoard and TySOM only provide HDMI output. If we are using one of these boards and the application must receive HDMI signals, we can use the FMC connector with an FMC HDMI input card.


The Digilent FMC-HDMI provides two HDMI inputs with the ability to receive HDMI data using both external and internal CODECs. Of its two inputs, the first uses the ADV7611, while the second equalizes and passes the HDMI Signals through to be decoded directly in the Zynq PL.







This provides us with the ability to demonstrate how both internal and external CODECs can be implanted on the ZedBoard when using an external CODEC for image transmission.


However first I need to get my soldering iron out to fit a jumper to J18 so that we can set VADJ on the ZedBoard to 3v3 as required for the FMC-HDMI.


We should also remember that while I have predominantly talked about the Zynq SoC here, the same discussion applies to the Zynq UltraScale+ MPSoC, although that device family also incorporates DisplayPort capabilities.



Code is available on Github as always.



If you want E book or hardback versions of previous MicroZed chronicle blogs, you can get them below.




  • First Year E Book here
  • First Year Hardback here.



 MicroZed Chronicles hardcopy.jpg



  • Second Year E Book here
  • Second Year Hardback here


MicroZed Chronicles Second Year.jpg 





Two new papers, one about hardware and one about software, describe the Snowflake CNN accelerator and accompanying Torch7 compiler developed by several researchers at Purdue U. The papers are titled “Snowflake: A Model Agnostic Accelerator for Deep Convolutional Neural Networks” (the hardware paper) and “Compiling Deep Learning Models for Custom Hardware Accelerators” (the software paper). The authors of both papers are Andre Xian Ming Chang, Aliasger Zaidy, Vinayak Gokhale, and Eugenio Culurciello from Purdue’s School of Electrical and Computer Engineering and the Weldon School of Biomedical Engineering.


In the abstract, the hardware paper states:



“Snowflake, implemented on a Xilinx Zynq XC7Z045 SoC is capable of achieving a peak throughput of 128 G-ops/s and a measured throughput of 100 frames per second and 120 G-ops/s on the AlexNet CNN model, 36 frames per second and 116 Gops/s on the GoogLeNet CNN model and 17 frames per second and 122 G-ops/s on the ResNet-50 CNN model. To the best of our knowledge, Snowflake is the only implemented system capable of achieving over 91% efficiency on modern CNNs and the only implemented system with GoogLeNet and ResNet as part of the benchmark suite.”



The primary goal of the Snowflake accelerator design was computational efficiency. Efficiency and bandwidth are the two primary factors influencing accelerator throughput. The hardware paper says that the Snowflake accelerator achieves 95% computational efficiency and that it can process networks in real time. Because it is implemented on a Xilinx Zynq Z-7045, power consumption is a miserly 5W according to the software paper, well within the power budget of many embedded systems.


The hardware paper also states:



“Snowflake with 256 processing units was synthesized on Xilinx's Zynq XC7Z045 FPGA. At 250MHz, AlexNet achieved in 93:6 frames/s and 1:2GB/s of off-chip memory bandwidth, and 21:4 frames/s and 2:2GB/s for ResNet18.”



Here’s a block diagram of the Snowflake machine architecture from the software paper, from the micro level on the left to the macro level on the right:



Snowflake CNN Accelerator Block Diagram.jpg 



 There’s room for future performance improvement notes the hardware paper:



“The Zynq XC7Z045 device has 900 MAC units. Scaling Snowflake up by using three compute clusters, we will be able to utilize 768 MAC units. Assuming an accelerator frequency of 250 MHz, Snowflake will be able to achieve a peak performance of 384 G-ops/s. Snowflake can be scaled further on larger FPGAs by increasing the number of clusters.”



This is where I point out that a Zynq Z-7100 SoC has 2020 “MAC units” (actually, DSP48E1 slices)—which is a lot more than you find on the Zynq Z-7045 SoC—and the Zynq UltraScale+ ZU15EG MPSoC has 3528 DSP48E2 slices—which is much, much larger still. If speed and throughput are what you desire in a CNN accelerator, then either of these parts would be worthy of consideration for further development.


Feeling like it’s time to go wireless with a Zynq SoC or Zynq UltraScale+ MPSoC? The new, $59 AES-PMOD-MUR-1DX-G WiFi/Bluetooth Pmod from Avnet.com (in stock now) is a fast way to get your Zynq on the air. It’s based on the ultra-small Murata Type 1DX module and it’s compatible with any development board that has access to a dual 2x6 Pmod connection. The product includes example guidelines for Avnet’s ZedBoard and UltraZed-EG Development Kits to demonstrate use of its wireless functions from PetaLinux.



Avnet WiFi Bluetooth Pmod.jpg



Please contact Avnet directly for more information about the AES-PMOD-MUR-1DX-G WiFi/Bluetooth Pmod.






Got a problem getting enough performance out of your processor-based embedded system? You might want to watch a 14-minute video that does a nice job of explaining how you can develop hardware accelerators directly from your C/C++ code using the Xilinx SDK.


How much acceleration do you need? If you don’t know for sure, the video gives an example of an autonomous drone with vision and control tasks that need real-time acceleration.


What are your alternatives? If you need to accelerate your code, you can:


  • Increase your processor’s clock speed, likely requiring a faster speed grade
  • Add more processor cores to share the load
  • Switch to a higher-end, code-compatible processor


Unfortunately, each of these three alternatives increases power consumption. There’s another alternative however that can actually cut power consumption. That alternative’s based on the use of Xilinx All Programmable Zynq SoCs and Zynq UltraScale+ MPSoCs. By moving critical code into custom hardware accelerators implements in the programmable logic incorporated into all Zynq family members, you can relieve the processor of the associated processing burden and actually slow the processor’s clock speed, thus reducing power. It’s quite possible to cut overall power consumption using this approach.


Ah, but implementing these accelerators. Aye, there’s the rub!


It turns out that implementation of these hardware accelerators might not be as difficult as you imagine. The Xilinx SDK is already a C/C++ development environment based on familiar IDE and compiler technology. Under the hood, the SDK serves as a single cockpit for all Zynq-based development work—software and hardware. It also includes SDSoC, the piece of the puzzle you need to convert C/C++ code into acceleration hardware using a 3-step process:



  • Code profiling to identify time-consuming tasks that are critical to real-time operation
  • Software/hardware partitioning based on the profiling data
  • Software/hardware compilation based on the system partitioning


One development platform, SDK, serves all members of the Zynq SoC and Zynq UltraScale+ MPSoC device families, giving you a huge price/performance range.


Here’s that 14-minute video:






A new article on the Avnet Web site titled “Zero Downtime Industrial IoT Using Programmable SoCs” discusses an IP design from SoC-e for the Xilinx Zynq-7000 SoC that provides a flexible solution for equipment that will be connected to HSR (High-availability Seamless Redundancy) rings and PRP (Parallel Redundancy Protocol) LANs. This IP will also work as a network bridge in the context of IEC 61850. The article also discusses a demo of this IP using Avnet’s Zynq-based MicroZed Industry 4.0 Ethernet Kit (MicroZed I4EK).


The first part of the article gives a detailed description of the HSR and PRP protocols. PRP is implemented in the network nodes rather than in the network. PRP nodes have two Ethernet ports and are called Dual Attached Nodes (DANs). Each DAN Ethernet port connects to one of two independent Ethernet networks (LAN A and LAN B), implementing a dual-redundant network topology. DANs send the same frames over both networks. HSR redundancy relies on sending packets in both directions through a ring network.


Here’s a diagram from the article showing an example of an HSR ring-based network topology:



HSR Ring-Based Topology.jpg 




As with PRP, each HSR network node again has two Ethernet ports and connects to the network as a Doubly Attached Node with HSR (DANH). Packets travel through the nodes in both directions in the HSR ring so a single break anywhere in the network can be detected while packet traffic continues to reach all destinations. The Red Box in the diagram is a DANH adapter for conventional Ethernet equipment that lacks DANH network connectivity. (The PRP protocol also supports the Red Box concept for equipment with only one Ethernet port.)


IIoT systems that implement both HSR and PRP protocols increase network system reliability and provide greater safety. Both of these characteristics are highly desirable in IIoT network systems.


The rest of the article describes SoC-e’s HSR/PRP Switch IP, which is implemented in the PL (programmable Logic) of a Zynq SoC contained on the Avnet MicroZed SOM that’s part of the Avnet MicroZed I4EK.


For more information about SoC-e’s HSR/PRP reference design and IP, click here.

Adam Taylors MicroZed Chonicles, Part 209: UltraZed Edition Part 16 – The PMU

by Xilinx Employee ‎07-31-2017 10:25 AM - edited ‎07-31-2017 10:26 AM (4,528 Views)


By Adam Taylor


When I introduced the Zynq UltraScale+ MPSoC’s PS (Processing System), I explained that it contained more processors than just those within the APU, RPU, and GPU. It also includes a Platform Management Unit (PMU) and Configuration Security Unit (CSU). The PMU is responsible for initialization during boot and platform monitoring during operation. The CSU is responsible for secure boot once the PMU releases it. The CSU also provides anti-tamper capabilities, key management and storage, and cryptographic acceleration.


Here’s a diagram of the PMU taken from the Zynq UltraScale+ MPSoC Technical Reference Manual:



Zynq MPSoC PMU.jpg 




In this blog post, we are going to look a little more in-depth look at the Zynq UltraScale+ MPSoC’s PMU because we really need to understand how it works, how we interact with it, and if necessary how we develop our own more complex platform management program using it.


The PMU has several roles in operation of the MPSoC. These roles can be summarized as platform management, however in more detail the PMU:


  • Performs initialization during boot. This process uses Sysmon to check the power supplies, initializes the PLLs, runs the Built in Test, and checks for errors before releasing the CSU.
  • Performs power management during operation. The PMU can shut down power domains or individual power islands or enter deep-sleep mode. Once in deep-sleep mode, the PMU is also suspended. Only the PMU can receive a wake-up trigger.
  • Monitors the system for errors and is capable of reporting these both internally and externally via the PS_ERROR_STATUS pin on the dedicated MIO.
  • Provides support for higher-level system management as may be required for functional-safety applications. It is possible for the user to upload their own more advanced PMU software, for instance to run a software test library (STL).


To reliably provide this platform-management function, the PMU has been implemented using triple-modular-redundant processors, which utilize voting along with ECC-protected RAM. The PMU ROM stores the initial application required to perform the initialization functions and transition between power schemes as necessary during operation of the Zynq UltraScale+ MPSoC.


Both the RPU and the APU are defined as power masters, which means they can ask the PMU to power down domains or islands. The APU and RPU can interact with the PMU through its global register space or via inter-process interrupts (IPI), which is how we control the power-management functions within the PMU. To support this there is a Xilinx Power Management Framework library available that complies with the IEEE P2415 Standard for Unified Hardware Abstraction and Layer for Energy Proportional Electronic Systems, which can be used by the application software. This framework library allows multiple processors running different operating systems to interact with the PMU.


As different power modes are entered and with a mix of powered and unpowered domains, it is possible to isolate powered domains from unpowered domains to prevent crowbarring. This also provides a very useful benefit: the PMU can isolate domains without powering them down. This ability to isolate domains from each other is very useful when it comes to functional-safety and secure applications.


The PMU has general purpose I/O that is connected to the Zynq UltraScale+ MPSoC’s MIO, used for interfacing with the outside world. There are also a number of dedicated PS I/O pins such as the POR signal and PS_ERROR_Status.  Internally, there are error signals provided in both directions between the PS and the PL. The PMU also has four periodic interval timers and the ability to process interrupts, which makes it a very capable platform for monitoring device status. A 32-bit AXI interface connects to the low-power domain switch, which allows the PMU to access other PS resources.


As I mentioned earlier, we may wish to execute a more complex platform management on the PMU in some applications— to run a software test library, for example. To do this, we upload our own program into the PMU RAM, first checking to ensure that the PMU is in its sleep mode. We can command the PMU to enter sleep mode by issuing the inter-process interrupt to PMU channel 0. This also enables the branching to the user code being loaded in to the RAM once the PMU exits sleep mode.


The PMU is a very important element of the Zynq UltraScale+ MPSoC. We will return to a discussion of the PMU several times in future blog posts, including an example that shows you how to load your own PMU application. However, we first need to examine a few more building blocks and the power management framework.



Code is available on Github as always.




If you want E book or hardback versions of previous MicroZed chronicle blogs, you can get them below.




  • First Year E Book here
  • First Year Hardback here.



 MicroZed Chronicles hardcopy.jpg



  • Second Year E Book here
  • Second Year Hardback here


MicroZed Chronicles Second Year.jpg 




Step-by-Step instructions for getting up and running with a Zynq-based Digilent ZYBO trainer board

by Xilinx Employee ‎07-27-2017 04:26 PM - edited ‎07-27-2017 04:29 PM (5,476 Views)


Digilent’s Alex Wong has just published a blog post on RS-online’s DesignSpark with step-by step instructions for getting your first program (“hello world” of course) running on a Digilent ZYBO trainer board, based on a Xilinx Zynq Z7010 SoC. It doesn’t get any simpler than this.



Digilent ZYBO.jpg



A new and yet-to-be-published 12-page paper submitted to the IEEE Transactions on VLSI Systems titled “Efficient FPGA Mapping of Pipeline SDF FFT Cores” (available on IEEE Xplore) contains a thorough, detailed discussion of ways to map SDF (single-path delay feedback) FFT implementations into the DSP48 slices, programmable logic, and memory resources available on Xilinx All Programmable devices. The paper deals with Virtex-4 and Virtex-6 FPGAs but the authors note: "[7 series] and UltraScale/UltraScale+ FPGAs from Xilinx use virtually the same slice architecture as Virtex-6, so… the results should be very easy to generalize.”


There’s been a steady evolution of the DSP48 slice in the multiple generations of Xilinx All Programmable devices starting with the Virtex-4 FPGAs. The Virtex-4 FPGA series included XtremeDSP (DSP48) slices with 18x18-bit multipliers and 48-bit accumulators; the Virtex-6 FPGAs included DSP48E1 slices with 25x18-bit multipliers and 48-bit accumulators; 7 series FPGAs and the Zynq-7000 SoCs include DSP48E1 slices with 25x18-bit multipliers and 48-bit accumulators; and UltraScale/UltraScale+ devices include DSP48E2 slices with 27x18-bit multipliers and 48-bit accumulators. There have been many additional improvements to Xilinx’s DSP48 slice along the way including steady clock-rate improvements with each new process generation, making the current DSP48E2 slice quite capable.



DSP48E2 Detailed Block Diagram.jpg 



The IEEE paper discusses transformations to map butterflies to fewer LUTs, transformations that efficiently enable the use of DSP48 preadders for implementing butterfly adders, efficient mapping of data and twiddle-factor storage to BRAMs and distributed resources, efficient sharing of twiddle-factor memories for radix-2k algorithms, and ways to improve timing through retiming and pipelining.


It’s unfair of me to reveal any of the paper’s techniques in this blog, you need to get the IEEE paper for that, but I’m not shy about reporting some of the conclusion to tempt you into reading the paper: “The reported implementation results show an increase of through-put per slice of up to 350% and 400% compared with the best previously published work, for Virtex-4 and Virtex-6, respectively. In addition, a higher maximal clock frequency is obtained and fewer memory resources are needed. As the previously best reported results are using exactly the same architecture, and for Virtex-4 exactly the same algorithm, this clearly shows the benefit of the transformations proposed to improve the mapping from architecture to FPGA hardware structure in this paper.”




In a very recent guest blog on Mentor Graphics’ Web site, Dan Driscoll discussed the ways you can isolate functions in secure and safety-critical systems using multiple processors on heterogeneous devices. He uses the Xilinx Zynq UltraScale+ MPSoC as a target example. Driscoll writes:


“In many systems, isolation is achieved by placing software performing safe or secure functions on a separate chip which has been dedicated exclusively for this purpose. While this design paradigm is certainly suitable for many scenarios and definitely achieves strong isolation, it does contrast with the one goal that many companies have today as they update or redesign existing systems – achieving cost savings through system consolidation.


“So in a design with two or three processors with each processor responsible for implementing a different aspect of the system (HMI/connectivity, secure transactions, safety-critical, etc.) consolidation efforts focus on moving to a single chip design where all of these functions reside together on a more capable processor (Figure 1). This consolidation reduces BOM costs, hardware design costs and complexities, and can likely produce a more power-efficient system.


“While this concept definitely looks and sounds straight-forward, the devil is in the details.”



Mentor Safety Security MPSoC Diagram.jpg



This diagram from Driscoll's blog post shows how you can move code running on three separate processors to the multiple processors on the Xilinx Zynq UltraScale+ MPSoC (the ARM Cortex-A53 application processors and the ARM Cortex-R5 real-time processors).


From there, Driscoll’s blog post dives into the details of creating such isolated systems using the Zynq UltraScale+ MPSoC using the following device features:



  • ARM TrustZone
  • Virtualization
  • System MMU
  • Xilinx Memory Protection Unit (XMPU)
  • Xilinx Peripheral Protection Unit (XPPU)


If you are developing safety-critical and secure systems—and who isn’t these days—then Dan’s blog is a fast must-read.


Korea-based ATUS (Across The Universe) has developed a working automotive vision sensor that recognizes objects such as cars and pedestrians using a 17.53frames/sec video stream. A CNN (convolutional neural network) performs the object recognition on 20 different object classes and runs in the programmable logic fabric on a Xilinx Zynq Z7045 SoC. The programmable logic clocks at 200MHz and the entire design draws 10.432W. That’s about 10% of the power required by CPUs or GPUs to implement this CNN.


Here’s a block diagram of the recognition engine in the Zynq SoC’s programmable logic fabric:






ATUS’ Object-Recognition CNN runs in the programmable logic fabric of a Zynq Z7045 SoC




Here’s a short video of ATUS’ Automotive Vision Sensor in action, running on a Xilinx ZC106 eval kit:






Please contact ATUS for more information about their Automotive Vision Sensor.




Samtec introduces 140Gbps Optical FMC module based on two 14Gbps FireFly micro-flyover optical modules

by Xilinx Employee ‎07-24-2017 03:58 PM - edited ‎07-24-2017 04:06 PM (5,190 Views)


Perhaps you’ve been intrigued by Samtec’s FireFly optical micro-flyover communications technology—which is capable of carrying as many as ten 14Gbps serial data streams over low-cost optical ribbon cable—but you didn’t want to try out the technology by designing the FireFly sites into a board. Well, Samtec’s just fixed that problem for you by introducing its VITA 57.1-compliant, 14Gbps FireFly FMC Module and Development Kit with 140Gbps of full-duplex bandwidth distributed over two 10-fiber, multi-mode optical ribbon cables connected to two on-board FireFly optical modules that link an FMC HPC connector to an industry-standard, 24-fiber MTP/MPO optical connector. Snap one into an appropriate Xilinx dev board, for example, and you have an instant 140Gbps, full-duplex optical link.



Samtec 140Gbps FireFly Optical FMC Module.jpg


14Gbps FireFly FMC Module with 140Gbps of full-duplex bandwidth




This type of interconnect pairs well, for example, with the 16.3Gbps GTH transceivers found on various Xilinx All Programmable UltraScale devices including Virtex UltraScale, Kintex UltraScale, and Kintex UltraScale+ FPGAs and Zynq UltraScale+ MPSoCs. Dev boards for these devices feature FMC connectors compatible with the 14Gbps FireFly FMC Module.


For more information about the FireFly Optical Flyover system, see:







Here’s an inspiring short video from National Instruments (NI) where educators from Georgia Tech, the MIT Media Lab, the University of Manchester, and the University of Waterloo discuss using a variety of NI products to inspire students, pique their curiosity, and foster deeper understanding of many complex engineering concepts while thoroughly disguising all of it as fun. Among the NI products shown in this 2.5-minute video are several products based on Xilinx All Programmable devices including:





Here’s the video:







For more information about these Xilinx-based NI products, see:










The latest “Powered by Xilinx” video, published today, provides more detail about the Perrone Robotics MAX development platform for developing all types of autonomous robots—including self-driving cars. MAX is a set of software building blocks for handling many types of sensors and controls needed to develop such robotic platforms.


Perrone Robotics has MAX running on the Xilinx Zynq UltraScale+ MPSoC and relies on that heterogeneous All Programmable device to handle the multiple, high-bit-rate data streams from complex sensor arrays that include lidar systems and multiple video cameras.


Perrone is also starting to develop with the new Xilinx reVISION stack and plans to both enhance the performance of existing algorithms and develop new ones for its MAX development platform.


Here’s the 4-minute video:




By Adam Taylor


Connecting the low-cost, Zynq-based Avnet MiniZed dev board connected to our WiFi network allows us to transfer files between the board and our development environment quickly and easily. I will use WinSCP—a free, open-source SFTP client, FTP client, WebDAV client, and SCP client for Windows—to do this because it provides an easy-to-use, graphical method to upload files.


If we have power cycled or reset our MiniZed between enabling the WiFi as in the previous blog and connecting to it using WinSCP, we will need to rerun the WiFi setup script. LED D10 on the MiniZed board will be lit when WiFi is enabled. Once we are connected to the WIFI network, we can use WinSCP to remotely log in. In the example below, the MiniZed had the address of on my network. The username and password to log in are the same as for the log in over the terminal. Both are set to root.





Connecting the MiniZed to the WiFi network



Once we are connected with WinSCP, we can see the file systems on both our host computer and the MiniZed. We can simply drag and drop files between the two file systems to upload or download files. It can’t get much easier than this until we develop mind-reading capabilities for Zynq-based products. What we need now is a simple program we can use to prove the setup.






WinSCP connected and able to upload and download files



To create a simple program, we can use SDK targeting the Zynq SoC’s A9 processor. There is also a “hello world” program template that can use as the basis for our application. Within SDK, create a new project (File ->New->Application Project) as shown in the images below, this will create a simple “hello world” application.









Opening the helloworld.c file within the created application allows you to customize the program if you so desire.


Once you are happy with your customization, your next step is to build the file, which will result in an ELF file. we can then upload this ELF file to the MiniZed using WinSCP and use the terminal to run our first example. Make sure to set the permissions for read, write, and execute when uploading the file to the MiniZed dev board.


Within the terminal window, we can then run the application by executing it using the command:




When I executed this command, I received the following in response that proved everything was working as expected:






Once we have this simple program running successfully, we can create a more complex programs for various applications including ones that use the MiniZed dev board’s WiFi networking capabilities. To do this we need to use sockets, which we will explore in a future blog.


Having gotten the MiniZed board’s WiFi up and running and loading a simple “hello world” program, we now turn our attention to the board’s Bluetooth wireless capability, which we have not yet enabled. We enable Bluetooth networking in a similar manner to WiFi networking. Navigate to /usr/local/bin/ and perform a LS command. In the results, you will see not only the script we used to turn on WiFi (WIFI.sh) but also a script file named BT.sh for turning on Bluetooth. Running this script turns on the Bluetooth. You will see a blue LED D9 illuminate on the MiniZed board when Bluetooth is enabled and within the console window, you will notice that the Bluetooth feature configures and starts scanning. If there is a discoverable Bluetooth device in the area, then you will see it listed. In the example below, you can see my TV.






If we have another device that we wish to communicate with, re-running the same script will cause an issue. Instead, we use the command hcitool scan:







Running this command after making my mobile phone discoverable resulted in my Samsung S6 Edge phone being added to the list of Bluetooth devices.


Now we know how to enable both the WiFi and Bluetooth on the MiniZed board, how to write our own program, and upload it to the MiniZed.



In future blogs, we will look at how we can transfer data using both the Bluetooth and WiFi in our applications.



Code is available on Github as always.




If you want E book or hardback versions of previous MicroZed chronicle blogs, you can get them below.




  • First Year E Book here
  • First Year Hardback here.



MicroZed Chronicles hardcopy.jpg 



  • Second Year E Book here
  • Second Year Hardback here


MicroZed Chronicles Second Year.jpg 





Green Hills Software has announced that it has been selected by a US supplier of guidance and navigation equipment for commercial and military aircraft to provide its DO-178B Level A-compliant real-time multicore operating system for next-generation of equipment based on the Xilinx Zynq Ultrascale+ MPSoC. The Zynq Ultrascale+ MPSoC’s four 64-bit ARM Cortex-A53 processor cores will run Green Hills Software's INTEGRITY-178 Time-Variant Unified Multi Processing (tuMP) safety-critical operating system. The Green Hills INTEGRITY-178 tuMP RTOS has been shipping to aerospace and defense customers since 2010. INTEGRITY-178 tuMP supports ARINC-653 Part 1 Supplement 4 standard (including section 2.2.1 – SMP operation), as well as the Part 2 optional features including Sampling Port Data Structures, Sampling Port Extensions, Memory Blocks, Multiple Module Schedules, and File System and offers advanced options such as a DO-178B Level A-compliant network stack.  


Linux provides a number of mechanisms that allow you to interact with FPGA bitstreams without using complex kernel device drivers. This feature allows you to develop and test your programmable hardware using simple Linux user-space applications. This free training Webinar by Doulos will review your options and examine their pros and cons.


Webinar highlights:


  • Find out how programmable logic is represented in the device tree
  • Explore the Linux user space mechanisms for FPGA I/O
  • Understand the best use of these methods


The concepts will be explored in the context of Xilinx Zynq SoCs and Zynq UltraScale+ MPSoCs.


Doulos’ Senior Member of Technical Staff Simon Goda will present this webinar on August 4 and will moderate live Q&A throughout the broadcast. There are two Webinar broadcasts to accommodate different time zones.


Register here.






Last month, I wrote about Perrone Robotic’s Autonomous Driving Platform based on the Zynq UltraScale+ MPSoC. (See “Linc the autonomous Lincoln MKZ running Perrone Robotics' MAX AI takes a drive in Detroit without puny humans’ help” and “Perrone Robotics builds [Self-Driving] Hot Rod Lincoln with its MAX platform, on a Zynq UltraScale+ MPSoC.”) That platform runs on a controller box supplied by iVeia. In the 2-minute video below, iVeia’s CTO Mike Fawcett describes the attributes of the Zynq UltraScale+ MPSoC that make it a superior implementation technology for autonomous driving platforms. The Zynq UltraScale+ MPSoC’s immense, heterogeneous computing power supplied by six ARM processors plus programmable logic and a few more programmable resources flexibly delivers the monumental amount of processing required for vehicular sensor fusion and real-time perception processing while consuming far less power and generating far less heat than competing solutions involving CPUs or GPUs.


Here’s the video:







If you have read Adam Taylor’s 200+ MicroZed Chronicles here in Xcell Daily, you already know Adam to be an expert in the design of systems based on programmable logic, Zynq SoCs, and Zynq UltraScale+ MPSoCs. But Adam has significant expertise in the development of mission-critical systems based on his aerospace engineering work. He gave a talk about this topic at the recent FPGA Kongress held in Munich and he’s kindly re-recorded his talk, combined with slides in the following 67-minute video.


Adam spends the first two-thirds of the video talking about the design of mission-critical systems in general and then spends the rest of the time talking about Xilinx-specific mission-critical design including the design tools and the Xilinx isolation design flow.


Here’s the video:






Adam Taylor’s MicroZed Chronicles, Part 207: Setting up MiniZed WIFI and Bluetooth Connectivity

by Xilinx Employee ‎07-17-2017 10:34 AM - edited ‎07-18-2017 02:54 PM (6,699 Views)


By Adam Taylor


So far on our journey, every Zynq SoC and Zynq UltraScale+ MPSoC we have looked at has had two or more ARM microprocessor cores. However, I recently received the new Avnet MinZed dev board based on a Zynq Z-7007S SoC. This board is really exciting for several reasons. It is the first board we’ve looked at that’s based on a single-core Zynq SoC. (It has one ARM Cortex-A9 processor core that runs as fast as 667MHz in the speed grade used on the board.) And like the snickerdoodle, it comes with support for WIFI and Bluetooth. This is a really interesting board and it sells for a mere $89 in the US.


Xilinx designed the single-core Zynq for cost-optimized and low-power applications. In fact, we have been using just a single core for most of the Zynq-based applications we have looked at over this series unless we have been running Linux, exploring AMP, or looking at OpenAMP. One processor core is still sufficient for many, many applications.


The MiniZed dev board itself comes with 512Mbytes of DDR3L SDRAM, 128Mbits of QSPI flash memory, and 8Gbytes of eMMC flash memory. When it comes to connectivity, in addition to the wireless links, the MiniZed board also provides two PMOD interfaces and an Arduino/ChipKit Shield connector. It also provides an on-board temperature sensor, accelerometer and microphone.


Here’s a block diagram of the MiniZed dev board:







Thanks to its connectivity, its capabilities and low cost make the MiniZed board ideal for a range of applications, especially those applications that fall within the Internet of Things and Industrial Internet of Things domains.


When we first open the box, the MiniZed board comes preinstalled with a PetaLinux image loaded into the QSPI flash memory. This has a slight limitation as the QSPI flash is not large enough to host a PetaLinux image with both a Bluetooth and WIFI stack. Only the WIFI stack is present in the out-of-the-box condition. If we want to use the Bluetooth—and we do—we need to connect over WIFI and upload a new boot loader so that we can load a full-featured PetaLinux image from the eMMC flash. The first challenge of course is to connect over WIFI. We will look at that in the rest of this blog.


The first step is to download the demo application files from the MiniZed Website. This provides us with the following files which we need to use in the demo:


  • bin – a boot loader used to load the boot image from eMMC Flash
  • ub – PetaLinux with the Bluetooth stack
  • conf – Configuration file where we can define the WIFI SSID and Key


To correctly set up the MiniZed for our future adventures, we will also need a USB memory stick. On our host PC, we need to open the file wpa_supplicant.conf using a program like notepad++. We then add our network’s SSID and PSK so that the MiniZed can connect to our network. Once this is done, we save the file to the USB memory stick’s root.





Setting the WIFI SSID and PSK



The next step is to power on the MiniZed board and connect to a PC using a USB cable from the computer’s USB port to the MiniZed board’s JTAG UART connector. Connect a second USB cable from the MiniZed’s auxiliary input connector for power. We need to do this because of the USB port’s current supply limits. Without the auxiliary USB cable, we can’t be sure that the memory stick can be powered correctly when plugged into the MiniZed board.


Press the MiniZed board’s reset button and you should see the Linux OS boot in your terminal screen. Once booted, log in with the password and username of root.


We can then plug in the USB memory stick. The MiniZed board should discover the USB memory stick and you should see it reported in the terminal window:





Memory Stick detection




To log on to our WIFI network, we need to copy this file to the eMMC. To do this, we issue the following commands via the terminal.




These commands change the directory to the eMMC and erases anything within it before changing directory to the USB memory stick and listing the contents, where we should see our wpa_supplicant.conf file.


The next step is to copy the file from the USB memory stick to the eMMC and check that it has been copied correctly:




We are then ready to start the WIFI we can do this by navigating to





You should see this:







Now we are connected to the WIFI we can enable the blue tooth and transfer files wirelessly which we will look at next time.




Code is available on Github as always.




If you want E book or hardback versions of previous MicroZed chronicle blogs, you can get them below.




  • First Year E Book here
  • First Year Hardback here.



MicroZed Chronicles hardcopy.jpg 



  • Second Year E Book here
  • Second Year Hardback here



MicroZed Chronicles Second Year.jpg 



Today, Mentor announced that it is making the Android 6.0 (Marshmallow) OS for the Xilinx Zynq UltraScale+ MPSoC along with pre-compiled binaries for the ZCU102 Eval Kit (currently on sale for half off, or $2495). This Android implementation includes the Mentor Android 6.0 board support package (BSP) built on the Android Open Source Project. The Android software is available for immediate, no-charge download directly from the Mentor Embedded Systems Division.


You need to file a download request with Mentor to get access.

Free Webinar on “Any Media Over Any Network: Streaming and Recording Design Solutions.” July 18

by Xilinx Employee ‎07-11-2017 11:21 AM - edited ‎07-11-2017 12:44 PM (4,999 Views)


On July 18 (that’s one week from today), Xilinx’s Video Systems Architect Alex Luccisano will be presenting a free 1-hour Webinar on streaming media titled “Any Media Over Any Network: Streaming and Recording Solution.” He’ll be discussing key factors such as audio/video codecs, bit rates, formats, and resolutions in the development of OTT (over-the-top) and VOD (video-on-demand) boxes and live-streaming equipment. Alex will also be discussing the Xilinx Zynq UltraScale+ MPSoC EV device family, which incorporates a hardened, multi-stream AVC/HEVC simultaneous encode/decode block that supports UHD-4Kp60. That’s the kind of integration you need to develop highly differentiated pro AV and broadcast products (and any other streaming-media or recording products) that stand well above the competition.


Register here.



By Adam Taylor


With the Vivado design for the Lepton thermal imaging IR camera built and the breakout board connected to the Arty Z7 dev board, the next step is to update the software so that we can receive and display images. To do this, we can also use the HDMI-out example software application as this correctly configures the board’s VDMA output. We just need to remove the test-pattern generation function and write our own FLIR control and output function as a replacement.


This function must do the following:



  1. Configure the I2C and SPI peripherals using the XIICPS and XSPI API’s provided when we generated the BSP. To ensure that we can communicate with the Lepton Camera, we need to set the I2C address to 0x2A and configure the SPI for CPOL=1, CPHA=1, and master operation.
  2. Once we can communicate over the I2C interface to determine that the Lepton camera module is ready, we need to read the status register. If the camera is correctly configured and ready when we read this register, the Lepton camera will respond with 0x06.
  3. With the camera module ready, we can read out an image and store it within memory. To do this we execute several SPI reads.
  4. Having captured the image, we can move the stored image into the memory location being accessed by VDMA to display the image.



To successfully read out an image from the Lepton camera, we need to synchronize the VoSPI output to find the start of the first line in the image. The camera outputs each line as a 160-byte block (Lepton 2) or two 160-byte blocks (Lepton 3), and each block has a 2-byte ID and a 2-byte CRC. We can use this ID to capture the image, identify valid frames, and store them within the image store.


Performing steps 3 and 4 allows us to increase the size of the displayed image on the screen. The Lepton 2 camera used for this example has a resolution of only 80 horizontal pixels by 60 vertical pixels. This image would be very small when displayed on a monitor, so we can easily scale the image to 640x480 pixels by outputting each pixel and line eight times. This scaling produces a larger image that’s easier to recognize on the screen although may look a little blocky.


However, scaling alone will not present the best image quality as we have not configured the Lepton camera module to optimize its output. To get the best quality image from the camera module, we need to use the I2C command interface to enable parameters such as AGC (automatic gain control), which affects the contrast and quality of the output image, and flat-field correction to remove pixel-to-pixel variation.


To write or read back the camera module’s settings, we need to create a data structure as shown below and write that structure into the camera module. If we are reading back the settings, we can then perform an I2C read to read back the parameters. Each 16-bit access requires two 8-bit commands:


  • Write to the command word at address 0x00 0x04.
  • Generate the command-word data formed from the Module ID, Command ID, Type, and Protection bit. This word informs the camera module which element of the camera we wish to address and if we wish to read, write, or execute the command.
  • Write the number of words to be read or written to the data-length register at address 0x00 0x06.
  • Write the number of data words to addresses 0x00 0x08 to 0x00 0x26.


This sequence allows us to configure the Lepton camera so that we get the best performance. When I executed the updated program, I could see the image that appears below, of myself taking a picture of the screen on the monitor screen. The image has been scaled up by a factor of 8.  






Now that we have this image on the screen, I want to integrate this design with MiniZed dev board and configure the camera to transfer images over a wireless network.


Code is available on Github as always.


If you want E book or hardback versions of previous MicroZed chronicle blogs, you can get them below.




  • First Year E Book here
  • First Year Hardback here.



MicroZed Chronicles hardcopy.jpg 



  • Second Year E Book here
  • Second Year Hardback here


MicroZed Chronicles Second Year.jpg 







YouTube teardown and repair videos are one way to uncover previously unknown applications of Xilinx components. Today I found a new-this-week video teardown and repair of a non-operational Agilent (now Keysight) 53152A 46GHz Microwave Frequency Counter that uncovers a pair of vintage Xilinx parts: an XC3042A FPGA (with 144 CLBs!) and an XC9572 CPLD with 72 macrocells. Xilinx introduced the XC3000 FPGA family in 1987 and the XC9500 CPLD family appeared a few years later, so these are pretty vintage examples of early programmable logic devices from Xilinx—still doing their job in an instrument that Agilent introduced in 2001. That’s a long-lived product!


Looking at the pcb, I’d say that the XC3042A FPGA implements a significant portion of the microwave counter’s instrumentation logic and the XC9572 CPLD connects all of the LSI components to the adjacent microprocessor. (These days, I could easily see replacing the board’s entire upper-left quadrant’s worth of ICs with one Zynq SoC. Less board space, far more microprocessor and programmable-logic performance.)



Agilent 53152A Microwave Frequency Counter Main Board with Xilinx FPGA and CPLD.jpg 


Agilent 53152A Microwave Frequency Counter main board with vintage Xilinx FPGA and CPLD

(seen in the upper left)




A quick look at the Keysight Web site shows that the 53152A counter is still available and lists for $19,386. If you look at it through my Xilinx eyeglasses, that’s a pretty good multiplier for a couple of Xilinx parts that were designed twenty to thirty years ago. The 42-minute video was made by YouTube video makers Shahriar and Shayan Shahramian, who call their Patreon-supported channel “The Signal Path.” In this video, Shahriar manages to repair this 53152A counter that he bought for about $650—so he’s doing pretty well too. (Spoiler alert: The problem's not with the Xilinx devices, they still work fine.)


I really enjoy watching well-made repair videos of high-end equipment and always learn a trick or two. This video by The Signal Path is indeed well made and takes its time explaining each step and why they’re performed. Other than telling you that the Xilinx parts are not the problem, I’m not going to give the plot away (other than to say, as usual, that the butler did it).  



Here’s the video:






I’m sure you realize that Xilinx continues to sell FPGAs—otherwise, you wouldn’t be on this blog page—although today’s FPGAs are a lot more advanced with many hundreds of thousands or millions of logic cells. But perhaps you didn’t realize that Xilinx is still in the CPLD business. If that’s a surprise to you, I recommend that you read this: “Shockingly Cool Again: Low-power Xilinx CoolRunner-II CPLDs get new product brief.” Xilinx CoolRunner-II CPLDs aren't offered in the 72-macrocell size, but you can get them with as many as 384 macrocells if you wish.




Freelance documentary cameraman, editor, and producer/director Johnnie Behiri has just published a terrific YouTube video interview with Sebastian Pichelhofer, acting Project Leader of Apertus’ Zynq-based AXIOM Beta open-source 4K video camera project. (See below for more Xcell Daily blog posts about the AXIOM open-source 4K video camera.) This video is remarkable in the amount of valuable information packed into its brief, 20-minute duration. This video is part of Behiri’s cinema5D Web site and there’s a companion article here.


First, Sebastian explains the concept behind the project: develop a camera with features in demand, with development funded by a crowd-funding campaign. Share the complete, open-source design with community members so they can hack it, improve it, and give these improvements and modifications back to the community.


A significant piece of news: Sebastian says that the legendary Magic Lantern team (a group dedicated to adding substantial enhancements to the video and imaging capabilities of Canon dSLR cameras, is now on board as the project’s color-science experts. As a result, says Sebastian, the camera will be able to feature push-button selection of different “film stocks.” Film selection was one way for filmmakers to control the “look” of a film, back in the days when they used film. These days, camera companies devote a lot of effort into developing their own “film” look, but the AXIOM Beta project wants flexibility in this area, as in all other areas. I think Sebastian’s discussion of camera color science from end to end is excellent and worth watching just by itself.


I also appreciated Sebastian’s very interesting discussion of the challenges associated with a crowd-funded, open-source project like the AXIOM Beta. The heart of the AXIOM Beta camera’s electronic package is a Zynq SoC on an Avnet MicroZed SOM and that design choice strongly supports the project team’s desire to be able to quickly incorporate the latest innovations and design changes into systems in the manufacturing process. Here's a photo captured from the YouTube interview:




AXIOM Beta Interview Screen Capture 1.jpg 




At 14:45 in the video, Sebastian attempts to provide an explanation of the FPGA-based video pipeline’s advantages in the AXIOM Beta 4K camera—to the non-technical Behiri (and his mother). It’s not easy to contrast the sequential processing of microprocessor-based image and video processing with the same processing on highly parallel programmable logic when talking to a non-engineering audience, especially on the fly in a video interview, but Sebastian makes a valiant effort. By the way, the image-processing pipeline’s design is also open-source and Sebastian suggests that some brave souls may well want to develop improvements.


At the end of the interview, there are some video clips captured by a working AXIOM prototype. Of course, they are cat videos. How appropriate for YouTube! The videos are nearly monochrome (gray cats) and shot wide open so there’s a very shallow depth of field, but they still look very good to me for prototype footage. (There are additional video clips including HDR clips here on Apertus’ Web site.)




Here’s the cinema5D video interview:







Additional Xcell Daily posts about the AXIOM Beta open-source video camera project:








By Adam Taylor


Over this blog series, I have written a lot about how we can use the Zynq SoC in our designs. We have looked at a range of different applications and especially at embedded vision. However, some systems use a pure FPGA approach to embedded vision, as opposed to an SoC like the members in the Zynq family, so in this blog we are going to look at how we can get a simple HDMI input-and-output video-processing system using the Artix-7 XC7A200T FPGA on the Nexys Video Artix-7 FPGA Trainer Board. (The Artix-7 A200T is the largest member of the Artix-7 FPGA device family.)


Here’s a photo of my Nexys Video Artix-7 FPGA Trainer Board:






Nexys Video Artix-7 FPGA Trainer Board




For those not familiar with it, the Nexys Video Trainer Board is intended for teaching and prototyping video and vision applications. As such, it comes with the following I/O and peripheral interfaces designed to support video reception, processing, and generation/output:



  • HDMI Input
  • HDMI Output
  • Display Port Output
  • Ethernet
  • UART
  • USB Host
  • 512 MB of DDR SDRAM
  • Line In / Mic In / Headphone Out / Line Out
  • FMC



To create a simple image-processing pipeline, we need to implement the following architecture:







The supervising processor (in this case, a Xilinx MicroBlaze soft-core RISC processor implemented in the Artix-7 FPGA) monitors communications with the user interface and configures the image-processing pipeline as required for the application. In this simple architecture, data received over the HDMI input is converted from its parallel format of Video Data, HSync and VSync into an AXI Streaming (AXIS) format. We want to convert the data into an AXIS format because the Vivado Design Suite provides several image-processing IP blocks that use this data format. Being able to support AXIS interfaces is also important if we want to create our own image-processing functions using Vivado High Level Synthesis (HLS).


The MicroBlaze processor needs to be able to support the following peripherals:



  • AXI UART – Enables communication and control of the system
  • AXI Timer Enables the MicroBlaze to time events

  • MicroBlaze Debugging Module – Enables the debugging of the MicroBlaze

  • MicroBlaze Local Memory – Connected to DLMB and ILMB (Data & Instruction Local Memory Bus)


We’ll use the memory interface generator to create a DDR interface to the board’s SDRAM. This interface and the SDRAM creates a common frame store accessible to both the image-processing pipeline and the supervising processor using an AXI interconnect.


Creating a simple image-processing pipeline requires the use of the following IP blocks:



  • DVI2RGB – HDMI input IP provided by Digilent
  • RGB2DVI – HDMI output IP provided by Digilent
  • Video In to AXI4-Stream – Converts a parallel-video input to AXI Streaming protocol (Vivado IP)
  • AXI4-Stream to Video Out – Converts an AXI-Stream-to-Parallel-video output (Vivado IP)
  • Video Timing Controller Input – Detects the incoming video parameters (Vivado IP)
  • Video Timing Controller Output – Generates the output video timing parameters (Vivado IP)
  • Video Direct Memory Access – Enables images to be written to and from the DDR SDRAM



The core of this video-processing chain is the VDMA, which we use to move the image into the DDR memory.







The diagram above demonstrates how the IP block converts from streamed data to memory-mapped data for the read and write channels. Both VDMA channels provide the ability to convert between streaming and memory-mapped data as required. The write channel supports Stream-to-Memory-Mapped conversion while the read channel provides Memory-Mapped-to-Stream conversion.


When all this is put together in Vivado to create the initial base system, we get the architecture below, which is provided by the Nexys Video HDMI example.







All that is required now is to look at the software required to configure the image-processing pipeline. I will explain that next time.




Code is available on Github as always.




If you want E book or hardback versions of previous MicroZed chronicle blogs, you can get them below.




  • First Year E Book here
  • First Year Hardback here.



MicroZed Adam Taylor Special Edition.jpg




  • Second Year E Book here
  • Second Year Hardback here


MicroZed Chronicles Second Year.jpg 




True audiophiles often know no bounds on the energy they’ll put into their quest for “perfect” sound and Patrick Cazeles, who goes by “patc” online, is no exception. His attraction to high-end audio started in 2002, veered into FPGAs as early as 2004, and his quest to develop a high-end digital-audio playback system has been intertwined with the Zynq SoC since in 2014, not too long after Xilinx started shipping the first Zynq devices. First, Patrick bought an Avnet MicroZed dev board; then he switched to a Zynq-based Parallela board; and now he’s using the low-cost snickerdoodle board from krtkl, which is based on a Zynq Z-7020 SoC. "What an amazing piece of hardware the Zynq is!" he writes.


Here’s a photo of the Patrick’s Zyntegrated Digital Audio system in its current incarnation with an inset photo of his complete audio system:



Zyntegrated Audio System by patc.jpg 



And here’s a block diagram showing all of the Zyntegrated Digital Audio system’s capabilities:




Zyntegrated Audio System block diagram by patc.jpg 




As you can see, the Zynq SoC implements nearly everything in the Zyntegrated Digital Audio System from interfacing to the audio sources (a SATA CD player and an SD card) to controlling the touch-panel LCD user interface, receiving remote IR commands, accepting SPDIF digital-audio input, driving eight class-D audio amps in a quad-amped arrangement (four amps per channel driving separate audio frequency bands, where the Zynq SoC’s programmable logic performs the bandpass and low-pass filtering for all four bands—for each stereo channel), and taking room acoustic measurements from a digitized microphone for digital room correction (again over a SPDIF interface).


Whew! This guy knows how to drive a Zynq SoC!


So, how does it work? Here’s a new, 4-minute video of the Zyntegrated Digital Audio System in action, playing audio from a digitized vinyl-record turntable, a standard audio CD, and WAV files stored on an SD card:





Isn’t that touch-screen interface amazing? The audio sounds nice too, but I suspect there’s considerable compression taking place to pack the quad-amped audio into YouTube’s teeny, tiny sound channel.


Here’s a recent 3-minute video, in which Patrick provides a detailed walkthrough of the current Zyntegrated Digital Audio system’s design:






Interested in even more details? Here’s a detailed, chronological forum message stream detailing the development of this amazing audio system back to the year 2014 on the Parallela Community forum.



By Adam Taylor


We have looked at embedded vision several times throughout this series, however it has always been within the visible portion of the electromagnetic (EM) spectrum. Infra-Red is another popular imaging section of the EM spectrum that allows us to see thermal emissions from objects in the world around us. For this reason, IR is very popular if we want to see in low-light conditions or at night and in a range of other very exciting applications from wildfire detection to defense applications.


Therefore, over the next two blogs we are going to look at getting the FLIR Lepton IR camera up and running with the Zynq-based Arty Z7 dev board from Digilent. I selected Digilent’s Arty Z7 because it has an HDMI output port so we can output the image from the Lepton IR camera to a display. The Arty Z7 board also has the Arduino/chipKIT Shield connector, which we can use to connect directly to the Lepton camera itself.





Digilent’s Arty Z7 dev board with the Lepton IR camera plugged into the board’s Arduino/chipKIT Shield connector




The Lepton IR camera from FLIR is an 80x60-pixel  (Lepton 2) or 160x120-pixel (Lepton 3) long-wave infra-red (LWIR) camera module. As a microbolometer-based thermal sensor, it operates without the need for cryogenic cooling, unlike HgCdTe-based sensors. Instead, a microbolometer works by each pixel changing resistance when IR radiation strikes it. This resistance change defines the temperatures in the scene. Typically, microbolometer-based thermal imagers have much-reduced resolution when compared to a cooled imager. They do however make thermal-imaging systems simpler to create.


To get the Lepton camera up and running with the Arty Z7 board, we need a breakout board for mounting the Lepton camera module. This breakout board simplifies the power connection, breaks out the camera’s control and video interfaces, and allows us to connect directly into the Arty Z7’s Shield connector.


The Lepton is controlled using a 2-wire interface, which is remarkably similar to I2C. This similarity allows us to use the Zynq I2C controller over EMIO to issue commands to the camera. The camera supplies 14-bit video output using Video over SPI (VoSPI). This video interface uses the SCLK, CS, and MISO signals. The camera module is assumed to be the slave. However, as we need to receive 16 bits of data for each pixel in the VoSPI transaction, we cannot use the SPI peripheral in the Zynq SoC’s PS (processing system), which only works with 8-bit data.


Instead, we will use an AXI QSPI IP block instantiated in the Zynq SoC’s PL (programmable logic), correctly configured to work with standard SPI. This is a simple example of why Zynq SoCs are so handy for I/O-centric embedded designs. You can accommodate just about any I/O requirement you encounter with a configurable IP block or a little HDL code.


Implementing the above will enable us to control the camera module on the breakout board and receive the video into the PS memory space. To display the received image, we need to be able to create a video pipeline that reads the image from the PS DDR SDRAM and outputs it over HDMI.


The simplest way to do this is to update the HDMI output reference design, which is available on the Digilent GitHub:









To update date this design we are going to do the following:


  1. Add an AXI QSPI configured for 16-bit standard SPI
  2. Enable the PS I2C routing, the signal via the EMIO
  3. Map both the I2C and the SPI I/O to the Arty Z7 board’s Shield connector



We can then update the software running on the Zynq processor core to control the camera module, receive the VoSPI, and configure the HDMI output channel.


For this example, I have plugged the Breakout board with the camera module so that the SDA and SCL pins on the Shield connector and breakout board align. This means we can use the Shield connector’s IO10 to IO13 pins for the VoSPI. We do not use IO11, which would be the SPI interface’s MOSI, because that signal is unused in this application.


However, if we use this approach we must also provide an additional power signal to the Breakout board and camera module as the Shield connector on the Arty Z7 is not able to supply the 5V required on the A pin. Instead, it’s connected to a Zynq I/O pin. Therefore, I used a wire from the Shield connector’s 5V pin on the opposite side to supply power to the Lepton breakout board’s 5V power input.


With the hardware, up and running and the Vivado design rebuilt we can then open SDK and update the software as required to display the image. We will look at that next time.




Code is available on Github as always.


If you want E book or hardback versions of previous MicroZed chronicle blogs, you can get them below.




  • First Year E Book here
  • First Year Hardback here.



MicroZed Chronicles hardcopy.jpg 



  • Second Year E Book here
  • Second Year Hardback here



MicroZed Chronicles Second Year.jpg 





About the Author
  • Be sure to join the Xilinx LinkedIn group to get an update for every new Xcell Daily post! ******************** Steve Leibson is the Director of Strategic Marketing and Business Planning at Xilinx. He started as a system design engineer at HP in the early days of desktop computing, then switched to EDA at Cadnetix, and subsequently became a technical editor for EDN Magazine. He's served as Editor in Chief of EDN Magazine, Embedded Developers Journal, and Microprocessor Report. He has extensive experience in computing, microprocessors, microcontrollers, embedded systems design, design IP, EDA, and programmable logic.