We have detected your current browser version is not the latest one. Xilinx.com uses the latest web technologies to bring you the best online experience possible. Please upgrade to a Xilinx.com supported browser:Chrome, Firefox, Internet Explorer 11, Safari. Thank you!


By Adam Taylor



So far on this journey (which is only just beginning) of looking at the Zynq UltraScale+ MPSoC we have explored mostly the A53 processors within the Application Processing Unit (APU). However, we must not overlook the Real-Time Processing Unit (RPU), which contains two ARM Cortex-R5 32 bit RISC processors and operates within the Zynq MPSoC’s PS’ (processing systems’) Low Power Domain.






R5 RPU Architecture



The RPU executes real-time processing applications, including safety-critical applications. As such, you can use it for applications that must comply with IEC61508 or ISO 26262. We will be looking at this capability in more detail in a future blog. To support this, the RPU can operate in two distinct modes:


  • Split or Performance: - Both cores operate independently
  • Lock-Step: - Both cores operate in lockstep


Of course, it is the lock-step mode which is implemented as one step when a safety application is being implemented (see chapter 8 of the TRM for full safety and security capabilities). To provide deterministic processing times, both ARM Cortex-R5 cores include 128KB of Tightly Coupled Memory (TCM) in addition to the Caches and OCM (on-chip memory). How the TCMs are used depends upon the operating mode. In Split mode, each processor has 128Kbytes of TCM (divided into A and B TCMs). In lock-step mode, there is one 256Kbyte TCM.





RPU in Lock Step Mode



At reset, the default setting configures the RPU to operate in lock-step mode. However, we can change between the operating modes while the processor group is in reset. We do this by updating the RPU Global Control Register SLCAMP bit, which clamps the outputs of the redundant processors, and the SLSPLIT bit, which sets the operating mode. We cannot change the RPU’s operating mode during operation, so we need to decide upfront during the architectural phase which mode we desire for a given application.


However, we do not have to worry about setting these bits when we use the debugger or generate a boot image. Instead we can use these to configure the operating mode. What I want to look at in the rest of the blog is look at how we configure the RPU operating mode both in our debug applications and boot-image generation.


The first way that we verify many of our designs is to use the System Debugger within SDK, which allows us to connect over JTAG or Ethernet and download our application. Using this method, we can of course use breakpoints and step through the code as it operates, to get to the bottom of any issues in the design. Within the debug configuration tab, we can also enable the RPU to operate in split mode if that’s the mode we want after system reset.





Debug Configuration to enable RPU Split Mode



When you download the code and run it on the Zynq MPSoC’s RPU, you will be able to see the operating mode within the debug window. This should match with your debug configuration setting.





Debug Window showing Lock-Step Mode



Once we are happy with the application, we will want to create a boot image and we will want to determine the RPU operating mode when we create that boot image. We can add the RPU elf to the FSB, FPGA, and APU files using the boot-image dialog. To select the RPU mode, we choose the edit option and then select the destination CPU—either both ARM Cortex-R5 cores in lockstep or the ARM Cortex-R5 core we wish it run on if we are using split mode.






Selecting the R5 Mode of operation when generating a boot image



Of course if we want to be sure we are in the correct mode in this operation, we need to read the RPU Global Control register and ensure the correct mode is selected as expected.


Now that we understand the different operating modes of the Zynq UltraScale+ MPSoC’s RPU, we can come back to these modes when we look at the security and safety capabilities provided by the Zynq MPSoC.



Code is available on Github as always.


If you want E book or hardback versions of previous MicroZed chronicle blogs, you can get them below.




  • First Year E Book here
  • First Year Hardback here.



MicroZed Chronicles hardcopy.jpg 



  • Second Year E Book here
  • Second Year Hardback here


MicroZed Chronicles Second Year.jpg 


CCIX Tech Demo Proves 25Gbps Performance over PCIe

by Xilinx Employee on ‎05-24-2017 12:58 PM (693 Views)

By:  Gaurav Singh



CCIX was just announced last year and already things are getting interesting.


The approach of CCIX as an acceleration interconnect is to work within the existing volume server infrastructure while delivering improvements in performance and cost.


We’ve reached a major milestone.  CCIX members Xilinx and Amphenol FCI have recently revealed the first public CCIX technology demo and what it means for the future of data center system design is exciting to consider.


In the demo video below, you’ll see the transferring of a data pattern at 25 Gbps between two Xilinx FPGAs, across a channel comprised of an Amphenol/FCI PCI Express CEM connector and a trace card. The two devices contain Xilinx Transceivers electrically compliant with CCIX. By using the PCI Express infrastructure found in every data center today, we can achieve this 25 Gig performance milestone. The total insertion loss in the demo is greater than 35dB, die pad to die pad, which allows flexibility in system design. We’re seeing excellent margin, and a BER of less than 1E-12.


At 25 Gig, this is the fastest data transfer between accelerators over PCI Express connections ever achieved. It’s three times faster than the top transfer speed of PCI Express Gen3 solutions available today.  The application benefits of communicating three times faster between accelerators is significant in data centers, and CCIX is designed to excel in multi-accelerator configurations.


CCIX will enable seamless system integration between processors such as X86, POWER and ARM and all accelerator types, including FPGAs, GPUs, network accelerators and storage adaptors.  Even custom ASICs can be incorporated into a CCIX topology.  And CCIX gives system designers the flexibility to choose the right combination of heterogeneous components from many different vendors to deliver optimized configurations for the data center. 


We’re looking forward to the first products with CCIX sampling later this year.






This week at its annual NI Week conference in Austin, Texas, National Instruments (NI) announced a new FlexRIO PXIe module, the PXIe-7915, based on three Xilinx Kintex UltraScale FPGAs. NI’s PCIe FlexRIO modules serve two purposes in NI-based systems: flexible, programmable, high-speed I/O and high-speed computation (usually DSP). NI’s customers access these FlexRIO resources using the company’s LabVIEW FPGA software, part of NI’s LabVIEW graphical development environment. Thanks to the Kintex UltraScale FPGA, new FlexRIO PXIe-7915 module contains significantly more programmable resources and delivers significantly more performance than previous FlexRIO modules, which are all based on earlier generations of Xilinx FPGAs. The set of graphs below shows the increased resources and performance delivered by the PXIe-7915 FlexRIO module in NI systems relative to previous-generation FlexRIO modules based on Xilinx Kintex-7 FPGAs:




FlexRIO UltraScale Graphs.jpg 



However, the UltraScale-based FlexRIO modules are not simply standalone products. They serve as design platforms for NI’s design engineers, who will use these platforms to develop many new, high-performance instruments. In fact, NI introduced the first two of these new instruments at NI Week 2017: the PXIe-5763 and PXIe-5764 high-speed, quad-channel, 16-bit digitizers. Here are the specs for these two new digitizers from NI:



NI FlexRIO Digitizers based on Kintex UltraScale FPGAs.jpg 



Previous digitizers in this product family employed parallel LVDS signals to couple high-speed ADCs to an FPGA. However, today’s fastest ADCs employ high-speed serial interfaces--particularly the JESD-204B interface specification—necessitating a new design. The resulting new design uses the FlexRIO PXIe-7915 card as a motherboard and the JESD204B ADC card as a mezzanine board, as shown in this photo:



NI FlexRIO PXIe-5764 Digitizer.jpg 




NI’s design engineers took advantage of the pin compatibility among various Kintex UltraScale FPGAs to maximize the flexibility of their design. They can populate the FlexRIO PXIe-7915 card with a Kintex UltraScale KU035, KU040, or KU060 FPGA depending on customer requirements. This flexibility allows them to create multiple products using one board layout—a hallmark of a superior, modular platform design.


Normally, you access the programmable-logic features of a Xilinx FPGA or Zynq SoC inside of an NI product using LabVIEW FPGA, and that’s certainly still true. However, NI has added something extra in its LabVIEW 2017 release: a Xilinx Vivado Project Export feature that provides direct access to the Xilinx Vivado Design Suite tools for hardware engineers experienced with writing HDL code for programmable logic. Here’s how it works:



LabVIEW Vivado Export Design Flow.jpg 




You can export all the necessary hardware files from LabVIEW 2017 to a Vivado project that is pre-configured for your specific deployment target. Any LabVIEW signal-processing IP in the LabVIEW design is included in the export as encrypted IP cores. As an added bonus, you can use the new Xilinx Vivado Project Export on all of NI’s FlexRIO and high-speed-serial devices based on Xilinx Kintex-7 or newer FPGAs.



NI has published a White Paper describing all of this. You’ll find it here.


Please contact National Instruments directly for more information about the new FlexRIO modules and LabVIEW 2017.




TI has a new design example for a 2-device power converter to supply multiple voltage rails to a Xilinx Zynq UltraScale+ MPSoC for Remote Radio Heads and wireless backhaul applications, but the design looks usable across the board for many applications of the Zynq MPSoC. The two TI power-control and -conversion devices in this reference design are the TPS6508640 configurable, multi-rail PMIC for multicore processors and the TPS544C25 high-current, single-channel dc-dc converter. Here’s a simplified diagram of the design:




TI Remote Radio Head Power Supply Design Example.jpg



Please contact TI for more information about these power-control and –conversion devices.

Adam Taylor’s MicroZed Chronicles, Part 196: SDSoC and Levels of Abstraction

by Xilinx Employee ‎05-22-2017 09:40 AM - edited ‎05-22-2017 10:28 AM (1,713 Views)


By Adam Taylor



We have looked at SDSoC several times throughout this series, however I recently organized and presented at the NMI FPGA Machine Vision event and during the coffee breaks and lunch, attendees showed considerable interest in SDSoC—not only for its use in the Xilinx reVISION acceleration stack but also its use in a range of over developments. As such, I thought it would be worth some time looking at what SDSoC is and the benefits we have previously gained using it. I also want to discuss a new use case.





SDSoC Development Environment




SDSoC is an Eclipse-based, system-optimizing compiler that allows us to develop our Zynq SoC or Zynq UltraScale+ MPSoC design in its entirety using C or C++. We can then profile the application to find aspects that cause performance bottlenecks and move then into the Zynq device’s Programmable Logic (PL). SDSoC does this using HLS (High Level Synthesis) and a connectivity framework that’s transparent to the user. What this means is that we are able develop at a higher level of abstraction and hence reduce the time to market of the product or demonstration.


To do this, SDSoC needs a hardware platform, which can be pre-defined or custom. Typically, these platforms within the PL provide the basics: I/O interfaces and DMA transfers to and from Zynq device’s PS’ (Processing System’s) DDR SDRAM. This frees up most the PL resources and PL/PS interconnects to be used by SDSoC when it accelerates functions.


This ability to develop at a higher level and accelerate performance by moving functions into the PL enables us to produce very flexible and responsive systems. This blog has previously looked at acceleration examples including AES encryption, matrix multiplication, and FIR Filters. The reduction in execution time has been significant in these cases. Here’s a table of these previously discussed examples:





Previous Acceleration Results with SDSoC. Blogs can be found here




To aid us in the optimization of the final application, we can use pragmas to control the HLS optimizations. We can use SDSoC’s tracing and profiling capabilities while optimizing these accelerated functions and the interaction between the PS and PL.


Here’s an example of a trace:





Results of tracing an example application

(Orange = Software, Green = Accelerated function and Blue = Transfer)



Let us take a look at a simple use case to demonstrate SDSoC’s abilities.


Frequency Modulated Continuous Wave (FMCW) RADAR is used for a number of applications that require the ability to detect objects and gauge their distance. FMCW applications make heavy use of FFT and other signal-processing techniques such as windowing, Constant False Alarm Rate (CFAR), and target velocity and range extraction. These algorithms and models are ideal for description using a high-level language such as C / C++. SDSoC can accelerate the execution of functions described this way and such an approach allows you to quickly demonstrate the application.


It is possible to create a simple FMCW receive demo using a ZedBoard and an AD9467 FPGA Mezzanine Card (FMC). At the simplest level, the hardware element of the SDSoC platform needs to be able to transfer samples received from the ADC into the PS memory space and then transfer display data from the PS memory space to the display, which in most cases will be connected with DVI or HDMI interfaces.






Example SDSoC Platform for FMCW application



This platform permits development of the application within SDSoC at a higher level. It also provides a platform that we can use for several different applications, not just FMCW. Rather helpfully, the AD9467 FMC comes with a reference design that can serve as the hardware element of the SDSoC Platform. It also provides drivers, which can be used as part of the software element.


With a platform in hand, it is possible to write the application within the SDSoC using C or C++, where we can make use of the acceleration libraries and stacks including matrix multiplication, math functions, and the ability to wrap bespoke HLD IP cores and use them within the development.


Developing in this manner provides a much faster development process, and provides a more responsive solution as it leverages the Zynq PL for inherently parallel or pipelined functions. It also makes it easier to upgrade designs in terms. As the majority development will also use C or C++ and because SDSoC is a system-optimizing complier, the application developer does not need to be a HDL specialist.




Code is available on Github as always.


If you want E book or hardback versions of previous MicroZed chronicle blogs, you can get them below.




  • First Year E Book here
  • First Year Hardback here.



MicroZed Chronicles hardcopy.jpg 




  • Second Year E Book here
  • Second Year Hardback here



MicroZed Chronicles Second Year.jpg 







Enea just announced that it has added a BSP (board support package) for the Zynq UltraScale+ MPSoC and ZCU102 Eval Kit to its POSIX-compliant, multicore OSE operating system. OSE offers embedded developers extremely low latency, low jitter, and minimal processing overhead to deliver bare-metal performance that extracts maximum performance from heterogeneous processors like the Zynq UltraScale+ MPSoC. According to Enea, OSE supports both SMP (symmetric multiprocessing) and AMP (asymmetric multiprocessing) and delivers linear performance scalability for MPSoCs with as many as 24 cores, so it should be able to easily handle the four or more 64- and 32-bit ARM Cortex-A53 and –R5 processor cores in the various Zynq UltraScale+ MPSoC family members.



ZCU102 Board Photo.jpg 


Xilinx ZCU102 Eval Kit for the Zynq UltraScale+ MPSoC




Enea’s carrier-grade OSE has long been used in the telecom industry and is incorporated into more than half of the world's radio base stations. In addition, OSE is used in automotive, medical, and avionics designs.



DFC Design’s Xenie FPGA module product family pairs a Xilinx Kintex-7 FPGA (a 70T or a 160T) with a Marvell Alaska X 88X3310P 10GBASE-T PHY on a small board. The module breaks out six of the Kintex-7 FPGA’s 12.5Gbps GTX transceivers and three full FPGA I/O banks (for a total of 150 single-ended I/O or up to 72 differential pairs) with configurable I/O voltage to two high-speed, high-pin-count, board-to-board connectors. A companion Xenie BB Carrier board accepts the Xenie FPGA board and breaks out the high-speed GTX transceivers into a 10GBASE-T RJ45 connector, an SFP+ optical cage, and four SDI connectors (two inputs and two outputs).


Here’s a block diagram and photo of the Xenia FPGA module:





DFC Design Xenia FPGA Module.jpg 



Xenia FPGA module based on a Xilinx Kintex-7 FPGA




And here’s a photo of the Xenie BB Carrier board that accepts the Xenia FPGA module:




DFC Design Xenia BB Carrier Board.jpg 


Xenia BB Carrier board



These are open-source designs.


DFC Design has developed a UDP core for this design, available on OpenCores.org and has published two design examples: an Ethernet example and a high-speed camera design.


Here’s a block diagram of the Ethernet example:





DFC Design Ethernet Example System.jpg 



Please contact DFC Design directly for more information.



Shockingly Cool Again: Low-power Xilinx CoolRunner-II CPLDs get new product brief

by Xilinx Employee ‎05-18-2017 11:51 AM - edited ‎05-19-2017 04:33 PM (1,955 Views)


This may shock you, but Xilinx continues to be in the CPLD business. I was reminded of that point this week when I got an email blast about the Xilinx CoolRunner-II CPLD family—first introduced about 15 years ago—which is still being made, sold, and supported. In fact, said the email blast, Xilinx has committed to 7+ years of supply for these devices. They also have a slick, updated product brief:



Xilinx CoolRunner-II Product Brief.jpg




Despite their age, Xilinx’s inexpensive, reprogrammable CoolRunner-II CPLDs are still pretty useful devices. They carry their own configuration in an on-chip EEPROM. Because CoolRunner-II CPLDs sip power—quiescent current can be a mere handful of μA for an XC2C32A device and there are some unique power-saving features in all of the devices including DataGATE and CoolCLOCK with DualEDGE flip-flops—they are often used for power sequencing and supervisory tasks. With maximum system toggle frequencies in the low hundreds of MHz, they’re capable of implementing fairly fast state machines as well. Yes, they’re used for glue logic too. They’re handy things to have in your design toolbox.


Need a very small programmable logic device for use on a tight pcb? The CoolRunner-II XC2C32A CPLD with 32 macrocells and 21 I/O pins is available in a 5x5mm QFG32 package and the CoolRunner-II XC2C64A CPLD with 64 macrocells and 37 I/O pins is available in a 7x7mm QFG48 package. Got an 8x8mm spot on your board? An XC2C256 CPLD can drop more than 100 I/O pins into that space.


You can still create designs for Coolrunner-II CPLDs using the Xilinx ISE Design Suite and, if you’ve not used these devices before, you can get a Digilent CoolRunner-II CPLD Starter Board for $39.99. The Starter Board incorporates a CoolRunner-II XC2C256 CPLD.



Digilent CoolRuner-II Starter Board.jpg 


$39.99 Digilent CoolRunner-II CPLD Starter Board







After telegraphing its intent for more than a year, Xilinx has now added the P416 language to its SDNet Development Environment for high-speed (1Gbps to 100Gbps) packet processing. SDNet release 2017.1 includes a generally accessible, front-end P4-to-SDNet translator. P416 is the latest version of the P4 language and the SDNet workflow compiles packet-processing descriptions into data-plane switching algorithms instantiated in high-speed Xilinx FPGAs. Xilinx debuted the new SDNet release at this week’s P4 Developer Day and P4 Workshop held at Stanford U. in Palo Alto, CA. (There was a beta version of the translator in the prior SDNet 2016.4 release.)


There’s information about the new Xilinx P4-toSDNet translator in the latest version of the SDNet Packet Processor User Guide (UG1012) and the P4-SDNet Translator User Guide (UG1252). If you’re up on recent developments with the P416 language, you might want to jump to these user guides directly. Otherwise, you might want to take a look at this Linley Group White Paper titled “Xilinx SDNet: A New Way to Specify Network Hardware”, written by Senior Analyst Loring Wirbel, or watch this short video first:






And if you have a couple of hours to devote to learning a lot more about the P4 language, try this video from the P4 Language Consortium, which includes presentations from Vladimir Gurevich from Barefoot Networks, Ben Pfaff from VMware, Johann Tonsing from Netronome, and Gordon Brebner from Xilinx:








Never at a loss for words, Adam Taylor has just published some additional thoughts on designing with Xilinx All Programmable devices over at the EEWeb.com site. His post, titled “Make Something Awesome with the $99 FPGA-Based Arty Development Board,” serves as a reminder or an invitation to attend the free May 31 Xilinx Webinar titled “Make Something Awesome with the $99 Arty Embedded Kit.”


Here’s what Adam has to say about FPGA design today:


“Both the maker and hobby communities are increasingly using FPGAs within their designs. This is thanks to the provision of boards at the right price point for the market, coupled with the availability of easy-to-use development tools that include simulation and High-Level Synthesis (HLS) capabilities.


“Let's be honest; compared to the reputation FPGAs have had historically, developing FPGA-based designs in this day-and-age is much simpler. This is largely thanks to a wide range of IP modules that are supplied with the development tools from board vendors and places like OpenCores.”



Adam’s article discusses two low-cost Digilent boards:






Digilent Arty Z7.jpg 


Digilent Arty Z7 Development Board




Adam concludes his article with this: “Overall, if you are looking to take your first steps into the world of FPGAs, then the Arty (Artix-based) or the Arty Z7 (Zynq 7000-based) should be high on your list of development boards to consider.”



High-Frequency Trading on Xilinx FPGAs? Aldec demos Kintex UltraScale board at Trading Show 2017, Chicago

by Xilinx Employee ‎05-17-2017 04:39 PM - edited ‎05-17-2017 05:07 PM (1,046 Views)


You’ve probably heard that “time equals money.” That’s especially true with high-frequency trading (HFT), which seeks high profits based on super-short portfolio holding periods driven by quant (quantitative) modeling. Microseconds make the difference in the HFT arena. As a result, a lot of high-frequency trading companies use FPGA-based hardware to make decisions and place trades and a lot of those companies use Xilinx FPGAs. No doubt that’s why Aldec is showing its HES-HPC-DSP-KU115 FPGA accelerator board at the Trading Show 2017 being held in Chicago, starting today.




 Aldec HES-HPC-DSP-KU115 Board.jpg


Aldec HES-HPC-DSP-KU115 FPGA accelerator board




This board is based on two Xilinx All Programmable devices: the Kintex UltraScale KU115 FPGA and the Zynq Z-7100 SoC (the largest member of the Zynq SoC family). This board has been optimized for High Performance Computing (HPC) applications and prototyping of DSP algorithms thanks to the Kintex UltraScale KU115 FPGA’s 5520 DSP blocks. This board partners the Kintex UltraScale FPGA with six simultaneously accessible external memories—two DDR4 SODIMMs and four low-latency RLDRAMs—providing immense FPGA-to-memory bandwidth.


The Zynq Z-7100 SoC can operate as an embedded Linux host CPU and it can implement a PCIe host interface and multiple Gigabit Ethenert ports.


In addition, the Aldec HES-HPC-DSP-KU115 FPGA accelerator board has two QSFP+ optical-module sockets for 40Gbps network connections.




Amazon Web Services (AWS) has just posted a 35-minute deep-dive video discussing the Amazon EC2 F1 Instance, a programmable cloud accelerator based on Xilinx Virtex UltraScale+ FPGAs. (See “AWS makes Amazon EC2 F1 instance hardware acceleration based on Xilinx Virtex UltraScale+ FPGAs generally available.”) This fresh, new video talks about the development process and the AWS SDK.


Rather than have me filter this interesting video, here it it:








By Adam Taylor


When I demonstrated how to boot the ZedBoard using the TFTP server, there was one aspect I did not demonstrate: configuring the Zynq SoC’s PL (programmable logic) over the TFTP. It’s very simple to do. We can include the PL bin file along with the Kernel, RAM Disk, and Device Tree blob on the server and then allow U-Boot to configure the PL as it boots, just as we did for the other elements.


We can also configure the Zynq SoC’s PL at any time we want using either Linux or bare-metal applications. To do this we use the DevC (Device configuration)/PCAP (Processor Configuration Access Port) within the Zynq SoC’s PS (processing system). There are three methods through which we configure the PL. The most obvious being JTAG, followed by PCAP under PS control, with the final method being the ICAP (Internal Configuration Access Port). It is through the DevC interface that we configure the PL when the device boots using the FSBL or U-Boot. The ICAP path is the least-used method and requires a configured PL prior to its use. One example where you might use the ICAP path would be to allow a MicroBlaze soft-core processor to reconfigure the PL.







When the device is running, we can replace the contents of the PL with an updated design using the same interface. All that we need to do is to have generated the new bit file and ensure that it is accessible to the program running on the ARM Cortex-A9 processors in the Zynq SoC’s PS so that they can download it via the DevC interface.


If we are using Linux, we can upload the file into the file system using FTP. We can then use the built-in DevC driver within the Linux Kernel to download the bit file.








From a command prompt, we can enter the command:



cat {filename} > /dev/xdevcfg



to download the bit file. When I did this for a simple Zedboard design, as shown below—which includes the ability to drive the LEDS connected to the PL—the “Done” LED lit. Of course, to ensure correct operation we need to have the device tree blob correctly configured to support the PL design.








If we want to configure the Zynq SoC’s PL using bare-metal software, we can use a similar approach. The BSP comes with an example file that downloads a PL image using the DevC interface provided that we have the PL file loaded into the Zynq SoC’s attached DDR memory. We can access the example and include it within our design using the System.MSS file, which is provided when we generate a BSP.







To correctly use the example provided, we need to have a PL bit file loaded in the DDR Memory. For a production-ready system, we would have to store the PL configuration file within a non-volatile memory and then load it into the DDR at a known address before running the DevC example code. However, to demonstrate the concept, we can use the debugger to download the configuration file into the DDR at the desired memory location.


Within the application example, all we need to do is define the location of the configuration file and the size of the file:







Having demonstrated how we can reconfigure the PL in its entirety, we can also use a similar approach to partially reconfigure regions within the PL, which we will look at in future blogs.




Code is available on Github as always.


If you want E book or hardback versions of previous MicroZed chronicle blogs, you can get them below.




  • First Year E Book here
  • First Year Hardback here.



MicroZed Chronicles hardcopy.jpg 



  • Second Year E Book here
  • Second Year Hardback here


MicroZed Chronicles Second Year.jpg 




LMI TechnologiesGocator 3210 is a smart, metrology-grade, stereo-imaging snapshot sensor that produces 3D point clouds of scanned objects with 35μm accuracy over fields as large as 100x154mm at 4fps. The diminutive (190x142x49mm) Gocator 3210 pairs a 2Mpixel stereo camera with an industrial LED-based illuminator that projects structured blue light onto the subject to aid precise measurement of object width, height, angles, and radii. An integral Xilinx Zynq SoC accelerates these measurements so that the Gocator 3210 can scan objects at 4Hz, which LMI says is 4x the speed of such a sensor setup feeding raw data to a host CPU for processing. This fast scanning speed means that parts can pass by the Gocator for inspection on a production line without stopping for the measurement to be made. The Gocator uses a GigE interface for host connection.



LMI Technologies Gocator 3210.jpg


LMI Technologies Gocator 3210 3D Smart Stereo Vision Sensor



LMI provides a browser-based GUI to process the point clouds and 3D models generated by the Gocator. That means the processing—which includes the calculation of object width, height, angles, and radii—all takes place inside of the Gocator. No additional host software is required.


Here’s a photo of LMI’s GUI showing a 3D scan of an automotive cylinder head (a typical application for this type of sensor):




LMI Gocator GUI.jpg



LMI also offers an SDK so that you can develop sophisticated inspection programs that run on the Gocator. The company has also produced an extensive series of interesting training videos for the Gocator sensor family.


Finally, here’s a short (3 minutes) but information-dense video explaining the Gocator’s features and capabilities:






LMI’s VP of Sales Len Chamberlain has just published a blog titled “Meeting the Demand for Application-Specific 3D Solutions” that further discusses the Gocator 3210’s features and applications.



A paper titled “Evaluating Rapid Application Development with Python for Heterogeneous Processor-based FPGAs” that discusses the advantages and efficiencies of Python-based development using the PYNQ development environment—based on the Python programming language and Jupyter Notebooks—and the Digilent PYNQ-Z1 board, which is based on the Xilinx Zynq SoC, recently won the Best Short Paper award at the 25th IEEE International Symposium on Field-Programmable Custom Computing Machines (FCCM 2017) held in Napa, CA. The paper’s authors—Senior Computer Scientist Andrew G. Schmidt, Computer Scientist Gabriel Weisz, and Research Director Matthew French from the USC Viterbi School of Engineering’s Information Sciences Institute—evaluated the impact of, the performance implications, and the bottlenecks associated with using PYNQ for application development on Xilinx Zynq devices. The authors then compared their Python-based results against existing C-based and hand-coded implementations.



The authors do a really nice job of describing what PYNQ is:



“The PYNQ application development framework is an open source effort designed to allow application developers to achieve a “fast start” in FPGA application development through use of the Python language and standard “overlay” bitstreams that are used to interact with the chip’s I/O devices. The PYNQ environment comes with a standard overlay that supports HDMI and Audio inputs and outputs, as well as two 12-pin PMOD connectors and an Arduino-compatible connector that can interact with Arduino shields. The default overlay instantiates several MicroBlaze processor cores to drive the various I/O interfaces. Existing overlays also provide image filtering functionality and a soft-logic GPU for experimenting with SIMT [single instruction, multiple threads] -style programming. PYNQ also offers an API and extends common Python libraries and packages to include support for Bitstream programming, directly access the programmable fabric through Memory-Mapped I/O (MMIO) and Direct Memory Access (DMA) transactions without requiring the creation of device drivers and kernel modules.”



They also do a nice job of explaining what PYNQ is not:



“PYNQ does not currently provide or perform any high-level synthesis or porting of Python applications directly into the FPGA fabric. As a result, a developer still must use create a design using the FPGA fabric. While PYNQ does provide an Overlay framework to support interfacing with the board’s IO, any custom logic must be created and integrated by the developer. A developer can still use high-level synthesis tools or the aforementioned Python-to-HDL projects to accomplish this task, but ultimately the developer must create a bitstream based on the design they wish to integrate with the Python [code].”



Consequently, the authors did not simply rely on the existing PYNQ APIs and overlays. They also developed application-specific kernels for their research based on the Redsharc project (see “Redsharc: A Programming Model and On-Chip Network for Multi-Core Systems on a Programmable Chip”) and they describe these extensions in the FCCM 2017 paper as well.




Redsharc Project.jpg




So what’s the bottom line? The authors conclude:


“The combining of both Python software and FPGA’s performance potential is a significant step in reaching a broader community of developers, akin to Raspberry Pi and Ardiuno. This work studied the performance of common image processing pipelines in C/C++, Python, and custom hardware accelerators to better understand the performance and capabilities of a Python + FPGA development environment. The results are highly promising, with the ability to match and exceed performances from C implementations, up to 30x speedup. Moreover, the results show that while Python has highly efficient libraries available, such as OpenCV, FPGAs can still offer performance gains to software developers.”


In other words, there’s a vast and unexplored territory—a new, more efficient development space—opened to a much broader system-development audience by the introduction of the PYNQ development environment.


For more information about the PYNQ-Z1 board and PYNQ development environment, see:







Today, IBM and Xilinx announced PCIe Gen4 16/Gtransfer/sec/lane interoperation between an IBM Power9 processor and a Xilinx UltraScale+ All Programmable device. (FYI: That’s double the performance of a PCIe Gen3 connection.) IBM expects this sort of interface to be particularly important in the data center for high-speed, processor-to-accelerator communications, but the history of PCIe evolution clearly suggests that PCIe Gen4 is destined for wide industry adoption across many markets—just like PCIe generations 1 through 3. The thirst for bit rate exists everywhere, in every high-performance design.



IBM Xilinx PCIe Gen4 Interoperability.jpg




All Xilinx Virtex UltraScale+ FPGAs, many Zynq UltraScale+ MPSoCs, and some Kintex UltraScale+ FPGAs incorporate one or more PCIe Gen3/4 hardened, integrated blocks, which can operate as PCIe Gen4 x8 or Gen3 x16 Endpoints or Roots. In addition, all UltraScale+ MGT transceivers (except the PS-GTR transceivers in Zynq UltraScale+ MPSoCs) support the data rates required for PCIe Gen3 and Gen4 interfaces. (See “DS890: UltraScale Architecture and Product Data Sheet: Overview” and “WP458: Leveraging UltraScale Architecture Transceivers for High-Speed Serial I/O Connectivity” for more information.)



With at least four, six, or more hardened, embedded programmable microprocessor cores in Xilinx Zynq UltraScale+ MPSoCs and a nearly unlimited number of soft MicroBlaze processor cores possible in the devices’ programmable logic, you need to start thinking pretty hard about how you’re going to harness all of that software programmability. Hardent would like to help you so it’s offering a free Webinar on June 20 titled “Leveraging The OpenAMP Framework for Heterogeneous Software Architecture.”


The OpenAMP framework for the Zynq UltraScale+ MPSoC virtualizes the Zynq MPSoC processors and makes that consolidated computing power available to software developers in a more familiar form. Hardent’s webinar will discuss the OpenAMP framework and will then outline how designers can leverage the framework to run different operating systems—such as Linux and an RTOS—concurrently, using different processors within the same MPSoC.


In this webinar, you will:


  • Learn about Linux Asymmetric Multiprocessing (AMP) on multi-core and heterogeneous devices
  • Discover what the OpenAMP framework is and how you can use it to manage firmware across a multi-processor system
  • Learn how to get started with the OpenAMP framework (topology, start-up process, API, and vendor support)



Register here.


Adam Taylor’s MicroZed Chronicles Part 194: A Zynq UltraScale+ MPSoC Interrupt & GPIO example

by Xilinx Employee ‎05-15-2017 09:05 AM - edited ‎05-16-2017 11:23 AM (1,414 Views)


By Adam Taylor


I have previously discussed the Zynq UltraScale+ MPSoC’s interrupt architecture, so this blog will show you how to use these interrupts in a simple example. To do this we are going to use the push button and the LED on the Avnet UltraZed Starter Kit. These peripherals are connected to the Zynq MPSoC’s PS MIO. We will configure the system so that pressing the button generates an interrupt, causing the Zynq MPSoC to toggle the LED on and off.


We are using the UltraZed SoM on the UltraZed IOCC (I/O Carrier Card), so the push button is connected to MIO 26 while the LED is connected to MIO 31. Within Vivado, you can see what each MIO is used for and, if necessary, configure it on the IO configuration tab of the MPSoC Customization wizard. Both of the MIO signals used in this example are on MIO bank 1 and, because we used Vivado’s the board automation when we instantiated the MPSoC in our block diagram, the MIO and PS are already configured correctly for both the SoM and the IOCC.






MIO configuration on the MPSoC PS




Because we are using the MIO for this example, we can use the existing Vivado MPSoC design that we’ve been using to date. The main work to get this example up and running will be within SDK, where we need to do the following:


  • Initialize and configure the GPIO Controller – MIO pin 26 is configured as an input while MIO pin 31 is configured and enabled as an output


  • Initialize and configure the Interrupt Controller – After we have initialized the GIC, we need to configure the GPIO to generate an interrupt when the button was pushed. Within this function, we also need to identify which function is to be called when the interrupt occurs.


  • Create an Interrupt Service Routine – This is the function that is executed when a GPIO interrupt is detected. This function reads the status of the interrupt pin, and toggles the LED state. As it is toggled, the LED state will be echoed to a local terminal. There is a delay within this ISR to de-bounce the switch, which prevents rapidly changing values on the switch input from changing the LED status multiple times.





Example running on the MPSoC



To implement this example and write the elements identified above, we will need to use functions contained with the Xilinx PS GPIO, PS Generic Interrupt Controller and Exception drivers. These were created when we established our BSP.


I have uploaded the source code and the bin file to the GitHub repository. If you are using a different board than the UltraZed IOCC, you can modify this example very simply. To do this you need to  change the input and output pin and bank mapping to the MIO allocations as used on your board, assuming there is a switch and LED connected to the MIO.





GPIO Bank and MIO Pin numbers to be updated in the source code for different boards




Once you have updated the source code example for your board, all you need to do is rebuild the project and run it on your hardware.



Code is available on Github as always.


If you want E book or hardback versions of previous MicroZed chronicle blogs, you can get them below.




  • First Year E Book here
  • First Year Hardback here.



MicroZed Chronicles hardcopy.jpg 



  • Second Year E Book here
  • Second Year Hardback here



MicroZed Chronicles Second Year.jpg 



The new PALTEK DS-VU 3 P-PCIE Data Brick places a Xilinx Virtex UltraScale+ VU3P FPGA along with 8Gbytes of DDR4-2400 SDRAM, two VITA57.1 FMC connectors, and four Samtec FireFly Micro Flyover ports on one high-bandwidth, PCIe Gen3 with a x16 host connector. The card aims to provide FPGA-based hardware acceleration for applications including 2K/4K video processing, machine learning, big data analysis, financial analysis, and high-performance computing.



Paltek Data Brick.jpg 


PALTEK Data Brick packs Virtex UltraScale+ VU3P FPGA onto a PCIe card




The Samtec Micro Flyover ports accept both ECUE copper twinax and ECUO optical cables. The ECUE twinax cables are for short-reach applications and have a throughput of 28Gbps per channel. The ECUO optical cables operate at a maximum data rate of 14Gbps per channel and are available with as many as 12 simplex or duplex channels (with 28Gbps optical channels in development at Samtec).


For broadcast video applications, PALTEK also offers companion 12G-SDI Rx and 12G-SDI-Tx cards that can break out eight 12G-SDI video channels from one FireFly connection.


Please contact PALTEK directly for more information about these products.




 For more information about the Samtec FireFly system, see:








Epiq Solutions has announced the Matchstiq S12 SDR transceiver, an expansion to the Matchstiq transceiver family, which also includes the Matchstiq S10 and S11. All three Matchstiq family members pair a Freecale i.MX6 quad-core CPU, used for housekeeping and interfacing (Ethernet, HDMI, and USB), with a Xilinx Spartan-6 LX45T FPGA installed on the company’s Sidekiq MiniPCIe card, which performs the RF signal processing for SDR. These two devices, located on separate boards, communicate over a single PCIe lane and form a reusable SDR platform for the Matchstiq transceiver family. The Matchstiq S12 employs a Dropkiq frequency-extension board to take the bottom of its tuning frequency range below 1MHz. All three Matchstiq transceiver tuners top out at 6GHz and have 50MHz of channel bandwidth. The Matchstiq S10 and S11 SDR tuners go down to 70MHz.


Here are the block diagrams of all three Matchstiq transceivers, which illustrate the platform nature of the basic Matchstiq design:



Epiq Solutions Matchstiq RF Transceivers.jpg



Epic Solutions Matchstiq SDR Transceiver Block Diagrams




And here’s a family photo:




Epiq Solutions Matchstiq RF Transceivers family.jpg 



Epic Solutions Matchstiq SDR Transceiver Family









On May 16, David Pellerin, Business Development Manager at AWS (Amazon Web Services) will be presenting two 1-hour Webinars with a deep dive into Amazon’s EC2 F1 Instance. (The two times are to accommodate different time zones worldwide.) The Amazon EC2 F1 Instance allows you to create custom hardware accelerators for your application using cloud-based server hardware that incorporates multiple Xilinx Virtex UltraScale+ VU9P FPGAs. Each Amazon EC2 F1 Instance can include as many as eight FPGAs, so you can develop extremely large and capable, custom compute engines with this technology. Applications in diverse fields such as genomics research, financial analysis, video processing, security/cryptography, and machine learning are already using the FPGA-accelerated EC2 F1 Instance to improve application performance by as much as 30x over general-purpose CPUs.


Topics include:


  • How to design hardware accelerations to maximize the benefits of F1 instances
  • Design tools available with F1 instances as part of the Developer AMI, Hardware Development Kit
  • How to package and deploy your hardware acceleration code and offer it on the AWS Marketplace


Register for Amazon’s Webinar here.



“Xilinx All Programmable FPGAs and SoCs are playing a pivotal role in building 5G systems that can be easily and rapidly updated and enhanced to align with emerging standards and opportunities. The majority of the industry’s 5G proof of concepts, test beds and early commercialization trials for eMBB, URLLC, and mMTC use cases are leveraging Xilinx technology,” because “merchant silicon does not exist and ASICs are not viable this early in the 5G standardization phase. … The first wave of commercial 5G system deployments are likely to rely on these prototypes.”


That’s the premise of a new blog written by Xilinx’s Director Communications Strategic & Technical Marketing Harpinder Matharu and posted on the knect365.com Web site. Follow the link to read Matharu’s full blog post.




For more 5G coverage in Xcell Daily, see:









The huge number of low-cost Arduino peripheral shields from dozens of vendors makes the Arduino form factor extremely attractive. Now you can take advantage of that shield library using the Zynq SoC with the inexpensive, €89 Trenz ArduZynq, which puts a single-core Xilinx Zynq Z-7007S SoC along with 512Mbytes of DDR3L SDRAM, 16Mbytes of SPI Flash memory, and a MicroSD card socket into an Arduino form factor.


Here’s a photo of the Trenz ArduZynq board:



Trenz ArduZynq.jpg 



€89 Trenz ArduZynq puts a single-core Xilinx Zynq Z-7007S SoC into the Arduino form factor




Here’s a pinout diagram of the ArduZynq:



Trenz ArduZynq Pinout.jpg


Trenz ArduZynq Pinout





Hardware and software design for this board is supported by the Xilinx Vivado Design Suite HL WebPACK Edition, downloadable at no cost.



Note: For more information about the single-core Zynq SoC family, see “One-ARM Zynq family joins the Zynq parade. Now you can choose from 31 devices with one, two, four, or six ARM microprocessors.”



By Adam Taylor


One of the great things about many of the Zynq SoC’s PS (processing system) peripherals is that we can break them out via the PL (programmable logic) I/O. This capability provides us with great flexibility at the system level as we can implement more peripherals than can be supported over the Zynq SoC’s MIO on its own. However, during the years of writing the MicroZed Chronicles, I have been asked questions by a few readers’ about mapping from the PS to the PL I/O using EMIO and how to map when using PS GPIO. So in this post, I am going to address those questions and provide a nice simple reference for how to do it.


We can break out many of the PS peripherals into the PL using EMIO. The exceptions are the USB ports, the SMC (static memory controller), and the QSPI Flash controller. There may be some performance degradation when the EMIO is used. For example, SDcard controller I/O operates at 50MHz when routed to the MIO and 25MHz when routed to the EMIO.





Peripheral and Routing to the EMIO



When we route signals to the EMIO, we will see the appropriate port appear at the top level of the Zynq IP block within Vivado. To enable these signals, we need to configure the SPI to use the EMIO, which is done on the MIO configuration tab of the Zynq IP Configuration Wizard. We can enable the SPI and from the IO drop down select EMIO. This will create a SPI port at the top level of the design.





Selecting the EMIO for SPI




Resultant SPI port on the Zynq Block with port added





We can then use the standard XDC constraints file to route the I/O to any of the PL pins as we would for a normal element within the PL design.


Where it gets slightly more complicated is when we are using the GPIO and decide to extend that using EMIO. Suddenly, we need to understand GPIO banks and GPIO Numbers and IO pins.


The Zynq-7000 series provides 54 GPIO signals in two banks dedicated the MIO (although if you use all 54 pins you cannot use any other peripherals). These banks consist of a 32-bit bank 0 and a 22-bit bank 1. Additionally, there are also two EMIO-only banks. Both are 32 bits wide. Within the EMIO, these banks provide 64 inputs, 64 outputs, and another 64 output enables that can be used as outputs, giving us a total of 192 I/O signals (64 Inputs, 128 Outputs).


These GPIO signals are numbered from 0 to 53, for banks within the PS MIO, and 54 to 117 for GPIO within the EMIO region. When we break these signals out into the EMIO, the Zynq IP block will show them on the Zynq IP block. Note that GPIO 0 on the Zynq port is Pin 54 for the ARM cores. These IO signals can then be routed to the PL IO as we would any other signal and tied to a specific IO pin and standard using the XDC file. The diagram below shows the relationship between the different elements:







It does get slightly confusing however when we use software to drive the GPIO signals within SDK. To drive the desired GPIO pin, we must use either the bank or the pin number. For GPIO signals, the pin numbers range from 0 and 53. For the EMIO signals, pin numbers range from 54 to 117. Once we understand this and that we route the signals in the PL just like we do any other signal, we can quickly use the EMIO using the XGPIOPS library provided with the BSP.


Hopefully this makes things a little clearer to those still starting out the relationships among the Zynq software, the Zynq SoC’s PL, and the XDC file.




Code is available on Github as always.


If you want E book or hardback versions of previous MicroZed chronicle blogs, you can get them below.




  • First Year E Book here
  • First Year Hardback here.



MicroZed Chronicles hardcopy.jpg 



  • Second Year E Book here
  • Second Year Hardback here



MicroZed Chronicles Second Year.jpg 





Face it, you use PCIe to go fast. That’s the whole point of the interface. So when you move data over your PCIe buses, you likely want to go as fast as possible. Perhaps you’d like some tips on getting maximum PCIe performance when designing with Xilinx’s most advanced FPGAs. You’re in luck , there’s a new 13-minute video that discusses that topic.


The video covers these contributors to PCIe performance:


  • Selecting the appropriate link speed and width
  • Maximum payload size
  • Largest possible transfer size
  • Enabling the maximum number of DMA channels
  • Polling versus interrupts (polling is faster)


The video explores a PCIe design for the KCU105 Kintex UltraScale FPGA Evaluation Kit using the Vivado Design Suite’s graphical IP Integrator (IPI) tool. The design took only about 20 to 30 minutes to create using IPI.


The video then discusses the results of various performance experiments using this design. Results like this:




PCIe results.jpg 




Here’s the video:







Xilinx today announced that its Spartan-7 FPGA family is now available for order entry and shipping to standard lead times. Spartan-7 devices family are designed for cost-sensitive system designs. They are low-cost, low-power FPGAs. Spartan-7 devices are available in packages as small as 8x8mm (with 100 user I/O pins!) for designs where real estate for circuitry is limited—small video cameras for example—while still offering huge I/O capabilities.


Here’s a table showing the key features of Spartan-7 device family members:



Spartan-7 Family Table with 1Q devices.jpg 



Spartan-7 family members deliver 50% better embedded performance than previous generations when using Xilinx 32-bit MicroBlaze soft processor IP. These devices also incorporate layered security features including AES-256 bitstream decryption, SHA-256 bitstream authentication, and on-chip eFUSE key storage.


For more information about the Xilinx Spartan-7 FPGA family, see:








Iain Mosely, founder of Converter Technology, has a new project that’s particularly interesting to me: developing a class-D audio amp using the Xilinx Zynq SoC (in the form of an Avnet MicroZed SOM) and GaN (gallium-nitride) power transistors. Mosely has been working with Zynq SoCs to develop dc/dc power converters and a class-D audio amp is certainly a close cousin.


Based on this LinkedIn blog post, Mosely appears to have designed a motherboard that accepts the Avnet MicroZed SOM and includes a stereo pair of GaN transistors. Here’s a photo of the board:





Class-D Audio Amp based on Zynq SoC and GaN Transistors.jpg 




If you’re not familiar with class-D amplifiers, they’re a riff on PWM used for audio applications. (Here’s a nice 8-minute tutorial video with a clear explanation of class-D amplification.) Class-D amplifiers were first proposed in 1958 (although the Internet appears to be mute on who first proposed the design, at least for the ten minutes I spent searching for the answer—surprise update, see below!). I remember learning about class-D amps in the late 1960s, when digital technology was still too immature to make the idea practical for consumer audio. The class-D amp’s advantage over linear audio amps is that class-D operation runs the output transistors like digital switches: on or off. That means little of the amp’s input power is wasted as heat and the output transistors won’t need big heat sinks or fans. The latter are anathema to the overall audio experience.


Another key factor in class-D amplifier design is that audio distortion comes from temporal errors and nonlinearities in the PWM (among other things).




In other words, clock jitter equals distortion for class-D amps.




That means that if you rely on a microprocessor’s software timing loop to perform the PWM conversion, you’re automatically and immediately in trouble. Not so if you implement the PWM in hardware, which is exactly what the Zynq SoC offers—in the form of high-speed programmable logic.


It appears that Mosely is just starting out with this project, so it will be interesting to see what he discovers as he brings his power-conversion expertise to bear on the problem.







Update on the inventor of the Class D amplifier. More determined searching produced this 2012 obituary for Hans Camenzind from the Santa Clara Magazine, published by Santa Clara University:


Hans Camenzind MBA ’71, the Swiss emigre analog guru who invented one of the most successful circuits in electronics history and introduced the concept of phase-locked loop to IC design, passed away in his sleep at the age of 78.

“Camenzind came to the United States in 1960 and worked for several years at some of the storied names of the newly developing semiconductor industry: Transitron, Tyco Semiconductor, and Signetics. 

“In 1971 he joined the ranks of entrepreneurs by founding InterDesign, a company specializing in semi-custom integrated circuit design. It was there, working under a contract with Signetics, that he invented the 555 timer.  Signetics commercialized the device in 1972, and it went on to become one of the most successful in the industry's history. The device, used in oscillator, pulse-generation and other applications, is still widely used today. Versions of the device have been or are still made by dozens of major semiconductor vendors, including Texas Instruments, Intersil, Maxim, Avago, Exar, Fairchild, NXP and STMicroelectronics. 

“Camenzind also introduced the idea of phase-locked loop to design and invented the first class D amplifier.

“Camenzind was a prolific author with interests as diverse as electronics textbooks and the history of the industry ("Much Ado About Almost Nothing") to a book on God and religion ("Circumstantial Evidence"). He wrote under the pen name John Penter. He received an MSEE from Northeastern University and an MBA from the University of Santa Clara, and, during his career secured 20 patents.”





MathWorks Logo 2.jpg MathWorks has just scheduled five dates with worldwide venues for its new “Software-Defined Radio with Zynq using Simulink” course. The full-day, hands-on class covers design and modeling of SDR systems using MATLAB and Simulink, targeting Xilinx Zynq SoCs. Here’s an overview of the course:



  • Model and simulate RF signal chain and communications algorithms.
  • Verify the operation of baseband transceiver algorithm using real data
  • Generate HDL and C code targeting the programmable logic (PL) and processing system (PS) on the Zynq SoC to implement TX/RX.



The course costs $750 or €700, depending on the venue. The venues and dates are:


  • Munich, Germany – June 30 and July 27, 2017
  • Natick, Massachusetts – October 20, 2017
  • San Diego, California – November 17, 2017
  • San Jose, California – December 8, 2017



Please contact MathWorks directly for more information about this SDR course.



Adam Taylor’s MicroZed Chronicles, Part 192: Pmod – What if there is no Driver?

by Xilinx Employee ‎05-08-2017 10:32 AM - edited ‎05-08-2017 10:33 AM (1,626 Views)


By Adam Taylor



We recently examined how we could use Pmods in our system. There are a lot of Pmods available from many vendors but drivers are not necessarily available for all of them. If there is no driver available, we can use the Pmod bridge in the Zynq SoC’s PL (programmable logic), which enables us to correctly map Pmod ports on our selected development board and to create our own Zynq PS (processing system) driver. If we were to explore one of the provided drivers, we would find these also use the Pmod bridge, coupled with an AXI IIC or SPI component.






Pmod AD2 PL Driver Components



In this example, I am going to be using Digilent’s DA4 octal DAC Pmod. I’ll integrate it with Digilent’s dual ADC AD2 Pmod, which we previously used in the driver example. We will develop our own driver with the Pmod bridge, generate an analog signal using the DA4 Pmod, and then receive the signal using the AD2 Pmod.






Pmod DA4 test set up



The Pmod bridge allows us to define the input types for both the top and bottom row of the Pmod connector. We can select from either GPIO, UART, IIC, or SPI interfaces. We make this selection for each of the Pmod rows in line with the Pmod we wish to drive. Selecting the desired type makes the pinout of the Pmod connector align with the standard for the interface type.


For the DA4, we need to use a SPI interface on the top row only. With this selected, we need to provide the actual SPI communication channel. As we are using the Zynq SoC, we have two options. The first would be to use an AXI SPI IP block within the PL and connected to the bridge. The second approach—and the one I am going to use—is to connect the bridge to the Zynq PS’ SPI using EMIO. This choice provides us with the ability to wire the pins from the PS SPI ports to the bridge inputs directly.


To do this we need to read the standard to ensure we can map from the SPI pins to the input pins on the bridge in the correct order (e.g. which PS SPI signal is connected to IN_0?). As these pins on the bridge represent different interface types, they are named generically. The diagram below shows how I did it for the DA4. Once we have mapped the pins for this example, we can build the project, export it to SDK, and then write the software to drive our DA4.






We can use the SPI drivers created by the BSP within SDK to drive the DA4. To interact with the DA4, the first thing we need to do is initialize the SPI controller. Once we have set the SPI Options for clock phase and master operation, we can then define a buffer and use the polled-transfer mode to transfer the required information to the DA4. A more complex driver would use an interrupt-driven approach as opposed to a polled one.






I have uploaded the file I created to drive the DA4 onto the git hub repository. To test it I drove a simple ramp output and used the scope feature in the Digilent Analog Discovery module to monitor the DAC output. I received the following signal:






With this completed and the DA4 known to be working as expected, I connected the DA4 and the AD2 together so that the Zynq SoC could receive the signal:






When doing this, we need to be careful to ensure the signal output by the DA4 is within the AD2 Pmod’s operating range.


Having completed this and shown that the DA4 is working on the hardware, we now understand how we can create drivers if there is no driver available.



Code is available on Github as always.


If you want E book or hardback versions of previous MicroZed chronicle blogs, you can get them below.




  • First Year E Book here
  • First Year Hardback here.



MicroZed Chronicles hardcopy.jpg 



  • Second Year E Book here
  • Second Year Hardback here



MicroZed Chronicles Second Year.jpg 



A Tale of Two Cameras: You’re gonna need a bigger FPGA

by Xilinx Employee on ‎05-05-2017 04:02 PM (2,215 Views)


Cutting-edge industrial and medical camera maker NET (New Electronic Technology) had a table at this week’s Embedded Vision Summit where the company displayed two generations of GigE cameras: the GigEPRO and CORSIGHT. Both of these camera families include multiple cameras that accommodate a wide range of monochrome and color image sensors. There are sixteen different C- and CS-mount cameras in the GigEPRO family with sensors ranging from WVGA (752x480 pixels) to WQUXGA (3664x2748 pixels) and a mix of global and rolling shutters. The CORSIGHT family includes eleven cameras with sensors ranging from WVGA (752x480 pixels) to QSXGA (2592x1944 pixels), or one of two line-scan sensors (2048 or 4096 pixels), with a mix of global and rolling shutters. In addition to its Gigabit Ethernet interface, the CORSIGHT cameras have WiFi, Bluetooth, USB 2.0, and optional GSM interfaces. Both the GigEPRO and CORSIGHT cameras are user-programmable and have on-board, real-time image processing, which can be augmented with customer-specific image-processing algorithms.



NET GigEPRO Camera.jpg 



GigEPRO Camera from NET







CORSIGHT Camera from NET




You program both cameras with NET’s GUI-based SynView Software Development Kit, which generates code for controlling the NET cameras and for processing the acquired images. When you create a program, SynView automatically determines if the required functionality is available in camera hardware. If not, SynView will do the necessary operations in software (although this increases the host CPU’s load). NET’s GigEPRO and CORSIGHT cameras are capable of performing significant on-board image processing right out of the box including Bayer decoding for color cameras, LUT (Lookup Table) conversion, white balance, gamma, brightness, contrast, color correction, and saturation.


Which leads to the question: What’s performing all of these real-time, image-processing functions in NET’s GigEPRO and CORSIGHT cameras?


Xilinx FPGAs, of course. (This should not be a surprise. After all, you’re reading a post in the Xilinx Xcell Daily blog.)


The GigEPRO cameras are based on Spartan-6 FPGAs—an LX45, LX75, or LX100 depending on the family member. At the Embedded Vision Summit, Dr. Thomas Däubler, NET’s Managing Director and CTO, explained to me that “the FPGAs are what give the GigEPRO cameras their PRO features.” In fact, there is user space reserved in the larger FPGAs for customer-specific algorithms to be performed in real time inside of the camera itself. What sort of algorithms? Däubler gave me two examples: laser triangulation and Q-code recognition. In fact, he said, some of NET’s customers perform all of the image processing and analysis in the camera and never send the image to the host—just the results of the analysis. Of course, this distributed-processing approach greatly reduces the host CPU’s processing load and therefore allows one host computer to handle many more cameras.


Here’s a photo from the Summit showing a NEW GigEPRO camera inspecting a can on a spinning platform while reading the label on the can:




NET GigEPRO Camera Inspects Object on Spinning Table and Reads Label.jpg 



NET GigEPRO Camera Inspects Object on Spinning Table and Reads Label



There’s a second important reason for using the FPGA in NET’s GigEPRO cameras: the FPGAs create a hardware platform that allowed NET to develop the sixteen GigEPRO family members that handle many different image sensors with varied hardware interfaces and timing requirements. NET relied on the Spartan-6 FPGAs’ I/O programmability to help with this aspect of the camera family’s design.


So when it came time for NET to develop a new intelligent camera family—the recently introduced CORSIGHT smart vision system—with even more features, did NET’s design engineers continue to use FPGAs for real-time image processing?


Of course they did. For the newer camera, and for the same reasons, they chose the Xilinx Artix-7 FPGA family.


And here’s the CORSIGHT camera in action:




NET CORSIGHT Camera Inspects Object on Spinning Table and Reads Label.jpg



NET CORSIGHT Camera Inspects Object on Spinning Table




Note: For more information about the GigEPRO and CORSIGHT camera families, and the SynView software, please contact NET directly.



About the Author
  • Be sure to join the Xilinx LinkedIn group to get an update for every new Xcell Daily post! ******************** Steve Leibson is the Director of Strategic Marketing and Business Planning at Xilinx. He started as a system design engineer at HP in the early days of desktop computing, then switched to EDA at Cadnetix, and subsequently became a technical editor for EDN Magazine. He's served as Editor in Chief of EDN Magazine, Embedded Developers Journal, and Microprocessor Report. He has extensive experience in computing, microprocessors, microcontrollers, embedded systems design, design IP, EDA, and programmable logic.