We have detected your current browser version is not the latest one. Xilinx.com uses the latest web technologies to bring you the best online experience possible. Please upgrade to a Xilinx.com supported browser:Chrome, Firefox, Internet Explorer 11, Safari. Thank you!

Arty Z7: Digilent’s new Zynq SoC trainer and dev board—available in two flavors for $149 or $209

by Xilinx Employee ‎03-22-2017 11:18 AM - edited ‎03-22-2017 04:09 PM (2,120 Views)


I’ve known this was coming for more than a week, but last night I got double what I expected. Digilent’s Web site has been teasing the new Arty Z7 Zynq SoC dev board for makers and hobbyists for a week—but with no listed price. Last night, prices appeared. That’s right, there are two versions of the board available:


  • The $149 Arty Z7-10 based on a Zynq Z-7010 SoC
  • The $209 Arty Z7-20 based on a Zynq Z-7020 SoC



Digilent Arty Z7.jpg 


Digilent Arty Z7 dev board for makers and hobbyists




Other than that, the board specs appear identical.


The first thing you’ll note from the photo is that there’s a Zynq SoC in the middle of the board. You’ll also see the board’s USB, Ethernet, Pmod, and HDMI ports. On the left, you can see double rows of tenth-inch headers in an Arduino/chipKIT shield configuration. There are a lot of ways to connect to this board, which should make it a student’s or experimenter’s dream board considering what you can do with a Zynq SoC. (In case you don’t know, there’s a dual-core ARM Cortex-A9 MPCore processor on the chip along with a hearty serving of FPGA fabric.)


Oh yeah. The Xilinx Vivado HL Design Suite WebPACK tools? Those are available at no cost. (So is Digilent’s attractive cardboard packaging, according to Arty Z7 Web page.)


Although the Arty Z7 board has now appeared on Digilent’s Web site, the product’s Web page says the expected release date is March 27. That’s five whole days away!


As they say, operators are standing by.



Please contact Digilent directly for more Arty Z7 details.





By Adam Taylor


In looking at the Zynq UltraScale+ MPSoC’s AMS capabilities so far, we have introduced the two slightly different Sysmon blocks residing within the Zynq UltraScale+ MPSoC’s PS (processing system) and PL (programmable logic). In this blog, I am going to demonstrate how we can get the PS Symon up and running when we use both the ARM Cortex-A53 and Cortex-R5 processor cores in the Zynq UltraScale+ MPSoC’s PS. There is little difference when we use both types of processor, but I think it important to show you how to use both.


The process to use the Sysmon is the same as it is for many of the peripherals we have looked at previously with the MicroZed Chronicles:


  1. Look Up the configuration of the Sysmon Peripheral (XSysMonPsu_LookupConfig)
  2. Initialize the Sysmon Peripheral (XSysMonPsu_CfgInitialize)
  3. Reset the Sysmon (XSysMonPsu_Reset)
  4. Set the Sequencer to safe mode while we update its configuration (XSysMonPsu_SetSequencerMode)
  5. Disable the alarms (XSysMonPsu_SetAlarmEnables)
  6. Set the Sequencer Enables for the channels we want to sample (XSysMonPsu_SetSeqChEnables)
  7. Set the ADC Clock Divisor (XSysMonPsu_SetAdcClkDivisor)
  8. Set the Sequencer Mode (XSysMonPsu_SetSequencerMode)


The function names in parentheses are those which we use to perform the operation we desire, provided we pass the correct parameters. In the simplest case, as in this example, we can then poll the output registers using the XSysMonPsu_GetAdcData() function. All of these functions are defined within the file xsysmonpsu.h, which is available under the board Support Package Lib Src directory in SDK.


Examining the functions, you will notice that each of the functions used in step 4 to 8 require an input parameter called SysmonBlk. You must pass this parameter to the function. This parameter is how we which Sysmon (within the PS or the PL) we want to address. For this example, we will be specifying the PS Sysmon using XSYSMON_PS, which is also defined within xsysmonpsu.h. If we want to address the PL, we use the XSYSMON_PL definition, which we will be looking at next time.


There is also another header file which is of use and that is xsysmonpsu_hw.h. Within this file, we can find the definitions required to correctly select the channels we wish to sample in the sequencer. These are defined in the format:






This simple example samples the following within the PS Sysmon:


  1. Temperature
  2. Low Power Core Supply Voltage
  3. Full Power Core Supply Voltage
  4. DDR Supply Voltage
  5. Supply voltage for PS IO banks 0 to 3


We can use conversion functions provided within the xsysmonpsu.h to convert from the raw value supplied by the ADC into temperature and voltage. However, the PS IO banks are capable of supporting 3v3 logic. As such, the conversion macro from raw reading to voltage is not correct for these IO banks or for the HD banks in the PL. (We will look at different IO bank types in another blog).


The full-scale voltage is 3V for most of the voltage conversions. However, in line with UG580 Pg43, we need to use a full scale of 6V for the PS IO. Otherwise we will see a value only half of what we are expecting for that bank’s supply voltage setting. With this in mind, my example contains a conversion function at the top of the source file to be used for these IO banks, to ensure that we get the correct value.


The Zynq UltraScale+ MPSoC architecture permits both the APU (the ARM Cortex-A53 processors) and the RPU (the ARM Cortex-R5 processors) to address the Sysmon. To demonstrate this, the same file was used in applications first targeting an ARM Cortex-A53 processor in the APU and then targeting the ARM Cortex-R5 processor in the RPU. I used Core 0 in both cases.


The only difference between these two cases was the need to create new applications that select the core to be targeted and then updating the FSBL to load the correct core. (See “Adam Taylor’s MicroZed Chronicles, Part 172: UltraZed Part 3—Saying hello world and First-Stage Boot” for more information on how to do this.)






Results when using the ARM Cortex-A53 Core 0 Processor






Results when using the ARM Cortex-R5 Core 0 Processor




When I ran the same code, which is available in the GitHub directory, I received the examples as above over the terminal program, which show it working on both the ARM Cortex-A53 and ARM Cortex-R5 cores.


Next time we will look at how we can use the PL Sysmon.




Code is available on Github as always.


If you want E book or hardback versions of previous MicroZed chronicle blogs, you can get them below.




  • First Year E Book here
  • First Year Hardback here.



MicroZed Chronicles hardcopy.jpg 



  • Second Year E Book here
  • Second Year Hardback here



MicroZed Chronicles Second Year.jpg






By Adam Taylor


Embedded vision is one of my many FPGA/SoC interests. Recently, I have been doing some significant development work with the Avnet Embedded Vision Kit (EVK) significantly (for more info on the EVK and its uses see Issues 114 to 126 of the MicroZed Chronicles). As part my development, I wanted to synchronize the EVK display output with an external source—also useful if we desire to synchronize multiple image streams.


Implementing this is straight forward provided we have the correct architecture. The main element we need is a buffer between the upstream camera/image sensor chain and the downstream output-timing and -processing chain. VDMA (Video Direct Memory Access) provides this buffer by allowing us to store frames from the upstream image-processing pipeline in DDR SDRAM and then reading out the frames into a downstream processing pipeline with different timing.


The architectural concept appears below:






VDMA buffering between upstream and downstream with external sync



For most downstream chains, we use a combination of the video timing controller (VTC) and AXI Stream to Video Out IP blocks, both provided in the Vivado IP library. These two IP blocks work together. The VTC provides output timing and generates signals such as VSync and HSync. The AXI Stream to Video Out IP Block synchronizes its incoming AXIS stream with the timing signals provided by the VTC to generate the output video signals. Once the AXI Stream to Video Out block has synchronized with these signals, it is said to be locked and it will generate output video and timing signals that we can use.


The VTC itself is capable of both detecting input video timing and generating output video timing. These can be synchronized if you desire. If no video input timing signals are available to the VTC, then the input frame sync pulse (FSYNC_IN) serves to synchronize the output timing.  






Enabling Synchronization with FSYNC_IN or the Detector




If FSYNC_IN alone is used to synchronize the output, we need to use not only FSYNC_IN but also the VTC-provided frame sync out (FSYNC_OUT) and GEN_CLKEN to ensure correct synchronization. GEN_CLKEN is an input enable that allows the VTC generator output stage to be clocked.


The FSYNC_OUT pulse can be configured to occur at any point within the frame. For this application, is has been configured to be generated at the very end of the frame. This configuration can take place in the VTC re-configuration dialog within Vivado for a one-time approach or, if an AXI Lite interface is provided, it can be positioned using that during run time.


The algorithm used to synchronize the VTC to an external signal is:


  • Generate a 1-clock-wide pulse on FSYNC_IN reception
  • Enable GEN_CLK
  • Wait for the FSYNC_OUT to be received
  • Disable GEN_CLK
  • Repeat from step 1


Should GEN_CLK not be disabled, the VTC will continue to run freely and will generate the next frame sequence. Issuing another FSYNC_IP while this is occurring will not result in re-synchronisation but will result in the AXI Stream to Video Out IP block being unable to synchronize the AXIS video with the timing information and losing lock.


Therefore, to control the enabling of the GEN_CLKEN we need to create a simple RTL block that implements the algorithm above.





Vivado Project Demonstrating the concept



When simulated, this design resulted in the VTC synchronizing to the FSYNC_IN signal as intended. It also worked the same when I implemented it in my EVK kit, allowing me to synchronize the output to an external trigger.





Simulation Results




Code is available on Github as always.


If you want E book or hardback versions of previous MicroZed chronicle blogs, you can get them below.




  • First Year E Book here
  • First Year Hardback here.



MicroZed Chronicles hardcopy.jpg 




  • Second Year E Book here
  • Second Year Hardback here



MicroZed Chronicles Second Year.jpg 





Everspin announces MRAM-based NMVe accelerator board and a new script for adapting FPGAs to MRAMs

by Xilinx Employee ‎03-08-2017 10:18 AM - edited ‎03-08-2017 01:40 PM (830 Views)


MRAM (magnetic RAM) maker Everspin wants to make it easy for you to connect its 256Mbit DDR3 ST-MRAM devices (and it’s soon-to-be-announced 1Gbit ST-MRAMs) to Xilinx UltraScale FPGAs, so it now provides a software script for the Vivado MIG (Memory Interface Generator) that adapts the MIG DDR3 controller to the ST-MRAM’s unique timing and control requirements. Everspin has been shipping MRAMs for more than a decade and, according to this EETimes.com article by Dylan McGrath, it’s still the only company to have shipped commercial MRAM devices.


Nonvolatile MRAM’s advantage is that it has no wearout failure, as opposed to Flash memory for example. This characteristic gives MRAM huge advantages over Flash memory in applications such as server-class enterprise storage. MRAM-based storage cards require no wear leveling and their read/write performance does not degrade over time, unlike Flash-based SSDs.


As a result, Everspin also announced its nvNITRO line of NVMe storage-accelerator cards. The initial cards, the 1Gbyte nvNITRO ES1GB and 2Gbyte nvNITRO ES2GB, deliver 1,500,000 IOPS with 6μsec end-to-end latency. When Everspin's 1Gbit ST-MRAM devices become available later this year, the card capacities will increase to 4 to 16Gbytes.


Here’s a photo of the card:



Everspin nvNITRO Accelerator Card.jpg 


Everspin nvNITRO Storage Accelerator




If it looks familiar, perhaps you’re recalling the preview of this board from last year’s SC16 conference in Salt Lake City. (See “Everspin’s NVMe Storage Accelerator mixes MRAM, UltraScale FPGA, delivers 1.5M IOPS.”)


If you look at the photo closely, you’ll see that the hardware platform for this product is the Alpha Data ADM-PCIE-KU3 PCIe accelerator card, loaded 1 or 2Gbyte Everspin ST-MRAM DIMMs. Everspin has added its own IP to the Alpha Data card, based on a Kintex UltraScale KU060 FPGA, to create an MRAM-based NVMe controller.


As I wrote in last year’s post:


“There’s a key point to be made about a product like this. The folks at Alpha Data likely never envisioned an MRAM-based storage accelerator when they designed the ADM-PCIE-KU3 PCIe accelerator card but they implemented their design using an advanced Xilinx UltraScale FPGA knowing that they were infusing flexibility into the design. Everspin simply took advantage of this built-in flexibility in a way that produced a really interesting NVMe storage product.”


It’s still an interesting product, and now Everspin has formally announced it.



Adam Taylor’s MicroZed Chronicles Part 175 Analog Mixed Signal UltraZed Edition Part 5

by Xilinx Employee ‎03-06-2017 11:11 AM - edited ‎03-06-2017 11:12 AM (1,447 Views)


By Adam Taylor


Without a doubt, some of the most popular MicroZed Chronicles blogs I have written about the Zynq 7000 SoC explain how to use the Zynq SoC’s XADC. In this blog, we are going to look at how we can use the Zynq UltraScale+ MPSoC’s Sysmon, which replaces the XADC within the MPSoC.







The MPSoC contains not one but two Sysmon blocks. One is located within the MPSoC’s PS (processing system) and another within the MPSoC’s PL (programmable logic). The capabilities of the PL and PS Sysmon blocks are slightly different. While the processors in the MPSoC’s PS can access both Sysmon blocks through the MPSoC’s memory space, the different Sysmon blocks have different sampling rates and external interfacing abilities. (Note: the PL must be powered up before the PL Sysmon can be accessed by the MPSoC’s PS. As such, we should check the PL Sysmon control register to ensure that it is available before we perform any operations that use it.)


The PS Sysmon samples its inputs at 1Msamples/sec while the PL Sysmon has a reduced sampling rate of 200Ksamples/sec. However, the PS Sysmon does not have the ability to sample external signals. Instead, it monitors the Zynq MPSoC’s internal supply voltages and die temperature. The PL Sysmon can sample external signals and it is very similar to the Zynq SoC’s XADC, having both a dedicated VP/VN differential input pair and the ability to interface to as many as sixteen auxiliary differential inputs. It can also monitor on-chip voltage supplies and temperature.







Sysmon Architecture within the Zynq UltraScale+ MPSoC




Just as with the Zynq SoC’s XADC, we can set upper and lower alarm limits for ADC channels within both the PL and PS Sysmon in the Zynq UltraScale+ MPSoC. You can use these limits to generate an interrupt should the configured bound be exceed. We will look at exactly how we can do this in another blog once we understand the basics.


The two diagrams below show the differences between the PS and PL Sysmon blocks in the Zynq UltraScale+ MPSoC:





Zynq UltraScale+ MPSoC’s PS System Monitor (UG580)







Zynq UltraScale+ MPSoC’s PL Sysmon (UG580)




Interestingly, the Sysmone4 block in the MPSoC’s PL provides direct register access to the ADC data. This will be useful if using either the VP/VN or Aux VP/VN inputs to interface with sensors that do not require high sample rates. This arrangement permits downstream signal processing, filtering, and transfer functions to be implemented in logic.


Both MPSoC Sysmon blocks require 26 ADC clock cycles to perform a conversion. Therefore, if we are sampling at 200Ksamlpes/sec, using the PL Sysmon we require a 5.2MHz ADC clock. For the PS Sysmon to sample at 1Msamples/sec, we need to provide a 26MHz ADC clock.


We set the AMS modules’ clock within the MPSoC Clock Configuration dialog, as shown below:






Zynq UltraScale+ MPSoC’s AMS clock configuration




The eagle-eyed will notice that I have set the clock to 52MHz and not 26 MHz. This is because the PS Sysmon’s clock divisor has a minimum value of 2, so setting the clock to 52MHz results in the desired 26MHz clock. The minimum divisor is 8 for the PL Sysmon, although in this case it would need to be divided by 10 to get the desired 5.2MHz clock. You also need to pay careful attention to the actual frequency and not just the requested frequency to get the best performance. This will impact the sample rate as you may not always get the exact frequency you want—as is the case here.


Next time in the UltraZed Edition of the MicroZed Chronicles, we will look at the software required to communicate with both the PS and PL Symon in the Zynq UltraScale+ MPSoC.





UltraScale Architecture System Monitor User Guide, UG580


Zynq UltraScale+ MPSoC Register Reference





Code is available on Github as always.


If you want E book or hardback versions of previous MicroZed chronicle blogs, you can get them below.




  • First Year E Book here
  • First Year Hardback here.




 MicroZed Chronicles hardcopy.jpg



  • Second Year E Book here
  • Second Year Hardback here



MicroZed Chronicles Second Year.jpg



Ettus Research accepts 7 teams to compete in the $10K RFNoC & Vivado HLS challenge for SDR designs

by Xilinx Employee ‎02-28-2017 11:13 AM - edited ‎03-03-2017 08:42 AM (802 Views)


Last September at the GNU Radio Conference (GRCon16) in Boulder, CO, Ettus Research announced its RFNoC & Vivado HLS Challenge with a $10,000 grand prize for developing “innovative and useful open-source RF Network on Chip (RFNoC) blocks that highlight the productivity and development advantage of Xilinx Vivado High-Level Synthesis (HLS) for FPGA programming using C, C++, or System C. The new RFNoC blocks generated during the challenge will add to the rapidly growing library of available open-source blocks for programming FPGAs in SDR development and production.”


Based on formal proposals, the company has now accepted seven teams for the challenge:



  • Team Guerrieri – Self
  • Team MarmotE – Vanderbilt University & Budapest University of Technology
  • Team WINLAB – Rutgers University
  • Team Waveform Benders – Karlsruhe Institute of Technology
  • Team Rabbit Ears – UC San Diego & SPAWAR Systems Center Pacific
  • Team Signum – Tennessee Tech University
  • Team E to the J Omega – HawkEye 360



The final challenge competition will take place in May or June 2017 (venue to be announced) and the teams are required to submit technical papers for publication in the GRCon17 technical proceedings outlining their design’s contribution, implementation, results, and lessons learned. (GRCon17 takes place on September 11-15, 2017 in San Diego, CA.)




For more information about the challenge, see “Matt Ettus of Ettus Research wants you to win $10K. All you have to do is meet his RFNoC & Vivado HLS challenge for SDR.”




If you’re still uncertain as to what System View’s Visual System Integrator hardware/software co-development tool for Xilinx FPGAs and Zynq SoCs does, the following 3-minute video should make it crystal clear. Visual System Integrator extends the Xilinx Vivado Design Suite and makes it a system-design tool for a wide variety of embedded systems based on Xilinx devices.


This short video demonstrates System View’s tool being used for a Zynq-controlled robotic arm:






For more information about System View’s Visual System Integrator hardware/software co-development tool, see:






After five years and a dozen prototypes, the Haddington Dynamics development team behind Dexter—a $3K, trainable, 5-axis robotic arm kit for personal manufacturing—launched the project on Kickstarter just yesterday and are already 41.6% of the way to meeting the overall $100K project funding goal with 28 days left in the funding period. Dexter is designed to be a personal robot arm with the ability to make a wide variety of goods. Think of Dexter as your personal robotic factory with additive (2.5D/3D printing) and subtractive (drilling and milling) capabilities.


Dexter incorporates a 6-channel motor controller but the arm itself uses five stepper motors for positioning. Adding a gripper or other end-effector to the end of the arm adds a 6th degree of freedom.



Dexter Robotic Arm.jpg


Dexter Robotic Arm 3D CAD Drawing




You need some hefty, high-performance computation to precisely coordinate five axes of motion and the current Dexter prototype employs programmable logic in the form of a Xilinx Zynq Z-7000 SoC on an Avnet MicroZed dev board for this task. (The Kickstarter page even shows an IP block diagram from the Vivado Design Suite.)


The Dexter team calls the Zynq SoC an FPGA supercomputer:


“By using a(n) FPGA supercomputer to solve the precision control problem, we were able to optimize the physical and electrical architecture of the robot to minimize the mass and therefore the power requirements. All 5 of the stepper motors are placed at strategic locations to lower the center of mass and to statically balance the arm. This way almost all of the torque of the motors is used to move the payload not the robot.”


The prototype design achieves 50-micron repeatability!


Here’s a video of the prototype Dexter robotic arm in development, including a shot of the robotic arm threading a needle:





There are several more videos on the Dexter Kickstarter page.





Adam Taylor’s MicroZed Chronicles, Part 172: UltraZed Part 3—Saying hello world and First-Stage Boot

by Xilinx Employee ‎02-16-2017 10:55 AM - edited ‎02-16-2017 11:14 AM (1,410 Views)


By Adam Taylor


We have now built a basic Zynq UltraScale+ MPSoC hardware design for the UltraZed board in Vivado that got us up and running. We’ve also started to develop software for the cores within the Zynq UltraScale+ MPSoC’s PS (processor system). The logical next step is to generate a simple “hello world” program, which is exactly what we are going to do for one of the cores in the Zynq UltraScale+ MPSoC’s APU (Application Processing Unit).


As with the Zynq Z-7000 SoC, we need three elements to create a simple bare-metal program for the Zynq UltraScale+ MPSoC:


  • Hardware Platform Definition – This defines the underlying hardware platform configuration, address spaces, and IP modules within the design.
  • Board Support Package – This uses the hardware platform to create a hardware abstraction layer (HAL) that provides the necessary drivers for the IP within the system. We need those drivers to use these hardware resources in an application.
  • Application – This is the application we will be writing. In this case it will be a simple “hello world” program.



To create a new hardware platform definition, select:



File-> New -> Other -> Xilinx – Hardware Platform Specification



Provide a project name and select the hardware definition file, which was exported from Vivado. You can find the exported file within the SDK directory if you exported it local to the project.





Creating the Hardware platform



Once the hardware platform has been created within SDK, you will see the hardware definition file opens within the file viewer. Browsing through this file, you will see the address ranges of the Zynq UltraScale+ MPSoC’s ARM Cotex-A53 and Cortex-R5 processors and PMU (Performance Monitor Unit) cores within the design. A list of all IP within the processors’ address space appears at the very bottom of the file.





 Hardware Platform Specification in SDK file browser



We then use the information provided within the hardware platform to create a BSP for our application. We create a new application by selecting:



File -> New -> Board Support Package



Within the create BSP dialog, we can select the processor this BSP will support, the compiler to be used, and the selected OS, In this case, we’ll use bare metal or FreeRTOS.


For this first example, we will be running the “hello world” program from the APU on processor core 0. We must be sure to target the same core as we create the BSP and application if everything is to function correctly.





 Board Support Package Generation



With the BSP created, the next step is to create the application using this BSP. We can create the application in a similar manner to the BSP and hardware platform:



File -> New -> Application Project



This command opens a dialog that allows us to name the project, select the BSP, specify the processor core, and select operating system. On the first tab of the dialog, configure these settings for APU core 0, bare metal, and the BSP just created. On the second tab of the dialog box, select the pre-existing “hello world” application.  





Configuring the application





Selecting the Hello World Application



At this point, we have the application ready to run on the UltraZed dev board. We can run the application using either the debugger within SDK or we can boot the device from a non-volatile memory such as an SD card.


To boot from an SD Card, we need to first create a first-stage boot loader (FSBL). To do this, we follow the same process as we do when creating a new application. The FSBL will be based on the current hardware platform but it will have its own BSP with several specific libraries enabled.



Select File -> New -> Application Project



Enter a project name and select the core and OS to support the current build as previously done for the “hello world” application. Click the “Create New” radio button for the BSP and then on the next page, select the Zynq MP FSBL template.






Configuring the FSBL application






 Selecting the FSBL template



With the FSBL created, we now need to build all our applications to create the required ELF files for the FSBL and the application. If SDK is set to build automatically, these files will have been created following the creation of the FSBL. If not, then select:



Project -> Build All



Once this process completes, the final step is to create a boot file. The Zynq UltraScale+ MPSoC boots from a file named boot.bin, created by SDK. This file contains the FSBL, FPGA programming file, and the applications. We can create this file by hand and indeed later in this series we will be doing so to examine the more advanced options. However, for the time being we can create a boot.bin by right-clicking on the “hello world” application and selecting the “Create Boot Image” option.





 Creating the boot image from the file, from the hello world application




This will populate the “create boot image” dialog correctly with the FSBL, FPGA bit file, and our application—provided the elf files are available.





Boot Image Creation Dialog correctly populated



Once the boot file is created, copy the boot.bin onto a microSD card and insert it into the SD card holder on the UltraZed IOCC (I/O Carrier Card). The final step, before we apply power, is to set SW2 on the UltraZed card to boot from the SD Card. The setting for this is 1 = ON, 2 = OFF, 3 = ON, and 4 = OFF. Now switch on the power on, connect to a terminal window, and you will see the program start and execute.


When I booted this on my UltraZed and IOCC combination, the following appeared in my terminal window:





Hello World Running



Next week we will look a little more at the architecture of the Zynq UltraScale+ MPSoC’s PS.




Code is available on Github as always.


If you want E book or hardback versions of previous MicroZed chronicle blogs, you can get them below.




  • First Year E Book here
  • First Year Hardback here.




MicroZed Chronicles hardcopy.jpg




  • Second Year E Book here
  • Second Year Hardback here



MicroZed Chronicles Second Year.jpg 




The Koheron SDK and Linux distribution, based on Ubuntu 16.04, allows you to prototype working instruments for the Red Pitaya Open Instrumentation Platform, which is based on a Xilinx Zynq All Programmable SoC. The Koheron SDK outputs a configuration bitstream for the Zynq SoC along with the requisite Linux drivers, ready to run under the Koheron Linux Distribution. You build the FPGA part of the Zynq SoC design by writing the code in Verilog using the Xilinx Vivado Design Suite and assembling modules using TCL.


The Koheron Web site already includes several instrumentation examples based on the Red Pitaya including an ADC/DAC exerciser, a pulse generator, an oscilloscope, and a spectrum analyzer. The Koheron blog page documents several of these designs along with many experiments designed to be conducted using the Red Pitaya board. If you’re into Python as a development environment, there’s a Koheron Python library as well.


There’s also a quick-start page on the Koheron site if you’re in a hurry.




Red Pitaya Open Instrumentation Platform small.jpg 

 The Red Pitaya Open Instrumentation Platform




For more articles about the Zynq-based Red Pitaya, see:





By Adam Taylor


Having introduced the Zynq UltraScale+ MPSoC last week, this week it is time to look at the Avnet UltraZed-EG SOM and its carrier card and to start building our first “hello world” program.


Like is MicroZed and PicoZed predecessors, the UltraZed-EG is a System on Module (SOM) that contains all of the necessary support functions for a complete embedded processing system. As a SOM, this module is designed to be integrated with an application-specific carrier card. In this instance, our application-specific card is the Avnet UltraZed IO Carrier Card.


The specific Zynq UltraScale+ MPSoC contained within the UltraZed SOM is the XCZU3EG-SFVA625, which incorporates a quad-core ARM Cortex-A53 APU (Application Processing Unit), dual ARM Cortex-R5 processors in an RPU (Real-Time Processing Unit), and an ARM Mali-400 GPU. Coupled with a very high performance programmable-logic array based on the Xilinx UltraScale+ FPGA fabric, suffice it to say that exploring how to best use all of these resources it will keep us very, very busy. You can find the 36-page product specification for the device here.


The UltraZed SOM itself shown in the diagram below provides us with 2GBytes of DDR4 SDRAM, while non-volatile storage for our application(s) is provided by both dual QSPI or eMMC Flash memory. Most of the Zynq UltraScale+ MPSoC’s PS and PL I/O are broken out to one of three headers to provide maximum flexibility on the application-specific carrier card.



Avnet UltraZed Block Diagram.jpg



Avnet UltraZed-EG SOM Block Diagram




The UltraZed IO Carrier Card (IOCC), breaks out the I/O pins from the SOM to a wide variety of interface and interconnect technologies including Gigabit Ethernet, USB 2/3, UART, PMOD, Display Port, SATA, and Ardunio Shield. This diverse set of I/O connections give us wide lattitude in developing all sorts of systems. The IOCC also provicdes a USB to JTAG interface allowing us to program and debug our system. You’ll find more information on the IOCC here.


Having introduced the UltraZed and its IOCC, it is time to write a simple “hello world” application and to generate our first Zynq UltraScale+ MPSoC design.


The first step on this journey is make sure we have used the provided voucher to generate a license and downloaded the Design Edition of the Vivado Design Suite.


The next step is to install the board files to provide Vivado with the necessary information to create designs targeting the UltraZed SoM. You can download these files using this link. These board-definition files include information such as the actual Zynq UltraScale+ MPSoC device populated on the SoM, connections to the PS on the IOCC, and a preset configuration for the SoM. We can of course create an example without using these files, however it requires a lot more work.


Once you have downloaded the zip file, extract the contents into the following directory:



<Vivado Install Root>/data/boards/boardfiles



When this is complete, you will see that the UltraZed board defintions are now in the directory and we can now use them within our design.






I should point out at this point that some of the UltraZed boards (including mine) use ES1 silicon. To alert Vivado about this, we need to create a init.tcl file in the scripts directory that will enable us to use ES1 silicon. Doing so is very simple. Within the directory:


<Vivado root>/scripts


Create a file called init.tcl. Enter the line “enable_beta_device*” into this file to enable the use of ES1 silicon within your toolchain.







With this completed we can open Vivado and create a new RTL project. After entering the project name and location, click next on the add sources, IP, and constraints tabs. This should bring you to part selection tab. Click on boards and you should see our UltraZed IOCC board. Select that board and then finish the open project dialog.







This will create a new project.


For this project I am just going to just use the Zynq UltraScale+ MPSoC’s PS to print “hello world.” I usually like to do this with new boards to ensure that I have pipe-cleaned the tool chain. To do this, we need a hardware-definition file to export to SDK to define the hardware platform.


The first step in this sequence is within Flow Navigator. On the left-hand side of the Vivado screen, select the Create Block Diagram option. This will provide a dialog box allowing you to name your block design (or you can leave it default). Click OK and this will create a blank block diagram (in the example below mine is called design_1).







Within this block diagram, we need to add an MPSoC system. Click on the “add IP” button as indicated in the block diagram. This will bring up an IP dialog. Within the search box, type in “MPSoC” and you will see the Zynq UltraScale+ MPSoC IP block. Double click on this and it will be added to the diagram automatically.







Once the block has been added, you will notice a designer assistance notification across the top of the block diagram. For the moment, do not click on that. Instead, double click on the MPSoC IP in your block diagram and it will open up the customization screen for the MPSoC, just like any other IP block.







Looking at the customization screen, you will see it is not yet configured for the target board. For instance, the IOU block has no MIO configuration. Had we not downloaded the board definition, we would now have to configure this by manually. But why do that when we can use the shortcut?





We have the board-definition files, so all we need to do to correctly configure this for the IOCC is close the customization dialog and click on the Run Block Automation notification at the top of the block diagram. This will configure the MPSoC for our use on the IOCC. Within the block automation dialog, check to make sure that the “apply pre-sets” option is selected before clicking OK.






Re-open the MPSoC IP block again and you will see a different configuration of the MPSoC—one that is ready to use with our IOCC.






Do not change anything. Close the dialog box. Then, on the block diagram, connect the PL_CLK0 pin to the maxihpm0_lpd_ack pin. Once that is complete, click on “validate” to ensure that the design has no errors.







The next step is very simple. We’ll create an RTL wrapper file for the block diagram. This will allow us to implement the design. Under the sources tab, right-click on the block diagram and select “create HDL wrapper.” When prompted, select the option that allows Vivado to manage the file for you and click OK.







To generate the bitstream, click on the “Generate Bitstream” icon on the menu bar. If you are prompted about any stages being out of date, re-run them first by clicking on “yes.”







Depending on the speed of your system, this step may take a few minutes or longer to generate the bitstream. Once completed, select the “open implementation” option. Having the implementation open allows us to export the hardware definition file to SDK where we will develop our software.







To export the hardware definition, select File-> Export->Export Hardware. Select “include bit file” and export it.







To those familiar with the original Zynq SoC, all of this should look pretty familiar.


We are now ready to write our first software program—next time.



You can find links to previous editions of the MPSoC edition here




Code is available on Github as always.


If you want E book or hardback versions of previous MicroZed chronicle blogs, you can get them below.




  • First Year E Book here
  • First Year Hardback here.



MicroZed Chronicles hardcopy.jpg 




  • Second Year E Book here
  • Second Year Hardback here



 MicroZed Chronicles Second Year.jpg



Xilinx launched the UltraFast Design Methodology more than three years ago. It’s designed to get you from project start to a successful, working design using the Xilinx Vivado Design Suite in the least amount of time using hand-picked best practices from industry experts. There’s a 364-page methodology manual titled “UltraFast Design Methodology Guide for the Vivado Design Suite” (UG949) that describes the methodology in detail and you can download and read it for free. (And you should.)


Don’t have time right now to read the 364-page version? Maybe you just want to get the ideas to see if it’s worth your time to read the full manual. (Hint: It is.)


OK, so if you’re that pressed for time there’s a 2-page Quick Reference Guide (UG1231) that you can read to see what the ideas are all about. You can download that guide here, also for free.



UltraFast Design Methodology Quick Reference Guide.jpg



Note: If you’re looking for something in between take a look at “UltraFast: Hand-picked best practices from industry experts, distilled into a potent Design Methodology.”


A Very Short Conversation with Ximea about Subminiature Video Cameras and Very Small FPGAs

by Xilinx Employee ‎02-02-2017 01:00 PM - edited ‎02-02-2017 09:54 PM (1,803 Views)


Yesterday at Photonics West, my colleague Aaron Behman and I stopped by the Ximea booth and had a very brief conversation with Max Larin, Ximea’s CEO. Ximea makes a very broad line of industrial and scientific cameras and a lot of them are based on several generations of Xilinx FPGAs. During our conversation, Max removed a small pcb from a plastic bag and showed it to us. “This is the world’s smallest industrial camera,” he said while palming a 13x13mm board. It was one of Ximea’s MU9 subminiature USB cameras based on a 5Mpixel ON Semiconductor (formerly Aptina) MT9P031 image sensor. Ximea’s MU9 subminiature camera is available as a color or monochrome device.


Here’s are front and back photos of the camera pcb:



Ximea MU9 Subminiature Camera.jpg


Ximea 5Mpixel MU9 subminiature USB camera



As you can see, the size of the board is fairly well determined by the 10x10mm image sensor, its bypass capacitors, and a few other electronic components mounted on the front of the board. Nearly all of the active electronics and the camera’s I/O connector are mounted on the rear. A Cypress CY7C68013 EZ-USB Microcontroller operates the camera’s USB interface and the device controlling the sensor is a Xilinx Spartan-3 XC3S50 FPGA in an 8x8mm package. FPGAs with their logic and I/O programmability are great for interfacing to image sensors and for processing the video images generated by these sensors.


Our conversation with Max Larin at Photonics West got me to thinking. I wondered, “What would I use to design this board today?” My first thought was to replace both the Spartan-3 FPGA and the USB microcontroller with a single- or dual-core Xilinx Zynq SoC, which can easily handle all of the camera’s functions including the USB interface, reducing the parts count by one “big” chip. But the Zynq SoC family’s smallest package size is 13x13mm—the same size as the camera pcb—and that’s physically just a bit too large.


The XC3S50 FPGA used in this Ximea subminiature camera is the smallest device in the Spartan-3 family. It has 1728 logic cells and 72Kbits of BRAM. That’s a lot of programmable capability in an 8x8mm package even though the Spartan-3 FPGA family first appeared way back in 2003. (See “New Spartan-3 FPGAs Are Cost-Optimized for Design and Production.”)


There are two newer Spartan FPGA families to consider when creating a design today, Spartan-6 and Spartan-7, and both device families include multiple devices in 8x8mm packages. So I decided see how much I might pack into a more modern FPGA with the same pcb real-estate footprint.


The simple numbers from the data sheets tell part of the story. A Spartan-3 XC3S50 provides you with 1728 logic cells, 72Kbits of BRAM, and 89 I/O pins. The Spartan-6 XCSLX4, XC6SLX9, and XCSLX16 provide you with 3840 to 14,579 logic cells, 216 to 576Kbits of BRAM, and 106 I/O pins. The Spartan-7 XC7S6 and XC7S15 provide 6000 to 12,800 logic cells, 180 to 360Kbits of BRAM, and 100 I/O pins. So both the Spartan-6 and Spartan-7 FPGA families provide nice upward-migration paths for new designs.


However, the simple data-sheet numbers don’t tell the whole story. For that, I needed to talk to Jayson Bethurem, the Xilinx Cost Optimized Portfolio Product Line Manager, and get more of the story. Jayson pointed out a few more things.


First and foremost, the Spartan-7 FPGA family offers a 2.5x performance/watt improvement over the Spartan-6 family. That’s a significant advantage right there. The Spartan-7 FPGAs are significantly faster than the Spartan-6 FPGAs as well. Spartan-6 devices in the -1L speed grade have a 250MHz Fmax versus 464MHz for Spartan-7 -1 or -1L parts. The fastest Spartan-6 devices in the -3 speed grade have an Fmax of 400MHz (still not as fast as the slowest Spartan-7 speed grade) and the fastest Spartan-7 FPGAs, the -2 parts, have an Fmax of 628MHz. So if you feel the need for speed, the Spartan-7 FPGAs are the way to go.


I’d be remiss not to mention tools. As Jayson reminded me, the Spartan-7 family gives you entrée into the world of Vivado Design Suite tools. That means you get access to the Vivado IP catalog and Vivado’s IP Integrator (IPI) with its automated integration features. These are two major benefits.


Finally, some rather sophisticated improvements to the Spartan-7 FPGA family’s internal routing architecture means that the improved placement and routing tools in the Vivado Design Suite can pack more of your logic into Spartan-7 devices and get more performance from that logic due to reduced routing congestion. So directly comparing logic cell numbers between the Spartan-6 and Spartan-7 FPGA families from the data sheets is not as exact a science as you might assume.


The nice thing is: you have plenty of options.



For previous Xcell Daily blog posts about Ximea industrial and scientific cameras, see:






The short video below captured at the recent SDS Drives show in Germany shows two recent safety-related certifications for Xilinx development tools. The first is a TÜV SÜD certification for the Vivado Design Suite for functional-safety applications and the second is for the Xilinx MicroBlaze processor GNU compiler tool chain, certified to SIL 4.


The video also shows a Zynq SoC being used to implements a functional-safety application using two different processor architectures—an ARM Cortex-A9 and a Xilinx MicroBlaze soft processor core—running the same code. This demonstration shows the functional-safety flexibility you get when you design a Zynq SoC into your design.






The Xilinx version of QEMU handles ARM Cortex-A53, Cortex-R5, Cortex-A9, and MicroBlaze

by Xilinx Employee ‎01-19-2017 02:32 PM - edited ‎01-22-2017 09:22 AM (4,032 Views)


Xilinx has a version of QEMU—a fast, open-source, just-in-time functional simulator—for the ARM processors in the Zynq SoC and the Zynq UltraScale+ MPSoC and for the company’s MicroBlaze soft processor core. QEMU accelerates code development by giving embedded software developers an enhanced execution environment long before hardware is available and they can continue to use QEMU as a software-development platform even after the hardware is ready. (After all, it’s a lot easier to distribute QEMU to 300 software developers than to ship hardware units to each of them.)


Although QEMU was already available through the open-source community, Xilinx has added several innovations over time to match the multi-core, heterogeneous devices available in the two distinct Zynq device families, augmented by additional MicroBlaze processors instantiated in programmable logic.


The latest version of Xilinx QEMU, available on github at https://github.com/Xilinx/qemu, includes extended features including:




  • Multi-architecture simulation for heterogeneous, multicore systems: The Xilinx Zynq UltraScale+ MPSoC incorporates embedded ARM processors including a quad-core ARM Cortex-A53 application processor, a dual-core ARM Cortex-R5 MPCore real-time processor, and a hardened version of the Xilinx MicroBlaze processor acting as a performance monitor. The Xilinx Zynq SoC incorporates a single- or dual-core ARM Cortex-A9 MPCore processor. The Xilinx version of QEMU can handle simulations of software running on all of these processor architectures so that your team can handle the associated integration challenges of such a complex, heterogeneous, multicore architecture early in the design cycle. (See http://www.wiki.xilinx.com/QEMU+-+Zynq+UltraScalePlus)


  • Yocto support: Your software development team can use its existing build and configuration flows through the Yocto infrastructure to build and simulate code that runs on Xilinx devices on the ARM processor cores available in the Zynq device families and on Xilinx MicroBlaze cores. (See http://www.wiki.xilinx.com/QEMU+Yocto+Flow)


  • Non-Intrusive Fault Injection: This feature allows you to identify and troubleshoot really difficult and costly security or safety problems by injecting error from an external interface without stopping the simulation. In addition, you can stress test your software using corner-case scenarios. (see https://github.com/Xilinx/qemu/blob/master/docs/fault_injection.txt)


  • Xilinx SDK Integration: You can launch QEMU from the Xilinx SDK just as you would a hardware target, which means that if you’re an experienced SDK user, you already know how to launch and use QEMU.



Xilinx is actively developing QEMU enhancements, which means more features are on the way. Meanwhile, you’ll find the Xilinx QEMU Wiki here.




By Adam Taylor


Having looked that how we can optimize the Zynq SoC’s PS (processor system) for power during operation and when we wish for the Zynq SoC to enter sleep mode, I now want to round off our look at power-reduction techniques by looking at how we reduce power consumption within the Zynq SoC’s PL (programmable logic) using design techniques. Obviously, one of the first things we should do is enable power optimization within implementation flow, which optimizes the design for power efficiency.  However, Vivado tools can only optimize a design as presented. So let’s see what we can do to ensure that we present the best design possible.





Setting Power Optimization within Vivado



One of the first places to start is to ensure that we are familiar with the structure of the CLBs and slices used to implement our creations within the Zynq SoC’s PL. If you are not as familiar as you should be, the detail of these PL components is provided within in the Seven Series CLB user guide UG474.

Each CLB contains two slices. These slices provide the LUTs (look up tables), storage elements, etc. used to implement the logic in your design. The first thing we can do to optimize power consumption in our programmable logic design is to consider the polarity, synchronicity, and grouping of control signals to these CLB’s and slices. When we talk about a control signal, we mean the clock, clock enable, set/reset, and distributed-RAM write enables used within a slice.





Storage elements in a Programmable Logic Slice



Looking at the storage elements shown above, you can see that except for the CLK control signal, which has a mux to enable its inversion, all other signals are active high. If we declare them as active low or asynchronous, we will require an extra LUT to invert the signal and additional routing resources to connect the inverter. These extra logic and routing resources increase power consumption.


Grouping of control signals relates to how a specific group of control signals—e.g. the clock, reset and clock enable—behave. Creating many different control groups within a design or module makes it more difficult for the placer to locate elements within different control groups close together. The end result will require more routing which makes timing closure more difficult and increases power consumption.


We also need to consider how we use and configure the PL’s I/O resources. For instance, we must giver proper consideration to limiting drive strength and slew rate. We should also consider using the lowest I/O voltage supported by the receiving device. For example, can we may be able to use reduced-swing LVDS in place of LVDS.


More advanced design techniques that we can use relate to the use of hard macros within the PL and how the tools use this logic. One of the biggest savings can be achieved by using a smaller device, which clearly reduces overall power. There are two main techniques we can use to reduce the size of the required device. The first of these is resource time sharing, which uses the same on-chip logic resources for different functions at different times. A second approach is to use a common core for processing multiple inputs and inputs if possible. However, this technique increases complexity during design capture because we must consider multiplexing and sequencing needs.


Once we have completed our design, we can run the XPE tool within Vivado to estimate power consumption and predict junction temperature (very important!). Hopefully, we’ll get the reduction power we require. However, if we do not, we can perform “what if” scenarios as detailed by UG907, which also contains other low-power design techniques.



Code is available on Github as always.


If you want E book or hardback versions of previous MicroZed chronicle blogs, you can get them below.




  • First Year E Book here
  • First Year Hardback here.



MicroZed Chronicles hardcopy.jpg 




  • Second Year E Book here
  • Second Year Hardback here




MicroZed Chronicles Second Year.jpg 




All of Adam Taylor’s MicroZed Chronicles are cataloged here.











You can now download the latest version of the Vivado Design Suite HLx Editions, release 2016.4, which adds support for multiple Xilinx UltraScale+ devices including the Virtex UltraScale+ XCVU11P and XCVU13P FPGAs and board support packages for the Zynq UltraScale+ MPSoC ZCU102-ES2 and Virtex UltraScale+ VCU118-ES1 boards.


Download the latest version here.


New to FPGA-based design? Buy this book. Now

by Xilinx Employee on ‎12-14-2016 02:18 PM (8,314 Views)

Designing with Xilinx FPGAs.jpg 

For the last three years, I’ve searched for a book on designing with FPGAs—specifically Xilinx FPGAs—that I can recommend to people who are just starting out. The introduction of the Vivado Design Suite in April 2012 complicated things because existing books on FPGA-based design did not discuss Vivado.


Now, there’s a 260-page book from Springer that I can recommend. The book title is “Designing with Xilinx FPGA: Using Vivado.” That’s pretty self-explanatory, isn’t it?


You might be able to teach yourself all of the topics in this book by collecting dozens of different documents on the Xilinx.com Web site. You’ll find it a lot easier just to get this book.


The simplest way to explain the books’ contents is to list the clearly labeled chapters, each written by a different author:


     1.   State-of-the-Art Programmable Logic

     2.   Vivado Design Tools

     3.   IP Flows

     4.   Gigabit Transceivers

     5.   Memory Controllers

     6.   Processor Options

     7.   Vivado IP Integrator

     8.   SysGen for DSP

     9.   Synthesis

     10. C-Based Design

     11. Simulation

     12. Clocking

     13. Stacked Silicon Interconnect (SSI)

     14. Timing closure

     15. Power Analysis and Optimization

     16. System Monitor

     17. Hardware Debug

     18. Emulation using FPGAs

     19. Partial Reconfiguration and Hierarchical Design



This newly published book covers extremely current topics including Stacked Silicon Interconnect (the Xilinx designation for 3D ICs) and the Xilinx Zynq UltraScale+ MPSoC. Of course, just as it says in the book title, the text heavily discusses the use of the Vivado Design Suite to develop designs with Xilinx devices.



By Adam Taylor


As I described last week, we need to a platform to fuse Python and SDSoC. In the longer term, I want to perform some image processing with this platform. So although I am going to remove most of the logic from the base design, we need to keep the following in the hardware to ensure that we can correctly boot up the Pynq board:


  1. SWSled GPIO
  2. Btns GPIO
  3. RGBLeds GPIO
  4. Block Memory – Added in the MicroZed Chronicles, Part 158


We will leave the block memory within the design to demonstrate that the build produced by SDSoC is unique and different when compared to the original boot.bin file. Doing so will enable us to use the notebook we previously used to read and write the Block RAM. However this time we will not need the overlay first.






Stripped Down Vivado Platform



As we know by now, we need to have two elements to create an SDSoC hardware definition and a software definition. We can create the hardware definition within Vivado itself. This is straightforward. We declare the available AXI ports, clocks, and interrupts. I have created a script to do. It’s available on the GitHib repository. You can run this in the command line of the TCL console within Vivado.


The software definition will take a little more thought. Because we are using a Linux-based approach, we need the following:


  • uImage – The Pynq Kernel
  • dtb – The device tree blob
  • elf – The first-stage boot loader
  • elf – The second-stage boot loader
  • bif – Used to determine the boot order



We can obtain most of these items from the Pynq Master that we downloaded from GitHib previously under the file location:






Within this directory, you can find the FSBL, device tree, Uboot, and a boot.bif. What is missing however is the actual Linux kernel: the uImage. We already have this image on the SD card we have been running the PYNQ from recently. I merely copied this file into the SDSoC platform directory.


With the platform defined, we can create a simple program that does not have any accelerators and we can use SDSoC to build the contents of the SD Card. Once built we can copy the contents to the SD Card and boot the PYNQ. We should see the LED’s flash as normal when the Pynq is ready for use.


We should be able to access the BRAM we have left within the design using the same notebook as before, but with the overlay section commented out. You should be able to read and write from the memory. You’ll should also check to see that if you change the base address from the correct address, the notebook will no longer work correctly.


Having proved that we can build a design without accelerating a function, the next step is to ensure that we can build a design that does accelerate a function. I therefore used the matrix multiply example to generate a simple example that shows how you to correctly use the platform to accelerate hardware. This is the final confirmation we need to confirm that we have defined the platform correctly.


Creating a new project, targeting the same platform as before, with the example code, and targeting the generation of a shared library produced the following hardware build in Vivado:







MMult hardware example as created by SDSoC




Clearly, we can see the addition of the accelerated hardware.


All that is needed now is to upload the bit, tcl, and so files to the PYNQ and then write a notebook to put them to work.



Code is available on Github as always.


If you want E book or hardback versions of previous MicroZed chronicle blogs, you can get them below.




  • First Year E Book here
  • First Year Hardback here.




MicroZed Chronicles hardcopy.jpg 



  • Second Year E Book here
  • Second Year Hardback here




MicroZed Chronicles Second Year.jpg 




All of Adam Taylor’s MicroZed Chronicles are cataloged here.









I wrote about MicroCore Labs’ MCL51, a microsequencer-based 8051 processor core, back in June (see “How to stuff a wild FPGA: Quad-core 8051 in a $99 Artix-7 Arty dev board multitasks by driving display, printer, music, and sound”) and now, MicroCore has harnessed four instances of these cores running on an Artix-7 FPGA in lockstep to create an n-modular redundant system. This system can detect a variety of soft errors and rebuilds itself when an error is detected. Each of the four processor modules contains independent voting logic that detects errors, so that the failed module can be removed and possibly rebuilt. Successful rebuilding and rejoining the 4-processor gestalt takes 700μsec. (You can get more details in this Microcore app note.)


Here’s a video of the system in operation with some heavy-handed error injection to demonstrate the error-recovery capability of the design:






MicroCore’s MCL51 processor core consumes only about 300 LUTs, so four instances easily fit in the Artix-7 A35T FPGA on the $99 Digilent ARTY dev board used in the video. This board is a real bargain because it includes a voucher for a device-locked copy of the Xilinx Vivado HL Design Edition, which lists for $2995.



By Adam Taylor


One of the benefits of the PYNQ system is that we can integrate hardware overlays for the PYNQ’s Zynq Z-7000 SoC and use them with ease in a Python programming environment. As we have seen over last few weeks, it is pretty simple to create and integrate a hardware overlay. However, we still need to be able develop an overlay with the functions we desire. Ideally, to continue to leverage the benefits of the high-level PYNQ system, we want to develop the overlays using a similar high-level approach.


The traditional way to develop hardware overlays for the FPGA fabric in the Zynq SoC is to use Vivado as we’ve done previously, perhaps combined with Vivado HLS to implement complex functions defined in C or C++. The Xilinx SDSoC development environment allows us to create applications that run on the Zynq SoC’s ARM Cortex-A9 processors (the PS or processor system) and the programmable logic (the PL). We can move functions between then as we desire to accelerate parts of the design. If do this using a high-level language like C or C++, SDSoC combine the capabilities of Vivado HLS with a connectivity framework.







How SDSoC and Pynq can be combined



What this means for the PYNQ system is that we can use SDSoC to create a hardware overlay using Vivado HLS and then interface to it using Python’s C Foreign Function Interface (CFFI). Using CFFI is very similar to the approach we undertook last week. In theory, this approach allows us to create hardware overlays without the need to write a line of HDL.


The first step in using SDSoC is to create an SDSoC platform. As we have discussed before, an SDSoC platform requires both a hardware definition and a software definition. We can create the hardware definition from within Vivado. For the software definition, we can use a template for a Linux operating system.  The base PYNQ design will serve as our foundation because we want to ensure that the PS settings are correct. However to free up resources in the PL for SDSoC, we may want to prune out some of the logic functions.


Once the platform has been created within SDSoC, we can take advantage of the support for high-level frame works like OpenCV and the other supported HLS libraries to create the application we want. SDSoC will automatically generate the required bit file and TCL file for a build. However in this case, we also need the C files generated by SDSoC to interface with the accelerated function in the Zynq PL. We do this using a shared library, which we can call from within the Python environment. We can create a shared library by ticking the option when we create a new SDSoC project, like so:







Setting the shared library option



To make use of the shared library, we will need to know the names of the functions contained within it. These functions will be renamed by SDSoC during the build process and we will need to use these modified names within the Python CFFI interface because that is what is included within the shared library.


For example, using the matrix multiply example in SDSoC, the name of the accelerated function becomes:



mmult_accel  -> _p0_mmult_accel_0



These files will be available under the <project>/<build config>/_sds/swstubs while the hardware files are under <project>/<build config>/_sds/p0/ipi.


This is how the previous example we ran, the Sobel filter (and the FIR filter), was designed.


Over the next few weeks, we will look more in depth at how we create the our own SDSoC platform and how we implement it within the PYNQ environment.


Code is available on Github as always.


If you want E book or hardback versions of previous MicroZed chronicle blogs, you can get them below.




  • First Year E Book here
  • First Year Hardback here.




MicroZed Chronicles hardcopy.jpg 




  • Second Year E Book here
  • Second Year Hardback here



MicroZed Chronicles Second Year.jpg 





All of Adam Taylor’s MicroZed Chronicles are cataloged here.







The Vivado Design Suite provides an IP-Centric development environment for FPGA-based designs and this November 30 webinar taught by Doulos, a Xilinx Authorized Training Provider, will you teach you how to customize IP from the Vivado IP Catalog, generate output products, and instantiate that IP in your design using Verilog or VHDL.


Topics include:


  • Vivado: an IP-Centric development environment
  • The IP Catalog
  • Alternative IP Flows: Out-Of-Context and Global Synthesis
  • IP Output Products
  • Simulating IP
  • Managing IP both within and outside projects
  • Using IP with revision control systems


The Webinar will occur twice on November 30 to accommodate different time zones around the world. Register here.




By Adam Taylor



Having done the easy part and got the Pynq all set up and running a simple “hello world” program, I wanted to look next at the overlays which sit within the PL, how they work, and how we can use the base overlay provided.


What is an overlay? The overlay is a design that’s loaded into the Zynq SoC’s programmable logic (PL). The overlay can be designed to accelerate a function in the programmable logic or provide an interfacing capability using the PL. In short, overlays give Pynq its, unique capabilities.


What is important to understand about the overlay is that there is not a Python-to-PL high-level synthesis process involved. Instead, we develop the overlay using one of the standard Xilinx design methodologies (SDSoC, Vivado, or Vivado HLS). Once we’ve created the bit file for the overlay, we then integrate it within the Pynq architecture and establish the required parameters to communicate with it using Python.


Like all things with the Zynq SoC that we have looked at to date, this is very simple. We can easily integrate with the Python environment using the bit file and other files provided with the Vivado build. We do this with the Python MMIO class, which allows us to interact with designs in the PL through memory-mapped reads and writes.  The memory map of the current overlay in the PL is all we need. Of course, we can change the contents of the PL on the fly as our application requires to accelerate functions in the PL.


We will be looking more at how we can create our own overlay over the next few weeks. However, if you want to know more in the short term, I suggest you read the Pynq manual here. If you are thinking of developing your own overlay, be sure that you base it on the base overlay Vivado design to ensure that the configuration of the Zynq SoC’s Processor System (PS) and the PS/PL interface s are correct.


The supplied base overlay provides support for several interfaces including the HDMI port and a wide range of PMODs.


The real power of the Pynq system comes from the open source community developing and sharing overlays. I want to look at a couple of these in the remainder of this blog. These overlays are available via GitHub and provide a Sobel Filter for the HDMI input and output and a FIR filter. You’ll find them here:





The first thing we need to do is the install the packages. For this example, I am going to install the Sobel filter. To do this we need to use a terminal program to download and install the overlay and its associated files.



We can do this using PuTTY and log in easily with the user name and password of Xilinx. The command to install the overlay is then:



sudo -H pip install --upgrade 'git+https://github.com/beja65536/pz1_sobelfilter'





Installing the Sobel Filter



Once this has been downloaded, the next step is to download the zip file containing the Juypter notebook from GitHub and upload it under the examples directory. This is simple to do. Just select the upload and navigate to the location of the notebook you wish to upload.





This notebook also performs the installation of the overlay if you have not done this via the terminal. You do however only need to do this once.



Once this is uploaded, we can connect the Pynq to an HDMI source and an HDMI monitor and run the example. For this example, I am going to connect the Pynq between the Embedded Vision Kit and the display and then run the notebook.






When I did this, the notebook produced the image below showing the result of the Sobel Filter. Overall, this was very easy to get up and running using a different overlay that is not the base overlay.






Code is available on Github as always.


If you want E book or hardback versions of previous MicroZed chronicle blogs, you can get them below.




  • First Year E Book here
  • First Year Hardback here.



MicroZed Chronicles hardcopy.jpg 




  • Second Year E Book here
  • Second Year Hardback here




MicroZed Chronicles Second Year.jpg 




All of Adam Taylor’s MicroZed Chronicles are cataloged here.




Time borrowing, a clock-frequency-boosting design technique long available to ASIC designers through automated clock-tree synthesis tools, is now available to designers using Xilinx Virtex UltraScale+ and Kintex UltraScale+ FPGAs and Zynq UltraScale+ MPSoCs with the latest versions of the Xilinx Vivado HLx Design Suite. That was one of Parivallal Kannan’s key messages in yesterday’s ICCAD 2016 presentation titled “Performance-Driven Routing for FPGAs.” The key issue here is balancing logic delays in successive pipeline stages to maximize clock frequency. Usually, that means trying to exactly balance logic delays between pipeline flip-flops. That’s quite a trick and it’s often just not possible to do this, especially in a short amount of time. Time borrowing turns this problem on its head by injecting clock delays in a stage to “borrow” time from the next succeeding stage. It looks like this:



Time Borrowing .jpg

The clock buffer driving FFj in the above figure permanently “borrows” half a nanosecond from the logic stage on the right (between FFj and FFk) using a programmable UltraScale+ clock-buffer delay element, effectively creating a 2.5nsec time period between FFi and FFj on the left side of the figure. This design technique maintains the pipeline’s overall 2nsec clock period and avoids the need to use a slower, 2.5nsec period to accommodate the additional logic in the slower pipeline stage.


A graph showing 89 results of this time-borrowing technique applied to Xilinx customer designs appears in the figure below:



Time Borrowing Results.jpg



As you can see, the resulting Fmax increase ranges from a low of about 1% to a high of nearly 14% with an average of about 5.5%. That’s some serious potential performance improvement from a relatively simple and automated design technique.


Again, these sorts of optimizations are made possible by the innovations incorporated into the programmable-logic fabric in Xilinx UltraScale+ FPGAs and Zynq UltraScale+ MPSoCs.




Xilinx now has four families of cost-optimized All Programmable devices to help you build systems:




  • Artix-7: The cost-optimized Xilinx FPGA family with 6.6Gbps serial transceivers for designs that need compatibility with high-speed serial I/O protocols such as PCIe Gen2 and USB 3.0. The Artix-7 family now has two smaller members: the Artix-7 A12T and A25T with 12,300 and 23,360 logic cells respectively.




Here’s a 12-minute video that further clarifies the options you now have:





Agnisys IDesignSpec and ISequenceSpec can generate synthesizable HDL for Vivado from plain-text specs

by Xilinx Employee ‎10-20-2016 10:23 AM - edited ‎10-20-2016 03:49 PM (5,763 Views)



Here’s the system engineer’s dream: Write a concise specification for a system, feed it to the right tool, and get a working design out of the other end of the tool. As I said, it’s a dream. But Agnisys seems bound and determined to make this dream a reality. The company offers two design tools—IDesignSpec and ISequenceSpec—that perform this feat for system and test designs. Here’s a diagram of the IDesignSpec flow:



Agnisys IDesignSpec.jpg 


Agnisys IDesigSpec Design Flow



Note that IDesignSpec accepts specifications in a variety of text-centric formats including IP-XACT and XML and emits a number of files including synthesizable Verilog or VHDL, which slips right into the Xilinx Vivado HLx Design Suite.


Now this may sound complex and I’m sure that whatever’s going on under the hood is indeed complicated; but perhaps your job isn’t, as demonstrated in the following 5-minute demo video:






Contact Agnisys directly for more information about IDesignSpec and ISequenceSpec.









Xilinx introduced the UltraFast Design Methodology—which uses hand-picked best design practices to speed you on your way to a successful design—nearly two years ago. If you’re not yet using the UltraFast Design Methodology but are curious as to why you might want board this train, here’s an 11-minute video that tells all:






The most requested piece of IP for Xilinx All Programmable devices is DMA for PCIe and the Xilinx DMA for PCIe subsystem is now included—free—in the Xilinx Vivado HL Design Suite. This DMA block supports as many as four PCIe Gen3 x16 channels with transfer sizes as large as 256Mbytes (and infinitely long transfers with linked descriptors). The DMA engine in the IP supports scatter/gather operations.


As of the new 2016.3 release of Vivado, this PCIe DMA IP core supports PCIe Gen3 x16 operation in Xilinx UltraScale+ FPGAs and MPSoCs. There’s beta support for Artix-7 and Kintex-7 FPGAs and the Zynq Z-7000 SoC as well in this latest Vivado release.


Here’s a 14-minute video to give you all of the technical details and a demo:






The folks at Hackaday are holding a SuperConference in Pasadena, California on November 5 and 6 and Digilent’s Sam Bobrowicz is running a 4-hour, hands-on, $79 workshop starting at noon on Saturday titled “FPGAs: Beyond Digital Logic with Microblaze and Arty.” (The registration fee includes a $99 Digilent ARTY board, so the workshop’s a bargain!) Sam’s going to be teaching advanced FPGA applications using the Arty board and Xilinx’s Vivado Design Suite. Participants will use a Xilinx Microblaze soft core processor along with a library of pre-built IP blocks to design a custom microcontroller and implement it inside a Xilinx Artix-7 FPGA. Graphical design tools and a standard C-programming environment will be used. This workshop will not involve writing HDL.


Register Here.



The $99 Digilent ARTY dev board


 ARTY Board v2 White.jpg




For more information about the ARTY board, see: ARTY—the $99 Artix-7 FPGA Dev Board/Eval Kit with Arduino I/O and $3K worth of Vivado software. Wait, What????





Programmable logic control of power electronics—where to start? What dev boards to use?

by Xilinx Employee ‎10-18-2016 10:24 AM - edited ‎10-18-2016 10:28 AM (4,177 Views)


A great new blog post on the ELMG Web site discusses three entry-level dev boards you can use to learn about controlling power electronics with FPGAs. (This post follows a Part 1 post that discusses the software you can use—namely Xilinx Vivado HLS and SDSoC—to develop power-control FPGA designs.)


And what are those three boards? They should be familiar to any Xcell Daily reader:



The $99 Digilent ARTY dev board (Artix-7 FPGA)


ARTY Board v2 White.jpg 




The Avnet ZedBoard (Zynq Z-7000 SoC)


ZedBoard V2.jpg






The Avnet MicroZed SOM (Zynq Z-7000 SoC)



MicroZed V2.jpg





Who is ELMG? They’ve spent the last 25 years developing digitally controlled power converters in motor drives, industrial switch mode power supplies, reactive power compensation, medium voltage system, power quality systems, motor starters, appliances and telecom switch-mode power supplies.



For more information about the ARTY board, see: ARTY—the $99 Artix-7 FPGA Dev Board/Eval Kit with Arduino I/O and $3K worth of Vivado software. Wait, What????



For more information about the MicroZed and the ZedBoard, see the 150+ blog posts in Adam Taylor’s MicroZed Chronicles.



About the Author
  • Be sure to join the Xilinx LinkedIn group to get an update for every new Xcell Daily post! ******************** Steve Leibson is the Director of Strategic Marketing and Business Planning at Xilinx. He started as a system design engineer at HP in the early days of desktop computing, then switched to EDA at Cadnetix, and subsequently became a technical editor for EDN Magazine. He's served as Editor in Chief of EDN Magazine, Embedded Developers Journal, and Microprocessor Report. He has extensive experience in computing, microprocessors, microcontrollers, embedded systems design, design IP, EDA, and programmable logic.