Today, Digilent announced a $299 bundle including its Zybo Z7-20 dev board (based on a Xilinx Zynq Z-7020 SoC), a Pcam 5C 5Mpixel (1080P) color video camera, and a Xilinx SDSoC development environment voucher. (That’s the same price as a Zybo Z7-20 dev board without the camera.) The Zybo Z7 dev board includes a new 15-pin FFC connector that allows the board to interface with the Pcam 5C camera over a 2-lane MIPI CSI-2 and I2C interfaces. (This connector is pin-compatible with the Raspberry Pi’s FFC camera port.) The Pcam 5C camera is based on the Omnivision OV5640 image sensor.
Digilent has created the Pcam 5C + Zybo Z7 demo project to get you started. The demo accepts video from the Pcam 5C camera and passes it out to a display via the Zybo Z7’s HDMI port. All IP used in the demo including a D-PHY receiver, CSI-2 decoder, Bayer to RGB converter and gamma correction is free and open-source so you can study exactly how the D-PHY and CSI-2 decoding works and then develop you own embedded vision products.
If you want this deal, you’d better hurry. The offer expires February 23—three weeks from today.
Rigol’s new RSA5000 real-time spectrum analyzer allows you to capture, identify, isolate, and analyze complex RF signals with a 40MHz real-time bandwidth over either a 3.2GHz or 6.5GHz signal span. It’s designed for engineers working on RF designs in the IoT and IIot markets as well as industrial, scientific, and medical equipment. Rigol was demonstrating the RSA5000 real-time spectrum analyzer at this week’s DesignCon being held at the Santa Clara Convention Center. I listened to a presentation from Rigol’s North American General Manager Mike Rizzo and then a demo by Rigol’s Director of Product Marketing & Software Applications Chris Armstrong, both captured in the 2.5-minute video below.
Rigol RSA5000 Real-Time Spectrum Analyzer
Based on what I saw in the demo, this is an extremely responsive instrument—far more responsive than a swept spectrum analyzer—with several visualization display modes to help you isolate the significant signal in a sea of signals and noise, in real time. It’s capable of continuously executing 146,484 FFTs/sec, which results in a minimum 100% POI (probability of intercept) of 7.45μsec. You need some real DSP horsepower to achieve that sort of performance and the Rigol RSA5000 real-time spectrum analyzer gets this performance from a pair of Xilinx Zynq Z-7015 SoCs. (You'll find many more details about real-time spectrum analysis and the RSA5000 Real-Time Spectrum Analyzer in the Rigol app note "Realtime Spectrum Analyzer vs Spectrum Analyzer," attached at the end of this post. See below.)
Rigol RSA5000 Real-Time Spectrum Analyzer Display Modes
Here’s the short presentation and demo of the Rigol RSA5000 real-time spectrum analyzer from DesignCon 2018:
Mike Rizzo told me that the Rigol design engineers selected the Zynq Z-7015 SoCs for three main reasons:
If you’re looking for a very capable spectrum analyzer, give the Rigol RSA5000 a look. If you’re designing your own real-time system and need high-speed computation coupled with fast user response, take a look at the line of Xilinx Zynq SoCs and Zynq UltraScale+ MPSoCs.
In a new report titled “Hitting the accelerator: the next generation of machine-learning chips,” Deloitte Global predicted that “by the end of 2018, over 25 percent of all chips used to accelerate machine learning in the data center will be FPGAs and ASICs.” The report then continues: “These new kinds of chips should increase dramatically the use of ML, enabling applications to consume less power and at the same time become more responsive, flexible and capable, which is likely to expand the addressable market.” And later in the Deloitte Global report:
“There will also be over 200,000 FPGA and 100,000 ASIC chips sold for ML applications.”
“…the new kinds of chips may dramatically increase the use of ML, enabling applications to use less power and at the same time become more responsive, flexible and capable, which is likely to expand the addressable market…”
“Total 2018 FPGA chip volume for ML would be a minimum of 200,000. The figure is almost certainly going to be higher, but by exactly how much is difficult to predict.”
These sorts of statements are precisely why Xilinx has rapidly expanded its software offerings for machine-learning development from the edge to the cloud. That includes the reVISION stack for developing responsive and reconfigurable vision systems and the Reconfigurable Acceleration stack for developing and deploying platforms at cloud scale.
Check out the Xilinx Machine Learning Web page for more in-depth information.
Xcell Daily has covered the FPGA-accelerated AWS EC2 F1 instances from Amazon Web Services several times. The AWS EC2 F1 instances allows AWS customers to develop accelerated code in C, C++, OpenCL, Verilog, or VHDL and run it on Amazon servers augmented with hardware-accelerated cards based on multiple Xilinx Virtex UltraScale+ VU9P FPGAs. (See below.)
A new AWS case study titled “Xilinx Speeds Testing Time, Increases Developer Productivity Using AWS” turns the tables. It discusses Xilinx’s use of AWS services to speed development of Xilinx development software such as the Vivado and SDx development environments. Xilinx employs extensive regression testing when developing new releases of these complex tools and the resulting demand spikes called for more “elastic” server resources. (Amazon’s “EC2” designation stands for “Elastic Compute Cloud.”)
As the case study states:
“Xilinx addressed its infrastructure-scaling problem by migrating to a high-performance computing (HPC) cluster running on Amazon Web Services (AWS). ‘We evaluated several cloud providers and chose AWS because it had the best tools and most mature solution,’” says [Ambs] Kesavan, [software engineering and DevOps director at Xilinx].
For more information about Amazon’s AWS EC2 F1 instance in Xcell Daily, see:
Earlier this month at the Xilinx Developers Forum (XDF) in Frankfurt, Huawei’s Principal Hardware Architect Craig Davies gave a half-hour presentation about Huawei Cloud’s FaaS (FPGAs as a Service). His primary mission: to enlist new Huawei Cloud partners to expand the company’s FACS (FPGA Accelerated Cloud Server) FaaS ecosystem. (Huawei announced the FACS offering at HUAWEI CONNECT 2017 last September, see “Huawei bases new, accelerated cloud service and FPGA Accelerated Cloud Server on Xilinx Virtex UltraScale+ FPGAs.”)
Huawei’s FACS cloud offering is based on a PCIe server card that incorporates a Xilinx Virtex UltraScale+ VU9P FPGA. (Huawei also offers the board for on-premise installations.) In addition to the hardware, Huawei offers three major development tools for FACS:
With these offerings, Davies said, Huawei is looking to add partners to expand its ecosystem and is particularly interested in talking to companies that offer:
There’s a Huawei Cloud Marketplace that serves as an outlet for FACS applications. The company is also welcoming end users to try the service.
Here’s a video of Davies’ 32-minute presentation at XDF:
Amazon’s Senior Director of Business Development and Product, Gadi Hutt, gave an in-depth presentation at the recent Xilinx Developers Forum in Frankfurt, Germany where he detailed the specifics, advantages, and the nuts-and-bolts “how to” with respect to using the FPGA-based AWS EC2 F1 instances to accelerate your business.
First, Hutt gave one of the most succinct definitions of “the cloud” I’ve heard: “the on-demand delivery of compute, storage, networking, etc. services.” This definition is free of the niggling details such as hardware, networking, power, and cooling that you are now free to ignore.
Then Hutt listed the advantages of cloud-based services:
From there, Hutt provided a deep explanation of the steps you need to take to distribute cloud-based services globally. He also quoted a Gartner estimate, which said that AWS (Amazon Web Services) has more compute capacity than all of the other cloud providers combined. Certainly, this Gartner report puts AWS far in the upper right corner of the Gartner Magic Quadrant for Cloud Infrastructure as a Service, Worldwide.
Using AWS allows your company to “get out of IT” and focus on providing specialized services where you can add value, said Hutt. “You can focus on your core business,” he continued.
Then he turned to the specifics of the AWS EC2 F1 instances, which are based on multiple Xilinx Virtex UltraScale+ VU9P FPGAs. Two of the many points Hutt made include:
“There’s pretty good maturity in the software ecosystem, today,” Hutt observed.
One of Hutt’s conclusions with respect to AWS EC2 F1 instances:
“There’s a tremendous opportunity for FPGAs to shine in a number of areas.”
If you’re interested in FPGA-based cloud acceleration, here’s the 48-minute video with Gadi Hutt’s full presentation at XDF:
For more information about Amazon’s AWS EC2 F1 instance in Xcell Daily, see:
Earlier this month, Xilinx held a developer’s forum in Frankfurt, Germany and Xilinx’s Senior Director for Software and IP Ramine Roan discussed the growing role of Xilinx All Programmable devices in his opening remarks, which appear in a New Electronics article written by Neil Tyler titled “Resurgence of interest in FPGAs helped by new services via the Cloud.” Roane started by stating something that any design team already knows: CPU architectures are failing to meet the demand of increasing workloads because Dennard frequency and power scaling—often erroneously lumped into Moore’s Law, which is really about transistor and density scaling—essentially died several years ago after several decades of robust health. The current workaround—multicore architectures—rapidly hits its own limits in most embedded systems where there just aren’t enough tasks to distribute to dozens of processor cores.
The article then quotes Roane:
“There are too many transistors switching at the same time and current leakage at lower geometries is hitting power constraint limits, and this is all happening at a time when workload demand is growing exponentially both in the Cloud and at the edge.”
One solution, hardware application accelerators, only make sense if the production volumes are justified. For that you need a killer app said Roane.
Problem: there just aren’t that many killer apps.
The current situation plays to the strengths of Xilinx All Programmable devices, which can be reconfigured for a truly wide range of applications. “They provide configurable processor sub-systems and hardware that can be reconfigured dynamically,” said Roane.
The problem, of course, is that taking advantage of the programmable hardware resources in Xilinx devices has not been as easy as it might be. In the past, you needed specialized hardware-design skills; You needed to know Verilog or VHDL; You needed to wade into possibly unfamiliar hardware waters.
Roane emphasized that things are very different today. As the article states, “Xilinx and its growing ecosystem of partners are now delivering a much richer development stack so that hardware, embedded and application software developers can program them more easily by using higher level programming options, like C, C++ and OpenCL.”
“We are now able to deliver a development stack that designers are increasingly familiar with and which is also available on the Cloud via secure cloud services platforms,” added Roane, referring to Xilinx-based cloud acceleration offerings from Amazon Web Services (AWS EC2 F1 instances) and Alibaba Cloud.
For more information about Amazon’s AWS EC2 F1 instance in Xcell Daily, see:
For more information about the Xilinx-based Alibaba Cloud F2 offering in Xcell Daily, see:
Embedded-vision applications present many design challenges and a new ElectronicsWeekly.com article written by Michaël Uyttersprot, a Technical Marketing Manager at Avnet Silica, and titled “Bringing embedded vision systems to market” discusses these challenges and solutions.
First, the article enumerates several design challenges including:
Next, the article discusses Avnet Silica’s various design offerings that help engineers quickly develop embedded-vision designs. Products discussed include:
The Avnet PicoZed Embedded Vision Kit is based on the Xilinx Zynq SoC
If you’re about to develop any sort of embedded-vision design, it might be worth your while to read the short article and then connect with your friendly neighborhood Avnet or Avnet Silica rep.
For more information about the Avnet PicoZed Embedded Vision Kit, see “Avnet’s $1500, Zynq-based PicoZed Embedded Vision Kit includes Python-1300-C camera and SDSoC license.”
DornerWorks is one of only three Xilinx Premier Alliance Partners in North America offering design services, so the company has more than a little experience using Xilinx All Programmable devices. The company has just launched a new learn-by-email series with “interesting shortcuts or automation tricks related to FPGA development.”
The series is free but you’ll need to provide an email address to receive the lessons. I signed up and immediately received a link to the first lesson titled “Algorithm Implementation and Acceleration on Embedded Systems” written by DornerWorks’ Anthony Boorsma. It contains information about the Xilinx Zynq SoC and Zynq UltraScale+ MPSoC and the Xilinx SDSoC development environment.
Sign up here.
Last month, a user on EmbeddedRelated.com going by the handle stephaneb started a thread titled “When (and why) is it a good idea to use an FPGA in your embedded system design?” Olivier Tremois (oliviert), a Xilinx DSP Specialist FAE based in France, provided an excellent, comprehensive, concise, Xilinx-specific response worth repeating in the Xcell Daily blog:
As a Xilinx employee I would like to contribute on the Pros ... and the Cons.
Let start with the Cons: if there is a processor that suits all your needs in terms of cost/power/performance/IOs just go for it. You won't be able to design the same thing in an FPGA at the same price.
Now if you need some kind of glue logic around (IOs), or your design need multiple processors/GPUs due to the required performance then it's time to talk to your local FPGA dealer (preferably Xilinx distributor!). I will try to answer a few remarks I saw throughout this thread:
FPGA/SoC: In the majority of the FPGA designs I’ve seen during my career at Xilinx, I saw some kind of processor. In pure FPGAs (Virtex/Kintex/Artix/Spartan) it is a soft-processor (Microblaze or Picoblaze) and in a [Zynq SoC or Zynq Ultrascale+ MPSoC], it is a hard processor (dual-core Arm Cortex-A9 [for Zynq SoCs] and Quad-A53+Dual-R5 [for Zynq UltraScale+ MPSoCs]). The choice is now more complex: Processor Only, Processor with an FPGA aside, FPGA only, Integrated Processor/FPGA. The tendency is for the latter due to all the savings incurred: PCB, power, devices, ...
Power: Pure FPGAs are making incredible progress, but if you want really low power in stand-by mode you should look at the Zynq Ultrascale+ MPSoC, which contains many processors and particularly a Power Management Unit that can switch on/off different regions of the processors/programmable logic.
Analog: Since Virtex-5 (2006), Xilinx has included ADCs in its FPGAs, which were limited to internal parameter measurements (Voltage, Temperature, ...). [These ADC blocks are] called the System Monitor. With 7 series (2011) [devices], Xilinx included a dual 1Msamples/sec@12-bits ADC with internal/external measurement capabilities. Lately Xilinx [has] announced very high performance ADCs/DACs integrated into the Zynq UltraScale+ RFSoC: 4Gsamples/sec@12 bits ADCs / 6.5Gsamples/sec@14 bits DACs. Potential applications are Telecom (5G), Cable (DOCSYS) and Radar (Phased-Array).
Security: The bitstream that is stored in the external Flash can be encoded [encrypted]. Decoding [decrypting] is performed within the FPGA during bitstream download. Zynq-7000 SoCs and Zynq Ultrascale+ MPSoCs support encoded [encrypted] bitstreams and secured boot for the processor[s].
Ease of Use: This is the big part of the equation. Customers need to take this into account to get the right time to market. Since 2012 and [with] 7 series devices, Xilinx introduced a new integrated tool called Vivado. Since then a number of features/new tools have been [added to Vivado]:
There are also tools related to the MathWorks environment [MATLAB and Simulink]:
All this to say that FPGA vendors have [expended] tremendous effort to make FPGAs and derivative devices easier to program. You still need a learning curve [but it] is much shorter than it used to be…
If you want your design to run at maximum speed at the lowest possible power consumption (and who does not?), then you want to run your algorithms using fixed-point hardware. With that in mind, MathWorks has just published an extensive guide to “Best Practices for Converting MATLAB Code to Fixed Point” for MATLAB-based designs with a nearly hour-long companion video.
Mathworks has been advocating model-based design using its MATLAB and Simulink development tools for some time because the design technique allows you to develop more complex software with better quality in less time. (See the Mathworks White Paper: “How Small Engineering Teams Adopt Model-Based Design.”) Model-based design employs a mathematical and visual approach to developing complex control and signal-processing systems through the use of system-level modeling throughout the development process—from initial design, through design analysis, simulation, automatic code generation, and verification. These models are executable specifications that consist of block diagrams, textual programs, and other graphical elements. Model-based design encourages rapid exploration of a broader design space than other design approaches because you can iterate your design more quickly, earlier in the design cycle. Further, because these models are executable, verification becomes an integral part of the development process at every step. Hopefully, this design approach results in fewer (or no) surprises at the end of the design cycle.
Xilinx supports model-based design using MATLAB and Simulink through the new Xilinx Model Composer, a design tool that integrates into the MATLAB and Simulink environments. The Xilinx Model Composer includes libraries with more than 80 high-level, performance-optimized, Xilinx-specific blocks including application-specific blocks for computer vision, image processing, and linear algebra. You can also import your own custom IP blocks written in C and C++, which are subsequently processed by Vivado HLS.
Here’s a block diagram that shows you the relationship among Mathworks’ MATLAB, Simulink, and Xilinx Model Composer:
Finally, here’s a 6-minute video explaining the benefits and use of Xilinx Model Composer:
Good machine learning heavily depends on large training-data sets, which are not always available. There’s a solution to this problem called transfer learning, which allows the new neural network to leverage an already trained neural network as a starting point. Kaan Kara at ETH Zurich has published an example of transfer learning as a Jupyter Notebook for the Zynq-and-Python based PYNQ development environment on Github. This demo uses the ZipML-PYNQ overlay and analyzes astronomical images of galaxies and puts the images into one of two classes: one showing images of merging galaxies and one that doesn’t.
The work is discussed further in a paper presented at the IEEE 25th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM), 2017. The paper is titled “FPGA-Accelerated Dense Linear Machine Learning: A Precision-Convergence Trade-Off.”
Designing SDRs (software-defined radios)? MathWorks and Analog Devices have joined together to bring you a free Webinar titled “Radio Deployment on SoC Platforms.” It a 45-minute class that discusses hardware and software development for SDR designs using MathWorks’ MATLAB, Simulink, and HDL Coder to:
Analog Devices’ Zynq-based RF SOM on a Carrier Card
There will be three broadcasts of the Webinar on December 13 to accommodate viewers around the world. Register here. Register even if you cannot attend and you’ll receive a link to a recording of the session.
RS-Online has a running series of design articles and a new one written by Adam Taylor titled “Software Defined SoC on Arty Z7-20, Xilinx ZYNQ evaluation board” tells you how to get started with the Xilinx SDSoC Development Environment using the Digilent Arty Z7-20, which is based on a a Zynq Z-7020 SoC. What’s great about SDSoC is that it lets you program code on the Zynq SoC’s dual-core ARM Cortex-A9 MPCore processors and accelerate specific tasks using the Zynq SoC’s programmable logic, all while using C, C++, or OpenCL.
The other great thing is that the Digilent Arty Z7-20 with a voucher for SDSoC costs a mere $219 from Digilent.
You can’t find a more cost-effective way of learning how to use FPGA-based hardware acceleration to break performance bottlenecks in your embedded designs.
For more information about the Digilent Arty Z7, see “Arty Z7: Digilent’s new Zynq SoC trainer and dev board—available in two flavors for $149 or $209.”
If you’ve got some high-speed RF analog work to do, VadaTech’s new AMC598 and VPX598 Quad ADC/Quad DAC modules appear to be real workhorses. The four 14-bit ADCs (using two AD9208 dual ADCs) operate at 3Gsamples/sec and the quad 16-bit DACs (using four AD9162 or AD9164 DACs) operate at 12Gsamples/sec. You’re not going to drive those sorts of data rates over the host bus so the modules have local memory in the form of three DDR4 SDRAM banks for a total of 20Gbytes of on-board SDRAM. A Xilinx Kintex UltraScale KCU115 FPGA (aka the DSP Monster, the largest Kintex UltraScale FPGA family member with 5520 DSP slices that give you an immense amount of digital signal processing power to bring to bear on those RF analog signals) manages all of the on-board resources (memory, analog converters, and host bus) and handles the blazingly fast data-transfer rates allowing you to create RF waveform generators and advanced RF-capture systems for applications including communications and signal intelligence (COMINT/SIGINT), radar, and electronic warfare using Xilinx tools including the Vivado Design Suite HLx Editions and the Xilinx Vivado System Generator for DSP, which can be used in conjunction with MathWorks’ MATLAB and the Simulink model-based design tool.
Here’s a block diagram of the AMC598 module:
VadaTech AMC598 Quad ADC/Quad DAC Block Diagram
And here’s a photo of the AMC598 Quad ADC/Quad DAC module:
VadaTech AMC598 Quad ADC/Quad DAC
Note: Please contact VadaTech directly for more information about the AMC598 and VPX598 Quad ADC/Quad DAC modules.
Digilent has announced a major upgrade to the Zynq-based Zybo dev board, now called the Zybo Z7. The original board was based on a Xilinx Zynq Z-7010 SoC with the integrated Arm Cortex-A9 MPCore processors running at 650MHz. The new Zybo Z7-10 and -20 dev boards are based on the Zynq Z-7010 and Z-7020 SoC respectively, and the processors now run at 667MHz. The Zybo Z7-10 sells for $199 (currently, you can get a voucher for the Xilinx SDSoC development environment for $10 more) and the Zybo Z7-20 board with triple the programmable logic resources sells for $299 (and currently includes the SDSoC voucher).
Digilent Zybo Z7-20 Dev Board based on Zynq Z-7020 SoC
In addition to the faster processors, there are several additional upgrades made to the Zybo Z7 versus the Zybo dev board. SDRAM capacity has increased from 512Mbytes on the original Zybo board to 1Gbyte on the Zybo Z7. The new boards now have two HDMI ports to support “bump-in-the-wire” HDMI applications. Both boards now also include a connector with a MIPI CSI-2 interface for video camera connections. You can plug a Raspberry Pi Camera Module directly into this connector and Digilent also plans to offer a camera module for this port.
Here’s a video explaining some of the highlights of the new Zybo Z7.
Note: For more information about the Zybo Z7 dev board, please contact Digilent directly.
Programmable logic is proving to be an excellent, flexible implementation medium for neural networks that gets faster and faster as you go from floating-point to fixed-point representation—making it ideal for embedded AI and machine-learning applications—and the latest proof point is a recently published paper written by Yufeng Hao and Steven Quigley in the Department of Electronic, Electrical and Systems Engineering at the University of Birmingham, UK. The paper is titled “The implementation of a Deep Recurrent Neural Network Language Model on a Xilinx FPGA” and it describes a successful implementation and training of a fixed-point Deep Recurrent Neural Network (DRNN) using the Python programming language; the Theano math library and framework for multi-dimensional arrays; the open-source, Python-based PYNQ development environment; the Digilent PYNQ-Z1 dev board; and the Xilinx Zynq Z-7020 SoC on the PYNQ-Z1 board. Using a Python DRNN hardware-acceleration overlay, the two-person team achieved 20GOPS of processing throughput for an NLP (natural language processing) application with this design and outperformed earlier FPGA-based implementation by factors ranging from 2.75x to 70.5x.
Most of the paper discusses NLP and the LM (language model), “which is involved in machine translation, voice search, speech tagging, and speech recognition.” The paper then discusses the implementation of a DRNN LM hardware accelerator using Vivado HLS and Verilog to synthesize a custom overlay for the PYNQ development environment. The resulting accelerator contains five Process Elements (PEs) capable of delivering 20 GOPS in this application. Here’s a block diagram of the design:
DRNN Accelerator Block Diagram
There are plenty of deep technical details embedded in this paper but this one sentence sums up the reason for this blog post about the paper: “More importantly, we showed that a software and hardware joint design and simulation process can be useful in the neural network field.” This statement is doubly true considering that the PYNQ-Z1 dev board sells for $229.
Twelve student and industry teams competed for 30 straight hours in the Xilinx Hackathon 2017 competition in early October and the 3-minute wrap video just appeared on YouTube. The video shows a lot of people having a lot of fun with the Zynq-based Digilent PYNQ-Z1 dev board and Python-based PYNQ development environment:
In the end, the prizes:
For detailed descriptions of the Hackathon entries, see “12 PYNQ Hackathon teams competed for 30 hours, inventing remote-controlled robots, image recognizers, and an air keyboard.”
And a special “Thanks!” to Sparkfun for supplying much of the Hackathon hardware. Sparkfun is headquartered just down the road from the Xilinx facility in Longmont, Colorado.
Xilinx has a terrific tool designed to get you from product definition to working hardware quickly. It’s called SDSoC. Digilent has a terrific dev board to get you up and running with the Zynq SoC quickly. It’s the low-cost Arty Z7. A new blog post by Digilent’s Alex Wong titled “Software Defined SoC on Arty Z7-20, Xilinx ZYNQ evaluation board” posted on RS Online’s DesignSpark site gives you a detailed, step-by-step tutorial on using SDSoC with the Digilent Arty S7. In particular, the focus here is on the ease of moving functions from software running on the Zynq SoC’s Arm Cortex-A9 processors to the Zynq SoC’s programmable hardware using Vivado HLS, which is embedded in SDSoC. That’s so that you can get the performance benefit of hardware-based task execution.
Digilent’s Arty Z7 dev board
Envious of all the cool FPGA-accelerated applications showing up on the Amazon AWS EC2 F1 instance like the Edico Genome DRAGEN Genome Pipeline that set a Guinness World Record last week, the DeePhi ASR (Automatic speech Recognition) Neural Network announced yesterday, Ryft’s cloud-based search and analysis tools, or NGCodec’s RealityCodec video encoder?
Well, you can shake off that green monster by signing up for the free, live, half-day Amazon AWS EC2 F1 instance and SDAccel dev lab being held at SC17 in Denver on the morning of November 15 at The Studio Loft in the Denver Performing Arts Complex (1400 Curtis Street), just across the street from the Denver Convention Center where SC17 is being held. Xilinx is hosting the lab and technology experts from Xilinx, Amazon Web Services, Ryft, and NGCodec will be available onsite.
Here’s the half-day agenda:
8:00 AM Doors open, Registration, and Continental Breakfast
9:00 AM Welcome, Technology Discussion, F1 Developer Use Cases and Demos
9:35 AM Break
9:45 AM Hands-on Training Begins
12:00 PM Developer Lab Concludes
A special guest speaker from Amazon Web Services is also on the agenda.
Lab instruction time includes:
Seats are necessarily limited for a lab like this, so you might want to get your request in immediately. Where? Here.
Earlier this month, I described Aaware’s $199 Far-Field Development Platform for cloud-based, voice controlled systems such as Amazon’s Alexa and Google Home. (See “13 MEMS microphones plus a Zynq SoC gives services like Amazon’s Alexa and Google Home far-field voice recognition clarity.”) This far-field, sound-capture technology exhibits some sophisticated abilities including:
Aaware’s Far-Field Development Platform
These features are layered on top of a Xilinx Zynq SoC or Zynq UltraScale+ MPSoC and Aaware’s CTO Chris Eddington feels that the Zynq devices provide “well over” 10x the performance of an embedded processor thanks to the devices’ on-chip programmable logic, which offloads a significant amount of processing from the on-chip ARM Cortex processor(s). (Aaware can squeeze its technology into a single-core Zynq Z-7007S SoC and can scale up to larger Zynq SoC and Zynq UltraScale+ MPSoC devices as needed by the customer application.)
Aaware’s algorithm development is based on a unique tool chain:
This tool chain allows Aaware to fit the features it wants into the smallest Zynq Z-7007S SoC or to scale up to the largest Zynq UltraScale+ MPSoC.
Amazon AWS’ re:Invent 2017 takes place in Las Vegas on November 27 through December 1. (Tickets nearly sold out as of today.) CMP402, a class session during the event, is titled “Accelerate Your C/C++ Applications with Amazon EC2 F1 Instances.” Here’s the verbatim class description:
“The newly introduced Amazon EC2 F1 OpenCL development workflow helps software developers with little to no FPGA experience supercharge their applications with Amazon EC2 F1. Join us for an overview and demonstration of how to accelerate your C/C++ applications in the cloud using OpenCL with Amazon EC2 F1 instances. We walk you through the development flow for creating a custom hardware acceleration for a software algorithm. Attendees get hands-on and creative by optimizing an algorithm for maximum acceleration on Amazon EC2 F1 instances.”
The Amazon AWS EC2 F1 instance gets its acceleration from Xilinx UltraScale+ VU9P FPGAs and the C/C++/OpenCL programming facility is based on SDAccel—Xilinx’s development environment for accelerating cloud-based applications using C, C++, or OpenCL—which became available for the AWS EC2 F1 instance just last month. (See “SDAccel for cloud-based application acceleration now available on Amazon’s AWS EC2 F1 instance.”)
For more information about the Amazon AWS EC2 F1 instance, see:
Exactly a week ago, Xilinx introduced the Zynq UltraScale+ RFSoC family, which is a new series of Zynq UltraScale+ MPSoCs with RF ADCs and DACs and SD-FECs added. (See “Zynq UltraScale+ RFSoC: All the processing power of 64- and 32-bit ARM cores, programmable logic plus RF ADCs, DACs.”) This past Friday at the Xilinx Showcase held in Longmont, Colorado, Senior Marketing Engineer Lee Hansen demonstrated a Zynq UltraScale+ ZU28DR RFSoC with eight 12-bit, 4Gsamples/sec RF ADCs, eight 14-bit, 6.4Gsamples/sec RF DACs, and eight SD-FECs connected through an appropriate interface to National Instruments’ LabVIEW Systems Engineering Development Environment.
The demo system was generating signals using the RF DACs, receiving the signals using the RF ADCs, and then displaying the resulting signal spectrum using LabVIEW.
Here’s a 3-minute video of the demo:
Over the past weekend, Xilinx held a Showcase and PYNQ Hackathon in its Summit Retreat Center in Longmont, Colorado. About 100 people from tech companies all over Colorado attended the Showcase and twelve teams—about 40 people including students from local universities and engineers from industry—competed in the Hackathon.
Here are a few images from the Xilinx Showcase and PYNQ Hackathon:
Xilinx Summit Retreat Center in Longmont, Colorado
Xilinx CTO Ivo Bolsens welcomes everyone to the Xilinx Showcase
Xilinx Showcase Attendees
More Xilinx Showcase Attendees
Xilinx VP of Interactive Design Tools and Xilinx Longmont Site Manager Dan Gibbons Welcomes the Hackers to the PYNQ Hackathon
Abo’s Pizza (Boulder’s Finest) for Friday Night Dinner
Handing out Digilent PYNQ-Z1 boards
The PYNQ Hackathon Work Begins
Giving a little help to the participants
Focus, Focus, Focus
A Mini Lecture about the PYNQ Logictools Overlay for Hackathon Attendees
Getting a little shuteye
A view of Longs Peak from the Xilinx Longmont Summit Retreat Center
The event was organized by an internal Xilinx team including:
In addition, a team of helpers was on hand over the 30-hour duration of the event to answer questions:
For more information about the PYNQ Hackathon, see “12 PYNQ Hackathon teams competed for 30 hours, inventing remote-controlled robots, image recognizers, and an air keyboard.”
For more information about the Python-based, open-source PYNQ development environment and the Zynq-based Digilent PYNQ-Z1 dev board, see “Python + Zynq = PYNQ, which runs on Digilent’s new $229 pink PYNQ-Z1 Python Productivity Package.”
Twelve student and industry teams competed for 30 straight hours in the Xilinx Hackathon 2017 competition over the weekend at the Summit Retreat Center in the Xilinx corporate facility located in Longmont, Colorado. Each team member received a Digilent PYNQ-Z1 dev board, which is based on a Xilinx Zynq Z-7020 SoC, and then used their fertile imaginations to conceive of and develop working code for an application using the open-source, Python-based PYNQ development environment, which is based on self-documenting Jupyter Notebooks. The online electronics and maker retailer Sparkfun, located just down the street from the Xilinx facility in Longmont, supplied boxes of compatible peripheral boards with sensors and motor controllers to spur the team members’ imaginations. Several of the teams came from local universities including the University of Colorado at Boulder and the Colorado School of Mines in Golden, Colorado. At the end of the competition, eleven of the teams presented their results using their Jupyter Notebooks. Then came the prizes.
For the most part, team members had never used the PYNQ-Z1 boards and were not familiar with using programmable logic. In part, that was the intent of the Hackathon—to connect teams of inexperienced developers with appropriate programming tools and see what develops. That’s also the reason that Xilinx developed PYNQ: so that software developers and students could take advantage of the improved embedded performance made possible by the Zynq SoC’s programmable hardware without having to use ASIC-style (HDL) design tools to design hardware (unless they want to do so, of course).
Here are the projects developed by the teams, in the order presented during the final hour of the Hackathon (links go straight to the teams’ Github repositories with their Jupyter notebooks that document the projects with explanations and “working” code):
Team John Cena’s Voice-Controlled Mobile Robot
Team “Joy of Pink” developed an emoji generator based on facial interpretation on Microsoft’s cloud-based Azure Emotion API
Team Caffeine’s Audio Fiend Tone-Based Robotic Controller
After the presentations, the judges deliberated for a few minutes using multiple predefined criteria and then awarded the following prizes:
Congratulations to the winners and to all of the teams who spent 30 hours with each other in a large room in Colorado to experience the joy of hacking code to tackle some tough problems. (A follow-up blog will include a photographic record of the event so that you can see what it was like.)
For more information about the PYNQ development environment and the Digilent PYNQ-Z1 board, see “Python + Zynq = PYNQ, which runs on Digilent’s new $229 pink PYNQ-Z1 Python Productivity Package.”
Late last month, I wrote about an announcement by DNAnexus and Edico Genome that described a huge reduction in the cost and time to analyze genomic information, enabled by Amazon’s FPGA-accelerated AWS EC2 F1 instance. (See “Edico Genome and DNAnexus announce $20, 90-minute genome analysis on Amazon’s FPGA-accelerated AWS EC2 F1 instance.”) The AWS Partner Network blog has just published more details in an article written by Amazon’s Aaron Friedman, titled “How DNAnexus and Edico Genome are Powering Precision Medicine on Amazon Web Services (AWS).”
The details are exciting to say the least. The article begins with this statement:
“Diagnosing the medical mysteries behind acutely ill babies can be a race against time, filled with a barrage of tests and misdiagnoses. During the first few days of life, a few hours can save or seal the fate of patients admitted to the neonatal intensive care units (NICUs) and pediatric intensive care units (PICUs). Accelerating the analysis of the medical assays conducted in these hospitals can improve patient outcomes, and, in some cases, save lives.”
Then, if you read far enough into the post, you find this statement:
“Rady Children’s Institute for Genomic Medicine is one of the global leaders in advancing precision medicine. To date, the institute has sequenced the genomes of more than 3,000 children and their family members to diagnose genetic diseases. 40% of these patients are diagnosed with a genetic disease, and 80% of these receive a change in medical management. This is a remarkable rate of change in care, considering that these are rare diseases and often involve genomic variants that have not been previously observed in other individuals.”
This example is merely a road sign, pointing the way to even more exciting developments in FPGA-accelerated, cloud-based computing to come. Well-known Silicon Valley venture capitalist Jim Hogan directly addressed these developments in a speech at San Jose State University just a couple of weeks ago. (See “Four free training videos (two hour's worth) on using Xilinx SDAccel to create apps for Amazon AWS EC2 F1 instances.”)
The Amazon AWS EC2 F1 instance is a cloud service that’s based on multiple Xilinx Virtex UltraScale+ VU9P FPGAs installed in Amazon’s Web servers. For more information on the AWS EC2 F1 Instance in Xcell Daily, see:
By Adam Taylor
The Xilinx Zynq UltraScale+ MPSoC is good for many applications including embedded vision. It’s APU with two or four 64-bit ARM Cortex-A53 processors, Mali GPU, DisplayPort interface, and on-chip programmable logic (PL) give the Zynq UltraScale+ MPSoC plenty of processing power to address exciting applications such as ADAS and vision-guided robotics with relative ease. Further, we can use the device’s PL and its programmable I/O to interface with a range of vision and video standards including MIPI, LVDS, parallel, VoSPI, etc. When it comes to interfacing image sensors, the Zynq UltraScale+ MPSoC can handle just about anything you throw at it.
Once we’ve brought the image into the Zynq UltraScale+ MPSoC’s PL, we can implement an image-processing pipeline using existing IP cores from the Xilinx library or we can develop our own custom IP cores using Vivado HLS (high-level synthesis). However, for many applications we’ll need to move the images into the device’s PS (processing system) domain before we can apply exciting application-level algorithms such as decision making or use the Xilinx reVISION acceleration stack.
I thought I would kick off the fourth year of this blog with a look at how we can use VDMA instantiated in the Zynq MPSoC’s PL to transfer images from the PL to the PS-attached DDR Memory without processor intervention. You often need to make such high-speed background transfers in a variety of applications.
To do this we will use the following IP blocks:
Once configured over its AXI Lite interface, the Test Pattern Generator outputs test patterns which are then transferred into the PS-attached DDR memory. We can demonstrate that this has been successful by examining the memory locations using SDK.
Enabling the FPD Master and Slave Interfaces
For this simple example, we’ll clock both the AXI networks at the same frequency, driven by PL_CLK_0 at 100MHz.
For a deployed system, an image sensor would replace the TPG as the image source and we would need to ensure that the VDMA input-channel clocks (Slave-to-Memory-Map and Memory-Map-to-Slave) were fast enough to support the required pixel and frame rate. For example, a sensor with a resolution of 1280 pixels by 1024 lines running at 60 frames per second would require a clock rate of at least 108MHz. We would need to adjust the clock frequency accordingly.
Block Diagram of the completed design
To aid visibility within this example, I have included three ILA modules, which are connected to the outputs of the Test Pattern Generator, AXI VDMA, and the Slave Memory Interconnect. Adding these modules enables the use of Vivado’s hardware manager to verify that the software has correctly configured the TPG and the VDMA to transfer the images.
With the Vivado design complete and built, creating the application software to configure the TPG and VDMA to generate images and move them from the PL to the PS is very straightforward. We use the AXIVDMA, V_TPG, Video Common APIs available under the BSP lib source directory to aid in creating the application. The software itself performs the following:
The application will then start generating test frames, transferred from the TPG into the PS DDR memory. I disabled the caches for this example to ensure that the DDR memory is updated.
Examining the ILAs, you will see the TPG generating frames and the VDMA transferring the stream into memory mapped format:
TPG output, TUSER indicates start of frame while TLAST indicates end of line
VDMA Memory Mapped Output to the PS
Examining the frame store memory location within the PS DDR memory using SDK demonstrates that the pixel values are present.
Test Pattern Pixel Values within the PS DDR Memory
You can use the same approach in Vivado when creating software for a Zynq Z-7000 SoC iinstead of a Zynq UltraScale+ MPSoC by enabling the AXI GP master for the AXI Lite bus and AXI HP slave for the VDMA channel.
Should you be experiencing trouble with your VDMA based image processing chain, you might want to read this blog.
The project, as always, is on GitHub.
If you want E book or hardback versions of previous MicroZed chronicle blogs, you can get them below.
First Year E Book here
First Year Hardback here.
Second Year E Book here
Second Year Hardback here
MathWorks has just published a 4-part mini course that teaches you how to develop vision-processing applications using MATLAB, HDL Coder, and Simulink, then walks you through a practical example targeting a Xilinx Zynq SoC using a lane-detection algorithm in Part 4.
Click here for each of the classes:
PDF Solutions provides yield-improvement technologies and services to the IC-manufacturing industry to lower manufacturing costs, improve profitability, and shorten time to market. One of the company’s newest solutions is the eProbe series of e-beam tools used for inline electrical characterization and process control. These tools combine an SEM (scanning electron microscope) and an optical microscope and have the unique ability to provide real-time image analysis of nanometer-scale features. The eProbe development team selected National Instrument’s (NI’s) LabVIEW to control the eProbe system and brought in JKI—a LabVIEW consulting company, Xilinx Alliance Program member, and NI Silver Alliance Partner—to help develop the system.
PDF Solutions eProbe e-beam tool combines an SEM with an optical microscope
In less than four months, JKI helped PDF Solutions attain a 250MHz pixel-acquisition rate from the prototype eProbe using a combination of NI’s FlexRIO module, based on a Xilinx Kintex-7 FPGA, and NI’s LabVIEW FPGA module. According to the PDF Solutions case study published on the JKI Web site, using NI’s LabVIEW allowed the PDF/JKI team to implement the required, real-time FPGA logic and easily integrate third-party FPGA IP in a fraction of the time required by alternative design platforms while still achieving the project’s image-throughput goals.
LabVIEW controls most of the functions within the eProbe that perform the wafer inspection including:
JKI contributed both to the eProbe’s software architecture design and the development of various high-level software components that coordinate and control the low-level hardware functions including data acquisition and image manipulation.
Although the eProbe’s control system runs within NI’s LabVIEW environment, the system’s user interface is based on a C# application from The PEER Group called the Peer Tool Orchestrator (PTO). JKI developed the interface between the eProbe’s front-end user interface and its LabVIEW-based control system using its internally developed tools. (Note: JKI offers several LabVIEW development tools and templates directly on this Web page.)
eProbe user interface screen
Once PDF Solutions started fielding eProbe systems, JKI sent people to work with PDF Solutions’ customers on site in a collaboration that helped generate ideas for future algorithm and tool improvements.
For more information about real-time LabVIEW development using the NI LabVIEW FPGA module and Xilinx-based NI hardware, contact JKI directly.