Displaying articles for: 02-05-2017 - 02-11-2017
As the BittWare video below explains, CPUs are simply not able to process 100GbE packet traffic without hardware acceleration. BittWare’s new Streamsleuth, to be formally unveiled at next week’s RSA Conference in San Francisco (Booth S312), adroitly handles blazingly fast packet streams thanks to a hardware assist from an FPGA. And as the subhead in the title slide of the video presentation says, StreamSleuth lets you program its FPGA-based packet-processing engine “without the hassle of FPGA programming.”
(Translation: you don’t need Verilog or VHDL proficiency to get this box working for you. You get all of the FPGA’s high-performance goodness without the bother.)
That said, as BittWare’s Network Products VP & GM Craig Lund explains, this is not an appliance that comes out of the box ready to roll. You need (and want) to customize it. You might want to add packet filters, for example. You might want to actively monitor the traffic. And you definitely want the StreamSleuth to do everything at wire-line speeds, which it can. “But one thing you do not have to do, says Lund, “is learn how to program an FPGA.” You still get the performance benefits of FPGA technology—without the hassle. That means that a much wider group of network and data-center engineers can take advantage of BittWare’s StreamSleuth.
As Lund explains, “100GbE is a different creature” than prior, slower versions of Ethernet. Servers cannot directly deal with 100GbE traffic and “that’s not going to change any time soon.” The “network pipes” are now getting bigger than the server’s internal “I/O pipes.” This much traffic entering a server this fast clogs the pipes and also causes “cache thrash” in the CPU’s L3 cache.
Sounds bad, doesn’t it?
What you want is to reduce the network traffic of interest down to something a server can look at. To do that, you need filtering. Lots of filtering. Lots of sophisticated filtering. More sophisticated filtering than what’s available in today’s commodity switches and firewall appliances. Ideally, you want a complete implementation of the standard BPF/pcap filter language running at line rate on something really fast, like a packet engine implemented in a highly parallel FPGA.
The same thing holds true for attack mitigation at 100Gbe line rates. Commodity switching hardware isn’t going to do this for you at 100GbE (10GbE yes but 100GbE, “no way”) and you can’t do it in software at these line rates. “The solution is FPGAs” says Lund, and BittWare’s StreamSleuth with FPGA-based packet processing gets you there now.
Software-based defenses cannot withstand Denial of Service (DoS) attacks at 100GbE line rates. FPGA-accelerated packet processing can.
So what’s that FPGA inside of the BittWare Streamsleuth doing? It comes preconfigured for packet filtering, load balancing, and routing. (“That’s a Terabit router in there.”) To go beyond these capabilities, you use the BPF/pcap language to program your requirements into the the StreamSleuth’s 100GbE packet processor using a GUI or APIs. That packet processor is implemented with a Xilinx Virtex UltraScale+ VU9P FPGA.
Here’s what the guts of the BittWare StreamSleuth look like:
And here’s a block diagram of the StreamSleuth’s packet processor:
The Virtex UltraScale+ FPGA resides on a BittWare XUPP3R PCIe board. If that rings a bell, perhaps you read about that board here in Xcell Daily last November. (See “BittWare’s UltraScale+ XUPP3R board and Atomic Rules IP run Intel’s DPDK over PCIe Gen3 x16 @ 150Gbps.”)
Finally, here’s the just-released BittWare StreamSleuth video with detailed use models and explanations:
For more information about the StreamSleuth, contact BittWare directly or go see the company’s StreamSleuth demo at next week’s RSA conference. For more information about the packet-processing capabilities of Xilinx All Programmable devices, click here. And for information about the new Xilinx Reconfigurable Acceleration Stack, click here.
Annapolis Microsystems has adopted the Xilinx Zynq UltraScale+ MPSoC in a big way today by introducing three 6U and 3U OpenVPX boards and one PCIe board based on three of the Zynq UltraScale+ MPSoC family members. The four new COTS boards are:
Annapolis Microsystems WILDSTAR UltraKVP ZP 3PE for 6U OpenVPX
Annapolis Microsystems WILDSTAR UltraKVP ZP 2PE for 6U OpenVPX
Annapolis Microsystems WILDSTAR UltraKVP ZP for 3U OpenVPX
Annapolis Microsystems WILDSTAR UltraKVP ZP for PCIe
Why choose the Xilinx Zynq UltraScale+ MPSoC? Noah Donaldson, Annapolis Micro Systems VP of Product Development explains: “Here at Annapolis we pride ourselves on being first to integrate the latest, cutting-edge components into our FPGA boards, all in pursuit of one goal: Delivering the highest-performing COTS boards and systems that have been proven in some of the most challenging environments on earth.”
That pretty much says it all, doesn’t it?
Amazon Web Services (AWS) rolled out the F1 instance for cloud application development based on Xilinx Virtex UltraScale+ Plus VU0P FPGAs last November. (See “Amazon picks Xilinx UltraScale+ FPGAs to accelerate AWS, launches F1 instance with 8x VU9P FPGAs per instance.) It appears from the following LinkedIn post that people are using it already to do some pretty interesting things:
If you’re interested in Cloud computing applications based on the rather significant capabilities of Xilinx-based hardware application acceleration, check out the Xilinx Acceleration Zone.
Last September, Xilinx announced the six members of the 28nm Spartan-7 FPGA family for “cost-sensitive” designs (that’s marketing-speak for “low-cost”) and for designs that require small-footprint devices. (The two smallest members of the Spartan-7 family will be offered in 8x8mm CPGA196 packages with 100 user I/O pins.)
There’s a new 15-minute video with a quick technical overview of the Spartan-7 family:
And you can download the 50-page Advance Product Specification here.
By Adam Taylor
Like is MicroZed and PicoZed predecessors, the UltraZed-EG is a System on Module (SOM) that contains all of the necessary support functions for a complete embedded processing system. As a SOM, this module is designed to be integrated with an application-specific carrier card. In this instance, our application-specific card is the Avnet UltraZed IO Carrier Card.
The specific Zynq UltraScale+ MPSoC contained within the UltraZed SOM is the XCZU3EG-SFVA625, which incorporates a quad-core ARM Cortex-A53 APU (Application Processing Unit), dual ARM Cortex-R5 processors in an RPU (Real-Time Processing Unit), and an ARM Mali-400 GPU. Coupled with a very high performance programmable-logic array based on the Xilinx UltraScale+ FPGA fabric, suffice it to say that exploring how to best use all of these resources it will keep us very, very busy. You can find the 36-page product specification for the device here.
The UltraZed SOM itself shown in the diagram below provides us with 2GBytes of DDR4 SDRAM, while non-volatile storage for our application(s) is provided by both dual QSPI or eMMC Flash memory. Most of the Zynq UltraScale+ MPSoC’s PS and PL I/O are broken out to one of three headers to provide maximum flexibility on the application-specific carrier card.
Avnet UltraZed-EG SOM Block Diagram
The UltraZed IO Carrier Card (IOCC), breaks out the I/O pins from the SOM to a wide variety of interface and interconnect technologies including Gigabit Ethernet, USB 2/3, UART, PMOD, Display Port, SATA, and Ardunio Shield. This diverse set of I/O connections give us wide lattitude in developing all sorts of systems. The IOCC also provicdes a USB to JTAG interface allowing us to program and debug our system. You’ll find more information on the IOCC here.
Having introduced the UltraZed and its IOCC, it is time to write a simple “hello world” application and to generate our first Zynq UltraScale+ MPSoC design.
The first step on this journey is make sure we have used the provided voucher to generate a license and downloaded the Design Edition of the Vivado Design Suite.
The next step is to install the board files to provide Vivado with the necessary information to create designs targeting the UltraZed SoM. You can download these files using this link. These board-definition files include information such as the actual Zynq UltraScale+ MPSoC device populated on the SoM, connections to the PS on the IOCC, and a preset configuration for the SoM. We can of course create an example without using these files, however it requires a lot more work.
Once you have downloaded the zip file, extract the contents into the following directory:
<Vivado Install Root>/data/boards/boardfiles
When this is complete, you will see that the UltraZed board defintions are now in the directory and we can now use them within our design.
I should point out at this point that some of the UltraZed boards (including mine) use ES1 silicon. To alert Vivado about this, we need to create a init.tcl file in the scripts directory that will enable us to use ES1 silicon. Doing so is very simple. Within the directory:
Create a file called init.tcl. Enter the line “enable_beta_device*” into this file to enable the use of ES1 silicon within your toolchain.
With this completed we can open Vivado and create a new RTL project. After entering the project name and location, click next on the add sources, IP, and constraints tabs. This should bring you to part selection tab. Click on boards and you should see our UltraZed IOCC board. Select that board and then finish the open project dialog.
This will create a new project.
For this project I am just going to just use the Zynq UltraScale+ MPSoC’s PS to print “hello world.” I usually like to do this with new boards to ensure that I have pipe-cleaned the tool chain. To do this, we need a hardware-definition file to export to SDK to define the hardware platform.
The first step in this sequence is within Flow Navigator. On the left-hand side of the Vivado screen, select the Create Block Diagram option. This will provide a dialog box allowing you to name your block design (or you can leave it default). Click OK and this will create a blank block diagram (in the example below mine is called design_1).
Within this block diagram, we need to add an MPSoC system. Click on the “add IP” button as indicated in the block diagram. This will bring up an IP dialog. Within the search box, type in “MPSoC” and you will see the Zynq UltraScale+ MPSoC IP block. Double click on this and it will be added to the diagram automatically.
Once the block has been added, you will notice a designer assistance notification across the top of the block diagram. For the moment, do not click on that. Instead, double click on the MPSoC IP in your block diagram and it will open up the customization screen for the MPSoC, just like any other IP block.
Looking at the customization screen, you will see it is not yet configured for the target board. For instance, the IOU block has no MIO configuration. Had we not downloaded the board definition, we would now have to configure this by manually. But why do that when we can use the shortcut?
We have the board-definition files, so all we need to do to correctly configure this for the IOCC is close the customization dialog and click on the Run Block Automation notification at the top of the block diagram. This will configure the MPSoC for our use on the IOCC. Within the block automation dialog, check to make sure that the “apply pre-sets” option is selected before clicking OK.
Re-open the MPSoC IP block again and you will see a different configuration of the MPSoC—one that is ready to use with our IOCC.
Do not change anything. Close the dialog box. Then, on the block diagram, connect the PL_CLK0 pin to the maxihpm0_lpd_ack pin. Once that is complete, click on “validate” to ensure that the design has no errors.
The next step is very simple. We’ll create an RTL wrapper file for the block diagram. This will allow us to implement the design. Under the sources tab, right-click on the block diagram and select “create HDL wrapper.” When prompted, select the option that allows Vivado to manage the file for you and click OK.
To generate the bitstream, click on the “Generate Bitstream” icon on the menu bar. If you are prompted about any stages being out of date, re-run them first by clicking on “yes.”
Depending on the speed of your system, this step may take a few minutes or longer to generate the bitstream. Once completed, select the “open implementation” option. Having the implementation open allows us to export the hardware definition file to SDK where we will develop our software.
To export the hardware definition, select File-> Export->Export Hardware. Select “include bit file” and export it.
To those familiar with the original Zynq SoC, all of this should look pretty familiar.
We are now ready to write our first software program—next time.
You can find links to previous editions of the MPSoC edition here
Code is available on Github as always.
If you want E book or hardback versions of previous MicroZed chronicle blogs, you can get them below.
Accolade’s newly announced ATLAS-1000 Fully Integrated 1U OEM Application Acceleration Platform pairs a Xilinx Kintex UltraScale KU060 FPGA on its motherboard with an Intel x86 processor on a COM Express module to create a network-security application accelerator. The ATLAS-1000 platform integrates Accolade’s APP (Advanced Packet Processor), instantiated in the Kintex UltraScale FPGA, which delivers acceleration features for line-rate packet processing including lossless packet capture, nanosecond-precision packet timestamping, packet merging, packet filtering, flow classification, and packet steering. The platform accepts four 10G SFP+ or two 40G QSFP pluggable optical modules. Although the ATLAS-1000 is designed as a flow-through security platform, especially for bump-in-the-wire applications, there’s also 1Tbyte worth of on-board local SSD storage.
Accolade Technology's ATLAS-1000 Fully Integrated 1U OEM Application Acceleration Platform
Here’s a block diagram of the ATLAS-1000 platform:
All network traffic enters the FPGA-based APP for packet processing. Packet data is then selectively forwarded to the x86 CPU COM Express module depending on the defined application policy.
Please contact Accolade Technology directly for more information about the ATLAS-1000.
A new LinkedIn post written by Iain Mosely, Founder of Converter Technology, and titled “Using the Xilinx Zynq SoC for Real Time Control of Power Electronics” details the author’s early experience in developing PWM controllers for power electronics converters using the Zynq SoC’s programmable logic. There are several reasons to note this post in Xcell Daily.
First, Mosely credits Xcell Daily author Adam Taylor for the help he’s provided through 169 installments of the MicroZed Chronicles (so far):
“Learning any new technology is great fun and challenging at the same time. We found the MicroZED Chronicles by Adam Taylor to be an excellent resource to help us get up and running and I really recommend looking at these if you want to learn about the Zynq technology.”
Then, Mosely gets to the heart of the matter:
“So, if you are used to the world of micro-controllers then a good way to think about the Zynq parts is to consider them as a micro-controller and FPGA on the same piece of Silicon. In fact the Zynq we used contains a dual-core ARM cortex A9 processor system (PS) so is a highly capable device surrounded by a huge amount of programmable logic (PL).”
“With the SoC approach used in the Zynq, the user now has the option to create their own custom hardware peripherals within the Zynq chip which can then be controlled by the on-chip ARM cores. What this really means is that the user has flexibility to create their own set of tightly coupled hardware peripherals (e.g. PWM block) using a hardware description language such as Verilog or VHDL. This gives incredible flexibility to the user to define exactly how they want their peripheral to behave and since it is implemented in physical gates on the device, timings can be highly deterministic.”
It’s reasonable to ask if this really is an efficient way to engineer a power converter. After all, many microcontrollers have PWM capabilities in their timer/counter peripherals. Mosely has an explanation:
“So, you might think that this is an awful lot of effort to go to to control a 30W flyback - and you would be correct! However, imagine the situation whereby you need to control multiple converters running out of phase (e.g. multiphase buck) or two converters in one system (e.g. PFC and downstream). What about multilevel converters for high voltage applications? In these more complex cases it can become increasingly difficult to control everything using a single micro-controller, especially if switching frequencies and loop bandwidths are being pushed to higher levels. Using a microcontroller is very much a 'sequential' approach to the design whereby at a certain point the processor cannot operate fast enough to implement the control algorithm. By offloading aspects of the real time control to the FPGA fabric in a device such as the Zynq, it becomes possible to run many operations in parallel which can bring significant speed advantages, especially in multi-phase systems.”
Note: If you’re not reading the frequent installments of Adam Taylor’s MicroZed Chronicles, you’re missing a lot of good help.
The Zynq-based Red Pitaya open instrumentation board gives you a programmable platform like an Arduino or a Raspberry Pi, but with the added kick of high-speed ADCs and DACs for analog instrumentation projects. The Red Pitaya organization always intended the Red Pitaya board to be learning tool, and Anton Potočnik at ETH Zürich has started writing a blog series to help you learn how to program the board. So far, he’s published four projects:
Red Pitaya Frequency Counter Block Diagram
The latest blog post is nearly a month old, so let’s hope there’s another soon.
For previous Xcell Daily blog posts about the Red Pitaya, see:
The new VC Z series of industrial Smart Cameras from Vision Components incorporate a Xilinx Zynq Z-7010 SoC to give the camera programmable local processing. The VC nano Z camera series is available as a bare-board imaging platform called the VCSBC series or as a fully enclosed camera called the VC series. The VSBC series is available with 752x480-pixels (WVGA), 1280x1024-pixels (SXGA), 1600x1200-pixel, or 2048x1536-pixel sensors. These camera modules acquire video at rates from 50 to 120 frames/sec depending on sensor size. All four of these modules are also available with remote sensor heads as the VCSBC nano Z-RH series to ease system integration. Thanks to the added video-processing horsepower of the Zynq SoC, these modules are also offered in dual-sensor, stereo-imaging versions called the VCSBC nano Z-RH-2 series.
Vision Components VCSBC nano Z-RH-2 industrial stereo smart camera module
These same cameras are also available from Vision Components with rugged enclosures and lens mounts as the VC nano Z series and the VC pro Z series. The VC pro Z versions can be equipped with IR LED illumination.
Vision Components VC pro Z enclosed industrial smart camera
The ability to create more than a dozen different programmable cameras and camera modules from one platform directly arises from the use of the integrated Xilinx Zynq SoC. The cameras use the Zynq SoC’s dual-core ARM Cortex-A9 MPCore processor to run Linux and to support the extensive programmability made possible by software tools such as Halcon Embedded from MVTech software, which allows you to comfortably develop applications on a PC and then export them to Vision Components’ Smart Cameras. The Zynq SoC’s on-chip programmable logic is able to perform a variety of vision-processing tasks such as white-light interferometry, color conversion, and high-speed image recognition (such as OCR, bar-code reading and license-plate recognition) in real time.
These cameras make use of the extensive, standard I/O capabilities in the Zynq SoC including high-speed Ethernet, I2C, and serial I/O while the Zynq SoC’s programmable I/O provides the interfacing flexibility needed to accommodate the four existing image sensors offered in the series or any other sensor that Vision Components might wish to add to the VC Z series of smart cameras in the future. According to Endre J. Tóth, Director of Business Development at Vision Components, these programmable capabilities give his company a real competitive advantage.
Here’s a 5-minute video detailing some of the applications you can address with these Smart Cameras from Vision components:
Note: For more information about these Smart Cameras, please contact Vision Components directly.
By Adam Taylor
As we are going to be looking at both the Zynq Z-7000 SoC and the Zynq UltraScale+ MPSoC in this series moving forward, one important aspect we need to consider is how we can best leverage the processor cores provided within our chosen device. How we use these cores of course, depends upon the system architecture we implement to achieve the overall system requirements. Increasingly, system designers use an asymmetric approach to obtain optimal performance and to address the system-design challenges. Of course, system-design challenges are usually application-specific.
At this point, those unfamiliar with the term may find themselves asking what an asymmetric approach is? Simply put, a asymmetric approach is one where different processing elements within a device perform different functions and indeed some may be running different operating systems to achieve that function. One example of this would be one of the two ARM Cortex-A9 processor cores of a Zynq SoC running Linux and handling system communications and other tasks, which do not need to be addressed in real time, while the second processor core runs a bare-metal application or a FreeRTOS application to execute real-time processing tasks and communicating results to the other core.
When we implement systems in such a manner, we call this asymmetric multi-processing or AMP. We have looked at AMP before, briefly. However, we did not look at the OpenAMP framework developed by the Multicore Association. This open-source framework is supported by both the Zynq SoC and Zynq UltraScale+ MPSoC and provides the software elements necessary for us to establish an AMP system. As such, it is something we need to be familiar with as we develop our examples going forward.
The alternative is a symmetric multi-processing (SMP) system in which all the cores run the same operating system and are balancing the workload among themselves. An example of this would be running Linux on both cores of a Zynq SoC.
Creating an AMP system allows us to leverage the parallelism provided by having several processing cores available, i.e. we can set different cores to perform different tasks under the control of a master core. However, AMP does come with challenges such as how inter-process communication is implemented, how resources are shared, process control, and how the life cycle is managed. The OpenAMP framework is designed to address these issues and to enable reuse and portability at the same time.
When working with the Zynq SoC and Zynq UltraScale+ MPSoC, we can implement AMP solutions which have the following configuration:
I should note at this point that while in the Zynq SoC, we can use one core to run Linux as the master, in the Zynq UltraScale+ MPSoC we can use the quad-core APU (based on ARM Cortex-A53 processorsto run Linux while using the dual-core RPU (based on ARM Cortex-R5 processors) as the remote to run the bare-metal or FreeRTOS applications.
The master core, running Linux contains most of the OpenAMP framework within the kernel. There are main three components:
The diagram below (taken from UG1186, “OpenAMP Framework for Zynq Devices”) illustrates the process between master and remote processor using OpenAMP.
Example of OpenAMP flow
Of course, when we build our bare-metal and FreeRTOS applications, we also need to include the necessary libraries within the BSP to enable these to support OpenAMP. The libraries we need to enable are the OpenAMP library and the libmetal library.
Having introduced the OpenAMP concept, next week we will look at how we can get an example up and running on a Zynq device.
Code is available on Github as always.
If you want E book or hardback versions of previous MicroZed chronicle blogs, you can get them below.
All of Adam Taylor’s MicroZed Chronicles are cataloged here.
The Pro AV industry’s transition from proprietary audio/video transport means to lower-cost, IP-based solutions is already underway but like any new field, the differing approaches make the situation somewhat chaotic. That’s why 14 leading vendors are launching the SDVoE (Software Defined Video Over Ethernet) Alliance at this week’s ISE 2017 show in Amsterdam (Booth 12-H55).
The SDVoE Alliance is a non-profit consortium that’s developing standards to provide “an end-to-end hardware and software platform for AV extension, switching, processing and control through advanced chipset technology, common control APIs and interoperability.” The consortium also plans to create an ecosystem around SDVoE technology.
An SDVoE announcement late last week said that fourteen new companies were joining the original six founding member companies (AptoVision, Aquantia, Christie Digital, NETGEAR, Sony, and ZeeVee). The new member companies are:
You might recognize Aquantia’s name on this list from the recent Xcell Daily blog post about the company’s new AQLX107 device, which packs an Ethernet PHY—capable of operating at 10Gbps over 100m of Cat 6a cable (or 5Gbps down to 100Mbps over 100m of Cat 5e cable)—along with a Xilinx Kintex-7 FPGA into one compact package.
The connection here is not at all coincidental. The Aquantia AQLX107 “FPGA-programmable PHY” makes a pretty nice device for implementing SDVoE and in fact, Aquantia and AptoVision announced such an implementation just today. According to this announcement, “Combined with AptoVision’s BlueRiver technology, the AQLX107 can be used to transmit true 4K60 video across off-the-shelf 10G Ethernet networks and standard category cable with zero frame latency. Audio and video processing, including upscaling, downscaling, and multi-image compositing are all realizable on the SDVoE hardware and software platform made possible by the AQLX107.”
The presence of Xilinx on this list of SDVoE Alliance members also should not be surprising. Xilinx has long worked with major Pro AV vendors to meet a wide variety of professional and broadcast-video challenges including any-to-any connectivity, all of the latest video-compression technologies, and video over IP.
In fact, Xilinx and its Xilinx Alliance Members will be demonstrating some of the most recent AV innovations and implementations in the Xilinx booth exhibiting at this week’s ISE show including:
Check out these demonstrations in booth 14-B132 at the ISE 2017 show.