VisualApplets SVDK from Silicon Software is a graphical, integrated FPGA development platform specifically for machine-vision applications using the Avnet Smart Vision Development Kit (SVDK), which is based on the Avnet PicoZed 7015T SOM incorporating the Xilinx Zynq Z-7015 SoC and a 1.2Mpixel Aptina camera module. One important aspect of the VisualApplets SVDK for machine-vision developers is that it uses the “language of image-processing people” rather than a hardware-centric sort of development environment. The VisualApplets SVDK development environment includes more than 200 image- and signal-processing operators. There’s a free, downloadable trial version of the development platform that doesn’t even require hardware for initial experimentation. If you don’t want to download the 150Mbytes, you can get a DVD from the company for €10.
Download the VisualApplets SVDK data sheet here.
Mercury Systems’ Ensemble LDS3506 rugged 3U OpenVPX embedded computing module melds an 8-core, 64-bit, Intel Xeon D-1540 processor with a Xilinx Kintex UltraScale KU040 or KU060 FPGA, sealing both devices in a large aluminum heat sink using the company’s 5th-generation server-class packaging technology for maximum thermal transfer. Here’s a photo of the module:
Mercury Systems’ Ensemble LDS3506 Rugged OpenVPX Embedded Computing Module
By Thomas Gage and Jonathan Morris, Marconi Pacific
ADAS makes safety and marketing sense. Whether it is Daimler, Toyota, Ford, Nissan, GM, another vehicle OEM or even Google, none are going to put vehicles on the road that can steer, brake or accelerate autonomously without having confidence that the technology will work. ADAS promises to first reduce accidents and assist drivers as a “copilot” before eventually taking over for them on some and eventually their entire journey as an “autopilot.”
As for how quickly the impacts of this technology will be felt, the adoption curves for any new technology look very similar to one another. For example, the first commercial mobile-phone network went live in the United States in 1983 in the Baltimore-Washington metropolitan area. At the time, phones cost about $3,000 and subscribers were scarce. Even several years later, coverage was unavailable in most of the country outside of dense urban areas. Today there are more mobile-phone subscriptions than there are people in the United States, and more than 300,000 mobile-phone towers connect the entire country. Low-end smartphones cost about $150. Vehicle technology is moving forward at a similar pace.
The Yamaha YMF262 OPL3 FM Sound Synthesis chip was popular as a sound generator for PC games back in the 1990s. Greg Taylor has reverse engineered the device and written a SystemVerilog RTL description targeting the low-cost Digilent ZYBO Development Board, which is based on a Xilinx Zynq Z-7010 SoC. The design consumes less than half of the programmable logic, very little on-chip RAM, and only one the 80 available DSP48 slices:
The SystemVerilog code’s on GitHub.
How does it sound? Listen for yourself (and dance in your chair):
Adam Taylor publishes the MicroZed Chronicles weekly in the Xcell Daily blog so you are likely familiar with his writing. Adam has just published an article about SDSoC and Zynq titled “High-level synthesis comes of age with SDSoC” on the Embedded.com Web site. Here’s his motivation for writing this article:
“…Zynq is a device every embedded system designer should be familiar with and considering for their application. At its heart the Zynq is not a FPGA with embedded processors -- like previous generations of FPGA with Power PCs -- but a true embedded processor with very flexible interfacing capabilities (DDR, CAN, UART, USB, Giga Bit Ethernet, SPI and I2C to name a few). What separates the Zynq from other embedded processors is the attached programmable logic, and with SDSoC embedded system developers can exploit this pretty simply…
SDSoC takes the eclipse front end, Vivado HLS, Vivado and a lot of behind the scenes intelligence to create seamlessly the option to accelerate software functions in the attached programmable logic of the device.”
What follows is a simple explanation of what SDSoC brings to the party from a design engineer’s perspective.
By Adam Taylor
Before I start on a more in-depth SDSoC example, I want to touch upon how we can debug our SDSoC application using the TCF debugger within SDSoC to watch register values, add beakpoints, etc.
Previously, Xcell Daily covered the topic of BOM cost reduction in the form of a new White Paper titled “Reducing System BOM Cost with Xilinx's Low-End Portfolio.” (See “It’s da BOM: How to lower BOM costs using non-intuitive design techniques—a free White Paper.”) For people who prefer to listen and watch rather than read, the same topic is now covered in a 25-minute EEJournal Chalk Talk video titled “10 Secrets to Getting a Lower BOM Cost” featuring Maureen Smerdon and Darren Zacher.
Here’s the video:
By Andy Chang, Senior Manager, Academic Research National Instruments Corp
The IoT has the potential to impact our lives profoundly. NI customers play a critical role in inventing, deploying and refining the consumer and industrial products and systems at the center of the IoT, as well as the wired and wireless infrastructure connecting those products and systems together. Spanning well over a decade, the NI and Xilinx technology partnership has provided engineers and scientists with tools to create world-changing innovations. NI has delivered latest generations of Xilinx devices in successive generations of its most advanced products, ranging from NI FlexRIO modules to CompactRIO controllers, as well as NI System on Module (SOM) and myRIO devices. NI takes great pride in its role helping innovators to design, build and test these intelligent devices with integrated software and hardware platforms.
Industrial systems interfacing the digital world to the physical world through sensors and actuators that solve complex control problems are commonly known as cyber-physical systems. These systems are being combined with Big Analog Data solutions to gain deeper insight through data and analytics. Imagine industrial systems that can adjust to their own environments or even their own health. Instead of running to failure, machines schedule their own maintenance or, better yet, adjust their control algorithms dynamically to compensate for a worn part, and then communicate that data to other machines and the people who rely on those machines.
The Zynq UltraScale+ MPSoC just taped out (see below) but you can still see one operating while we’re waiting for the first silicon to come back from the fab. Mentor Graphics demonstrated its multicore operating system framework on a multi-chip, FPGA-based emulation of the Zynq UltraScale+ MPSoC at this year’s Embedded World 2015. The demo shows SMP Linux running on the Zynq MPSoC’s four ARM Cortex-A53 64-bit RISC cores and the company’s Nucleus RTOS running on the Zynq MPSoC’s two ARM Cortex-R5 cores.
Here’s the video:
Note: You can read about the recent tapeout announcement for the Xilinx Zynq UltraScale+ MPSoC here: “We have Tapeout! Xilinx Zynq UltraScale+ MPSoC ready for TSMC’s 16FF+ process.”
Last year, Analog Devices announced the 2.5Gsamples/sec AD9625-2.5 ADC and the $750 AD-FMCADC2-EBZ FMC module. (See “Analog Devices kicks sample rate up a notch on AD9625 ADC: now 2.5Gsamples/sec”.) The company demonstrated this FMC module at this year’s Embedded World 2015 with two ADCs running in interleaved mode for an aggregate sample rate of 5Gsamples/sec. The FMC card is plugged into a Xilinx Virtex-7 VC707 Eval Kit and communicates over 16 lanes of JESD204B each running at 6.5Gbps. The on-board Virtex-7 XC7VX485T FPGA’s DSPs and high-speed logic fabric are used to interleave the readings from the two ADCs on the FMC card.
Here’s a short video showing the demo:
Last year, Xilinx Fellow Dr. Steve Trimberger gave a talk at the University of Toronto titled “The Three Ages of the FPGA.” Trimberger has worked at Xilinx since 1988—just three or four years after the company was founded—and he has seen the entire evolution of the FPGA during his tenure. Few people can claim to have the perspective that he’s earned by actually being there. His talk walks you through the three ages of the FPGA:
If you want to get an in-depth but painless look at three decades of FPGA development history, this 40-minute video is an ideal place to start.
Last year at FPL 2014, Xilinx Corporate Vice President of FPGA Development and Silicon Technology Liam Madden gave an excellent, in-depth keynote about the last several decades of silicon and IC packaging development from a personal, career perspective. If you are wondering why certain design decisions get made in the world of Xilinx FPGAs and if you’d like to know a lot more about Xilinx’s multi-generational exploration of 3D packaging technology, Madden’s talk is a clear-eyed, data-driven look at the many technologies involved and exactly why they were used.
Here’s a 1-hour recording of Madden’s presentation:
This year’s FPL conference takes place in September 2-5 at the Royal Institution in London. More information and registration here.
Last night, EETimes and EDN presented a number of ACE Awards including twelve “Ultimate Product” awards. The Xilinx SDAccel Development Environment for C, C++, and OpenCL won the Ultimate Product Award in the Development Kits category.
EETimes/EDN 2015 ACE Awards
From the SDAccel entry form:
“The SDAccel development environment for OpenCL, C, and C++, enables up to 25X better performance/watt for data center application acceleration leveraging FPGAs and combines the industry’s first architecturally optimizing compiler supporting any combination of OpenCL, C, and C++ kernels, along with libraries, development boards, and the first complete CPU/GPU-like development and run-time experience for FPGAs. SDAccel streamlines the development and deployment of critical algorithms such as Deep Neural Networks used in machine learning.
SDAccel includes the industry’s first architecturally optimizing compiler that makes efficient use of on-chip FPGA resources along with a familiar software-development flow based on an Eclipse integrated design environment (IDE) for code development, profiling and debugging, providing a CPU/GPU-like work environment.
SDAccel leverages Xilinx’s dynamically reconfigurable technology to enable accelerator kernels optimized for different applications to be swapped in and out on the fly. The applications can have multiple kernels swapped in and out of the FPGA during run-time without disrupting the interface between the server CPU and the FPGA for nonstop application acceleration. This functionality is ideal for swapping applications during peak loading periods.”
(Note: For more information about the Xilinx SDAccel design environment, see “CPU/GPU-like software development environment for OpenCL, C, C++ delivers FPGA-based app acceleration with 25x better performance/W,” “and “Latest SDAccel release adds 4 new hardware dev platforms, 4 new libraries, 6 new design services firms.”)
Vinay Singh accepts an ACE Award for the SDAccel design environment from Max Maxfield
The LabVIEW Communications System Design Suite from National Instruments (NI) won the ACE Ultimate Product Award in the Software category. NI’s LabVIEW Communications System Design Suite combines software defined radio (SDR) hardware with a comprehensive, unified software design flow to help engineers prototype 5G systems. The package includes built-in application frameworks for WiFi and LTE that enable wireless developers to focus on creating specific components based on existing standards rather than designing new algorithms from scratch.
The LabView Communications System Design software is coupled with the company’s USRP software-defined radio development platform for 5G research, which is based on a Xilinx Kintex-7 All Programmable FPGA. Wireless engineers can use the NI USRP RIO and the NI LabVIEW Communications System Design software to rapidly prototype real-time wireless communications systems and test them under real-world conditions.
(Note: For more information about the NI LabVIEW Communications System Design Suite, see “LabVIEW Communications System Design Suite combines SDR hardware with a unified software design flow for 5G development.”)
Congratulations to all of the talented developers from both National Instruments and Xilinx who created these award-winning products.
Last week at SEMICON West held in San Francisco, Eye Vision Technology (EVT) CEO Michael Beising showed me two system-development platforms his company has based on the Xilinx Zynq SoC. We also discussed a new finished product—a Particle Inspector for clean-room particulate metrology—that employs one of these Zynq SoC development boards to operate a line-scan camera.
The new Xilinx VCU108 Eval Kit gives you fast access to a 20nm Virtex UltraScale XCVU095 FPGA (941K logic cells, 60.8Mbits of Block RAM, 768 DSP slices, 32 GTH 16Gbps SerDes ports, 32 GTY 30.5Gbps SerDes ports, four PCIe hard embedded blocks, six Interlaken blocks, and four 100G Ethernet hard MACs) and provides a great platform for prototyping systems that require massive data flow and packet processing such as 400+ Gbps systems, large-scale emulation, and high performance computing. The Eval Kit’s pc board also includes:
Here’s a photo of the board in the VCU108 Eval Kit:
Xilinx VCU108 Eval Kit
The kit also includes a full seat of Vivado Design Suite: Design Edition, device-locked to the XCVU095 FPGA.
Following closely upon the heels of this month’s recent tapeout announcement for its 16nm Zynq UltraScale+ MPSoC (see “We have Tapeout! Xilinx Zynq UltraScale+ MPSoC ready for TSMC’s 16FF+ process"), Xilinx has now created an early access release of the Vivado Tool Suite that supports its 16nm UltraScale+ device portfolio and has added support for the Zynq MPSoC to the Eclipse-based Xilinx Software Development Kit and PetaLinux tools. For more information about early access to tools for the UltraScale+ portfolio, contact your friendly neighborhood Xilinx sales representative.
Xilinx has announced the public access release of the SDSoC development environment for Zynq SoCs and MPSoCs. The SDSoC development environment provides a familiar embedded C/C++ application development experience including an easy to use Eclipse IDE and a comprehensive design environment for heterogeneous processing designs based on the Zynq SoC and MPSoC device families. Max Maxfield, who writes for embedded.com and EETimes, wrote about the SDSoC design environment back in March: “The thing that makes this so useful for system architects, platform architects, and software developers is that so much of the magic happens ‘under the hood.’”Read more...
By Adam Taylor
In conventional SoC design, we would implement our module using RTL and create (hopefully) a test bench to ensure our RTL functions as intended against the specification. This test bench will of course be used to test the module’s boundary and corner cases.
When we develop algorithms in high-level software, we may develop a test harness which itself calls the software function we are developing, subjects it to inputs, and checks that the function behaves as expected. In many ways this is similar to an RTL test bench, however it is typically quicker to execute a piece of compiled code than it is to run an RTL simulation. Checking functions in software allows us to demonstrate that we have implemented the correct algorithm (and implemented the algorithm correctly) prior to accelerating the function in hardware.
When using SDSoC to accelerate functions, we need to consider other aspects that may cause an issue when we accelerate them. These issues can be grouped into three overarching groups:
Build errors occur with SDSoC as the tool attempts to create the executables and boot files. The cause of the error will be reported within the log and report files. Build-error causes include typos, syntax errors and failure to follow coding guidelines, and implementation issues. Implementation issues include aspects such as Vivado HLS or Vivado not being able to achieve the desired timing. If this is the case, you must again look at the defined clock frequencies and optimization options.
A final-build error can occur when the accelerated design does not fit within the PL side of the selected Zynq device. If this is the case all is not lost. You can look at accelerating fewer functions or modifying your hardware optimizations to produce more compact implementations. For example, you can prohibit loop unrolling.
Runtime errors are very similar to those experienced in traditional software development (e.g. incorrect results or premature completion of the program). Of course these problems can be debugged using the profiler and debugger within SDSoC. There is however anther issue which can occur and that is the program “hanging.” This occurs when streaming data transfers are mismatched between the producer and the consumer. A “hang” occurs when the producer has stopped producing data while the consumer is waiting for more data. Here’s an example:
The code snippet above shows a simple example which will implement as a streaming connection and hang when executed. Of course the reason is very straightforward. A simple “<” operator in place of a “<=” operator results in only 19 reads while the consumer is expecting 20 reads and will hang waiting for the 20th.
Performance issues arise when there are problems with the algorithm we have accelerated or with the time taken to transfer data to or from the accelerated function. The next blog in this series will look at how we can debug such applications.
Incidentally I am talking at the Embedded Systems Conference in Santa Clara this week on using lessons learned from VHDL/FPGA space development here on earth. If you see me about please come over and say hello.
Note: Xilinx has announced the public access release of the SDSoC development environment for Zynq SoCs and MPSoCs.
Now, you can have convenient, low-cost Kindle access to the first year of Adam Taylor’s MicroZed Chronicles for a mere $7.50. Click here.
Please see the previous entries in this MicroZed Chronicles series by Adam Taylor:
By David Squires, Vice President of Business Development, BEEcube, A National Instruments Company
As wireless operators continue their relentless march to be the first to provide consumers with new services and devices, additional bandwidth and service plans that yield higher profits, infrastructure companies are also racing to field the 5G equipment that will form the foundation of the next generation of wireless communications. To enable this 5G wireless infrastructure, BEEcube (recently acquired by National Instruments) leveraged Xilinx FPGAs and Zynq-7000 All Programmable SoCs to provide 5G equipment manufacturers with a new emulation system as well as a mobile-handset emulator. The BEE7 and nanoBEE are enabling design teams to be innovative and productive so they can bring 5G technologies to market ahead of the competition.
BEE7 PLATFORM ARCHITECTURE: The BEE7 platform is a state-of-the-art architecture that BEEcube designed from the ground up to meet the above requirements of next-generation communications systems. The amount of data that must be moved quickly and efficiently is enormous. The heart of the BEE7 prototyping system is the Xilinx XC7VX690T. This device combines 80 serial transceivers with 3,600 DSP slices, making the 690T a world-class engine for advanced wireless applications (both to prototype and for early field trials).
The Industrial Internet Consortium (IIC) was founded in March 2014 to bring together the organizations and technologies necessary to accelerate growth of the Industrial Internet by identifying, assembling, and promoting best practices. Members include small and large technology innovators, vertical market leaders, researchers, universities, and governments.
And now the IIC’s membership roster includes Xilinx.
The IIC’s goals include:
The IIC membership list already includes many Xilinx customers and partners and Xilinx has now joined them in this work. The IIoT is driving the fourth wave of the industrial revolution and is one of six key Megatrends identified by Xilinx as critical to shaping the future of many next-generation systems.
MIT researchers presented a new system called BlueDBM at the International Symposium on Computer Architecture in June that could make servers using FPGAs flash storage, in-store processing, and integrated networks for cost-effective analytics of large datasets.as efficient as those using conventional RAM while cutting power consumption for several common big-data applications. According to the paper presented at the Symposium, “a rack-sized BlueDBM system is likely to be an order of magnitude cheaper and less power hungry than a cloud based system with enough DRAM to accommodate 10TB to 20TB of data.”
The prototype system design is based on a Xilinx KC707 Eval Kit (which incorporates a Xilinx Virtex-7 XC7VX485T FPGA) with two custom Artix-7 FPGA boards to control Flash memory arrays. Here’s a photo and block diagram of a BlueDBM storage node:
Photo and Block Diagram of a BlueDBM Storage Node
The paper referenced above contains a lot of performance and power-consumption data for various applications.
Note: Xcell Daily discussed an earlier version of the BlueDBM system last year. See “FPGA+Flash = Big-Data Analytics. Hadoop acceleration anyone?”
The following Xilinx video demonstrates a max-speed, 8-lane JESD204B interface operating between an Analog Devices AD-FMCDAQ2-EBZ high-speed analog FMC module and a Xilinx Kintex UltraScale KCU105 development board. The GTX SerDes ports of both the mid-range Kintex-7 and Kintex UltraScale devices support JESD204B’s maximum line rate of 12.5Gbps per channel and Kintex UltraScale devices support this line rate across all device speed grades.
Note: The Xilinx KCU105 Eval board is on sale at the moment. See “Xilinx Summer Special: Hot sale on KCU105 Eval Kit. $500 off!”
PCIe has been an intra-system connection interface of choice for quite a while now, offering a standardized and well-understood way to move large amounts of data quickly between, for example, a host CPU and an FPGA. Efficient, high-speed PCIe screams for DMA and if that’s what you need, the latest version of RIFFA 2.2 (Reusable Integration Framework For FPGA Accelerators) is now posted on GitHub and includes DMA IP you might want for your current or your next design.
RIFFA 2.2 employs communication channels between software threads running on a CPU and hardware user cores instantiated on an FPGA. A channel is similar to a network socket in that it must be opened before it can be read and written. Then it must be closed. However, unlike a network socket, channel reads and writes be made simultaneous by using two threads. Each channel is independent and thread-safe. RIFFA 2.2 supports as many as twelve channels.
Here’s a block diagram of the RIFFA hardware/software architecture:
RIFFA Hardware/Software Architecture
But the real measure of a DMA controller is its ability to move data quickly and efficiently. According to the UCSD RIFFA Web page, the latest RIFFA versions are “able to saturate the PCIe link for nearly all link configurations supported.” The following chart shows the performance of designs based on RIFFA 2.1 using the 32 bit, 64 bit, and 128 bit interfaces:
For more detailed information about RIFFA, see the RIFFA 2.2 documentation and these two papers:
Face it. You’re itching to get your fingers on one of those new Xilinx UltraScale FPGAs and try one out. Right? Well, if you’ve waited this long and held your acquisitive nature in check, here’s your reward:
From now until the end of September, you can get the Kintex UltraScale KCU105 Eval Kit at a $500 discount.
Here’s a photo of the board in the kit:
Xilinx Kintex UltraScale KCU105 Evaluation KitRead more...
By Bijan R. Rofoee, Mayur Channegowda, Shuping Peng, George Zervas, and Dimitra Simeonidou
By 2050, the human population will have reached 9 billion people, with 75 percent of the world’s inhabitants living in cities. With already around 80 percent of the United Kingdom’s population living in urban areas, the U.K. needs to ensure that cities are fit for purpose in the digital age. Smart cities can help deliver efficiency, sustainability, a cleaner environment, a higher quality of life and a vibrant economy. To this end, Bristol Is Open (BIO) is a joint venture between the University of Bristol and Bristol City, with collaborators from industry, universities, local communities, and local and national governments. Bristol Is Open (www.bristolisopen.com) is propelling this municipality of a half million people in southwest England to a unique status as the world’s first programmable city.
Bristol will become an open testing ground for the burgeoning new market of the Industrial Internet of Things—that is, the components of the smart-city infrastructure. The Bristol Is Open project leverages Xilinx All Programmable FPGA devices in many areas of development and deployment.Read more...
By Mike Santarini, Publisher, Xcell Journal
Six important emerging markets—video/vision, ADAS/autonomous vehicles, Industrial Internet of things, 5G wireless, SDN/NFV and cloud computing—will soonmerge into an omni-interconnected network of networks that will have a far-reaching impact on the world we live in. This convergence of intelligent systems will enrich our lives with smart products that are manufactured in smart factories and driven to us safely in smart vehicles on the streets of smart cities—all interconnected by smart wired and wireless networks deploying services from the cloud.
Xilinx Inc.’s varied and brilliant customer base is leveraging Xilinx All Programmable devices and software-defined solutions to make these new markets and their convergence a reality. Let’s examine each of these emerging markets and take a look at how they are coming together to enrich our world. Then we’ll take a closer look at how customers are leveraging Xilinx devices and software-defined solutions to create smarter, connected and differentiated systems that in these emerging markets to shape a brilliant future for us all.
IT STARTS WITH VISION: Vision systems are everywhere in today’s society. You can find cameras with video capabilities in an ever-growing number of electronic systems, from the cheapest mobile phones to the most advanced surgical robots to military and commercial drones and unmanned spacecraft exploring the universe. In concert, the supporting communications and storage infrastructure is quickly shifting gears from a focus on moving voice and data to an obsession with fast video transfer.
ADAS’ DRIVE TO AUTONOMOUS VEHICLES: If you own or have ridden in an automobile built in the last decade, chances are you have already experienced the value of ADAS technology. Indeed, perhaps some of you wouldn’t be here to read this article if ADAS hadn’t advanced so rapidly. The aim of ADAS is to make drivers more aware of their surroundings and thus better, safer drivers.
IIOT’S EVOLUTION TO THE FOURTH INDUSTRIAL REVOLUTION: The term Internet of Things has received much hype and sensationalism over the last 20 years—so much so that to many, “IoT” conjures up images of a smart refrigerator that notifies you when your milk supply is getting low and the wearable device that receives the “low-milk” notification from your fridge while also fielding texts, tracking your heart rate and telling time. These are all nice-to-have, convenience technologies. But to a growing number of people, IoT means a great deal more. In the last couple of years, the industry has divided IoT into two segments: consumer IoT for convenience technologies (such as nifty wearables and smart refrigerators), and Industrial IoT (IIoT), a burgeoning market opportunity addressing and enabling some truly major, substantive advances in society.
INTERCONNECTING EVERYTHING TO EVERYTHING ELSE: In response to the need for better, more economical network topologies that can efficiently and affordably address the explosion of data-based services required for online commerce and entertainment as well as the many emerging IIoT applications, the communications industry is rallying behind two related network topologies: software-defined networks and network function virtualization.
SECURITY EVERYWHERE: As systems from all of these emerging smart markets converge and become massively interconnected and their functionality becomes intertwined, there will be more entry points for nefarious individuals to do a greater amount of harm affecting a greater amount of infrastructure and greater number of people. The many companies actively participating in bringing these converging smart technologies to market realize the seriousness of ensuring that all access points in their products are secure. A smart nuclear reactor that can be accessed by a backdoor hack of a $100 consumer IoT device is a major concern. Thus, security at all point points in the converging network will become a top priority, even for systems that seemingly didn’t require security in the past.
XILINX PRIMED TO ENABLE CUSTOMER INNOVATION: Over the course of the last 30 years, Xilinx’s customers have become the leaders and key innovators in all of these markets. Where Xilinx has played a growing role in each generation of the vision/video, ADAS, industrial, and wired and wireless communications segments, today its customers are placing Xilinx All Programmable FPGAs, SoCs and 3D ICs at the core of the smarter technologies they are developing in these emerging segments.
Note: This blog post has been excerpted from Mike Santarini’s far more detailed article in the special Megatrends issue of Xcell Journal (Issue 92) that has just been published. To read the full article, click here or download a PDF of the entire issue by clicking here.
By Mike Santarini, Publisher, Xcell Journal
The new special issue of Xcell Journal celebrates the ways in which Xilinx customers are enabling a new era of innovation in six key emerging markets: vision/video, ADAS/autonomous vehicles, Industrial IoT, 5G, SDN/NFV and cloud computing. Each of these segments is bringing truly radical new products to our society. And as the technologies advance over the next few years, the six sectors will converge into a network of networks that will bring about substantive changes in how we live our lives daily.
Vision systems are quickly becoming ubiquitous, having long since evolved beyond their initial niches in security, digital cameras and mobile devices. Likewise undergoing rapid and remarkable growth are advanced driver assistance systems (ADAS), which are getting smarter and expanding to enable vehicle-to-vehicle communications (V2V) for autonomous driving and vehicle-to-infrastructure (V2I) communications that will sync vehicles with smart transportation infrastructure to coordinate traffic for an optimal flow through freeways and cities.
These smart vision systems, ADAS and infrastructure technologies form the fundamental building blocks for emerging Industrial Internet of Things (IIoT) markets like smart factories, smart grids and smart cities—all of which will require an enormous amount of wired and wireless network horsepower to function. Cloud computing, 5G wireless and the twin technologies of software-defined networking (SDN) and network function virtualization (NFV) will supply much of this horsepower.
Converged, these emerging technologies will be much greater than the sum of their individual parts. Their merger will ultimately enable smart cities and smart grids, more productive and more profitable smart factories, and safer travel with autonomous driving.
By Adam Taylor
As we briefly examined last week, while HLS is very powerful we must create the correct coding structures if the code is to synthesize efficiently. At the highest level there, are a number of high level rules which we must follow:
The first point is very self-explanatory as system calls task the operating system, which is clearly not going to be synthesized. Examples of system calls are printf(), fprintf(), time(), and sleep() etc. If we need to include system calls or no synthesizable constructs within the function for use during testing, we can use the macro __SYNTHESIS__ to exclude non-synthesizable code. Of course we need to be very careful not to use this macro to alter the behavior of the function so that it behaves differently when synthesized.
When writing traditional C functions, it is often common to allocate and free system memory using the malloc(), alloc(), and free() functions. This coding technique poses issues for synthesis as we need to ensure the memory requirements of the function to be synthesized are bounded. Unbounded memory cannot be synthesized using finite hardware resources.
In this case it is often best to use a user-defined macro—as in the code below—to ensure that the code is the same when synthesized or not:
The above code snippet demonstrates how we can avoid system calls and use constructs that are fixed and bounded, thus easing synthesis.
One of the most popular constructs in C is the pointer. We can use pointers within code we intend to synthesize however we do need to follow some basic rules if they are to be synthesized efficiently. The first and most important rule involves casting. If we want to be able to use HLS, then we can cast between native C types. However we cannot cast between general types. It is permissible to use pointers within the function parameter list, to use pointer arithmetic, and use pointers to arrays.
We also need to ensure that we do not use recursive functions within the function that we wish to accelerate using HLS. A recursive function is one that calls itself either a finite or infinite number of times.
As well as coding styles that enable synthesis, there are also some coding styles that enable better optimization. For example, a loop with variable bounds can prevent optimization because HLS cannot determine the loop latency. We can address this problem using an assert macro in the C code to provide the maximum number of loops. It is worth noting here that we cannot unroll a variable bounds loop because HLS does not know how much hardware to create. In such cases, we may wish to look into ways to rewrite the function based on a fixed number of iterations.
Loops provide a great area for optimization. Remember in our previous example, we worked on optimizing a loop to increase the throughput. Using nested loops, we can pipeline either the inner loop (like we did in our example), which will result in the smallest logic footprint, or we can pipeline the outer loop, which will unroll the inner loop and create dedicated hardware for each iteration. Of course, increasing performance demands more resources, which may or may not be allowable depending upon resources available.
To flatten loops, they need to be either perfect or semi perfect. There’s only one difference between the two. A perfect loop has defined bounds while a semi perfect loop allows the outer loop bound to be variable. However both perfect and semi perfect loops must also comply with the following constructs:
Having now looked more at coding styles, I will be looking at verification and test benches in the next blog before we start looking at more SDSoC examples.
If you visited Xilinx.com today, you will have noticed a very different representation of Xilinx. The Web site change represents Xilinx’s latest step forward in an ongoing corporate transformation into a new era of offerings. The change also brings focus on six key “Megatrends” that are changing the world we live in:
Xilinx participates in all of these Megatrends and you’ll find a substantial amount of new material about them in the redesigned Xilinx.com Web site. You’ll also discover a significant amount of new information about the design and development solutions that are uniquely Xilinx, based on the company’s All Programmable (hardware, software, I/O programmability) device technology (FPGAs, SoCs, and MPSoCs) and a combination of industry-standard and unique software tools in the growing SDx family of development environments that support rapid, high-level development using Xilinx devices.
You will also discover extensive and intensely interesting coverage of these Megatrends in the latest, just-published edition of Xcell Journal. Click here to read the new edition of Xcell Journal online or here to download the PDF.
Note: If you usually access the Xcell Daily blog using the link on the Xilinx.com home page, it has moved. You’ll now find it under the “About” drop-down tab at the top of every Web page on Xilinx.com. So no matter where you are on the site, Xcell Daily is just a couple of clicks away.
The broadcast video market is in a transition period between the use of proprietary broadcast-specific networks and Ethernet. The proprietary broadcast video network protocols came into existence because, at the time, Ethernet speeds and standards were not up to the demands of live video production. Now they are. Ethernet-based broadcast networks promise to reduce video production costs by allowing the use of standard Ethernet cabling and switches to ship video from multiple cameras at an event to the production and editing suite. But there’s plenty of equipment already in the field that uses the older broadcast video network standards. Two new products developed by Xilinx Premier Alliance Member CoreEL in conjunction with BBC R&D can adapt existing broadcast equipment to Ethernet networks specifically for venues such as small- and medium-scale sporting events and multi-location live events (like elections).Read more...