We have detected your current browser version is not the latest one. Xilinx.com uses the latest web technologies to bring you the best online experience possible. Please upgrade to a Xilinx.com supported browser:Chrome,
Internet Explorer 11,
Safari. Thank you!
Xilinx, as one of the creators of field-programmable gate array (FPGA) technology for integrated-circuit design, has long embraced high-level synthesis (HLS) as an automated design process that interprets a desired behavior in order to create hardware that delivers that behavior. Xilinx has just introduced a book that clearly explains the process of creating an optimized hardware design using HLS.
The book, “Parallel Programming for FPGAs,” by Stephen Neuendorffer, Principal Engineer at Xilinx, together with Ryan Kastner from UCSD and Janarbek Matai from Cognex, is a practical guide for anyone interested in building FPGA systems. It is of particular value to students in advanced undergraduate and graduate courses. But it can also be useful for system designers and embedded programmers already on the job.
The book assumes the reader has a working knowledge of C/C++ programming -- which is like assuming someone knows how to drive a car with an automatic transmission -- and assumes familiarity with other basic computer architecture concepts. The book also includes a signiﬁcant amount of sample code. Any reader of the book is strongly encouraged to fire up a Vivado HLS and try the sample code out for themselves. Free licenses are available through Vivado WebPack Edition, or a free 30-day trial of Vivado System Edition.
The book also includes several textbook-like features that make it particularly valuable in a classroom setting. For instance, it also asks questions within each chapter that will challenge the reader to help solidify their understanding of the material as they read along. There are also associated projects that were developed and used in an HLS class taught at the University of California at San Diego (UCSD). UCSD will make the ﬁles for these projects available to instructors upon request. Each project is more or less associated with one chapter in the book and includes reference designs targeting FPGA boards that are distributed through the Xilinx University Program.
As you might expect, the complexity of each project increases as you read along, which means that the book is intended to be read sequentially. Using this approach, the reader can see, for example, how the optimizations of the HLS approach are directly applicable to a specific application. And each application further explains how to write HLS code. However, there are drawbacks to the teach-by-example approach. First off, most applications require some additional background to give the reader a better understanding of the computation being performed. Truly understanding the computation often requires an extensive discussion of the mathematical background of the application. That may be oﬀ-putting to a reader who just wants to understand the basics of HLS, but Neuendorffer believes that such a deep understanding is necessary to master the code restructuring that is necessary to achieve the best design.
Although the chapters in “Parallel Programming for FPGAs” are arranged to be read sequentially and grow in complexity as the reader moves along, a more advanced HLS user can read an individual chapter if he or she only cares to understand a particular application domain. For example, a reader interested in generating a hardware accelerated sorting engine can skip ahead to Chapter 10 without necessarily having to read all of the previous chapters.
Xilinx strongly embraces HLS as an effective design process for developing FPGA integrated circuits to build hardware that works smartly and effectively in the fields of automotive, aircraft, satellite and other emerging technology. “Parallel Programming for FPGAs” will be an effective and essential guide for developing such products going forward. Keep it within reach on the desk in your lab.
Matrix-vector multiplication architecture with a particular choice of array partitioning and pipelining.
The pipelining registers have been elided and the behavior is shown at right.