We have detected your current browser version is not the latest one. Xilinx.com uses the latest web technologies to bring you the best online experience possible. Please upgrade to a Xilinx.com supported browser:Chrome,
Internet Explorer 11,
Safari. Thank you!
As a winter-break project, Edward at MicroCore Labs ported his FPGA-based MCL86 8088 processor core to a vintage IBM PCjr. The MCL86 core combined with the minimum-mode 8088 BIU consumes 1.5% of the smallest Kintex-7 FPGA, the K70T. Four of the Kintex-7 FPGA’s 135 block RAMS hold the processor’s microcode. Edward disabled the cycle accuracy governor in the core and added 128Kbytes of internal RAM using the same Kintex-7 FPGA that he was using to implement the processor. The result: the world's fastest IBM PCjr running Microsoft DOS 2.1.
IBM PCjr sped up by a MicroCore Labs MCL86 processor core implemented in a Kintex-7 FPGA
Will the world beat a path to Edward’s door for fast antiquated personal computers? Probably not. The PCjr, code name “Peanut,” was IBM’s ill-fated attempt to enter the home market with a cost-down version of the original IBM PC. When it was announced on November 1, 1983, the entire market quickly developed a “Peanut” allergy. Adjectives such as “toylike,” “pathetic,” and “crippled” were used to describe the PCjr in the press.
The machine’s worst feature, the one to come under the most criticism, was its “Chiclet” keyboard (named after a chewing gum with a shape similar to the keyboard’s keys). IBM had gone from making the world’s best keyboard on the IBM PC to the world’s worst on the PCjr. After a year and a half of sales that dropped off a cliff as soon as the discounts ended, IBM killed the machine.
So what’s the point of MicroCore’s Franken-Peanut then?
It nicely demonstrates the vast implementation power of even the smallest Xilinx FPGAs. MicroCore Labs’ MCL86 processor core easily fits in low-cost FPGAs from the Spartan-6 and Spartan-7 product lines.
Finally, here’s a very short video of MicroCore’s jazzed-up IBM PCjr playing a little Bach:
It’s been fascinating to watch Apertus’ efforts to develop the crowd-funded Axiom Beta open 4K cinema camera over the past few years. It’s based on On Semi and CMOSIS image sensors and a MicroZed dev board sporting a Xilinx Zynq Z-7030 SoC. Apertus released a Team Talk video last November with Max Gurresch and Sebastian Pichelhofer discussing the current state of the project and focusing on the mechanical and housing aspects of the projects.
Here’s a photo from the video showing a diagram of the camera’s electronic board stack including the MicroZed board:
Axiom Beta open-source 4K Cinema Camera Electronic Board Stack
And here’s a rendering of the current thinking for a camera enclosure, which is discussed at length in the video.
Axiom Beta open-source 4K Cinema Camera Housing Concept
If you’re interested in following the detailed thought processes of this complex imaging product that also addresses many of the issues connected with a crowd-funded project like the Axiom Beta camera, watch this video:
Prodigy Kintex UltraScale Proto Package with DDR4, GPIO extension modules
The Kintex UltraScale KU115 FPGA is a DSP monster with 5520 DSP48E2 DSP slices, 1.451 million system logic cells, 75.9Mbits of BRAM, and 52 16.3Gbps GTH serial transceiver ports (48 of which are brought out to connectors on the S2C Prodigy Logic Module), and 832 I/O pins (656 of which are brought out to connectors on the S2C Prodigy Logic Module).
Want that S2C deal? (Of course you do!) Click here.
Better do it fast though, before S2C changes its mind.
Block Diagram of Atria Logic UHD H.264 Codec Solution
Atria Logic’s AL-H264E-4KI422-HW is a hardware-based, feature-rich, low-latency, high-quality, H.264 (AVC) UHD Hi422 Intra encoder IP core. The AL-H264E-4KI422-HW encoder pairs with the Atria Logic AL-H264D-4KI422-HW low-latency decoder IP.
The IP cores’ features include:
Complete modular implementation that you can customize and scale
264 Intra-only Hi422 Level 5.1 encoder and decoder
Integrated HDMI2.0 receiver and transmitter subsystems
YUV 4:2:2/4:4:4, RGB support
Very low latency at ~0.3sec
Variable bit rate (VBR) and constant bit rate (CBR) support
Video quality at 0.99% SSIM, or 50dB PSNR or higher
Video processing subsystem for pre/post processing including color-space conversion, video, scaling, and chroma subsampling
Gbps Ethernet streaming output support
When devising a plan for evaluating our UHD Encoder and Decoder IP cores and to meet 4K@60fps performance requirements, we needed a flexible, powerful platform. We settled on the Xilinx ZC706 evaluation kit based on the Zynq Z-7045 SoC because:
The Zynq Z-7045 SoC’s programmable logic can accommodate the encoder and decoder IP logic while meeting our stringent timing requirements to achieve the required performance.
The Zynq SoC’s processing system with its dual-core ARM Cortex-A9 MPCore processor gave us the ability to modify application driver software and to build customizations like an application-specific GUI.
The H.264 encoder supports the H.264 Hi422 (High-422) profile at Level 5.1 (3840x2160p30) for Intra-only coding. Support for 10-bit video content means that there is no grayscale or color degradation in terms of banding. Support for YUV 4:2:2 video content means that there is better color separation—especially noticeable for red colors—which makes images appear sharper. These video-quality attributes are especially important for medical-imaging applications.
Atria Logic UHD H.264 Encoder IP Block Diagram
Support for Intra-only encoding allows the H.264 encoder to operate at frame-rate latencies. A macroblock-line-level pipelined architecture further reduces the latency to the sub-frame level: about 0.3msec. Using a pipelined design that processes 8 pixels/clock allows the design to encode 4k@60fps in real time.
Implementation of the Atria Logic H.264 encoder consumes only 78% of the Zynq Z-7045 SoC’s programmable logic and DSP resources and 55% of the available RAM, leaving ample room for other required circuitry.
The H.264 decoder supports the H.264 Hi422 (High-422) profile at Level 5.1 (3840x2160p30) for Intra-only coding. As with the encoder, support for 10-bit video content means that there is no grayscale or color degradation in terms of banding. The decoder also supports YUV 4:2:2 video content. Support for Intra-only decoding using a pipelined architecture allows the decoder to operate at frame-rate latencies.
Atria Logic UHD H.264 Decoder IP Block Diagram
Low latency is important for any closed-loop man/machine application. When the Atria Logic AL-H264E-4KI422-HW encoder is connected to the Atria Logic AL-H264D-4KI422-HW low-latency decoder via an IP network, the glass-to-glass latency is about 0.6msec (excluding transmission latency). That’s about a 2-frame latency.
An efficient implementation of the Atria Logic H.264 decoder only takes up 68% of the Zynq Z-7045 SoC’s programmable logic resources, 35% of available DSP resources, and 45% of the available RAM, leaving ample room for implementation of any other required circuitry.
The design’s HDMI subsystem consists of two major modules: the Xilinx LogiCore HDMI TX and RX subsystems, configured as shown in the figure below:
The HDMI Transceiver (GTX) module transmits and receives the serial HDMI TX and RX data and converts between these serial streams and on-chip parallel data streams as needed. The transceiver module, which converts parallel data into serial and vice versa, employs the Zynq SoC’s high speed GT transceivers as the HDMI PHY.
The TX subsystem consists of the transmitter core, AXI video bridge, video timing controller, and an optional HDCP module. An AXI video stream carries two or four pixels per clock into the HDMI TX subsystem and supports 8, 10, and 12 bits per component. This stream conforms to the video protocol defined in the Video IP chapter of the AXI Reference Guide (UG761). The TX subsystem’s video bridge converts the incoming video AXI-stream to native video and the video timing controller generates the native video timing. The audio AXI stream transports multiple channels of uncompressed audio data into the HDMI TX subsystem. The Zynq Z-7045 SoC’s ARM Cortex-A9 processor controls the HDMI TX subsystem’s transmitter blocks through the CPU interface.
The HDMI RX subsystem incorporates three AXI interfaces. A video bridge converts captured native video to AXI streaming video and outputs the video data through the AXI video interface using the video protocol defined in the Video IP chapter of the AXI Reference Guide (UG761). The video timing controller measures the video timing. Received audio is transmitted through the AXI streaming audio interface. A CPU interface provides processor access to the peripherals’ control and status data.
The HDCP module is optional and is not included in the standard deliverables.
I thought I would kick off the new year with a few blogs that look at the Zynq SoC’s power-management options. These options are important for many Zynq-based systems that are designed to run from battery power or other constrained power sources.
There are several elements of the design we can look at, from the system and board level down to the PS and PL levels inside of the Zynq SoC. At the system level, we can look at component selection. We can use low-voltage devices wherever possible because they will have a lower static power consumption. We can also use lower-power DRAM by selecting components like LPDDR in place of DDR2. One of the simpler choices would be selecting a single-core Zynq SoC as opposed to a dual-core device.
Within the Zynq SoC itself, there are several things we can do both within the PS and PL to reduce power. There are two categories we can consider when it comes to reducing power consumption in Zynq-based systems:
Entering a low-power standby mode in which application execution is stopped. This is achieved by placing the Zynq PS in sleep mode and powering down the PL.
Optimizing the PS and the PL to reduce power during operation.
The first option allows us to reduce the power consumption after we have detected that the system has been inactive for a period and should therefore enter a low-power mode to prolong operating life on a battery charge. The second option allows us to make the best use of the battery capacity while operating. I will demonstrate the savings to be had with the Zynq SoC’s sleep mode and how to enter it in a follow-up blog. For the moment, I want to look at what we can do within the Zynq SoC’s PS to reduce power consumption. Most of these techniques relate to how we configure the clocking architecture within the PS.
As you can see in the diagram below, the Zynq SoC’s clocking architecture is very flexible. We can use this flexibility to reduce the power consumption of the Zynq PS.
Zynq SoC Clocking Architecture
The first approach we can take is to trade off performance against power consumption. We can reduce the power consumption within the Zynq SoC’s PS simply by selecting a lower APU frequency. Of course, this also reduces APU performance. However, as engineers one of our roles is to understand the overall system requirements and balance them. CMOS power dissipation is frequency dependent so we reduce power consumption by reducing the APU frequency, which has the potential to significantly reduce PS power dissipation. We can also use the same trade-off with the DDR SDRAM, trading memory bandwidth for reduced power.
Clock Configuration in the Zynq SoC – Reducing the APU frequency
Along with reducing the frequency of the APU, we can also implement a clocking scheme that reduces the number of PLLs used within the PS. The Zynq PS has three available PLLs named the ARM, IO, and DDR PLL. The clocking architecture allows downstream connections to use any one of the PLL sources, so a clocking scheme that uses fewer than all three PLLs results in lower power dissipation as unused PLLs can be disabled and their power consumption eliminated.
In addition, the application being developed may not require the use of all peripherals within the PS. We can therefore use the Zynq SoC’s clock-gating facilities to reduce power consumption by not clocking unused peripherals, further reducing the power consumption of the PS within the Zynq.
I performed a very simple experiment with a MicroZed board by inserting an ammeter into the USB power port supplying power to the MicroZed. This is a simple way to monitor the board’s overall power consumption. Running the Zynq PS alone with no design in the programmable logic, the MicroZed drew a current of 364mA @ 5V (1.825W) with the default MicroZed configuration.
I ran a few simple experiments to see the effect on the overall power consumption by reducing the clock frequency by half from 666MHz to 250MHz and then selecting the use of only one PLL—the DDR PLL—to clock the design. Running just from the DDR PLL reduced the current consumption to only 308mA, a 16% reduction. However, I did to have de-activate the unused PLL’s myself in my application. Reducing the frequency of the APU alone only reduced the overall current consumption to 345mA, a 6% reduction. So we see that turning off unused PLLs can have a big effect on power consumption.
If we want to gate the clocks to unused peripherals within the PS, we can use the Zynq SoC’s APER register to disable the clocks to that peripheral.
APER Control Register Bits
For a final experiment, I relocated the program to execute from the Zynq SoC’s on-chip RAM and disabled the DDR memory. For many applications, this may not be feasible but for some it may, so I thought it worthy of a test. Relocating the code further reduced the current consumption to 270mA (a 26% reduction) when combined with peripheral gating, APU frequency reduction, and running from one PLL alone.
Next time we will look at how we can place the processor into sleep mode.
Director of Strategic Marketing and Business Planning
Be sure to join the Xilinx LinkedIn group to get an update for every new Xcell Daily post! ******************** Steve Leibson is the Director of Strategic Marketing and Business Planning at Xilinx. He started as a system design engineer at HP in the early days of desktop computing, then switched to EDA at Cadnetix, and subsequently became a technical editor for EDN Magazine. He's served as Editor in Chief of EDN Magazine, Embedded Developers Journal, and Microprocessor Report. He has extensive experience in computing, microprocessors, microcontrollers, embedded systems design, design IP, EDA, and programmable logic.