Editor’s Note: This content is contributed by Thanaporn Sangpaithoon, General Manager of Design Gateway Co., Ltd.
The Xilinx® UltraScale+™ FPGAs and SoCs with GTY transceivers can support a PCI Express® Gen4 interface. Design Gateway’s NVMe Host Controller IP core is designed to leverage the GTY transceivers to support the latest NVMe SSD drive PCIe Gen4 technology. The IP core is implemented on Xilinx’s Virtex® UltraScale+ FPGA VCU118 Evaluation Kit and able to achieve incredibly fast read/write performance—more than 4GB/s.
Implementation of NVMe Host Controller on UltraScale+ GTY Transceiver
Conventionally, the NVMe host is implemented by using a host processor operating with a PCIe controller for transferring data to and from the NVMe SSD. The NVMe protocol is implemented for device driver communications with the PCIe controller hardware’s CPU peripheral connected through a very high-speed bus. External DDR memory is required for data buffering and command queue to transfer the data between the PCIe controller and SSD.
UltraScale+ devices with GTY transceivers are capable of PCIe Gen4 interface support. However, a PCIe Gen4 integrated block and Arm® processor are not available on some devices.
Design Gateway solved this problem by developing the NVMeG4-IP core that can run as a stand-alone NVMe host controller with built-in PCIe soft IP and PCIe bridge logic in a single core. Enabling NVMe PCIe Gen4 SSD access with a simplified user interface and standard features allows ease of use without needing knowledge of the NVMe protocol.
Implement application layer, transaction layer, data link layer, and some parts of the physical layer to access the NVMe SSD without CPU and external DDR memory required
Operate Xilinx PCIe PHY IP configured as a 4-lane PCIe Gen4 (256-bit bus interface)
Includes 256Kb RAM data buffer
Supports six commands, i.e., Identify, Shutdown, Write, Read, SMART, and Flush (support additional command as optional)
User clock frequency must be more than or equal to PCIe clock (250MHz for Gen4)
Available reference design:
ZCU102 with AB17-M2FMC adapter board
KCU105 with AB18-PCIeX16/AB16-PCIeXOVR adapter board
VCU118 with AB18-PCIeX16 adapter board
FPGA resources on the XCVU9P-FLGA2104-2L FPGA device are shown in the table below.
Example Implementation Statistics for UltraScale+ DevicesBecause of very low FPGA resource usages, the NVMeG4-IP core is also suitable for building a multi-channel RAID system with very high performance and the lowest possible FPGA resource consumption.
Implementation and Performance Result on the VCU118
NVMeG4-IP demo environment set up on VCU118. (Image source: Design Gateway)
The example test results when running the demo system on the VCU118 while using the 1 TB GIGABYTE AORUS NVMe PCIe Gen4 SSD is shown in the figure below.
NVMe SSD read/write performance on the VCU118 by using GIGABYTE AORUS NVMe PCIe Gen4 SSD (Image source: Design Gateway)
The NVMeG4-IP core provides a solution to enable the NVMe PCIe Gen4 SSD interface on the VCU118 evaluation kit and also the solution for Xilinx’s UltraScale+ device family features with GTY transceivers without a PCIe Gen4 integrated block. NVMeG4-IP delivers the highest possible performance with the lowest possible FPGA resource usage for NVMe SSD access without requiring a CPU. It is very suitable for high-performance NVMe storage without CPU invention and able to implement multiple NVMe SSD interfaces by utilizing GTY transceivers without limitations from the number of available PCIe integrated blocks in the FPGA device.
For more detail of NVMeG4-IP and available reference design, please visit Design Gateway’s website at