cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 

Partner: The cost-effective 100Gb/s Network & NVMe PCIe Gen4 Storage FPGA development platform by Xilinx’s Kintex UltraScale+ device

xtech-blogs
Xilinx Employee
Xilinx Employee
0 0 1,015

Editor’s Note: This content is contributed by Thanaporn Sangpaithoon, General Manager at Design Gateway Co., Ltd.

 

Overview

The Kintex® UltraScale+™ family is considered to be the best price/performance/ watt balance FPGA device built on TSMC 16nm FinFET Technology from Xilinx®. Combine with new UltraRAM, new interconnect optimization technology (SmartConnect), this device can deliver the most cost-effective solution for applications that require high-end capabilities GTY transceivers for 100Gbps network and PCIe® Gen4 connectivity, especially for networking and data storage application.

This article demonstrates the 100Gb/s solution of TCP Offload Engine networking  IP Core and NVMe PCIe Gen4 SSD Host IP Core implementation on Xilinx’s KCU116 Evaluation Kit, which is no CPU solutions for 12GB/s TCP transmission over 100GbE interface and NVMeG4-IP core, which is able to achieve incredibly fast performance ~4GB/s per SSD.

The KCU116 is ideal for evaluating key Kintex UltraScale+ features, equipped with onboard 32bit DDR4-2666, FMC expansion ports for M.2 NVMe SSD, and PCIe Gen4 x8 lanes. The 16 x 28Gb/s GTY transceivers available for both PCIe Gen4 and 100GbE interface for our demo implementation.

Figure 1: KCU116 Evaluation Kit. (Image source: Xilinx Inc.)Figure 1: KCU116 Evaluation Kit. (Image source: Xilinx Inc.)

 

Implementation of 100Gb/s Networking & Storage Solutions

Figure 2: 100Gbps networking & storage solution on KCU116Figure 2: 100Gbps networking & storage solution on KCU116

 

The complex networking and NVMe storage protocol processing is possible to implement by leveraging Design Gateway’s IP Cores solutions

  • TOE100G-IP : 100GbE Full TCP protocol stack IP Core without need CPU.
  • NVMeG4-IP : Standard alone NVMe Host Controller with built-in PCIe Gen4 Soft IP

Both TOE100G-IP and NVMeG4-IP can operate without CPU/OS/Driver. Designers can implement customized control and data path with both IPs by using pure hardware logic or bare-metal OS with MicroBlaze.

Both IPs enablethe development of high-level applications and algorithms faster and easier without worrying about complex networking and NVMe protocol.

 

Design Gateway’s TOE100G-IP for UltraScale+ Device

Figure 3: TOE100G-IP systems. (Image source: Design Gateway)Figure 3: TOE100G-IP systems. (Image source: Design Gateway)

TOE100G IP core implements TCP/IP stack by hardwire logic and connects with Xilinx’s 100Gb Ethernet Subsystem module for the lower-layer hardware. The user interface of TOE100G IP consists of a register interface for control signals and FIFO interface for data signals. TOE100G IP is designed to connect with a 100Gb Ethernet subsystem, which uses 512-bit AXI4-ST as a user interface. Ethernet subsystem, provided by Xilinx, includes EMAC, PCS, and PMA function. The clock frequency of the user interface of 100Gb Ethernet subsystem is equal to 322.265625 MHz

TOE100G-IP’s Features

  • Full TCP/IP stack implementation
  • Support one session by one TOE100G IP (Multisession can be implemented by using multiple TOE100G IPs))
  • Support both Server and Client mode (Passive/Active open and close)
  • Support Jumbo frame
  • Simple data interface by standard FIFO interface
  • Simple control interface by single-port RAM interface
  • Designed to connect with Xilinx’s 100Gb Ethernet Subsystem

FPGA resource usages on the XCKU5P-2FFVB676E FPGA device are shown in Table 1 below. 

Table 1.png

 

Xilinx’s 100Gb Ethernet Subsystem

Xilinx’s 100G Ethernet Subsystem implements the MAC layer and Physical layer for 100Gb Ethernet. The user interface to connect with TOE100G IP is 512bit AXI4 stream. Xilinx provides 100G Ethernet Subsystem (Ethernet MAC and Ethernet PCS/PMA) with many features, described on the following website.
https://www.xilinx.com/products/intellectual-property/cmac_usplus.html

 

Design Gateway’s NVMe PCIe Gen4 Host Controller for GTY Transceivers

More details of the NVMeG4-IP for GTY Transceivers are described Xilinx’s Adaptable Avantage Blog.

https://forums.xilinx.com/t5/Adaptable-Advantage-Blog/Partner-NVMe-PCIe-Gen4-Host-Controller-Core-by-Leveraging-Xilinx/ba-p/1144836

FPGA resource usage for NVMeG4-IP implementation is shown in Table 2 below.

Table 2.png

 

Example TOE100G-IP implementation & performance result on KCU116

Figure 4 shows the overview of the reference design based on the KCU116 to demonstrate TOE100G-IP implementation. The demo system includes bare-metal OS Microblaze systems, user logic, and Xilinx’s 100Gb Ethernet Subsystems.

Figure 4: TOE100G-IP demo systems block diagram. (Image source: Design Gateway)Figure 4: TOE100G-IP demo systems block diagram. (Image source: Design Gateway)

The demo system is designed to evaluation TOE100G-IP operation in both Client and Server mode. The test logic allows testing by sending and receiving data with a test pattern for the highest possible data speed at user interface side. For 100GbE interface with KCU116, 4 x SFP+ transceiver (25GBASE-R) and fiber cable are required as shown in figure 5.

Figure 5: TOE100G-IP demo environment set up on KCU116. (Image source: Design Gateway)Figure 5: TOE100G-IP demo environment set up on KCU116. (Image source: Design Gateway)

The example performance test result for FPGA-to-FPGA speed when comparing 100G with others 1G/10G/25G/40G speed is shown in figure 6.

Figure 6: TOE100G-IP performance comparison with 1G/10G/25G/40G on KCU116.  (Image source: Design Gateway)Figure 6: TOE100G-IP performance comparison with 1G/10G/25G/40G on KCU116. (Image source: Design Gateway)

The test result demonstrates that TOE100G-IP is capable of achieving ~12GB/s TCP transmission speed. 

 

 

Example of NVMeG4-IP implementation & performance result on KCU116

Figure 7 shows the overview of the reference design based on the KCU116 to demonstrate 1CH NVMeG4-IP implementation. It’s possible to implement multiple instances of NVMeG4-IP to achieve higher storage performance if FPGA resource is available from customized design.

Figure 7: NVMeG4-IP reference design overview. (Image source: Design Gateway)Figure 7: NVMeG4-IP reference design overview. (Image source: Design Gateway)

The demo system writes and verifies data with the NVMe SSD on the KCU116. The user controls the test operation through a Serial console. For the NVMe SSD to interface with the KCU116, an AB18-PCIeX16 adapter board is required, as shown in Figure 8.

Figure 8: NVMeG4-IP demo environment set up on KCU116. (Image source: Design Gateway)Figure 8: NVMeG4-IP demo environment set up on KCU116. (Image source: Design Gateway)

The example performance test result is shown in Figure 9.

Figure 9: NVMe SSD read/write performance on KCU116 by using Aorus NVMe PCIe Gen4 SSD.  (Image source: Design Gateway)Figure 9: NVMe SSD read/write performance on KCU116 by using Aorus NVMe PCIe Gen4 SSD. (Image source: Design Gateway)

 

Conclusion

Both TOE100G-IP and NVMeG4-IP Core utilize 100Gbps and PCIe Gen4 connectivity capability on KCU116 board for networking and NVMe storage application implementation. TOE100G-IP is capable of ~12GB point-to-point TCP transmission over 100GbE. While NVMeG4-IP can provide very high-performance storage at ~4GB/s per SSD. Storage performance can be increase by RAID implementation.

Open up new opportunities for advanced system-level solutions such as sensors data capturing, onboard computation, and AI based Edge computing devices.

For more detail of TOE100G-IP and NVMeG4-IP: datasheet, available reference design, demo environment setup, please visit Design Gateway’s website at

https://dgway.com/TOE100G-IP_X_E.html

https://dgway.com/NVMeG4-IP_X_E.html