UPGRADE YOUR BROWSER

We have detected your current browser version is not the latest one. Xilinx.com uses the latest web technologies to bring you the best online experience possible. Please upgrade to a Xilinx.com supported browser:Chrome, Firefox, Internet Explorer 11, Safari. Thank you!

cancel
Showing results for 
Search instead for 
Did you mean: 

Need to get 100G Ethernet data stream into a host Intel CPU? PCIe bifurcation is the answer

Xilinx Employee
Xilinx Employee
0 0 49.5K

Netcope Technologies and CESNET (the Czech Republic's National Research and Education Network) have demonstrated a method for sustaining data transfer from a 100Gbps Ethernet port to a host CPU using two of an FPGA’s PCIe Gen 3 x8 interface ports in parallel. The concept is called bifurcation and it was introduced on Intel’s Core I7 CPUs a couple of years ago. Intel’s intent was to allow splitting of the CPU’s PCIe x16 ports so that they could handle two separate tasks but the technique works just as well when used in reverse: to merge two external PCIe x8 ports into one x16 port. Using bifurcation to build a 100Gbps system with a single FPGA eliminates the need for an additional PCIe switch chip, which saves cost, board space, and approximately 6W of power.

 

CESNET and Netcope Technologies conducted a series of experiments to demonstrate the benefits of PCIe bifurcation. The test setup included a custom FPGA card equipped with a Xilinx Virtex-7 H580T 3D FPGA. Two of the FPGA’s PCIe x8 hard blocks were connected to a PCIe x16 slot on the card. The FPGA firmware working in concert with Linux device drivers transferred data to a ring buffer located in the PC’s RAM. The PCIe x8 interfaces were used in a round-robin manner to transfer data to a single buffer. The following block diagram illustrates the experimental setup:

 

 

PCIe Bifurcation Experiment Block Diagram.jpg

 

 

A random packet data generator instantiated in the FPGA generated traffic in excess of 100Gbps. The transfer-rate results are shown in the following graph:

 

 

 PCIe Bifurcation Experiment Throughput REsults.jpg

 

 

The DMA engine instantiated in the FPGA groups packets together so that the packet length doesn’t affect the raw PCIe throughput. The achieved throughput is 107 Gbps. Note that at least eight CPU cores were needed to scale the processing to the targeted 100Gbps for smaller packet sizes.

 

For more information about this experiment, see the Netcope Technologies White Paper: “CESNET and Netcope Technologies demonstrate 100 Gbps transfers over PCIe with a single FPGA.”

 

For more information about the Xilinx Virtex-7 H580T 3D FPGA, here’s a video that discusses the device’s 28Gbps serial transceivers, another critical element in the development of 100Gbps Ethernet systems:

 

 

 

 

Tags (1)