We have detected your current browser version is not the latest one. Xilinx.com uses the latest web technologies to bring you the best online experience possible. Please upgrade to a Xilinx.com supported browser:Chrome, Firefox, Internet Explorer 11, Safari. Thank you!

Showing results for 
Search instead for 
Did you mean: 

Brian Bailey writes about Machine Learning and the CCIX high-speed, cache-coherent, chip-to-chip I/O standard

Xilinx Employee
Xilinx Employee
0 0 42K


Brian Bailey has just posted an excellent tutorial article titled “CCIX Enables Machine Learning” on the Semiconductor Engineering Web site. The article discusses use of the CCIX high-speed, coherent chip-to-chip I/O standard and its use for machine-learning applications. As it states on the CCIX Consortium Web site:


“CCIX was founded to enable a new class of interconnect focused on emerging acceleration applications such as machine learning, network processing, storage off-load, in-memory data base and 4G/5G wireless technology. 


“The standard allows processors based on different instruction set architectures to extend the benefits of cache coherent, peer processing to a number of acceleration devices including FPGAs, GPUs, network/storage adapters, intelligent networks and custom ASICs.”


Bailey writes:



“Today, machine learning is based on tasks that have a very deep pipeline. ‘Everyone talks about the amount of compute required, and that is why GPUs are doing well,’ says [Vice President of architecture and verification at Xilinx and chair of the CCIX consortium Gaurav] Singh. ‘They have a lot of compute engines, but the bigger problem is actually the data movement. You may want to enable a model where the GPU is doing the training and the inference is being done by the FPGA. Now you have a lot of data sharing for all of the weights being generated by the GPU, and those are being transferred over to the FPGA for inference. You also may have backward propagation and forward propagation. Forward propagation could be done by the FPGAs, backward by the GPU, but the key thing is still that data movement. They can all work efficiently together if they can share the same data.’”




For more information about CCIX, see:








Tags (1)