We have detected your current browser version is not the latest one. Xilinx.com uses the latest web technologies to bring you the best online experience possible. Please upgrade to a Xilinx.com supported browser:Chrome, Firefox, Internet Explorer 11, Safari. Thank you!

Showing results for 
Search instead for 
Did you mean: 
Visitor dooheon0527
Registered: ‎09-17-2018

In my system bandwidth of FPGA DRAM through PCie BAR is very low

Hi guys,

I implemented PL-Memory module which was connected to Desktop through PCIe

Configuration is

Host - PCIe - |  xilinx PCIe core - axi interconnect - MIG - DRAM  |

I set two BAR region in PCIe core. And the second BAR region is directly mapped to axi address range. And it is byte accessible, the access unit is 8 bytes

This is the FPGA hardware setting.


And software, I set the cdev driver which access address of PCIe bar2 MMIO by ioremap()



But, when application access FPGA's memory by cdev driver.

FPGA memory's bandwidth is too slow; read bandwidth is about 4MB/s, write is 7MB/s

Converted on latency, read is about 2us and write is about 1.3us


Is this typical performance?

Or did I make some mistake?

If anyone who implement this system or the similar one, tell me what is the performance of your system.


Thank you.


0 Kudos
1 Reply
Xilinx Employee
Xilinx Employee
Registered: ‎08-06-2008

Re: In my system bandwidth of FPGA DRAM through PCie BAR is very low

Performance could be due to variety of system level bottlenecks and not only dependent on the IP itself. Please refer to the following document that describes the possible causes of performance degradation.


For better performance, you might want to consider using XDMA ip. The link below describes some performance numbers that could be achieved using this IP.


To investigate on where exactly the bottleneck is happening and to rule out whether it is coming from the FPGA or not, you could check the data read request and the corresponding completion going back at different interfaces such as at the Memory interface, PCIe IP user interface etc. If you have a protocol link analyzer that would be helpful too. Just for a test, you could try with BRAM instead of using DDR and see if there is any affect in the performance.

Another test just to narrow down whether the issue is in the hardware or at the software level, you could run a simulation of your design and measure latency of overall read path.Also, checking on a different system could also give some helpful clues.