06-18-2010 10:47 PM
I want to know the exact transmission rate of the PCIe core, which can be built by the core gen.
I can acheive a disappointing rate, about 6 MB/sec!!
Is this probable?
What should I do to gain a better throughput?
I use HTG-V5-PCIE board, with a SX95T xilinx FPGA. As I mentioned, I use the core gen PCIe codes. (ISE 11.2)
06-21-2010 06:57 AM
How are you measuring your throughput?
Performance always depends on the efficiency of both devices on a PCI Express link. Parameters like payload size, flow control credit availability and different latencies strongly influence the overall result. The numerous input factors make it very difficult to find a precise estimate of the real-live performance.
It would be important that you fully understand the above mentioned parameters to begin to understand what your system is capable of doing.
Hope this helps.
06-21-2010 09:08 AM
PIO is slow by nature, plus the host CPU needs to manage data transfer.
In order to have good throughput you need to perform DMA transfer. To do so, you need to become PCIe bus master from your board. There is an application note and a free design : xapp1052
I succeed to achieve about 1Gb/s on a x1 PCIe design using a low cost and quite old mother board.
06-22-2010 02:24 AM
06-23-2010 01:04 AM
Thanks for your answer. We followed the xapp1052 instructions, but the we cannot get DMA to work properly.
Did you test your board with PIO too? What was the result?
We use ISE 11.2 and it's coregen. What should we do to get the DMA work?
And finally, can you implement a design with PCIe core and some other codes in it's application block?
06-23-2010 01:32 AM
I test also a PIO example on my board, it works but very slowly : around the throughput that you mansion previously.
I think that ISE 11.2 is enought to make the xapp1052 (but I'm not sure, the minimum ISE version is specified in the xapp).
There is some tricks to follow in order to have the design working (Generate a top level as verilog, not VHDL ... You need to search on this forum).
I make it work on a linux PC, I did not try on Windows. I use the Xilinx throughput application to verify that it works fine.
Then I modify the driver and the HDL design to build my own application.
Making an application from the xapp1052 is quite simple but it requires that you understand the BMD design first, then to understand what happens in the driver running on the host (setting DMA addresses using PIO access, then launch the DMA ....). It is quite simple but it requires time and skills in HDL and drivers.
Best regards and good luck
06-23-2010 01:45 AM
Many thanks for your answers. We tested the DMA of our board with linux, but after setting the register values with PIO, the DMA process doesn't start, and the dmacsr value doesn't change and the performance register, which has to show the number of trn_clk remains zero. We used the verilog top level files.
What should we do to implement the DMA codes on the board.
We have a HTG_v5_PCIE board, with SX95T FPGA.
06-23-2010 08:19 AM
Normally you don't have to do anything else than what is explained in the xapp1052.
You can find a zip file containing the BMD design on the Xilinx website. You need to follow xapp1052 to integrate BMD design to the files generated by coregen.
Then you have to compile the linux driver and the test application.
Once these 2 things are done, you should have the test application working. Otherwise do not try to go further.
For me this work quite quickly (1 or 2 days reading this forum, xapp1052 and synthesis time).
If you have still problems maybe you can post the log files generated by the Xilinx test application.
06-24-2010 03:49 AM
As I mentioned earlier, we use a virtex 5 , SX95T fpga with a HTG-V5-PCIE board. The xapp 1052 and the BMD files don't contain this device. What should we do to make this design work?
What files should we add to the project. I mean, do we have to generate the ISE design manually?
06-24-2010 05:04 AM
The xapp1052 is for ML555 board. Normally you just have to modify the .ucf file to feet on your board.
The xapp1052 design files comes with a script (perl) that help synthesis process.
But I made my own ISE project because if I remember well the script fails.
To make the design working I have to define some constant manually (verilog #define).
Yes indeed it is not really simple to do. But you should succeed if you understand the design and carrefully read the synthesis errors and warning message in ISE.
06-27-2010 11:53 PM
Thanks again for your answers.
I have some questions.
1- do I have to add PIO files, which are already in the PCIe core design?
2-which one of these modules(PCI_exp_64b_app) is the app block in the design:
a) the app module in the core gen's codes?
b) the app module in the dma files which can be downloaded from the xilinx site?
3- for a V5 (SX95T) FPGA, which of the files for the app block should be added? the v5_blk_plus_exp_64b_app.v????or something else?
4- during the DMA process, where will the data be written? in a bolck ram inside the core? or should I have to add the DDR2 memory or something like that?
5- if I want to save some data to a block ram inside the PCIe core, to a block ram, what should I do? can I have more than BAR0 in generating the PCI core?
6- after all, I cannot impelent the whole design!!! how can I get a better understanding of the modules and the relations between them?
06-29-2010 11:42 PM
We finally get the BMD design to work!
We get the 14000 Mbps for x8 .
Now I have another question:
how can I get the transferred data?I mean, where the DMA transferred data is saved?In a buffer inside the endpoint?Where is it?!?
Can I have another BAR other than BAR0 to save my PIO transferred data?
07-13-2010 04:00 AM
03-28-2014 07:52 PM
I have a problem with xapp1052,I want to add a sdram in this project,but I don't konw how to add it?
Do you have any suggestion?
Thanks in advance.
03-29-2014 09:34 AM
Have a look at the below application note
The design illustrates how to create an 8-lane endpoint design with an interface to a DDR2 memory.
I hope the information helps.
Mark the post - "Accept as solution" if the post solves or answers your query. This helps other users to get to solution while searching for similar issues.
03-31-2014 11:51 AM
Is there an equivelent of XAPP1052 for the 7 series parts, particularly the Zc706. I've been trying to modify the PCIE_TRD but almost any change I make seems to cause the design to become non-functional. I'm wondering if they didn't intend for you to use the TRD as a baseline even if you've purchased the full NWL IP.
Any help is appreciated.
04-01-2014 12:00 AM
If you want to have a throughput test to be performed with a PCIe or similar cores, you have to have DMA logic inside design.
PIO is just a simple design aiming at initial bring up of system and demonstrates a couple of read writes of smaller size(1DW).
For 7 series the targetted referancxe designs are best suited for measuring throughput.
There is one for VC709 also.
You can make modifications as all wrappers are available.
I hope the above information helps.
04-01-2014 08:32 AM
This information both helps and doesn't help. Perhaps I'm doing something wrong in my procedure for modifying the TRD but anychanges I make to the design causes the HDMI demo to be non-functional. I think this has something to do with the FSBL and u-boot which I haven't been modifying with my hardware changes. Is this correct or should the only file I need to modify be the zc706_pcie_trd.bin file?