cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
mpul
Visitor
Visitor
2,858 Views
Registered: ‎04-18-2018

CDMA can't read data from PC over PCIe

Hi.

 

We are trying to transfer data from board to PC and vice-versa over PCIe. Board is 32 bit and the PC is 64 bit. Transferring data from board to PC functions properly but we're having problems with the opposite direction. This is our setup.

 

AXI Memory Mapped to PCI-Express IP core doesn't support 64bit addressing directly. Instead, it requires writing upper bits in special translation register inside the core and then providing lower bits on AXI bus. To overcome this limitation, for each entry in original S/G list, there are two entries in adapted S/G list. First entry is called address descriptor and it only copies upper address bits into the translation register. Second entry is data descriptor which actually copies the 4KB data over the PCIe bus.
When making descriptors pair, according to whether we send or receive data to or from PC RAM, we do it as follows:

 

static void make_desc_pair(
    sg_desc_t*      addr_desc,
    pc_sg_desc_t*   pc_sg_desc,
    u32             data_src,
    u32             direction
    )
{
    sg_desc_t* data_desc;

    data_desc = addr_desc + 1;

    /* Copy the upper address bits */
    addr_desc->next_desc = (u32)data_desc;
    addr_desc->src_addr = (u32)&pc_sg_desc->addr_high;
    addr_desc->dest_addr = (u32)AXI_BAR1_HIGH;
    addr_desc->control = 8;

    /* Copy 4KB of actual data */
    data_desc->next_desc = (u32)(data_desc + 1);
    if (direction==FROM_BOARD_TO_PC) // From 32 bit to 64 bit.
    {
        data_desc-> src_addr= data_src;
        data_desc->dest_addr = BAR1_START + (pc_sg_desc->addr_low & PHYS_ADDR_MASK1);
    }
    else
    {
        data_desc-> dest_addr= data_src;
        data_desc->src_addr = BAR1_START + (pc_sg_desc->addr_low & PHYS_ADDR_MASK1);
    }

    data_desc->control = 4096;
}

This is the only part of the code which determines whether we are sending data from or receiving to boards RAM. The rest of the mechanic remains the same. As opposed to official documentation which suggests use of BRAM for SG utilization, our descriptors are stored in RAM. Initially we allocate and fill required memory for them. Mentioned addresses used in code above are defined as:

 

#define BAR1_START      XPAR_AXIPCIE_0_AXIBAR_1
#define BAR1_END        XPAR_AXIPCIE_0_AXIBAR_HIGHADDR_1

#define AXI_BAR1_HIGH (volatile u32*)(XPAR_AXIPCIE_0_BASEADDR + 0x210)
#define AXI_BAR1_LOW (volatile u32*)(XPAR_AXIPCIE_0_BASEADDR + 0x214)

#define PHYS_ADDR_MASK1 (BAR1_END - BAR1_START)
typedef struct {
    u32 next_desc;  /* 0x00 */
    u32 na1;        /* 0x04 */
    u32 src_addr;   /* 0x08 */
    u32 na2;        /* 0x0C */
    u32 dest_addr;  /* 0x10 */
    u32 na3;        /* 0x14 */
    u32 control;    /* 0x18 */
    u32 status;     /* 0x1C */
} __aligned(64) sg_desc_t;

typedef struct {
    u32 addr_high;
    u32 addr_low;
} __aligned(16) pc_sg_desc_t;

 

Address list in Vivado:

memory map.jpg

 

PCIe block design in Vivado:

PCIe block design.jpg

 

When making debug test cases, we observed data write and read registers on the line which connects CDMA_INTERCONNECT on pin M01_AXI and AXI_PCIE on pin S_AXI.

 

In the record direction (from board to PC), we observed content of data and address fields for both read and write parameter flags as triggers (rvalid = ‘1’ AND rready = ‘1’; wvalid = ‘1’ AND wready = ‘1’). Trigger parameters wready and wvalid went off, observed results were as predicted, which means data received was correct.

 

In the playback direction (from PC to board), we observed content of data and address fields for both read and write parameter flags as triggers (rvalid = ‘1’ AND rready = ‘1’; wvalid = ‘1’ AND wready = ‘1’). Trigger parameters for this direction (rvalid and rready) didn’t go off, which means that there’s no data read.

 

To conclude, in the playback direction we aren’t able to reach source destination of data on boards RAM which should be possible with current configuration. We aren’t able to identify the underlying problem.

Transfer fails at very first descriptor, where XAxiCdma_GetError retrieves error code 0x20.

 

I have also attached a document with ILA observations of mentioned parameters. Any help would be greatly appreciated.

 

Regards,

Matija

 

0 Kudos
7 Replies
deepeshm
Xilinx Employee
Xilinx Employee
2,641 Views
Registered: ‎08-06-2008

Hi,

Could you let us know which signal is not getting asserted? Is it rvalid or rready or both?

Thanks.
0 Kudos
mpul
Visitor
Visitor
2,604 Views
Registered: ‎04-18-2018

Hi,

 

it is rvalid only. It remains at 0. Hope it helps.

 

Thanks.

0 Kudos
deepeshm
Xilinx Employee
Xilinx Employee
2,563 Views
Registered: ‎08-06-2008

Hi,

Sounds like there is no data coming from outside? Could you probe the PCIe hard block interface? You could pull the RX interface signals of the PCIe hard block by using 'Setup Up' debug after synthesis. Also, do you have PCIe Protocol Link Analyzer to see if there is incoming data from the PC.

Is it correct to assume there is no data coming in from the PC to board at all? Or is it the case where you see some traffic at the start and then it stops with rvalid permanently staying low.

Thanks.
mpul
Visitor
Visitor
2,464 Views
Registered: ‎04-18-2018

Hi,

 

could you elaborate on "probe the PCIe hard block interface"?

 

We were not able to set up debug of RX interface due to following critical warning:
'PCI_EXPRESSAXI_PCIE_pcie_7x_mgt_rxn' is not ChipScope-debuggable; it is not accessible from the fabric routing.

 

We have tried adding RX interface to ILA, but following critical warning was received:
[BD 41-759] The input pins (listed below) are either not connected or do not have a source port, and they don't have a tie-off specified.
Please check your design and connect them if needed:
/PCI_EXPRESS/AXI_PCIE/pci_exp_rxp
/PCI_EXPRESS/AXI_PCIE/pci_exp_rxn
/PCI_EXPRESS/ila_0/probe0
/PCI_EXPRESS/ila_0/probe1

 

We have exposed pcie_7x_mgt interface as an external.

 

We do not have PCIe Protocol Link Analyzer.

 

After SG list has been transfered from PC to board via "XAxiCdma_SimpleTransfer", nothing else is received on the board after starting transfer by writing tail descriptor address in CDMA.

 

Sending of data from PC to board fails at the first data descriptor (descriptor's status indicates DMA Slave Error).

 

Thanks.

0 Kudos
mpul
Visitor
Visitor
2,357 Views
Registered: ‎04-18-2018

Hi,

 

we have tried transferring data from PC to board using CDMA simple transfers (one transfer per data descriptor).
We have transfered data in length of 4KB, using PC addresses from address (PC high address) and data (PC low address) descriptors.

 

In this case, data from PC is being transfered without problems.

 

PC driver and application is the same in this case, compared to one used for SG transfer, which proves that SG problem is not caused on PC.

0 Kudos
mpul
Visitor
Visitor
2,262 Views
Registered: ‎04-18-2018

Hi,

we have performed comparison of read bursts for simple transfer and scatter/gather transfer using same PC driver, PC application. ARM code differs only in switch between usage of simple transfer or scatter/gather.

 

 

Simple transfer:

SimpleTransfer.png

 

Scatter gather:

ScatterGather.png

 

As it can be seen, raddr is in both cases in range of BAR1 (0x81000000 - 0x817FFFFF).

In scatter/gather mode, when rvalid is equal to 1, rdata changes on every clock and is completely irrelevant to actual data stored in RAM of PC (repeating sequences of 0xABAB).

 

As it wasn't stated before, we are using Vivado Design Suite 2016.2.

 

Hopefully, this additional info expands previous posts.

Looking forward to your reply!

0 Kudos
mpul
Visitor
Visitor
2,022 Views
Registered: ‎04-18-2018

Hi,

 

further debugging has shown that data descriptor is being processed before processing of corresponding address descriptor has ended. Following ILA outputs show:

write address of address descriptor with its ready and valid signals
write data of address descriptor with its ready and valid signals
read address of data descriptor with its ready and valid signals

 

wrong_order.png

 

wrong_order_2.png

 

It can be clearly seen that masked low PC address on BAR1 (0x817c2000/0x814ad000) is being read before 64bit address has been written to Address translation registers of AXI_PCIe_CTL port (0x00000001/0x00000000 on Upper Address Translation register and 0x8afc2000/0x244ad000 on Lower Address Translation register).

 

In opposite, working direction (source is DDR and destination is PC), descriptors are being processed in correct order.

 

We suspect that this is causes our problem, but what could cause this premature processing of data descriptor?

EDIT:

Manually writing 64bit address to Address translation registers before starting CDMA results in correct transfer for first data descriptor!

Transfer of next data descriptor fails due to problem described above, but this confirms case of said problem.

0 Kudos