07-21-2017 08:58 AM
(Really) short version:
In my HLS design, I am trying a single AXI-DMA transfer of 126MB and the SDK function XAxiDma_SimpleTransfer fails with status 15 (max transfer size limit). How should I modify my design?
Detailed version:
In Vivado HLS, I am using an AXI DMA component to feed an HLS-home-made filter module which uses AXI stream interfaces. Here you can see the configuration of the AXI DMA and how the filter is connected to it.
I created a simple software test bench in the SDK to test the system in bare-metal:
// Pass the size of the image to the filter module.
XFilter_11x11_blur_axis_Set_height_V(&Filter_11x11_blur_axis, IMG_NROWS); XFilter_11x11_blur_axis_Set_width_V(&Filter_11x11_blur_axis, IMG_NCOLS); XFilter_11x11_blur_axis_Start(&Filter_11x11_blur_axis);
// Set DMA transfers (AXI DMA to/from the filter) status = XAxiDma_SimpleTransfer(&axiDMA, (INTPTR) input_data, IMG_SIZE * sizeof(u32), XAXIDMA_DMA_TO_DEVICE); // It fails with status 15. See the function implemention in the below snippet.
if (status != XST_SUCCESS) { printf("ERROR: Sending data from DMA to IP core (status = %d)\n", status); return XST_FAILURE; } status = XAxiDma_SimpleTransfer(&axiDMA, (INTPTR) output_data, IMG_SIZE * sizeof(u32), XAXIDMA_DEVICE_TO_DMA); if (status != XST_SUCCESS) { printf("INFO: Receiving data from IP core to DMA (status = %d)\n", status); return XST_FAILURE; }
// Wait for completion while (XAxiDma_Busy(&axiDMA, XAXIDMA_DMA_TO_DEVICE)) { /* Wait */ } while (XAxiDma_Busy(&axiDMA, XAXIDMA_DEVICE_TO_DMA)) { /* Wait */ } while (!XFilter_11x11_blur_axis_IsDone(&Filter_11x11_blur_axis)); status = XFilter_11x11_blur_axis_Get_return(&Filter_11x11_blur_axis);
The test bench (i.e. the DMA transfer) works fine with medium-size images (HDPLUS 1600x900, FHD 1920x1080), but it fails at the first XAxiDma_SimpleTransfer (XAXIDMA_DMA_TO_DEVICE), when I try to transfer 4K and 8K UHD images (e.g., 7680x4320 ~ 126MB image). The return value is 15. In the xaxidma.c, I can see the function implementation:
u32 XAxiDma_SimpleTransfer(XAxiDma *InstancePtr, UINTPTR BuffAddr, u32 Length, int Direction) { //... if ((Length < 1) || (Length > InstancePtr->TxBdRing.MaxTransferLen)) { return XST_INVALID_PARAM; // 15 } //... }
At this point, how can I transfer a big image through the AXI DMA? How can I increase the maximum transfer length?
Thank you
07-21-2017 12:00 PM
The answer is - you can't just increase the size because it is set in hardware.
You have to break it down in blocks by setting up a ring of descriptors - OR issue one simple transfer at a time.
See the example here:
07-21-2017 09:38 AM
@gdg It is fixed by the type of hardware configuration. Looking at xaxidma.c I see it initialized to either XAXIDMA_MAX_TRANSFER_LEN or XAXIDMA_MCHAN_MAX_TRANSFER_LEN
if ((InstancePtr->RxNumChannels > 1) || (InstancePtr->TxNumChannels > 1)) { MaxTransferLen = XAXIDMA_MCHAN_MAX_TRANSFER_LEN; } else { MaxTransferLen = XAXIDMA_MAX_TRANSFER_LEN; }And they are defined in xaxidma.h as
#define XAXIDMA_MAX_TRANSFER_LEN 0x7FFFFF /* Max length hw supports */ #define XAXIDMA_MCHAN_MAX_TRANSFER_LEN 0x00FFFF /* Max length MCDMA hw supports */So it seams that if you have just one channel on both RX and TX then the max transfer size is 0x7F FFFF and 0xFFFF otherwise.
07-21-2017 10:25 AM
so, given my system configuration the maximum size of an AXI transfer in bytes is 0x7FFFFF = 8388607 bytes ~ 8MB. That is exactly the size of the Full HD image (1920*1080 * 4bytes).
The question is still open. How can I increase the maximum size of the transfer or (automatically) implement multiple bursts through the AXI DMA?
07-21-2017 10:29 AM
I spotted a typo in a sentence:
In Vivado HLS, I am using an AXI DMA component to feed an HLS-home-made filter module which uses AXI stream interfaces.
The correct one is "In Vivado, I am using an AXI DMA".
[ I am scared of using the edit post, that behaves weird some times :-) ]
Please, note that the problem is still open.
07-21-2017 12:00 PM
The answer is - you can't just increase the size because it is set in hardware.
You have to break it down in blocks by setting up a ring of descriptors - OR issue one simple transfer at a time.
See the example here:
07-24-2017 12:29 PM
do you have a reference or tutorial on the SDK software API of the AXI DMA with scatter gather?
The example that you pointed is also part of this example package
https://www.xilinx.com/support/answers/58080.html
but none of them provide details about the API.
For example, given the following function, I would like the explanations of the function calls. What is BD? What is a TX ring? ... etc.
static int TxSetup(XAxiDma * AxiDmaInstPtr) { XAxiDma_BdRing *TxRingPtr; XAxiDma_Bd BdTemplate; int Delay = 0; int Coalesce = 1; int Status; u32 BdCount; TxRingPtr = XAxiDma_GetTxRing(&AxiDma); /* Disable all TX interrupts before TxBD space setup */ XAxiDma_BdRingIntDisable(TxRingPtr, XAXIDMA_IRQ_ALL_MASK); /* Set TX delay and coalesce */ XAxiDma_BdRingSetCoalesce(TxRingPtr, Coalesce, Delay); /* Setup TxBD space */ BdCount = XAxiDma_BdRingCntCalc(XAXIDMA_BD_MINIMUM_ALIGNMENT, TX_BD_SPACE_HIGH - TX_BD_SPACE_BASE + 1); Status = XAxiDma_BdRingCreate(TxRingPtr, TX_BD_SPACE_BASE, TX_BD_SPACE_BASE, XAXIDMA_BD_MINIMUM_ALIGNMENT, BdCount); if (Status != XST_SUCCESS) { xil_printf("ERROR: Failed create BD ring in txsetup\r\n"); return XST_FAILURE; } /* * We create an all-zero BD as the template. */ XAxiDma_BdClear(&BdTemplate); Status = XAxiDma_BdRingClone(TxRingPtr, &BdTemplate); if (Status != XST_SUCCESS) { xil_printf("ERROR: Failed bdring clone in txsetup %d\r\n", Status); return XST_FAILURE; } /* Start the TX channel */ Status = XAxiDma_BdRingStart(TxRingPtr); if (Status != XST_SUCCESS) { xil_printf("ERROR: Failed start bdring txsetup %d\r\n", Status); return XST_FAILURE; } return XST_SUCCESS; }
07-24-2017 12:34 PM
The best reference I could find is in the header files. They have a huge amount of documentation and usually is more than sufficient to understand.
Look inside XAxiDma.h, XAxiDma_Bd.h and XAxiDma_BdRing.h
07-24-2017 03:01 PM
Ok, so I will go the hard way :-)
Thank you
07-24-2017 03:32 PM
This guy Mohammed Sadri actually has a very good set of tutorials on youtube
ZYNQ Training - Lesson 10 Part I - Using AXI DMA In Scatter-Gather Mode
https://www.youtube.com/watch?v=rDGOVszjeKs
This friend of mine, Leonardo, also has some good tutorials (his accent is just adorable)
VIVADO HLS Training AXI Stream interface #07
https://www.youtube.com/watch?v=3So1DPe2_4s
07-24-2017 04:07 PM
Those are good tutorial indeed. I followed many of their tutorials.
Thank you
05-13-2020 11:47 AM
Hello I am trying to understand the basics of image processing withing the PL of FPGA.
Can you explain or enlighten me on how your IP-core looks like ?
Is your inputs and outputs to the IP-core with AXI4Streams ? And internally processed with hls::axiVIDEO2Mat(src, img) or a different way ?
Thank you
Gilles