cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
mrbietola
Scholar
Scholar
476 Views
Registered: ‎05-31-2012

Axi DMA or Datamover to traspose a matrix?

Jump to solution

Hi, i want to do a FFT 2D, i need to traspose row and columns after the first FFT, 

since i haven't enough BRAM, i am thinking of write the result of the first FFT to the external DDR memory opportunely and readback the trasposed image from DDR to be processed by the second FFT.

 

FFT-> DMA to memory (writing every line as a row)-> DMA from memory (read lines)-> FFT

I'm uncertain of what IP should i use (Axi Datamover or Axi DMA) and the timing involved in this operation.

Tags (3)
0 Kudos
1 Solution

Accepted Solutions
florentw
Moderator
Moderator
387 Views
Registered: ‎11-09-2015

HI @mrbietola 


@mrbietola wrote:

Hi Florent,

i get your suggestion, so breaking down the data (using a smaller BRAM as buffer)to smaller chunks that i can transfer in burst to DDR.

For example if i have 512x512x64bit Matrix i could buffer 64x512x64bit in BRAM and transfer this chunk to DDR with 512 writes of 64x64bit. (burst of 64)

In this case you will still need to have a custom logic to do the transposing. Because you need to read consecutive data. 

I was thinking a logic because you need to send the data sample by sample so you do the transposing while writing to the DDR

Anyway i need to transfer the data at some point to DDR and i was wondering what would be the best IP to use. Ideally i would like these transfers to occur without software intervention.

I think a custom IP would be the better solution. Because the datamover or the AXI DMA (which include the Datamover) will expect that you feed them with the aligned data (so already transposed) so it will buffer it and send it to the memory.

I would also like to know, suppose i don't use a controller as you suggested, is it better to traspose during write to DDR (so write without burst and read with burst) or during read (so write with burst and read without) from DDR? 


I do not see what can be best. I would say it depends on your system. If you are already using the BW for read or write maybe use the other way


Florent
Product Application Engineer - Xilinx Technical Support EMEA
**~ Don't forget to reply, give kudos, and accept as solution.~**

View solution in original post

0 Kudos
4 Replies
florentw
Moderator
Moderator
419 Views
Registered: ‎11-09-2015

HI @mrbietola 

You might want to write your own controller for this.

The main issue is that this is usually really inefficient to do a transpose. Because you cannot do any burst. So you have to write each sample one by one. I do not think the AXI DMA or AXI Datamover is good for this


Florent
Product Application Engineer - Xilinx Technical Support EMEA
**~ Don't forget to reply, give kudos, and accept as solution.~**
0 Kudos
mrbietola
Scholar
Scholar
405 Views
Registered: ‎05-31-2012

Hi Florent,

i get your suggestion, so breaking down the data (using a smaller BRAM as buffer)to smaller chunks that i can transfer in burst to DDR.

For example if i have 512x512x64bit Matrix i could buffer 64x512x64bit in BRAM and transfer this chunk to DDR with 512 writes of 64x64bit. (burst of 64)

Anyway i need to transfer the data at some point to DDR and i was wondering what would be the best IP to use. Ideally i would like these transfers to occur without software intervention.

I would also like to know, suppose i don't use a controller as you suggested, is it better to traspose during write to DDR (so write without burst and read with burst) or during read (so write with burst and read without) from DDR? 

0 Kudos
florentw
Moderator
Moderator
388 Views
Registered: ‎11-09-2015

HI @mrbietola 


@mrbietola wrote:

Hi Florent,

i get your suggestion, so breaking down the data (using a smaller BRAM as buffer)to smaller chunks that i can transfer in burst to DDR.

For example if i have 512x512x64bit Matrix i could buffer 64x512x64bit in BRAM and transfer this chunk to DDR with 512 writes of 64x64bit. (burst of 64)

In this case you will still need to have a custom logic to do the transposing. Because you need to read consecutive data. 

I was thinking a logic because you need to send the data sample by sample so you do the transposing while writing to the DDR

Anyway i need to transfer the data at some point to DDR and i was wondering what would be the best IP to use. Ideally i would like these transfers to occur without software intervention.

I think a custom IP would be the better solution. Because the datamover or the AXI DMA (which include the Datamover) will expect that you feed them with the aligned data (so already transposed) so it will buffer it and send it to the memory.

I would also like to know, suppose i don't use a controller as you suggested, is it better to traspose during write to DDR (so write without burst and read with burst) or during read (so write with burst and read without) from DDR? 


I do not see what can be best. I would say it depends on your system. If you are already using the BW for read or write maybe use the other way


Florent
Product Application Engineer - Xilinx Technical Support EMEA
**~ Don't forget to reply, give kudos, and accept as solution.~**

View solution in original post

0 Kudos
joancab
Mentor
Mentor
384 Views
Registered: ‎05-11-2015

 

Sending rows to FIFOs then clocking them and getting one column at a time could be one way, but 512 FIFOs 512x64 looks like a lot.

0 Kudos