cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
anshpmrl
Adventurer
Adventurer
19,802 Views
Registered: ‎04-07-2014

Video rotation using axi_vdma

Jump to solution

Hi All,

 

1. We need to rotate a 1920 x1080 @60 input video (landscape) into 1080x1920 @ 60 video(portrait) to display it on a portrait          LCD.

2. We are planning to use the below architecture

 

                                                                                DDR

                                                                                   |

video_in(landscape)  -> Video_in_to_Stream -> axi_vdma -> Stream_to_video_out -> LCD(portrait)

                                                         |                                                        |

                                                  axi_vtc(det)                                      axi_vtc(gen)

 

3. Is it possible to implement the rotation algorithm using axi_vdma ?

    by reading pixel data in a different order from DDR memory ?

    by controlling the read address of read master of axi_vdma ?

 

 

Regards,

Akshay

1 Solution

Accepted Solutions
bwiec
Xilinx Employee
Xilinx Employee
23,904 Views
Registered: ‎08-02-2011
There shouldn't be any issue. AXI is high performance high throughput. As long as you set it up correctly (i.e. full crossbar, appropriate data width, appropriate clock rate, etc), it shouldn't be a bottleneck.
www.xilinx.com

View solution in original post

14 Replies
gszakacs
Instructor
Instructor
19,775 Views
Registered: ‎08-14-2007

The problem with that approach is that when you read back the image, each pixel you need will be stored at a widely spaced address from the previous pixel.  For example when you store the pixels in the original raster order, the first pixel of each line (which become the first line of the output video) will be spaced 1920 locations apart.  This sort of random read will go much slower than the successive location read you normally get with streaming DMA.  You can speed up the read process a bit if you arrange it that each successive input row goes into the next bank of memory.  Configure the memory interface to allow the maximum number of simultaeous open banks.  Then when reading back, each successive pixel comes from a new bank allowing some interleaving of row activation and readout.  Still that will be very slow compared to burst access.

 

The right way to do a 90 degree rotation would be to buffer enough lines that the first pixel of all the buffered lines make up a full burst of data to the DDR memory.  If you don't have enough BRAM in the part to do this, then an external static memory is the next best thing.  Then write the data into DDR building each burst from a vertical stripe through the buffered lines.  Readback is still in a different order, but it will go much faster because you get a full burst of pixel data at a time, rather than just a single pixel.

 

If you find you need to go to external static RAM, then it may make sense to do the whole image rotation that way instead of a few lines at a time.  This assumes the external SRAM is large enough to hold the entire image, or more likely two images since you will probably be writing the next frame while reading out the current frame.

-- Gabor
pupxxx
Visitor
Visitor
19,689 Views
Registered: ‎05-14-2014

Can you elaborate on this?

 

 

I think you are saying to have as many line buffers as needed by your DDR burst size (64 line buffers = burst of 64). Then burst write a vertical stripe of data into DDR. When you burst write the next vertical stripe into DDR is the start address in DDR the next sequential address? Or are you starting a new row? If you started a new row in memory, it is easier to keep track of where the pixels are located, but you have a jump in memory locations for each burst. On the other hand if you burst each 64 bit chunk into sequential addresses it is more difficult to keep track of what chunk is where, but the write should be much faster because it's not jumping around in memory. Which is better? Also, the line buffers should be implemented as some addressable memory (dual port RAM) and not a FIFO because you need to read out the last pixels of the lines first - right? Also, is it better to have less jumping around in memory on the reads or writes?

rotation.jpg
0 Kudos
anshpmrl
Adventurer
Adventurer
19,528 Views
Registered: ‎04-07-2014

Hi All,

 

We have succesfully implemented the system using a custom frame buffer controller for DDR.

Reading data from random locations of ddr and writing into ouput line buffers((dual port RAM).

Now we are planning to port the system to axi bus interface model.Now our custom frame buffer

need to support master write/read transactions to and from axi_ddr via axi_interconnect.Is it

posiible to achieve required through put as before using this sysytem?

 

Any replies will be greatly appreciated

 

Thanks

Akshay 

 

 

 

Tags (1)
0 Kudos
gszakacs
Instructor
Instructor
19,521 Views
Registered: ‎08-14-2007

Hi Akshay,

 

I'd suggest starting a new thread for this question.  I only noticed it because I subscribed to the thread, and I'm not an expert on AXI interconnect.

-- Gabor
0 Kudos
bwiec
Xilinx Employee
Xilinx Employee
19,512 Views
Registered: ‎08-02-2011
Hi Akshay,

Can you elaborate on your concern?

As long as you configure AXI Interconnect appropriately, it should not be a bottleneck.
www.xilinx.com
0 Kudos
anshpmrl
Adventurer
Adventurer
19,505 Views
Registered: ‎04-07-2014

Hi All,

 

With previous architecture we have

 

 Frame buffer controller (DDR controller ) data width = 512( 8 burst of 64 bits(native width) ->  and clk frequency = 200MHz

 

1. We have only 14.8 us(hsync period) to perform 120 write cycles and 120 read cycles from ddr.

2. Since write occurs at concecutive locations it takes only 4 us but read cycle takes 9 us, since read occurs from random

locations.So total time taken was 13 us(< 14.8 us ) . So the through put was a major concern.

3. When porting the system to axi ? Is it possible to achieve this by replacing the existing Frame buffer controller(with a wrapper for ddr controller generated with coregen- having native interface) with an axi master frame buffer controller whiich performs the read and write to axi_ddr via axi interconnect.

4. Does this axi interconnect or axi_ddr introduce any bottle neck as compared to the previous design ?

 

Thanks,

Akshay

 

 

Tags (1)
0 Kudos
bwiec
Xilinx Employee
Xilinx Employee
23,905 Views
Registered: ‎08-02-2011
There shouldn't be any issue. AXI is high performance high throughput. As long as you set it up correctly (i.e. full crossbar, appropriate data width, appropriate clock rate, etc), it shouldn't be a bottleneck.
www.xilinx.com

View solution in original post

eli_smertenko
Visitor
Visitor
18,002 Views
Registered: ‎07-18-2011

Hello Akshay!

i'm also trying to implement video rotation with DDR. 

i have a problem while reading form random addresses form the DDR.

could you please share the algorithm you have implemented to make therotation?

thank yo!

0 Kudos
klindseth
Observer
Observer
13,376 Views
Registered: ‎03-07-2008

I think the issue is not one of AXI bandwidth, but finding a way to write the vertical stripes to memory without storing a whole frame in the FPGA and not having to do single pixel writes to disparate addresses in memory.  Who would issue such transfers assuming the Xilinx AXI could handle the random accesses at a sufficient speed, which I am still not convinced of by the way?  I don't think there is any Xilinx IP that does that. 

 

I wonder if there is a way to run the AXI packets through a lookup table that would scatter the pixels around in memory, even though the AXI thinks it is writing a continuous burst.

0 Kudos
bwiec
Xilinx Employee
Xilinx Employee
4,775 Views
Registered: ‎08-02-2011

Yeah, AXI only supports INCR, FIXED, and WRAP type bursts, so scattering it would mean a bunch of 1-beat bursts which is obviously ideal.

Your idea would require a memory controller capable of 'descrambling' the data AFTER decoded from AXI. Of course, MIG doesn't do that. Then again, DDR bandwidth itself will still suffer from scattered memory accesses, even if you are able to get AXI bottlenecks out of the way.

You're probably looking at some additional linebuffers for a practical solution, as Gabor mentioned.

Anyway, I just wanted to post some resources here:
http://china.xilinx.com/publications/archives/xcell/Xcell51.pdf
http://www.xilinx.com/products/logicore/dsp/rotation_resize.pdf

www.xilinx.com
0 Kudos
pgrangeray
Explorer
Explorer
3,728 Views
Registered: ‎05-31-2017
Hi @Anonymous and @bwiec ,

I look this topic for the same question (rotation for mobile display). I don't really understand how to do this with VDMA. How to tell the VDMA to read column by column and not line by line?

Thank you
0 Kudos
marcowu
Observer
Observer
3,626 Views
Registered: ‎08-29-2016

I don't think VDMA can be used in image rotation because you can't write or read the image data randomly from/to DDR devices.

0 Kudos
pgrangeray
Explorer
Explorer
3,460 Views
Registered: ‎05-31-2017

Hi @pupxxx 

 

I repost about rotation.

 

I will need to rotate 1920*1080 video.

On my board (custom), I have an Artix7 and DDR3 SDRAM 1Gb, 8Meg * 16 bits * 8 banks (MT41J64M16).

 

At the begining, SDRAM is used for pictogram inlay (1 bank is enough).

But now, i need to use the SDRAM for roration too.

 

Is it possible? Rotation and inlay (at the same time).

0 Kudos
pgrangeray
Explorer
Explorer
3,395 Views
Registered: ‎05-31-2017

Hi @marcowu @anshpmrl

 

VDMA is not compatible with rotation, OK.

What is the best solution? Frame buffer, DMA?

 

Thanks

0 Kudos