04-27-2014 10:07 PM
1. We need to rotate a 1920 x1080 @60 input video (landscape) into 1080x1920 @ 60 video(portrait) to display it on a portrait LCD.
2. We are planning to use the below architecture
video_in(landscape) -> Video_in_to_Stream -> axi_vdma -> Stream_to_video_out -> LCD(portrait)
3. Is it possible to implement the rotation algorithm using axi_vdma ?
by reading pixel data in a different order from DDR memory ?
by controlling the read address of read master of axi_vdma ?
06-12-2014 06:31 PM
04-28-2014 05:47 AM
The problem with that approach is that when you read back the image, each pixel you need will be stored at a widely spaced address from the previous pixel. For example when you store the pixels in the original raster order, the first pixel of each line (which become the first line of the output video) will be spaced 1920 locations apart. This sort of random read will go much slower than the successive location read you normally get with streaming DMA. You can speed up the read process a bit if you arrange it that each successive input row goes into the next bank of memory. Configure the memory interface to allow the maximum number of simultaeous open banks. Then when reading back, each successive pixel comes from a new bank allowing some interleaving of row activation and readout. Still that will be very slow compared to burst access.
The right way to do a 90 degree rotation would be to buffer enough lines that the first pixel of all the buffered lines make up a full burst of data to the DDR memory. If you don't have enough BRAM in the part to do this, then an external static memory is the next best thing. Then write the data into DDR building each burst from a vertical stripe through the buffered lines. Readback is still in a different order, but it will go much faster because you get a full burst of pixel data at a time, rather than just a single pixel.
If you find you need to go to external static RAM, then it may make sense to do the whole image rotation that way instead of a few lines at a time. This assumes the external SRAM is large enough to hold the entire image, or more likely two images since you will probably be writing the next frame while reading out the current frame.
05-14-2014 09:20 AM
Can you elaborate on this?
I think you are saying to have as many line buffers as needed by your DDR burst size (64 line buffers = burst of 64). Then burst write a vertical stripe of data into DDR. When you burst write the next vertical stripe into DDR is the start address in DDR the next sequential address? Or are you starting a new row? If you started a new row in memory, it is easier to keep track of where the pixels are located, but you have a jump in memory locations for each burst. On the other hand if you burst each 64 bit chunk into sequential addresses it is more difficult to keep track of what chunk is where, but the write should be much faster because it's not jumping around in memory. Which is better? Also, the line buffers should be implemented as some addressable memory (dual port RAM) and not a FIFO because you need to read out the last pixels of the lines first - right? Also, is it better to have less jumping around in memory on the reads or writes?
06-11-2014 02:57 AM
We have succesfully implemented the system using a custom frame buffer controller for DDR.
Reading data from random locations of ddr and writing into ouput line buffers((dual port RAM).
Now we are planning to port the system to axi bus interface model.Now our custom frame buffer
need to support master write/read transactions to and from axi_ddr via axi_interconnect.Is it
posiible to achieve required through put as before using this sysytem?
Any replies will be greatly appreciated
06-11-2014 05:20 AM - edited 06-11-2014 05:21 AM
I'd suggest starting a new thread for this question. I only noticed it because I subscribed to the thread, and I'm not an expert on AXI interconnect.
06-11-2014 12:55 PM
06-12-2014 01:48 AM
With previous architecture we have
Frame buffer controller (DDR controller ) data width = 512( 8 burst of 64 bits(native width) -> and clk frequency = 200MHz
1. We have only 14.8 us(hsync period) to perform 120 write cycles and 120 read cycles from ddr.
2. Since write occurs at concecutive locations it takes only 4 us but read cycle takes 9 us, since read occurs from random
locations.So total time taken was 13 us(< 14.8 us ) . So the through put was a major concern.
3. When porting the system to axi ? Is it possible to achieve this by replacing the existing Frame buffer controller(with a wrapper for ddr controller generated with coregen- having native interface) with an axi master frame buffer controller whiich performs the read and write to axi_ddr via axi interconnect.
4. Does this axi interconnect or axi_ddr introduce any bottle neck as compared to the previous design ?
06-12-2014 06:31 PM
11-19-2014 01:09 AM
i'm also trying to implement video rotation with DDR.
i have a problem while reading form random addresses form the DDR.
could you please share the algorithm you have implemented to make therotation?
04-02-2015 10:27 AM
I think the issue is not one of AXI bandwidth, but finding a way to write the vertical stripes to memory without storing a whole frame in the FPGA and not having to do single pixel writes to disparate addresses in memory. Who would issue such transfers assuming the Xilinx AXI could handle the random accesses at a sufficient speed, which I am still not convinced of by the way? I don't think there is any Xilinx IP that does that.
I wonder if there is a way to run the AXI packets through a lookup table that would scatter the pixels around in memory, even though the AXI thinks it is writing a continuous burst.
04-02-2015 11:32 AM - edited 04-02-2015 11:35 AM
Yeah, AXI only supports INCR, FIXED, and WRAP type bursts, so scattering it would mean a bunch of 1-beat bursts which is obviously ideal.
Your idea would require a memory controller capable of 'descrambling' the data AFTER decoded from AXI. Of course, MIG doesn't do that. Then again, DDR bandwidth itself will still suffer from scattered memory accesses, even if you are able to get AXI bottlenecks out of the way.
You're probably looking at some additional linebuffers for a practical solution, as Gabor mentioned.
Anyway, I just wanted to post some resources here:
11-15-2017 04:54 AM
02-01-2018 12:41 AM
I repost about rotation.
I will need to rotate 1920*1080 video.
On my board (custom), I have an Artix7 and DDR3 SDRAM 1Gb, 8Meg * 16 bits * 8 banks (MT41J64M16).
At the begining, SDRAM is used for pictogram inlay (1 bank is enough).
But now, i need to use the SDRAM for roration too.
Is it possible? Rotation and inlay (at the same time).