UPGRADE YOUR BROWSER

We have detected your current browser version is not the latest one. Xilinx.com uses the latest web technologies to bring you the best online experience possible. Please upgrade to a Xilinx.com supported browser:Chrome, Firefox, Internet Explorer 11, Safari. Thank you!

cancel
Showing results for 
Search instead for 
Did you mean: 
Highlighted
Observer richkeefe
Observer
2,465 Views
Registered: ‎01-24-2011

How to synchronize multiple VDMA MM2S cores

Jump to solution

Hi,

 

I'd like to load a number (e.g,, 2) of image frames using VDMA into separate buffers and then unload them in lock step in order to perform pixel-wise processing (sum, difference, compare, etc.) before propagating the result as a single frame. Each VDMA in the series starts with a different buffer number and sequences through the same 3 in staggered fashion. Once begun, this process will continue frame-by-frame perpetually.

 

Are there native features of the VDMA read channel that will accomplish this pixel-wise synchronization? For example, once each of the 3 VDMA's line buffers begin outputting pixels, will they all stream without any breaks? Or, owing to arbitration to the DDR, are asymmetric stalls something that should be accounted for? (Perhaps the VDMA can be configured to load the line buffer before it is unloaded?)

 

Thanks.

0 Kudos
1 Solution

Accepted Solutions
Explorer
Explorer
3,016 Views
Registered: ‎07-18-2011

Re: How to synchronize multiple VDMA MM2S cores

Jump to solution

@richkeefe,

 

Yes, you should be able to eliminate the small input FIFOs I use and directly throttle the data flow using the tready, tvalid, and tuser (SOF) bits. 

 

You still have to have a method for synchronization between all of your different sources, and you should have a way to gracefully exit a reset, change of output parameters, pause of data flow, and things like that.

 

Since you can't guarantee continuous data on every clock from all three VDMAs (a tvalid may be low when you need a new pixel, or you may be waiting for synchronization), you need to have a way to process only valid pixels in your continuously-clocked processing pipeline.   If you make a clock/data enable signal from all three tvalid and tready signals on the input side, you can use that as a "valid" signal to enable your processing pipeline, and propagate this valid signal along with the data through your pipeline, and ultimately use it as the write enable for the output FIFO in your AXI4-Stream master section, insuring only valid pixels get sent out the AXI4-Stream master.

 

As for synchronization between your three VDMA datastreams, you can use the same approach as before  Hold off processing any video pixels at the start of a new read frame until there is a high on the tuser (SOF) bit and tvalid for each of the three data streams.   Once this is the case, process all three pixels simultaneously into your pipeline, setting the "valid" bit each time all three are valid, and stopping processing if any one of the three tvalid signals is low, and wait for it to catch up.  If the output FIFO is appropriately sized, and you wait until it is full before initially setting the output tvalid signal at the start of a new frame, it should be able to coast through any pauses from the three VDMAs, and provide continuous data to the downstream IP blocks with no processing bubbles.

 

 

 

 

 

 

View solution in original post

8 Replies
Moderator
Moderator
2,374 Views
Registered: ‎11-09-2015

Re: How to synchronize multiple VDMA MM2S cores

Jump to solution

Hi @richkeefe,

 

I am not sure about what you are asking. But the VDMA will not write into a buffer if you are reading into it. So you can then continue to read into it.

 

If this do not answer your question, please clarify (maybe use a screenshot?)

 

Regards,


Florent
Product Application Engineer - Xilinx Technical Support EMEA
**~ Don't forget to reply, give kudos, and accept as solution.~**
0 Kudos
Observer richkeefe
Observer
2,366 Views
Registered: ‎01-24-2011

Re: How to synchronize multiple VDMA MM2S cores

Jump to solution

Hi Florent,

 

Thanks for the reply. To clarify my question; I need to perform real-time processing of pixels from 3 different frames, all from the same streaming source but at different points in time. Each of these frames is written into individual frame buffers by a write-only VDMA. I plan to use 3 read-only VDMA instances to extract the frames in raster order and then perform the pixel processing. So, I am trying to determine whether the read operation of these 3 read-only VDMA instances can be conducted  in lock-step with each other. Screen shot is below. Does this clarify my question?

 

Regards,

Rich

 

Untitled.jpg

 

0 Kudos
Moderator
Moderator
2,342 Views
Registered: ‎11-09-2015

Re: How to synchronize multiple VDMA MM2S cores

Jump to solution

Hi @richkeefe,

 

I do not see a way to do it directly with the VDMA (but this does not mean that there is not a way). However, I guess you should be able to do a small piece of logic to do that (controlling the tready signals).

 

Regards,


Florent
Product Application Engineer - Xilinx Technical Support EMEA
**~ Don't forget to reply, give kudos, and accept as solution.~**
0 Kudos
Explorer
Explorer
2,323 Views
Registered: ‎07-18-2011

Re: How to synchronize multiple VDMA MM2S cores

Jump to solution

As Florent  mentioned, the easiest way to do this is to write a custom AXI4-stream IP block that controls the reads from memory.  There  may be better ways to do this, but I'll describe the method I use, which always works for me.

 

Since you have three (or more) VMDAs accessing the same MIG controller in a round-robin fashion, along with possibly a MicroBlaze and/or other IP blocks sharing time slots, you can be guaranteed that all three of your VDMAs will never have initial data available simultaneously. 

 

In order to insure synchronization and valid data in each of the streams, use a small FIFO, say 16 deep, for each VDMA input source.  Reset all three FIFOs simultaneously at the start of a frame, then pull each tready bit high to request data from the three VDMAs.   Enable your initial FIFO writes only when a valid SOF has arrived on that path, otherwise keep tready high and the FIFO write enable low, which will discard non-valid or non-SOF pixels, or flush any remaining data from a MIG read burst that may have been left over from the last frame read.  This will insure your FIFO data starts with the first pixel of a new video frame.  Once the first SOF has been written, you can then write all valid pixels thereafter, pausing only when the input FIFO full flag goes high (you may need to use the almost-full flag, or a programmable full flag, if you have some pipe delays or input latches prior to your FIFO, to insure against overflow).

 

You can start simultaneous processing of all three paths as soon as all three input FIFO empty flags are low, and continue processing as long as none of the three empty flags go high.  If any of the three empty flags goes high, you pause processing until all three empty flags are low again.  Since all three FIFOs have only valid data that started with the SOF for each path, each FIFO read from the three sources will be synchronous.

 

In order to insure there are no processing "bubbles" in your AXI4-stream video processing, you want to avoid having to wait when the MIG buffer empties.  The best way to accomplish this is to add an AXI-master output section with a decent-sized output FIFO.   After the initial reset, hold off setting the output tvalid signal until this output FIFO is full, or almost full, then allow tvalid to go high to signal downstream IP that valid data is ready. This will maximize the data throughput and guard against processing bubbles, as the output FIFO will have enough stored data to coast through any pauses for new data fetches from the MIG.   The input side processing should strive to keep this output FIFO full at all times, or at least keep up with the output IP data requests.   You can use the FIFO empty flag to control the valid handshaking with downstream IP.

 

Moderator
Moderator
2,290 Views
Registered: ‎11-09-2015

Re: How to synchronize multiple VDMA MM2S cores

Jump to solution

Hi @richkeefe,

 

If everything is clear for you, please close the topic by marking a reply as accepted solution.

Thanks and Regards,


Florent
Product Application Engineer - Xilinx Technical Support EMEA
**~ Don't forget to reply, give kudos, and accept as solution.~**
0 Kudos
Observer richkeefe
Observer
2,246 Views
Registered: ‎01-24-2011

Re: How to synchronize multiple VDMA MM2S cores

Jump to solution

Hi reaiken,

 

Thanks for your thoughtful and comprehensive answer. In the dwell time between my initial post and your answer, I have considered a similar approach. I have been experimenting with the VDMA example design and testbench. It is the case that the VDMA core already includes a line store FIFO, right? It seems that I can throttle the data flow at the VDMA streaming interface with the TREADY signal. In the simulation, the VDMA responds immediately by pausing data transitions. TVALID remains high, but no new data appears on the bus. Have you any experience using these VDMA controls directly to manage data flow?

 

 

0 Kudos
Explorer
Explorer
3,017 Views
Registered: ‎07-18-2011

Re: How to synchronize multiple VDMA MM2S cores

Jump to solution

@richkeefe,

 

Yes, you should be able to eliminate the small input FIFOs I use and directly throttle the data flow using the tready, tvalid, and tuser (SOF) bits. 

 

You still have to have a method for synchronization between all of your different sources, and you should have a way to gracefully exit a reset, change of output parameters, pause of data flow, and things like that.

 

Since you can't guarantee continuous data on every clock from all three VDMAs (a tvalid may be low when you need a new pixel, or you may be waiting for synchronization), you need to have a way to process only valid pixels in your continuously-clocked processing pipeline.   If you make a clock/data enable signal from all three tvalid and tready signals on the input side, you can use that as a "valid" signal to enable your processing pipeline, and propagate this valid signal along with the data through your pipeline, and ultimately use it as the write enable for the output FIFO in your AXI4-Stream master section, insuring only valid pixels get sent out the AXI4-Stream master.

 

As for synchronization between your three VDMA datastreams, you can use the same approach as before  Hold off processing any video pixels at the start of a new read frame until there is a high on the tuser (SOF) bit and tvalid for each of the three data streams.   Once this is the case, process all three pixels simultaneously into your pipeline, setting the "valid" bit each time all three are valid, and stopping processing if any one of the three tvalid signals is low, and wait for it to catch up.  If the output FIFO is appropriately sized, and you wait until it is full before initially setting the output tvalid signal at the start of a new frame, it should be able to coast through any pauses from the three VDMAs, and provide continuous data to the downstream IP blocks with no processing bubbles.

 

 

 

 

 

 

View solution in original post

Explorer
Explorer
2,219 Views
Registered: ‎07-18-2011

Re: How to synchronize multiple VDMA MM2S cores

Jump to solution

@richkeefe,

 

One more thing, you could design your IP to not use an output FIFO, but it makes the AXI4-Stream master control design much easier.

 

Also, since you are merging three video sources from three separate VDMAs into a single output video stream, an output FIFO may allow you to use smaller line buffers in each of your VDMAs, since the output FFO can be sized to accommodate any buffering that may be necessary, rather than duplicating the same length buffer three times in the VDMA line buffers.  This may allow you to reduce resources, depending on your line buffer size requirements.