cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
hezi
Contributor
Contributor
291 Views
Registered: ‎09-24-2020

AIE - data movement bandwith

Jump to solution

AXI4-Stream interconnect is 32 bits wide. Assuming I want a larger bandwidth per stream than 32G bits/sec, for example 64G bits/sec, can I push from PL 256 bits wide bus with a 250 MHz clock that will be converted in AIE domain to 2 parallel streams of 32 bits each with 1 GHz clock aggregating 64G bits/sec? Will the tile know to take both streams and combine them to one 64 bit wide word in tile?  Otherwise is true to say the sample rate input is limited to 32G bits/sec?

0 Kudos
Reply
1 Solution

Accepted Solutions
derekh
Xilinx Employee
Xilinx Employee
185 Views
Registered: ‎08-06-2018

Each AI Engine tile can read two 32-bit AXI stream direct inputs. Given typical 1 GHz AIE clock, the maximum direct stream inputs you an achieve is 64 Gbit/s.

Note that it takes 4 clock cycles to convert from native 32 bit AXIS to 128 bit register/memory alignment when using direct stream. You can either interleave the stream inputs to do 128 bit update of the register to effectively refresh every second clock cycle or do concatenate if both stream ports every 4th clock cycle to get 256 bit update.

Also note that if you use dual AXIS, you will need to arrange the input streams alternating 4 consecutive samples on AXIS port 0 and port 1 respectively.

This is explained in this guide:

https://github.com/Xilinx/Vitis-In-Depth-Tutorial/tree/master/AI_Engine_Development/Design_Tutorials/02-super_sampling_rate_fir/DualStreamSSR

 

Derek
SAE DSP and AI Engine, Xilinx Sweden/EMEA
**~ Don't forget to reply, give kudos, and accept as solution.~**

View solution in original post

4 Replies
florentw
Moderator
Moderator
282 Views
Registered: ‎11-09-2015

Hi @hezi 

I would recommend you to read the very detailed answer from my colleague @ludovica on the following topic:
https://forums.xilinx.com/t5/Versal-and-UltraScale/Versal-FPGA-AIE-to-PL-logic-bit-width-rate-matching-and/m-p/1169826#M15351

Let me know if there are still things unclear after reading it.


Florent
Product Application Engineer - Xilinx Technical Support EMEA
**~ Don't forget to reply, give kudos, and accept as solution.~**
hezi
Contributor
Contributor
228 Views
Registered: ‎09-24-2020

Thanks, this doesn't exactly address my question since I want a  virtual stream of 64G bits/sec mapped on 2 physical 32G bits/sec streams in AIE.

This means I need a 256 bit PL interface running at 250MHz using 4 native PL-AIE 64b stream interfaces connected to 2 internal 32b interconnect streams concatenated in AIE tile to 1 64 bit stream.

AIE is capable of preforming 2 loads of 128 bits per clock. If the above is not possible, this means the AIE tile can utilize full BW only as consumer of neighboring tiles(producing 128/256 bits results), but not as consumer from PL(since streams are only 32 bit wide). 

Is my understanding correct? Or in other words how can I have a single input to AIE array with BW greater than 4GB/sec?

0 Kudos
Reply
hezi
Contributor
Contributor
201 Views
Registered: ‎09-24-2020
.
0 Kudos
Reply
derekh
Xilinx Employee
Xilinx Employee
186 Views
Registered: ‎08-06-2018

Each AI Engine tile can read two 32-bit AXI stream direct inputs. Given typical 1 GHz AIE clock, the maximum direct stream inputs you an achieve is 64 Gbit/s.

Note that it takes 4 clock cycles to convert from native 32 bit AXIS to 128 bit register/memory alignment when using direct stream. You can either interleave the stream inputs to do 128 bit update of the register to effectively refresh every second clock cycle or do concatenate if both stream ports every 4th clock cycle to get 256 bit update.

Also note that if you use dual AXIS, you will need to arrange the input streams alternating 4 consecutive samples on AXIS port 0 and port 1 respectively.

This is explained in this guide:

https://github.com/Xilinx/Vitis-In-Depth-Tutorial/tree/master/AI_Engine_Development/Design_Tutorials/02-super_sampling_rate_fir/DualStreamSSR

 

Derek
SAE DSP and AI Engine, Xilinx Sweden/EMEA
**~ Don't forget to reply, give kudos, and accept as solution.~**

View solution in original post