cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
bhanu27
Contributor
Contributor
1,038 Views
Registered: ‎05-10-2019

AXI performance Monitor : Read Latency Count

Hi,

I am trying to use AXI performance Monitor.

I am using it in Profile Mode. I have connected one of the slots to AXI VDMA MM2S channel.

I am observing that for read Channel, the Read Latency count is showing a unusual high number.

I am following Programming Sequence section of profile mode from the specs

 https://www.xilinx.com/support/documentation/ip_documentation/axi_perf_mon/v5_0/pg037_axi_perf_mon.pdf

Any hint on how to correctly use the Read Latency Count Metric

0 Kudos
6 Replies
dgisselq
Scholar
Scholar
1,027 Views
Registered: ‎05-21-2015

@bhanu27,

What do you consider to be an "unusually high" number?  And what is the data path, to include any and all clock rate conversions, data width conversions, and the slave at the end of the path?

Dan

bhanu27
Contributor
Contributor
924 Views
Registered: ‎05-10-2019

Hi,

Please find a snapshot of how the performance monitor is connected in the design.

Also, I did a simulation and I am observing that outstanding Read Transactions are happening on HP0 port to which Slot1 of the performance monitor is attached.

In Performance Monitor,  Read latency is calculated as the time from the start of read address issuance to the
ending of the read data transaction. In outstanding Read Transaction case, address issuance of two transactions is happening a few cycles apart, so Read latency number is more for the second transaction as its address is issued much earlier. For Real read latency, it seems I have to divide the read latency calculated by performance monitor by 2

In AXI Performance Monitor Specs (Page 17), it is mentioned that 

Supports a maximum outstanding transaction depth of 32. The write/read latency
metrics are affected by the outstanding transactions

 

ReadLatencyPerfMon.png
block_diagram.png
0 Kudos
dgisselq
Scholar
Scholar
902 Views
Registered: ‎05-21-2015

@bhanu27,

Yes, that latency is horrible.

My first question, though, would be: why isn't RREADY held high?  That signal is produced within your logic, and it should be something that you can control (somewhat--the interconnect can impact it as well.)

My next question would be whether or not the AXI performance monitor was set up to collect statistics on all ID's, or did only some of your ID's get monitored.

Dan

0 Kudos
bhanu27
Contributor
Contributor
872 Views
Registered: ‎05-10-2019

Hi Dan,

Thank you for looking into this issue.

I have following logic in the design 

CUSTOM_BLOCK -> VDMA -> ZYNQ_HP0 

Performance Monitor is being used in Profile mode and its Slot1 is connected to HP0 Port.

Custom Block reads image data from DDR using MM2S channel of VDMA. The VDMA MM2S channel is configured for 640 HSIZE and 480 VSIZE to read the image data from DDR memory. Each pixel is 16 bit.

The CUSTOM block can load and process only one ROW at a time. So it de-asserts READY to VDMA after one row of 640 pixels is read. So for MM2S interface of VDMA, ready goes low after each row is read and goes high when the row is processed.

ID based filtering is only available in Advanced Mode, in my case Performance monitor is used in Profile mode, so it ignores ID for metric calculation.

What I am observing from Sims is that there is one outstanding Read Transaction. I am using PYNQ based setup for validation of my application on the board. In Validation, I am observing that the Read latency value that I get for a frame, if I divide it by 2 and multiply by my clock cycle time, it gives me a number which seems reasonable for the time it takes to read a frame from DDR memory.

Best Regards

 

 

 

0 Kudos
dgisselq
Scholar
Scholar
863 Views
Registered: ‎05-21-2015

@bhanu27,

How wide is the data bus of your custom block?  Did you match the interconnect at 128-bits?  Also, are you matching the clock rate of the PS?  My guess is a no on the second, since video can be notoriously difficult to match sample rates with, but how about the data width?
Dan

0 Kudos
bhanu27
Contributor
Contributor
856 Views
Registered: ‎05-10-2019

CUSTOM Block Streaming interface to VDMA is 64 bit. VDMA AXI master interface is also 64 bit. ( So MM2S channel is 64 bit)

It is the axi interconnect through which VDMA is connected to Zynq HP0, which does the 64 bit to 128-bit conversion.

Frequence of Clock is 200Mhz

0 Kudos