cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
Highlighted
Visitor
Visitor
5,255 Views
Registered: ‎11-24-2014

Is AXI Chip2Chip throughput greater than its theoretical maximum?

Jump to solution

Hello,

 

I was working to increase the throughput in application note design xapp1216 - AXI Chip2Chip/Aurora for real-time video. I am changing the throughput of the Test Pattern Generator via increasing the resolution and pixel clock.

This reference design writes and reads a video test pattern DDR3 on a slave KC705 board before putting it out on the HDMI interface on the master KC705 board.

 

The master board contains a microblaze that prints write and read byte counts to the UART terminal. The performance monitor counts write and read bytes at the Master Chip2Chip AXI4 interface. AXI perfmon gets the counters stop time via the clock-dependent time value in the sample interval register. According to the Chip2Chip product guide, the maximum theoretical throughput can be calculated by: (1-0.03215)*64*125E6 / 2 = 484.375 MByte/s

 

My problem: When I set the video test pattern generator to increased resolutions, the perfmon measures throughputs that exceed the theoretical maximum of Chip2Chip.

 

What I ruled out so far:

- I double-checked the core clock of perfmon and the sample interval register, to ensure the counters do not count longer than they should. I also checked the UART printed global count which does not exceed the expected figure.

- I verified the pixel clock of the video pipeline is correct. KC705 HDMI output is operating at 60 frames per second (input status in monitor menu).

- I double-checked the perfmon register configuration is done correctly in the microblaze application. (slot ID, metric selection, metric counter nr.)

 

The measured figures are in the attached image.*

 

Q1: Could there be a factor skewing the throughput figures?

Q2: If not, then why are the figures greater than the Chip2Chip theoretical maximum throughput?

 

*invalid clock: the pixel clock must be greater than 50MHz (AXI-lite clk) and lower than 150MHz (perfmon core clk).

ss_userforms_post_figures.png
0 Kudos
Reply
1 Solution

Accepted Solutions
Highlighted
Visitor
Visitor
8,665 Views
Registered: ‎11-24-2014

Found the solution myself. I made two mistakes. First mistake was using the wrong value for the PHY frequency. This should have been the clock connecting to the core interface pin c2c_phy_clk. This clock is essentially passed on by the Aurora core from the GT transceiver logic.

The second, and most annoying mistake, is that I interpreted the data sheet wrong. To somebody like me who has little experience with FPGAs, this line in PG067 sounded like it meant write+read rather than write or read:

 

"The following emperical formula provides guidance on max theoretical throughput that a core can provide for the AXI Read/Write channel with a known overhead."

 

The theoretical maximum, write+read channel, was in fact 756.87 MB/s all along. This is in line with the observation for 1080p60 video that was not displaying at a measured throughput of 745.2 MB/s.

View solution in original post

0 Kudos
Reply
3 Replies
Highlighted
Visitor
Visitor
5,185 Views
Registered: ‎11-24-2014

UPDATE

 

I measured the BW on the KC705 slave board with the performance monitor at the exact same settings. Surprisingly enough, I am measuring the same numbers before and after the chip2chip link. I am starting to think the Xilinx provided formula is wrong (see my previous post's attachment) or I am making fundamental mistakes.

I really want some assistance on this subject as the xapp1216 is using this formula in the same way, but may have interpreted it wrong.

0 Kudos
Reply
Highlighted
Xilinx Employee
Xilinx Employee
5,153 Views
Registered: ‎10-24-2013
Hi,
Moving to Embedded Processor System Design board.
Thanks,Vijay
--------------------------------------------------------------------------------------------
Please mark the post as an answer "Accept as solution" in case it helped resolve your query.
Give kudos in case a post in case it guided to the solution.
0 Kudos
Reply
Highlighted
Visitor
Visitor
8,666 Views
Registered: ‎11-24-2014

Found the solution myself. I made two mistakes. First mistake was using the wrong value for the PHY frequency. This should have been the clock connecting to the core interface pin c2c_phy_clk. This clock is essentially passed on by the Aurora core from the GT transceiver logic.

The second, and most annoying mistake, is that I interpreted the data sheet wrong. To somebody like me who has little experience with FPGAs, this line in PG067 sounded like it meant write+read rather than write or read:

 

"The following emperical formula provides guidance on max theoretical throughput that a core can provide for the AXI Read/Write channel with a known overhead."

 

The theoretical maximum, write+read channel, was in fact 756.87 MB/s all along. This is in line with the observation for 1080p60 video that was not displaying at a measured throughput of 745.2 MB/s.

View solution in original post

0 Kudos
Reply