UPGRADE YOUR BROWSER

We have detected your current browser version is not the latest one. Xilinx.com uses the latest web technologies to bring you the best online experience possible. Please upgrade to a Xilinx.com supported browser:Chrome, Firefox, Internet Explorer 11, Safari. Thank you!

cancel
Showing results for 
Search instead for 
Did you mean: 
103 Views
Registered: ‎04-02-2019

Timing measurement on read/write to Share Memory

We currently performed “executing-time timing measurement” on both A53 side and R5 side using ZCU102 board and get the following results:

 

Buffer size = 1024 bytes
loop 1000 times

A53 Release (uSec)

Min

Max

Running Avg

Fill the buffer byte by byte

1717

1786

1720

Local memcpy

180

231

180

shrmem memcpy - write

180

190

180

shrmem memcpy - read

180

190

180

 

 

Buffer size = 1024 bytes
loop 1000 times

R5 Release (uSec)

Min

Max

Running Avg

Fill the buffer byte by byte

10260

10260

10260

Local memcpy

1188

1190

1188

shrmem memcpy - write

4156

4228

4157

shrmem memcpy - read

29175

30407

29189

shrmem blockwrite

4288

4290

4288

shrmem blockread

29688

29739

29694

 

We have the following questions:

  1. Why does “shrmem memcpy – write” is much much slower in R5 comparing to A53 (4,157 versus 180 usec)?  Here we copy 1024-byte buffer from local-memory to share-memory; then we loop these processing 1000 times.

Start time = clock_gettime()

memcpy()

End time = clock_gettime()

Executing_time = end time – start time

 

On A53, we use clock_gettime () to get Start time and End time.

On R5, we use XTime_GetTime() to get Start time and End time.

  1. Why does “shrmem memcpy –read” is much slower than “shrmem memcpy – write” on R5 (29189 vs. 4157)?   Whereas, the read and write is quite symmetric on A53 (180 vs. 180).
  2. Do these timing measurements seems reasonable to you?

 

Thank you

 

Andrew

0 Kudos