cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
lmaxeniro
Adventurer
Adventurer
187 Views
Registered: ‎09-09-2019

Tasks parallel execution timing profiling -- difference between HW-EMU and HW?

Dear supporter,

Here I have a project which I designed to implement multiple compute units from one same C kernel (sha256 of vitis library) design.

When I run the test under HW emulation, I can see the tasks enqueued to multiple CU aligned with the expectation--see below chart:

lmaxeniro_0-1614909737579.png

When running on HW, I get the below chart by using vitis_anaylzer, my questions are:

1. I can not find the meaning of "row0" "row1" ect..--what is that defined?

2. Most of the Kernel Enqueues are happen on row0 and only a few on row1--actually if with enough task numbers I would see row 2/3 as well but with much less occur. Is that the correct behavior? 

3. The kernel execution efficiency looks not very high (large timing interval between each kernel execution) -- Is that due to the above#2 kernelEnqueues on the same row? what can be done to improve this?

lmaxeniro_0-1614922450549.png

 

Thanks a lot for helping!

 

0 Kudos
2 Replies
lmaxeniro
Adventurer
Adventurer
110 Views
Registered: ‎09-09-2019

nobody gives a suggestion, refresh and check if get any luck.

0 Kudos
nutang
Moderator
Moderator
30 Views
Registered: ‎08-20-2018

Hi @lmaxeniro 

In the Vitis analyzer, interpreting guidance data is a key part. This guidance view places each entry in a separate row. Each row might contain the name of the guidance rule, threshold value, actual value, and a brief but specific description of the rule.

In the case of kernel execution, the number of rows depends on the number of overlapping kernel executions. Overlapping of the kernels should not be mistaken for actual parallel execution on the device as the process might not be ready to execute right away.

Best Regards,
Nutan
-------------------------------------------------------------------------------
Please don't forget to reply, kudo and accept as a solution
0 Kudos