09-13-2020 07:06 PM
Dear experts!
I'm working with DNNDK in mpsoc 2020.1 in zcu102. I have a question need to clearly understand, could you please help me to a deeper understanding about this as below:
When I run example with adas_detection I did not see the Arm Neon execute, while I execute the resnet50 example, I found that the Arm Neon was used for that. Could you please tell to me why we use Arm Neon? why don't we use DPU in that case? What is the limitation of DPU?
I'm looking forward to receiving your support,
Thank you in advance~~
09-17-2020 04:29 PM
Hi @vanhuong.do
NEON instructions are typically invoked when SIMD/vector intrinsics are used in your application code. For example, the following function will be pipelined to run on the NEON processor:
int8x8_t vadd_s8 (int8x8_t a, int8x8_t b) -- this function adds two vectors of 8 integers (each int being an int8_t), and returns the sum vector.
The difference you are seeing in the ARM NEON usage maybe due to the inherent models which the ADAS dectection and Resnet50 applications are based on. For example, in Vitis AI library, ADAS detection is implemented using Yolov3 which is different compared to resnet50. If you are seeing the NEON instructions being invoked by resnet50, then that is likely because it's implementation uses vector intrinsics.
09-13-2020 07:19 PM
Hi @vanhuong.do ,
It is no need to worry about that.
As I know Neon can be used when you are using particular libraries or it can be automatically called with particular compile configurations.
Is there any particular request to enable or disable Neon?
09-14-2020 11:47 PM
Thanks @jasonwu for your reply. at this moment, I do not have the request to enable or disable Neon yet. I'll post the inquiries in case I need your support!
Thank you so much~
09-17-2020 04:29 PM
Hi @vanhuong.do
NEON instructions are typically invoked when SIMD/vector intrinsics are used in your application code. For example, the following function will be pipelined to run on the NEON processor:
int8x8_t vadd_s8 (int8x8_t a, int8x8_t b) -- this function adds two vectors of 8 integers (each int being an int8_t), and returns the sum vector.
The difference you are seeing in the ARM NEON usage maybe due to the inherent models which the ADAS dectection and Resnet50 applications are based on. For example, in Vitis AI library, ADAS detection is implemented using Yolov3 which is different compared to resnet50. If you are seeing the NEON instructions being invoked by resnet50, then that is likely because it's implementation uses vector intrinsics.