cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
Highlighted
376 Views
Registered: ‎03-05-2019

UltraZed-EG DPU 3.0 problems with Tensorflow and Caffe

Hi,

I'm trying to run DPU AI-accelerated examples from Edge-AI-Tutorials and model zoo. I'm using UltraZed-EG board with UltraZed-EG IO Carrier Card, which is not supported in TRD and precompiled examples.

I had success with running applications for face_detection and resnet50 based on Vivado 2018.2, Petalinux 2018.2 and DPU 2.0 with precompiled neural networks from examples.

Now I wanted to switch to DPU version 3.0 with Vivado 2019.1 and Petalinux 2019.1 (and DPU kernel driver for DPU 3.0) and try to compile example trained neural nets using DNNDK 3.1 tools.

I have two problems with this DPU:

  • At first I used Tensorflow MNIST example, I had successfully trained models, frozen them and compiled. Then I've created application for running this network and it gives me totally random results:

 

####################################################
Warning:                                            
The DPU in this TRD can only work 8 hours each time!
Please consult Sales for more details about this!   
####################################################

total image : 6

Load image: x_test_445.png
[Top 0] prob = 0.587278  name = 5
[Top 1] prob = 0.403630  name = 3
[Top 2] prob = 0.004484  name = 7

Load image: x_test_123.png
[Top 0] prob = 0.999276  name = 5
[Top 1] prob = 0.000488  name = 3
[Top 2] prob = 0.000123  name = 2

Load image: x_test_5709.png
[Top 0] prob = 0.983238  name = 3
[Top 1] prob = 0.015893  name = 5
[Top 2] prob = 0.000424  name = 9

Load image: x_test_198.png
[Top 0] prob = 0.760100  name = 5
[Top 1] prob = 0.192183  name = 3
[Top 2] prob = 0.042882  name = 7

Load image: x_test_307.png
[Top 0] prob = 0.927000  name = 3
[Top 1] prob = 0.059261  name = 8
[Top 2] prob = 0.007078  name = 5

Load image: x_test_27.png
[Top 0] prob = 0.365752  name = 5
[Top 1] prob = 0.322775  name = 2
[Top 2] prob = 0.284848  name = 8
[Time]23790us
[FPS]252.207

####################################################
Warning:                                            
The DPU in this TRD can only work 8 hours each time!
Please consult Sales for more details about this!   
####################################################

When compared to model in Tensorflow with the same images for testing it gives totally different results.

 

  • Then I switched to Caffe examples and compiled resnet50 network example (224x224x3 input) and tried to run it on board and it gives me these logs:
Load image : PIC_001.jpg

Run DPU Task for ResNet50 ...
[  846.193679] [DPU][2738][PID 2738][taskID 2]Core 0 Run timeout,failed to get finish interrupt!
[  846.202216] [DPU][2738][DPU debug info]
[  846.202216] level = 9
[  846.208318] [DPU][2738]Core 0 schedule  counter: 10
[  846.213199] [DPU][2738]Core 0 interrupt counter: 9
[  846.217992] [DPU][2738][DPU Registers]
[  846.221730] [DPU][2738]VER           : 0x0d9c13f1
[  846.226165] [DPU][2738]RST           : 0x000000ff
[  846.230600] [DPU][2738]ISR           : 0x00000000
[  846.235036] [DPU][2738]IMR           : 0x00000000
[  846.239471] [DPU][2738]IRSR          : 0x00000000
[  846.243907] [DPU][2738]ICR           : 0x00000000
[  846.248341] [DPU][2738]
[  846.250781] [DPU][2738]DPU Core      : 0
[  846.254263] [DPU][2738]HP_CTL        : 0x07070f0f
[  846.258524] [DPU][2738]ADDR_IO       : 0x00000000
[  846.262786] [DPU][2738]ADDR_WEIGHT   : 0x00000000
[  846.267309] [DPU][2738]ADDR_CODE     : 0x0006fce8
[  846.271657] [DPU][2738]ADDR_PROF     : 0x00000000
[  846.276006] [DPU][2738]PROF_VALUE    : 0x00000000
[  846.280442] [DPU][2738]PROF_NUM      : 0x00000000
[  846.284704] [DPU][2738]PROF_EN       : 0x00000000
[  846.288965] [DPU][2738]START         : 0x00000001
[  846.293227] [DPU][2738]COM_ADDR_L0   : 0x70200000
[  846.297750] [DPU][2738]COM_ADDR_H0   : 0x00000000
[  846.302272] [DPU][2738]COM_ADDR_L1   : 0x71b00000
[  846.306794] [DPU][2738]COM_ADDR_H1   : 0x00000000
[  846.311317] [DPU][2738]COM_ADDR_L2   : 0x6fce0000
[  846.315839] [DPU][2738]COM_ADDR_H2   : 0x00000000
[  846.320361] [DPU][2738]COM_ADDR_L3   : 0x00000000
[  846.324884] [DPU][2738]COM_ADDR_H3   : 0x00000000
[  846.329406] [DPU][2738]COM_ADDR_L4   : 0x00000000
[  846.333928] [DPU][2738]COM_ADDR_H4   : 0x00000000
[  846.338451] [DPU][2738]COM_ADDR_L5   : 0x00000000
[  846.342973] [DPU][2738]COM_ADDR_H5   : 0x00000000
[  846.347495] [DPU][2738]COM_ADDR_L6   : 0x00000000
[  846.352017] [DPU][2738]COM_ADDR_H6   : 0x00000000
[  846.356540] [DPU][2738]COM_ADDR_L7   : 0x00000000
[  846.361062] [DPU][2738]COM_ADDR_H7   : 0x00000000
[  846.365583] [DPU][2738]
[DNNDK] DPU timeout while execute DPU Task [resnet50_0-2] of Node [res2a_branch2c]
  • After running resnet50 example my mnist network doesn't even load and returns the same error (DPU has hanged):
####################################################
Warning:                                            
The DPU in this TRD can only work 8 hours each time!
Please consult Sales for more details about this!   
####################################################

total image : 6
[  875.473799] [DPU][2742][PID 2742][taskID 3]Core 0 Run timeout,failed to get finish interrupt!
[  875.482337] [DPU][2742][DPU debug info]
[  875.482337] level = 9
[  875.488441] [DPU][2742]Core 0 schedule  counter: 11
[  875.493326] [DPU][2742]Core 0 interrupt counter: 9
[  875.498124] [DPU][2742][DPU Registers]
[  875.501859] [DPU][2742]VER           : 0x0d9c13f1
[  875.506295] [DPU][2742]RST           : 0x000000ff
[  875.510730] [DPU][2742]ISR           : 0x00000000
[  875.515165] [DPU][2742]IMR           : 0x00000000
[  875.519601] [DPU][2742]IRSR          : 0x00000000
[  875.524036] [DPU][2742]ICR           : 0x00000000
[  875.528470] [DPU][2742]
[  875.530910] [DPU][2742]DPU Core      : 0
[  875.534392] [DPU][2742]HP_CTL        : 0x07070f0f
[  875.538653] [DPU][2742]ADDR_IO       : 0x00000000
[  875.542915] [DPU][2742]ADDR_WEIGHT   : 0x00000000
[  875.547438] [DPU][2742]ADDR_CODE     : 0x0006fcd8
[  875.551786] [DPU][2742]ADDR_PROF     : 0x00000000
[  875.556135] [DPU][2742]PROF_VALUE    : 0x00000000
[  875.560571] [DPU][2742]PROF_NUM      : 0x00000000
[  875.564833] [DPU][2742]PROF_EN       : 0x00000000
[  875.569095] [DPU][2742]START         : 0x00000001
[  875.573357] [DPU][2742]COM_ADDR_L0   : 0x70200000
[  875.577879] [DPU][2742]COM_ADDR_H0   : 0x00000000
[  875.582402] [DPU][2742]COM_ADDR_L1   : 0x6fce0000
[  875.586923] [DPU][2742]COM_ADDR_H1   : 0x00000000
[  875.591446] [DPU][2742]COM_ADDR_L2   : 0x6fcd8000
[  875.595968] [DPU][2742]COM_ADDR_H2   : 0x00000000
[  875.600490] [DPU][2742]COM_ADDR_L3   : 0x00000000
[  875.605013] [DPU][2742]COM_ADDR_H3   : 0x00000000
[  875.609535] [DPU][2742]COM_ADDR_L4   : 0x00000000
[  875.614057] [DPU][2742]COM_ADDR_H4   : 0x00000000
[  875.618580] [DPU][2742]COM_ADDR_L5   : 0x00000000
[  875.623102] [DPU][2742]COM_ADDR_H5   : 0x00000000
[  875.627624] [DPU][2742]COM_ADDR_L6   : 0x00000000
[  875.632146] [DPU][2742]COM_ADDR_H6   : 0x00000000
[  875.636669] [DPU][2742]COM_ADDR_L7   : 0x00000000
[  875.641191] [DPU][2742]COM_ADDR_H7   : 0x00000000
[  875.645712] [DPU][2742]
[DNNDK] DPU timeout while execute DPU Task:mnist-3

 

Is there something I'm doing wrong? I've seen posts about using 1.4.0 or 1.4.0.1 dnnc (compiler) for RAM usage low and high in DPU, but I only use 1.4.0 which should compile NN with DPU Low RAM usage. I can't also see where can I obtain 1.4.0.1 dnnc.

In the link there is attachment I've included Vivado project (.tcl for generate project), .hdf, .dcf used for compiling networks, precompiled BOOT SD card for my board with compiled examples for resnet50 and mnist. There are also included source codes in C++ for using compiled networks with DNNDK tools on UltraZed board. I've also added network_models in Tensorflow (.pb) and Caffe (.caffemodel and .prototxt) which I used to compile with .sh scripts that I used to run dnnc tool with proper arguments.

 

>>> IMPORTANT ATTACHEMENT (CODES, NNs, VIVADO etc.) <<<

0 Kudos
2 Replies
Highlighted
Moderator
Moderator
366 Views
Registered: ‎03-27-2013

Hi klukomski@fp-instruments.com ,

 

I met similar problem when trying to deploy my custom network on DNNDK 3.0. And I was told that there would be problem on tensorflow flow on DNNDK 3.0. So I just wait for the update for 3.1. And for now it works fine on 3.1. So it is strongly recommended to update to DNNDK 3.1(with all other Xilinx tools to 2019.1) and have a try again.

And BTW the other thing I can think about is that the preprocess operations here in your training/calibration/deploying are need to be exactly the same. It is a little difficult here. Because they are keras code/TensorFlow low level APIs/ARM C code.

I am trying to tidy up my example to train and deploy the custom Convnet flow on DNNDK 3.1 + Vivado/PetaLinux 2019.1.

I would let you know if it is ready. :-)

Best Regards,
Jason
-----------------------------------------------------------------------------------------------
Please mark the Answer as "Accept as solution" if the information provided is helpful.

Give Kudos to a post which you think is helpful and reply oriented.
-----------------------------------------------------------------------------------------------
0 Kudos
Highlighted
360 Views
Registered: ‎03-05-2019

I am already using tools from DNNDK 3.1.

To be sure that you understand what tools I am using now I have problems with:

https://www.xilinx.com/products/design-tools/ai-inference/ai-developer-hub.html#edge

I downloaded xilinx_dnndk_v3.1_190809.tar.gz for compiler and optimizer tools on Host machine (tested on Ubuntu 18.04 and Debian bullseye).

For building my DPU I used TRD for ZCU102 with DPU 3.0 - zcu102-dpu-trd-2019-1-190809.zip, then I used this DPU in Vivado 2019.1 and BSP for this ZCU102 petalinux board to create petalinux 2019.1 project with drivers for DPU 3.0 for my UltraZed board.

0 Kudos