UPGRADE YOUR BROWSER

We have detected your current browser version is not the latest one. Xilinx.com uses the latest web technologies to bring you the best online experience possible. Please upgrade to a Xilinx.com supported browser:Chrome, Firefox, Internet Explorer 11, Safari. Thank you!

cancel
Showing results for 
Search instead for 
Did you mean: 
Contributor
Contributor
421 Views
Registered: ‎07-15-2019

ultra96 v2 reboot when run dnndk resnet50 examples...

 

I saw someone met such kinds of problem before, cause of power supply issue of ultra96 v2 board:

https://forums.xilinx.com/t5/Deephi-DNNDK/Edge-AI-Platform-Tutorials-System-reboots-on-ultra96-v2/m-p/1007308#M1824

 

but mine is different:

First, I already change the power supply to 4A type;

Second, on my ultra96-v2 board, if I run resnet50 sample with B1152 dpu, it works fine. But if I run it with B2304 dpu, the system will reboot when excuting "_T(dpuSetInputImage2(taskConv, CONV_INPUT_NODE, img));" in function "run_resnet_50(...)",  and I also find that, the dpu_resnet50_0.elf file for B2304 have no debug info, so I can't get more information when system crash.

 

Does anyone who meet this problem before?

 

 

0 Kudos
4 Replies
Contributor
Contributor
408 Views
Registered: ‎07-15-2019

回复: ultra96 v2 reboot when run dnndk resnet50 examples...


@phoenixmy  已写:

 

I saw someone met such kinds of problem before, cause of power supply issue of ultra96 v2 board:

https://forums.xilinx.com/t5/Deephi-DNNDK/Edge-AI-Platform-Tutorials-System-reboots-on-ultra96-v2/m-p/1007308#M1824

 

but mine is different:

First, I already change the power supply to 4A type;

Second, on my ultra96-v2 board, if I run resnet50 sample with B1152 dpu, it works fine. But if I run it with B2304 dpu, the system will reboot when excuting "_T(dpuSetInputImage2(taskConv, CONV_INPUT_NODE, img));" in function "run_resnet_50(...)",  and I also find that, the dpu_resnet50_0.elf file for B2304 have no debug info, so I can't get more information when system crash.

 

Does anyone who meet this problem before?

 

 


BTW:

I also run the face_detection example with 2304 dpu, but the system also crashed at  excuting "_T(dpuSetInputImage2(taskConv, CONV_INPUT_NODE, img));" in function "run_resnet_50(...)"

0 Kudos
Xilinx Employee
Xilinx Employee
370 Views
Registered: ‎01-21-2014

回复: ultra96 v2 reboot when run dnndk resnet50 examples...

Reduce the clock rate of the DPU and see if it starts working. I'm not yet convinced that the 4A power supply totally solves the problem without a change to the power monitoring circuit on the V2, as mentioned in this thread: 

https://www.element14.com/community/thread/72736/l/ultra96-v2-errata

 

Terry

 

Contributor
Contributor
352 Views
Registered: ‎07-15-2019

回复: ultra96 v2 reboot when run dnndk resnet50 examples...

 


@terryo  已写:

Reduce the clock rate of the DPU and see if it starts working. I'm not yet convinced that the 4A power supply totally solves the problem without a change to the power monitoring circuit on the V2, as mentioned in this thread: 

https://www.element14.com/community/thread/72736/l/ultra96-v2-errata

 

Terry

 


 

Thanks for your reply.

I have read the errata doc, and I change the B2304 dpu clock to 500MHz as  it proposed, but the system still reboot when excuting resnet50 examples. Below is the dpu info:

root@ultra96v2_dpu:~# dexplorer -w
[DPU IP Spec]
IP Timestamp : 2019-08-22 10:15:00
DPU Core Count : 1

[DPU Core List]
DPU Core : #0
DPU Enabled : Yes
DPU Arch : B2304F
DPU Target : v1.4.0
DPU Freqency : 250 MHz
DPU Features : Avg-Pooling, LeakyReLU/ReLU6, Depthwise Conv
root@ultra96v2_dpu:~#



Hope avnet can fix it issue soon, there are really a lot of work need to do on it~~~~~~

 

 

0 Kudos
Voyager
Voyager
164 Views
Registered: ‎10-01-2007

回复: ultra96 v2 reboot when run dnndk resnet50 examples...

I want to assure you that Avnet is working on the issue. The Ultra96-V2 PL Vccint regulator is capable of 4A steady-state. The Fault Threshold will trips once a peak current is detected.

The first 3K boards were built with Fault = 3A. This passed our testing, which included a PL stress test, but it's obvious now that the DPU is much more intensive than this testing. This fault threshold was much too pessimistic and well below the capability of the circuit design.

  • Fault = 3A, DPU Max Frequency = 195 MHz

We felt it safe to adjust the Fault to 4A. This then passed the example that we were running. However, all of the models were not executed. When we expanded the test to cover all the examples, we found again that the board reboots when peak current exceeds 4A.

  • Fault = 4A, DPU Max Frequency = 230 MHz

We are now performing testing with Fault = 5.5A. We are actively monitoring the average current to ensure that it does not exceed 4A. We have also found that the improving the cooling allows the Pmics to operate more efficiently, reducing the peak currents. Using an external desk fan blowing on the hardware will help. We are also designing a new heatsink that greatly improves the thermal conditions for both MPSoC and PMics. We are testing now, looping the most rigorous DPU tests in a thermal chamber at high temps with the new heatsink. We focused our testing on similar performance to Ultra96-V1, which had a max DPU frequency of 255 MHz. These tests have passed. We are now investigating how much higher we can raise the DPU frequency while keeping the board stable.

  • Fault = 5.5A, DPU Max Frequency = 255+ MHz, testing still ongoing

Bryan

0 Kudos