UPGRADE YOUR BROWSER

We have detected your current browser version is not the latest one. Xilinx.com uses the latest web technologies to bring you the best online experience possible. Please upgrade to a Xilinx.com supported browser:Chrome, Firefox, Internet Explorer 11, Safari. Thank you!

取消
显示结果 
搜索替代 
您的意思是: 
Observer richsheep
Observer
634 次查看
注册日期: ‎03-14-2019

DPU,程序执行resnet50程序超时,无中断返回

转到解答

在xczu2eg-sbva484-1-e芯片,Vivado2019.1,例化了DPU IP核(IP核取自zcu102-dpu-trd-2019-1-190809.tar.gz)

S​ummary以及实际系统中执行dexplorer -w结果如下图所示,


Selection_138.pngSelection_139.png

 

 

 

 

 

 

 

 

 

 

使用dnndk v3.1根据我的硬件工程编译了resnet50模型elf文件和可执行程序,程序使用的时ZedBoard文件夹中的resnet50示例程序

执行出现超时,无中断返回,如图所示:

Selection_140.png

求助!

标记 (2)
0 项奖励
1 个已接受解答

已接受的解答
Xilinx Employee
Xilinx Employee
521 次查看
注册日期: ‎03-27-2013

回复: DPU,程序执行resnet50程序超时,无中断返回

转到解答

Hi @richsheep ,

 

感谢反馈,log信息已经收到了。感觉中断是配置了(没有相关错误信息)。

你这边是:

[ 55.126774] [DPU][2384]Found DPU signature addr = 0xb1000000 in device-tree
[ 55.133752] [DPU][2384]Checking DPU signature at addr = 0xb1f00000, 
[ 55.140170] [DPU][2384]DPU signature checking done!

我那个一个自定义的设计,log信息是:

[ 19.058661] [DPU][2233]Found DPU signature addr = 0xa0000000 in device-tree
[ 19.065619] [DPU][2233]Checking DPU signature at addr = 0xa0f00000,
[ 19.072004] [DPU][2233]DPU signature checking done!
[ 19.077910] [DPU][2233]Init SMFC IP...
[ 19.081733] [DPU][2233]Request SMFC IRQ 60 successful.
[ 19.081738] [DPU][2233]Init SMFC IP done

多出来的都是softmax的配置,应该没有关系。

下一步的建议查看下Linux终端号的配置是否和硬件连接是符合的:

1. IPI中DPU core的中断连接方式

2. DTS中DPU这个节点的配置

3. 系统启动后的cat /proc/interrupts的结果重点查看dpu_isr的结果,我这边有两个core,所以结果就是:

 54:          0          0          0          0     GICv2 121 Level     dpu_isr
 55:          0          0          0          0     GICv2 122 Level     dpu_isr
 56:          0          0          0          0     GICv2  97 Level     xhci-hcd:usb1
 59:          0          0          0          0  zynq-gpio  22 Edge      sw19
 60:          0          0          0          0     GICv2 123 Level     dpu_smfc

可以参考https://github.com/Xilinx/Edge-AI-Platform-Tutorials/tree/master/docs/DPU-Integration 有关中断节点的说明。

如果你能贴出来,我这边也可以一起检查下看看。

 

Best Regards,
Jason
-----------------------------------------------------------------------------------------------
Please mark the Answer as "Accept as solution" if the information provided is helpful.

Give Kudos to a post which you think is helpful and reply oriented.
-----------------------------------------------------------------------------------------------

在原帖中查看解决方案

9 条回复9
Xilinx Employee
Xilinx Employee
586 次查看
注册日期: ‎03-27-2013

回复: DPU,程序执行resnet50程序超时,无中断返回

转到解答

HI @richsheep ,

 

这个情况比较常见的是中断没有配置好,能发一下你的完整启动log信息,和运行代码时的log。

需要文本格式的,我这边先检查下看看。

另外请确认下是否用了2019.1版本的Vivado/PetaLinux/IP/Driver 和DNNDK3.1

Best Regards,
Jason
-----------------------------------------------------------------------------------------------
Please mark the Answer as "Accept as solution" if the information provided is helpful.

Give Kudos to a post which you think is helpful and reply oriented.
-----------------------------------------------------------------------------------------------
Observer richsheep
Observer
557 次查看
注册日期: ‎03-14-2019

回复: DPU,程序执行resnet50程序超时,无中断返回

转到解答

Hi jason,

感谢您的快速响应,

附件附上了启动dmesg信息

0 项奖励
Observer richsheep
Observer
528 次查看
注册日期: ‎03-14-2019

回复: DPU,程序执行resnet50程序超时,无中断返回

转到解答

运行代码时的log,文本格式如下,另外已经确认工程使用版本和DNNDK版本正确.

# ./resnet50

Load image : airplane1.png

Run DPU Task for ResNet50 ...
[ 226.271874] [DPU][2398][PID 2398][taskID 1]Core 0 Run timeout,failed to get finish interrupt!
[ 226.280408] [DPU][2398][DPU debug info]
[ 226.280408] level = 9
[ 226.286498] [DPU][2398]Core 0 schedule counter: 1
[ 226.291285] [DPU][2398]Core 0 interrupt counter: 0
[ 226.296073] [DPU][2398][DPU Registers]
[ 226.299814] [DPU][2398]VER : 0x0d9c13f1
[ 226.304248] [DPU][2398]RST : 0x000000ff
[ 226.308674] [DPU][2398]ISR : 0x00000000
[ 226.313102] [DPU][2398]IMR : 0x00000000
[ 226.317529] [DPU][2398]IRSR : 0x00000001
[ 226.321956] [DPU][2398]ICR : 0x00000000
[ 226.326382] [DPU][2398]
[ 226.328814] [DPU][2398]DPU Core : 0
[ 226.332287] [DPU][2398]HP_CTL : 0x07070f0f
[ 226.336540] [DPU][2398]ADDR_IO : 0x00000000
[ 226.340794] [DPU][2398]ADDR_WEIGHT : 0x00000000
[ 226.345308] [DPU][2398]ADDR_CODE : 0x00070058
[ 226.349648] [DPU][2398]ADDR_PROF : 0x00000000
[ 226.353989] [DPU][2398]PROF_VALUE : 0x00000000
[ 226.358416] [DPU][2398]PROF_NUM : 0x00000000
[ 226.362670] [DPU][2398]PROF_EN : 0x00000000
[ 226.366923] [DPU][2398]START : 0x00000001
[ 226.371177] [DPU][2398]COM_ADDR_L0 : 0x70300000
[ 226.375691] [DPU][2398]COM_ADDR_H0 : 0x00000000
[ 226.380205] [DPU][2398]COM_ADDR_L1 : 0x71c00000
[ 226.384719] [DPU][2398]COM_ADDR_H1 : 0x00000000
[ 226.389233] [DPU][2398]COM_ADDR_L2 : 0x70058000
[ 226.393747] [DPU][2398]COM_ADDR_H2 : 0x00000000
[ 226.398261] [DPU][2398]COM_ADDR_L3 : 0x00000000
[ 226.402775] [DPU][2398]COM_ADDR_H3 : 0x00000000
[ 226.407289] [DPU][2398]COM_ADDR_L4 : 0x00000000
[ 226.411803] [DPU][2398]COM_ADDR_H4 : 0x00000000
[ 226.416317] [DPU][2398]COM_ADDR_L5 : 0x00000000
[ 226.420831] [DPU][2398]COM_ADDR_H5 : 0x00000000
[ 226.425345] [DPU][2398]COM_ADDR_L6 : 0x00000000
[ 226.429859] [DPU][2398]COM_ADDR_H6 : 0x00000000
[ 226.434373] [DPU][2398]COM_ADDR_L7 : 0x00000000
[ 226.438887] [DPU][2398]COM_ADDR_H7 : 0x00000000
[ 226.443401] [DPU][2398]
[DNNDK] DPU timeout while execute DPU Task [resnet_v1_50_0-1] of Node [resnet_v1_50_conv1_Conv2D]

DPU状态信息如下:

# ./dexplorer -s
[DPU cache]
Enabled

[DPU mode]
normal

[DPU timeout limitation (in seconds)]
100

[DPU Debug Info]
Debug level	: 9
Core 0 schedule : 1
Core 0 interrupt: 0

[DPU Resource]
DPU Core  	: 0
State     	: Idle
PID       	: 2398
TaskID    	: 1
Start     	: 184888922278
End       	: 0

[DPU Registers]
VER       	: 0x0d9c13f1
RST       	: 0x000000ff
ISR       	: 0x00000000
IMR       	: 0x00000000
IRSR      	: 0x00000000
ICR       	: 0x00000000

DPU Core	: 0
HP_CTL  	: 0x07070f0f
ADDR_IO 	: 0x00000000
ADDR_WEIGHT	: 0x00000000
ADDR_CODE	: 0x00070058
ADDR_PROF	: 0x00000000
PROF_VALUE	: 0x00000000
PROF_NUM	: 0x00000000
PROF_EN 	: 0x00000000
START   	: 0x00000000
COM_ADDR_L0	: 0x70300000
COM_ADDR_H0	: 0x00000000
COM_ADDR_L1	: 0x71c00000
COM_ADDR_H1	: 0x00000000
COM_ADDR_L2	: 0x70058000
COM_ADDR_H2	: 0x00000000
COM_ADDR_L3	: 0x00000000
COM_ADDR_H3	: 0x00000000
COM_ADDR_L4	: 0x00000000
COM_ADDR_H4	: 0x00000000
COM_ADDR_L5	: 0x00000000
COM_ADDR_H5	: 0x00000000
COM_ADDR_L6	: 0x00000000
COM_ADDR_H6	: 0x00000000
COM_ADDR_L7	: 0x00000000
COM_ADDR_H7	: 0x00000000

[Memory Resource]
MemInUse	:        0 MB

DPU架构信息如下:

# ./dexplorer -w
[DPU IP Spec]
IP  Timestamp            : 2019-11-26 19:30:00
DPU Core Count           : 1

[DPU Core Configuration List]
DPU Core                 : #0
DPU Enabled              : Yes
DPU Arch                 : B800
DPU Target Version       : v1.4.0
DPU Freqency             : 325 MHz
Ram Usage                : High
DepthwiseConv            : Enabled
DepthwiseConv+Relu6      : Enabled
Conv+Leakyrelu           : Enabled
Conv+Relu6               : Enabled
Channel Augmentation     : Enabled
Average Pool             : Enabled

谢谢!

0 项奖励
Xilinx Employee
Xilinx Employee
522 次查看
注册日期: ‎03-27-2013

回复: DPU,程序执行resnet50程序超时,无中断返回

转到解答

Hi @richsheep ,

 

感谢反馈,log信息已经收到了。感觉中断是配置了(没有相关错误信息)。

你这边是:

[ 55.126774] [DPU][2384]Found DPU signature addr = 0xb1000000 in device-tree
[ 55.133752] [DPU][2384]Checking DPU signature at addr = 0xb1f00000, 
[ 55.140170] [DPU][2384]DPU signature checking done!

我那个一个自定义的设计,log信息是:

[ 19.058661] [DPU][2233]Found DPU signature addr = 0xa0000000 in device-tree
[ 19.065619] [DPU][2233]Checking DPU signature at addr = 0xa0f00000,
[ 19.072004] [DPU][2233]DPU signature checking done!
[ 19.077910] [DPU][2233]Init SMFC IP...
[ 19.081733] [DPU][2233]Request SMFC IRQ 60 successful.
[ 19.081738] [DPU][2233]Init SMFC IP done

多出来的都是softmax的配置,应该没有关系。

下一步的建议查看下Linux终端号的配置是否和硬件连接是符合的:

1. IPI中DPU core的中断连接方式

2. DTS中DPU这个节点的配置

3. 系统启动后的cat /proc/interrupts的结果重点查看dpu_isr的结果,我这边有两个core,所以结果就是:

 54:          0          0          0          0     GICv2 121 Level     dpu_isr
 55:          0          0          0          0     GICv2 122 Level     dpu_isr
 56:          0          0          0          0     GICv2  97 Level     xhci-hcd:usb1
 59:          0          0          0          0  zynq-gpio  22 Edge      sw19
 60:          0          0          0          0     GICv2 123 Level     dpu_smfc

可以参考https://github.com/Xilinx/Edge-AI-Platform-Tutorials/tree/master/docs/DPU-Integration 有关中断节点的说明。

如果你能贴出来,我这边也可以一起检查下看看。

 

Best Regards,
Jason
-----------------------------------------------------------------------------------------------
Please mark the Answer as "Accept as solution" if the information provided is helpful.

Give Kudos to a post which you think is helpful and reply oriented.
-----------------------------------------------------------------------------------------------

在原帖中查看解决方案

Observer richsheep
Observer
492 次查看
注册日期: ‎03-14-2019

回复: DPU,程序执行resnet50程序超时,无中断返回

转到解答

Hi @jasonwu 

感谢回复

根据您提供的信息

并且参考https://github.com/Xilinx/Edge-AI-Platform-Tutorials/tree/master/docs/DPU-Integration 有关中断节点的说明

PS InterfaceGIC IRQ #Linux IRQ #
PL_PS_IRQ1[7:0]143:136111:104
PL_PS_IRQ0[7:0]128:12196:89

 

检查了Vivado工程,以及设备树,发现当前Vivado工程中,DPU中断链接在PL_PS_IRQ1[0]端口,对应Linux IRQ为104,而当前Linux设备树中填写的中断号为106,对应不上

刚刚尝试了修改设备树中断号为104,重新生成Petalinux工程

加载驱动时报错如下:

# modprobe dpu
[   35.786901] [DPU][2397]Found DPU signature addr = 0xb1000000 in device-tree
[   35.793887] [DPU][2397]Checking DPU signature at addr = 0xb1f00000, 
[   35.800300] [DPU][2397]DPU signature checking done!
[   35.807153] irq: type mismatch, failed to map hwirq-136 for interrupt-controller@f9010000!
[   35.815429] [DPU][2397]Request IRQ 0 failed!
[   35.819728] dpu: probe of amba:dpu failed with error -22

是否需要在Vivado中一定要与TRD中相同,链接在PL_PS_IRQ1[2]端口(即对应linux中断号为106)?

0 项奖励
Xilinx Employee
Xilinx Employee
485 次查看
注册日期: ‎03-27-2013

回复: DPU,程序执行resnet50程序超时,无中断返回

转到解答

Hi @richsheep ,

 

中断类型选择level high,在设备树中设置为0x4再尝试下呢?

Best Regards,
Jason
-----------------------------------------------------------------------------------------------
Please mark the Answer as "Accept as solution" if the information provided is helpful.

Give Kudos to a post which you think is helpful and reply oriented.
-----------------------------------------------------------------------------------------------
0 项奖励
Observer richsheep
Observer
467 次查看
注册日期: ‎03-14-2019

回复: DPU,程序执行resnet50程序超时,无中断返回

转到解答

Hi @jasonwu ,

感谢回复,

现在中断是参考bsp工程的dpu.dtsi,如下

                    dpucore {
                        compatible = "xilinx,dpucore";
                        interrupt-parent = <&gic>;
                        interrupts = <0x0 106 0x1>;
                        core-num = <0x1>;
                    };

是否要修改为0x4高电平触发?这样修改的原因是什么?

                    dpucore {
                        compatible = "xilinx,dpucore";
                        interrupt-parent = <&gic>;
                        interrupts = <0x0 106 0x4 >;
                        core-num = <0x1>;
};

BTW. 我现在已经正在重新生成Vivado工程,把DPU的中断连接到PL_PS_IRQ[2](即设备树中对应106).打算确认一下是不是必须指定为106为DPU Core0中断

0 项奖励
Highlighted
Xilinx Employee
Xilinx Employee
336 次查看
注册日期: ‎03-27-2013

回复: DPU,程序执行resnet50程序超时,无中断返回

转到解答

Hi @richsheep ,

 

是的,电平就是这样修改的。

这个IP是电平触发的,所以要这样修改,部分文档中可能有差异,这个会在后续修正的,目前因为是手动输入的直接修改成0x4就可以了。

不是必须106,但是要和你硬件一致才行,这样才能监听正确的中断做出反馈的。

Best Regards,
Jason
-----------------------------------------------------------------------------------------------
Please mark the Answer as "Accept as solution" if the information provided is helpful.

Give Kudos to a post which you think is helpful and reply oriented.
-----------------------------------------------------------------------------------------------
Observer richsheep
Observer
290 次查看
注册日期: ‎03-14-2019

回复: DPU,程序执行resnet50程序超时,无中断返回

转到解答

Hi @jasonwu ,

昨天晚上进行了尝试,将DPU中断链接到PL_PS_IRQ[2],并且对应设备树修改为<0x0 106 0x1>

执行成功了,dpu运行resnet50程序可以得到推断结果

......
Load image: PIC_083.jpg
[Top 0] prob = 0.305291  name = fig,
[Top 1] prob = 0.237761  name = strawberry,
[Top 2] prob = 0.144209  name = lemon,
[Top 3] prob = 0.112310  name = orange,
[Top 4] prob = 0.032177  name = wooden spoon,
[Time]4602184us
[FPS]21.5115

这时的中断情况如下:

# cat /proc/interrupts
....
45: 207 0 0 0 GICv2 61 Level zynqmp_ipi1 46: 11484 0 0 0 GICv2 138 Edge dpu_isr
....

随后我又重新试了一下连接到PL_PS_IRQ[0],设备树修改为<0x0 104 0x1>

执行还是出现加载驱动问题:

# modprobe dpu
[  101.390356] [DPU][2453]Found DPU signature addr = 0xb1000000 in device-tree
[  101.397336] [DPU][2453]Checking DPU signature at addr = 0xb1f00000, 
[  101.403758] [DPU][2453]DPU signature checking done!
[  101.410533] irq: type mismatch, failed to map hwirq-136 for interrupt-controller@f9010000!
[  101.418825] [DPU][2453]Request IRQ 0 failed!
[  101.423124] dpu: probe of amba:dpu failed with error -22

现在看来上升沿触发dpu也可以进行工作,随后会尝试修改为高电平看能否正常工作.但是中断连接这个问题我还是百思不得其解.

不管怎样,原始超时问题解决了,谢谢您的帮助!

0 项奖励