cancel
Showing results for 
Search instead for 
Did you mean: 
Observer
Observer
82 Views
Registered: ‎09-25-2016

dpuSetExceptionMode() does not honor N2CUBE_EXCEPTION_MODE_RET_ERR_CODE

By default, the N2Cube Core library uses the N2CUBE_EXCEPTION_MODE_PRINT_AND_EXIT exception handling mode, where the DPU application is terminated when an error occurs.

Using dpuSetExceptionMode(N2CUBE_EXCEPTION_MODE_RET_ERR_CODE) should change the exception handling mode to return an error code rather than terminating the DPU application, but this change is not honored and the default behavior remains in effect.

 

Here is the following code snippet added to the beginning of my DPU application (tested before and after the call to dpuOpen()) which indicates that N2CUBE_EXCEPTION_MODE_RET_ERR_CODE should be in effect:

 

// code
dpuSetExceptionMode(N2CUBE_EXCEPTION_MODE_RET_ERR_CODE); ret = dpuGetExceptionMode(); printf("dpuGetExceptionMode: %d %d : %d\n", N2CUBE_EXCEPTION_MODE_PRINT_AND_EXIT, N2CUBE_EXCEPTION_MODE_RET_ERR_CODE, ret);

// output
dpuGetExceptionMode: 0 1 : 1

 

 

 

However, when an error occurs, N2Cube continues to print the following error report and terminate the DPU application:

 

 

[ 20.098533] [DPU][2010][PID 2010][taskID 3]Core 0 Run timeout,failed to get finish interrupt!
[ 20.107066] [DPU][2010][DPU debug info]
[ 20.107066] level = 9
[ 20.113161] [DPU][2010]Core 0 schedule counter: 3
[ 20.117948] [DPU][2010]Core 0 interrupt counter: 0
[ 20.122737] [DPU][2010][DPU Registers]
[ 20.126476] [DPU][2010]VER : 0x06aec4c8
[ 20.130912] [DPU][2010]RST : 0x000000ff
[ 20.135347] [DPU][2010]ISR : 0x00000000
[ 20.139783] [DPU][2010]IMR : 0x00000000
[ 20.144219] [DPU][2010]IRSR : 0x00000001
[ 20.148655] [DPU][2010]ICR : 0x00000000
[ 20.153090] [DPU][2010]
[ 20.155529] [DPU][2010]DPU Core : 0
[ 20.159003] [DPU][2010]HP_CTL : 0x07070f0f
[ 20.163265] [DPU][2010]ADDR_IO : 0x00000000
[ 20.167527] [DPU][2010]ADDR_WEIGHT : 0x00000000
[ 20.172050] [DPU][2010]ADDR_CODE : 0x0006fc80
[ 20.176399] [DPU][2010]ADDR_PROF : 0x00000000
[ 20.180748] [DPU][2010]PROF_VALUE : 0x00000000
[ 20.185184] [DPU][2010]PROF_NUM : 0x00000000
[ 20.189447] [DPU][2010]PROF_EN : 0x00000000
[ 20.193709] [DPU][2010]START : 0x00000001
[ 20.197971] [DPU][2010]COM_ADDR_L0 : 0x70280000
[ 20.202494] [DPU][2010]COM_ADDR_H0 : 0x00000000
[ 20.207017] [DPU][2010]COM_ADDR_L1 : 0x70300000
[ 20.211540] [DPU][2010]COM_ADDR_H1 : 0x00000000
[ 20.216062] [DPU][2010]COM_ADDR_L2 : 0x6fc80000
[ 20.220585] [DPU][2010]COM_ADDR_H2 : 0x00000000
[ 20.225108] [DPU][2010]COM_ADDR_L3 : 0x00000000
[ 20.229630] [DPU][2010]COM_ADDR_H3 : 0x00000000
[ 20.234153] [DPU][2010]COM_ADDR_L4 : 0x00000000
[ 20.238676] [DPU][2010]COM_ADDR_H4 : 0x00000000
[ 20.243199] [DPU][2010]COM_ADDR_L5 : 0x00000000
[ 20.247721] [DPU][2010]COM_ADDR_H5 : 0x00000000
[ 20.252244] [DPU][2010]COM_ADDR_L6 : 0x00000000
[ 20.256767] [DPU][2010]COM_ADDR_H6 : 0x00000000
[ 20.261290] [DPU][2010]COM_ADDR_L7 : 0x00000000
[ 20.265812] [DPU][2010]COM_ADDR_H7 : 0x00000000
[ 20.270334] [DPU][2010]
[DNNDK] DPU timeout while execute DPU Task [app_0-3] of Node [ConvNd_1]

 

 

In our application domain (aerospace), it's highly desireable that N2Cube returns control to the DPU application to allow for error handling. I'd appreciate any suggestions. Thanks!

 

Two ways to "inject" a DPU error:

  • temporarily reassign the IRQ number of the DPU in the DTS
  • run dexplorer -t 1 and run a model that takes longer than 1 second

Resources:

 

0 Kudos