cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
trived76
Visitor
Visitor
150 Views
Registered: ‎10-03-2019

Difficulty in passing input to the second kernel when we have two DPU kernels with new APIs from vitis-ai

Jump to solution

Hi Everyone, 

I have quantized and compiled a custom Resnet classifier. At the end of the process, I got 4 kernels:

  1. Convolutional kernel (main network - 1st DPU kernel)
  2. Global Average Pooling (on CPU)
  3. Fully connected layer (2nd DPU kernel)
  4. Softmax (on CPU)

Now, I am able to get the output tensor from the 1st DPU kernel (i.e. convolutional layer) and am also able to generate the global average pooling from Xilinx's "globalAvePool()". This returns a 1D vector of 64 features.

Following to that, I want to pass this output of Global Average Pooling to the 2nd DPU kernel (i.e. fully connected layer). The input this kernel needs is 1x1x64. I was also able to construct a cv::Mat of size 1x1x64 and planning it to pass to either "setMeanScaleBGR()" or "setMeanScaleRGB()". But, I know that because the image that I constructed does not have 3 channels, this APIs fail to pass the input to the second kernel.

The error message I get is:

 

New image size: [1 x 1] & channels: 64 for the second Fully-Connected DPU kernel
WARNING: Logging before InitGoogleLogging() is written to STDERR
F0418 00:12:01.883733 16188 dpu_task_imp.cpp:181] Check failed: inputs.size() > 0 (0 vs. 0)
*** Check failure stack trace: ***
Aborted

 

 

I know that for the pose_detect example of Xilinx with older DPU APIs, we were able to pass inputs to more than one kernels. I would really appreciate if someone could please help me here.

0 Kudos
1 Solution

Accepted Solutions
trived76
Visitor
Visitor
62 Views
Registered: ‎10-03-2019

Xilinx's "getInputTensor()" actually can also set the data. The data does not necessarily pass through the "setImageBGR()" and "setImageRGB()" for the second kernel. As @zhipengl also pointed out, the output of the globalAvePool() from Xilinx returns the same type of data that I needed for the second kernel. The name "getInputTensor()" was confusing to me, I thought that this is only a getter method but when I read through the documentation more, I found that it can set the input tensor as well.

One more caveat is that if anyone uses more than 2 kernels and uses a shared object file, multiple kernels needs to be combined in one shared object file.

View solution in original post

2 Replies
zhipengl
Xilinx Employee
Xilinx Employee
78 Views
Registered: ‎03-21-2021

Hello,

The data "Global Average Pooling (on CPU)" output is the data "Fully connected layer (2nd DPU kernel)".

Just pay attention to the order and size of the data.

trived76
Visitor
Visitor
63 Views
Registered: ‎10-03-2019

Xilinx's "getInputTensor()" actually can also set the data. The data does not necessarily pass through the "setImageBGR()" and "setImageRGB()" for the second kernel. As @zhipengl also pointed out, the output of the globalAvePool() from Xilinx returns the same type of data that I needed for the second kernel. The name "getInputTensor()" was confusing to me, I thought that this is only a getter method but when I read through the documentation more, I found that it can set the input tensor as well.

One more caveat is that if anyone uses more than 2 kernels and uses a shared object file, multiple kernels needs to be combined in one shared object file.

View solution in original post