cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
jsi_wmz
Contributor
Contributor
1,694 Views
Registered: ‎08-09-2018

ERROR: Faild to load section from hybrid DPU file.

Jump to solution

In AI SDK, I modified a script called "test_customer_provided_model_yolov3.cpp". This script is to use some customed library of model to inference. When I had done everything the introduction said, this error occurred.

root@zcu102:/usr/share/XILINX_AI_SDK/samples/yolov3# ./test_jpeg_yolov3_voc_416x416_deconv ~/JPEGImages/000032.jpg
[DNNDK] Faild to load section from hybrid DPU file. section:METADATA, hybrid ELF:/usr/lib/libdpumodelyolov3_voc_deconv.so
    1- Specified DPU Kernel name "yolov3_voc_deconv" is right
    2- DPU Kernel "yolov3_voc_deconv" is compiled and linked into "/usr/lib/libdpumodelyolov3_voc_deconv.so" as expected

That model yolov3_voc_deconv is the customed one. I first trained a yolov3 model on voc dataset in a standard way. And that trained model has good accuracy. Then, after I convert the model to caffemodel, I followed the steps, like using decent and dnnc, to generate the .elf file. Then I builded the .elf to .so, and place it in /usr/lib. The name of prototxt, kernel name and parameter are the same.

After that, I modified the script to test the model. But I don't know why this error happened. 

Because upsample is not supported in dnndk(always has problem when using decent?), I changed upsample to deconv. Other layers are the same except the upsample. 

In a word, I have two questions: 1. WHAT does the error information mean and WHAT is the probable reason?   2. I don't know IF the steps I did are RIGHT or NOT.

At last, here is the script, I only changed the model name from "yolov3_voc_416" to "yolov3_voc_deconv".

#include <xilinx/yolov3/yolov3.hpp>
#include <iostream>
#include <opencv2/opencv.hpp>
#include <map>
#include <string>
using namespace std;
using namespace cv;

/*
  This is an example on how to use customer-provided model for yolov3 lib.
  Below parameter of "yolov3_voc_416" in create_ex() is assumed as 
  the model provided by customer. 
  This parameter must be same as the kernel name in the configuration file 
  of prototxt in /etc/XILINX_AI_SDK.conf.d/.
  That means, the file name of the .prototxt, the kernel name, and the parameter
  must be same.  

  below is the detailed steps of HowTo.
    Note: 
      * replace the ... in below dir to your own directory
      * please set correct parameter when running below tool.
        You need refer to corresponding document for detailed information
        of the tool in SDK which mentioned below.

  1. prepare your own customer-provided model (maybe trained by caffe or tensorflow);
  2. convert your own model to Xilinx model format via convert tool provided by SDK.
  3. use dnnc tool to build your model into .elf file
     3.1. caffe model.    
      dnnc --prototxt=/home/.../yolov3_voc_416/deploy.prototxt 
        --caffemodel=/home/..../deploy.caffemodel --output_dir=/home/.... 
        --net_name=yolov3_voc_416 --dpu=4096FA --cpu_arch=arm64 --mode=normal
     3.2. tensorflow model
      dnnc --parser=tensorflow --frozen_pb=/home/.../deploy.pb --output_dir=/home/... 
        --net_name=ssd_your_kern_name --dpu=4096FA --cpu_arch=arm64 --mode=normal
  4. build your model into library. This step need cross-compiling tool.
      /home/.../aarch64-linux-gnu-g++ -nostdlib -fPIC -shared 
        /home/.../dpu_your_own_model.elf -o 
        libdpumodelssd_your_own_model.so
  5. place the built lib in /usr/lib or other library path which can be accessed.
  6. prepare your own prototxt file and place it in etc/XILINX_AI_SDK.conf.d/ .
     Please refer to the document on how to modify this file.

  test pic: use "sample_yolov3.jpg" for this test.

*/

int main(int argc, char *argv[])
{
  if (argc < 2) {
    std::cout << "usage : " << argv[0] << " <img_url> "
              << std::endl;
  }

  Mat img = cv::imread(argv[1]);
  if(img.empty()) {
      cerr << "cannot load " << argv[1] << endl;
      abort();
  }
  auto yolo = xilinx::yolov3::YOLOv3::create_ex("yolov3_voc_deconv", true);
  auto results = yolo->run(img);
  std::cout << "results.size " << results.bboxes.size() << " " //
            << std::endl;
  for(auto &box : results.bboxes){
      int label = box.label;
      float xmin = box.x * img.cols + 1;
      float ymin = box.y * img.rows + 1;
      float xmax = xmin + box.width * img.cols;
      float ymax = ymin + box.height * img.rows;
      if(xmin < 0.) xmin = 1.;
      if(ymin < 0.) ymin = 1.;
      if(xmax > img.cols) xmax = img.cols;
      if(ymax > img.rows) ymax = img.rows;
      float confidence = box.score;

      cout << "RESULT: " << label << "\t" << xmin << "\t" << ymin << "\t"
           << xmax << "\t" << ymax << "\t" << confidence << "\n";
      rectangle(img, Point(xmin, ymin), Point(xmax, ymax), Scalar(0, 255, 0),
                  1, 1, 0);
  }
  imwrite("sample_yolov3_customer_provided_result.jpg", img);
  return 0;
}
0 Kudos
1 Solution

Accepted Solutions
easmith5
Participant
Participant
1,507 Views
Registered: ‎07-16-2018

I had this error as well, I believe it was fixed by changing the kernel name passed into dpuLoadKernel(). I have a my_model_0.elf and I needed to load the kernel named 'my_model_0'. I was originally using just my_model. I am still getting other errors, so unsure if this will fix for everyone. 

View solution in original post

0 Kudos
8 Replies
liangxinyu86
Contributor
Contributor
1,668 Views
Registered: ‎05-16-2014

I have the same problem , dont know how to solve

0 Kudos
anz162112
Contributor
Contributor
1,614 Views
Registered: ‎07-05-2018

Even I am facing the same problem. Any help on this

Regards,
Shikha Goel
(Ph.D. , IIT Delhi)
0 Kudos
anz162112
Contributor
Contributor
1,598 Views
Registered: ‎07-05-2018

Any help on this?

I am stuck on it from past few days. Please help and provide solution for it.

Regards,
Shikha Goel
(Ph.D. , IIT Delhi)
0 Kudos
jsi_wmz
Contributor
Contributor
1,563 Views
Registered: ‎08-09-2018

I don't know either. I have tried different ways to generate the model file and built to .so. But this problem always occured. If I don't use the .so file which the AI SDK provides, I will meet this error! 

Is there any xilinx staff who know this issue or just explain the meaning of error information?

0 Kudos
easmith5
Participant
Participant
1,508 Views
Registered: ‎07-16-2018

I had this error as well, I believe it was fixed by changing the kernel name passed into dpuLoadKernel(). I have a my_model_0.elf and I needed to load the kernel named 'my_model_0'. I was originally using just my_model. I am still getting other errors, so unsure if this will fix for everyone. 

View solution in original post

0 Kudos
jsi_wmz
Contributor
Contributor
1,492 Views
Registered: ‎08-09-2018

I have solved this problem according to the answers in https://forums.xilinx.com/t5/Deephi-DNNDK/Custom-NN-Error-when-running-executable/m-p/983192#M1186. This is exactly the name mismatch problem. And the point is that when you generate the .elf by dnnc, the "net_name" should match the prototxt, lib*.so and the kernal name you load in script.

anz162112
Contributor
Contributor
1,368 Views
Registered: ‎07-05-2018

But this cannot be changed when using AI SDK. This can only be done when using DNNDK to run the model. In AI SDK, we directly write this

auto det = xilinx::classification::Classification::create_ex("resnet_50");

Also,

setenv("DPU_COMPILATIONMODE", "1",1). is set when using DNNDK not using AI SDK.

Regards,
Shikha Goel
(Ph.D. , IIT Delhi)
0 Kudos
w_qs
Visitor
Visitor
1,264 Views
Registered: ‎06-21-2019

After trying a lot of times, I think the problem is as below:

under the path /etc/XILINX_AI_SDK.conf.d/*.prototxt

cat /etc/XILINX_AI_SDK.conf.d/resnet_50.prototxt
model {
name : "resnet_50"
kernel {
name: "resnet_50"
input: "conv1"
output: "fc1000"
mean: 104.0
mean: 107.0
mean: 123.0
scale: 1.0
scale: 1.0
scale: 1.0
}
model_type : CLASSIFICATION
classification_param {
top_k : 5
test_accuracy : false
}
}

 

you can see there is a model name "resnet_50", and a kernel name "resnet_50". I think the model name is the same as the *.so name from gcc and kernel name is the same as *.elf name from dnnc. 

my .elf name is resnet_50_0.elf so I changed the kernel name to "resnet_50_0", and seemed to solve the problem on this stage.

0 Kudos