cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
Highlighted
379 Views
Registered: ‎01-28-2020

DNNDK v3.1 and DPU 3.0: Input image datatype

Hello:

I'm trying to execute several model-zoo networks in a ZedBoard using the DNNDK v3.1 package, with a PetaLinux project including the DPU 3.0. Up to this moment, I have tried the execution of tf_resnet_v1_50, tf_inception_v1 v3 and v4.

I'm able to execute the resnet50 model with no problems. My workflow consists on pre-processing including bgr2rgb conversion, as well as central crop and resizing the image to 224x224, quantization of the model to weights and activation bit width of 8-bit, compilation for arm32 and execution on the board. The pre-processing for resnet is indicated on the readme file of the .zip you download from the model-zoo repository. The model executes and is accurate.

On the other hand, the inception networks are not working. The main difference from the previous workflow is that you have to add normalization between -1 and 1 to the image pre-processing step, as specified in its readme file. The quantization works fine, but when running the models in the board, there is lack of accuracy. I believe the problem is the normalization step. In order to do it, the pixel datatype of the images have to be changed to floating points, and the network doesn't classify correctly the images when introducing this format. Is this due to the fact that the DPU uses unsigned integers internally for all the variables it handles?

I have tried spiking the normalization step in the board application, but this doesn't improve the accuracy. I have also tried normalization between -1 and 1 maintaining the datatype as signed integer, which results in images filled with values -1, 0 or 1. This doesn't work either.

My question would be what is the datatype that the pixels of the input images need to be in to be input in the DPU. Do they have to be signed integers, which is the datatype the DPU works with, or can they be floats? It makes sense to me that they have to be 8bit signed integers, but then it wouldn't be possible to run any inception models with proper accuracy, as far as I'm concerned. Am I missing any point?

Any help is welcomed, thank you in advance!

 

0 Kudos
6 Replies
Highlighted
Moderator
Moderator
303 Views
Registered: ‎03-27-2013

Hi xrancano@alumnos.uvigo.es ,

 

The range[-1, 1] normalization should be supported please find the reference code here:

https://github.com/Xilinx/Vitis-AI-Tutorials/blob/Keras-GoogleNet-ResNet/files/target_zcu102/cifar10/miniGoogleNet/src/top5_tf_main.cc

void normalize_image(const Mat& image, int8_t* data, float scale, float* mean) {
  for(int i = 0; i < 3; ++i) {
    for(int j = 0; j < image.rows; ++j) {
      for(int k = 0; k < image.cols; ++k) {
	//data[j*image.rows*3+k*3+2-i] = (float(image.at<Vec3b>(j,k)[i])/255.0 ) * scale; //DB: original code
	data[j*image.cols*3+k*3+i] = (float(image.at<Vec3b>(j,k)[i])/255.0f - 0.5f)*2 * scale;

      }
     }
   }
}
Best Regards,
Jason
-----------------------------------------------------------------------------------------------
Please mark the Answer as "Accept as solution" if the information provided is helpful.

Give Kudos to a post which you think is helpful and reply oriented.
-----------------------------------------------------------------------------------------------
Highlighted
267 Views
Registered: ‎01-28-2020

Hello @jasonwu and thank you for the reply:

I have tried executing tf_inceptionv1 using that function for the normalization but the accuracy results are still bad. The network doesn't generate good predictions.

I have tried the inceptionv3 model with Caffe framework and there is no problem. The pre-processing of this framework doesn't require to perform the normalization function, that is the only difference I found.

This makes me believe that the problem might be that normalization between -1 and 1 is meant to be made with a float datatype, in order for the pixels to get values between that range, but the fact that the normalization is done to int8_t makes the pixels only pick the values -1, 0 and 1. The problem is that, if you use float type, the network doesn't work either, and that might be due to the fact that the DPU represents its variables as signed 8-bit integer.

This might not be the actual issue, but I cannot find any other explanation.

Regards.

0 Kudos
Highlighted
Moderator
Moderator
232 Views
Registered: ‎03-27-2013

Hi xrancano@alumnos.uvigo.es ,

 

Yes, if you want to locate the root cause you may spend a lot of time.

BTW the normalization to (0,1) or (-1,1) should not have big impact to the finical accuracy.

If possible you could try with (0,1) normalization. If that can work it could be a work-around for that.

Best Regards,
Jason
-----------------------------------------------------------------------------------------------
Please mark the Answer as "Accept as solution" if the information provided is helpful.

Give Kudos to a post which you think is helpful and reply oriented.
-----------------------------------------------------------------------------------------------
Highlighted
Participant
Participant
188 Views
Registered: ‎09-11-2018

Hi,

I think I'm running in a similiar issue (see https://forums.xilinx.com/t5/AI-and-Vitis-AI/preprocessing-for-tf-ssdmobilenetv2-coco-300-300-3-75G/td-p/1122584)! I also can't wrap my head around how this is supposed to work...  If I run

float inputTensorScale=dpuGetInputTensorScale(task_conv,(char *)CONV_INPUT_NODE);
cout << inputTensorScale <<  endl;

It prints 64, which is strange because then the the DPU would only use half its range (-64 to 64 instead of the full -128 to 128 range).
If anyone has more info on this, I would be really happy to hear it.

Cheers

0 Kudos
Highlighted
Participant
Participant
167 Views
Registered: ‎09-11-2018

@jasonwuI have also tried the code you provided (and many other variations) with no success. Do you have any info on why the inputscale for the mobilenetv2ssd is only 64 if  it is stated that the input should be normalized between -1 and 1?

0 Kudos
Highlighted
Moderator
Moderator
149 Views
Registered: ‎03-27-2013

Hi @rbriegel ,

 

Thanks  for your input.

The input scale value is determined during the quantized. So that you need to read the value out during the deployment.

From the original post you are working on classification application with different normalization.

And mobilenetssd is more like a detection and classification appication. This is more complex network.

Have you resolved the previous problem? If not I would suggest you to work with simple network first.

And BTW there are critial update from DNNDK 3.1 to VAI 1.1. On current release status and support policy I would suggest you to select MPSoC as platform and update to VAI 1.1 if possible. 

 

Best Regards,
Jason
-----------------------------------------------------------------------------------------------
Please mark the Answer as "Accept as solution" if the information provided is helpful.

Give Kudos to a post which you think is helpful and reply oriented.
-----------------------------------------------------------------------------------------------
0 Kudos