05-06-2020 01:38 AM
As the new yolov4 from darknet is available does the Xilinx support to deploy the model on FPGA using DNNDK tools ?
In Darknet2caffe conversion tutorial, I can see only it supports upto yolov3. Can you guide me how to make yolov4 new version to run on FPGA using DNNDK tools.
05-06-2020 05:33 PM
I have not tried deploying Yolvo4 on Vitis-AI (or DNNDK) yet, but there will be one issue to be aware of if converting to Caffe.
There is a mismatch between Caffe and Darknet for maxpool layers of stride 1. You will get a data blob error in Caffe.
Yolov3 does not have such a layer, but it looks like Yolov4 has 3 of these. Yolov3Tiny has the same issue BTW.
## SPP ###
This means if you want to convert to Caffe you are going to have to modify the layers to use a different stride or remove the layer altogether, and then retrain in Darknet.
There is a way to convert Darknet to a Keras, and this avoid the maxpool issue: https://github.com/qqwweee/keras-yolo.
I have used this before with Vitis-AI and DNNDK and Yolov3 and it works. You then can convert to TensorFlow and freeze the model. There is an example how to do the Keras to TensorFlow conversion on https://github.com/Xilinx/Vitis-AI-Tutorials
I don't know however, if the conversion will work for Yolov4. Its something I want to try myself soon.
The AlexeyAB site https://github.com/AlexeyAB/darknetsite as 2 links for converting form Darknet to Tensorflow directly, but I don'y have any experience with those, or if they would work for Yolov4.
05-06-2020 06:16 PM
05-06-2020 06:27 PM
After looking more closely I see that Yolov4 is using a mish activation layer.
This is not supported on the DPU. Only Relu, Relu 6, or leaky Relu is currently supported.
You would have to change all the activation layer to Relu and retrain. I don't know what effect this would have on accuracy.
05-07-2020 12:58 PM - edited 05-07-2020 12:59 PM
Thanks for the support.
I have found this git repo to convert from yolov4 to keras: https://github.com/Ma-Dan/keras-yolo4
Above repo is build based on the repo you mentioned in the previous message:https://github.com/qqwweee/keras-yolo.
On my PC i have 2019.1. Can you please try this on your side and check whether it works with vitis ?
If it works, then i can upgrade to 2019.2. And choose method of converting yolov4 to keras instead of yolov4 to caffe.
05-08-2020 11:20 AM
There is another issue with 2 of the max_pooling layers. The maximum kernel size supported by the dpu is 8, but there are 2 kernels in the #SPP section of size 9 and 13.
These will either have to be removed, or the kernel size changed size and then the model retrained in Darknet.
I am going to try changing the sizes to 8, and the train against the Pascal data set to see what accuracy I can get.
Retraining will take a long time, but I can try converting with an early version of the weights to see if it can be compiled in Vitis-AI.
05-08-2020 12:59 PM - edited 05-12-2020 12:24 PM
Using either of the conversion repos gives me similar error:
File "convert.py", line 173, in <module>
yolo4_model = Yolo4(score, iou, anchors_path, classes_path, model_path, weights_path)
File "convert.py", line 157, in __init__
File "convert.py", line 130, in load_yolo
buffer=weights_file.read(weights_size * 4))
TypeError: buffer is too small for requested array
Not sure if its an issue with my weights file, or the code, and it will probably be a while, before I can look into this some more.
Update, this was due to an error in pointing to the coco_classes.txt instead of voc_classes.txt. I was training on VOC.
05-12-2020 12:17 PM - edited 05-12-2020 12:24 PM
I was able to successfully quantize and compile a modified version of Yolov4 in the Vitis-AI 1.0 tools.
I did not build a sw application and run on the DPU, because the weights I used are from early in the training stage and are not very accurate.
Here is the procedure I used.
1. Modifed cfg/yolov4.cfg
Changed all mish activation layers to leaky.
Changed size the max_pool layer with size greater than 8 to 8. There were 3 layers that had this.
Disclaimer: At this point I do not know what effect on accuracy these changes will have. I am in the process of training on Pascal VOC, and it is not complete yet.
2. Retrain in Darket
3. Use https://github.com/qqwweee/keras-yolo to transform to keras
4. Use https://github.com/Xilinx/Vitis-AI-Tutorials/tree/Keras-Freeze-with-Vitis-AI method 1 to convert to TensorFlow and freeze model.
5. Brought frozen model into Yolov3 Tutorial: https://github.com/Xilinx/Vitis-AI-Tutorials/tree/ML-at-Edge-yolov3 and quantize and compiled. As I mentioned I did not go through the process of building a sw application.
I believe that the AlexAB version of Darknet does not implement letterboxing by default. So if you train without letterboxing you will need to do the appropriate image resizing when you deploy.
The above ML-at-Edge-yolov3 Tutorial assumes that letterboxing was done during the resize.
05-12-2020 01:16 PM
Thanks for the update. Then I will start training my yolov4 model with the changes you mentioned in the yolov4.cfg file along with Vitis-AI tools.
I am very much interested in accuracy. When your training got finished can you please post the accuracy results in comparing with the original yolov4.cfg ?
Regarding maxpool layer, "Changed size the max_pool layer with size greater than 8 to 8" does this mean i have to keep size =8 in all three layers ?
And also, In upcoming Vitis-AI tools will there be any support for the layers like mish and maxpool ?
05-12-2020 06:42 PM
You do not have the keep the size at 8, its just that the max size cannot be larger than 8. You could try removing or reducing the kernel size to smaller than 8.
Here is what I did:
### SPP ###
I will post my accuracy results when they are complete, I am currently training on the Pascal data set. At the present I only have to access to a low performance GPU machine, so its going to take a long time.
Future support for these layers are being looked at, but there is no current schedule as to when this would be available.
06-02-2020 04:56 PM - edited 06-03-2020 11:00 AM
My training is not yet done on Pascal VOC 416x416, but I wanted to report my results so far.
I used the pre-loaded yolov4.weights to initialize, with cfg mods described in my prior post.
Current best MAP is 85.26%.
06-10-2020 03:27 AM
Thank you very much for sharing the map results. In addition to yolov4 i am also working on yolov3-tiny implementation using 2019.1 (DNNDK tools).
Yolov3-tiny also has maxpool layers. So can you please suggest how to modifiy the cfg file to make it work with good accuracy. I have already trained the model and tried to implement it on the fpga but the no.of detections are very poor comparing the detections on GPU.
For reference i have attached my "yolov3-tiny-pellet.cfg" file to this message. Thanks in advance.
06-16-2020 04:51 PM
You can convert TinyYolov3 directly to Keras, and then use TensorFlow in Vitis-AI. You do not have to make any layer modifications if you go this route.
06-30-2020 05:50 AM
I manged to convert the yolov4 to keras "yolov4.h5".
But while freezing the model with following command:
freeze_graph --input_graph=./method1/tf_infer_graph.pb \
i am getting an error like:
AssertionError: activation_4/Softmax is not in graph
Result of my keras2tf command is as shown below:
/opt/vitis_ai/conda/envs/vitis-ai-tensorflow/lib/python3.6/site-packages/keras/engine/saving.py:310: UserWarning: No training configuration found in save file: the model was *not* compiled. Compile it manually.
warnings.warn('No training configuration found in save file: '
Keras model information:
Input names : [<tf.Tensor 'input_1:0' shape=(?, ?, ?, 3) dtype=float32>]
Output names: [<tf.Tensor 'conv2d_94/BiasAdd:0' shape=(?, ?, ?, 18) dtype=float32>, <tf.Tensor 'conv2d_102/BiasAdd:0' shape=(?, ?, ?, 18) dtype=float32>, <tf.Tensor 'conv2d_110/BiasAdd:0' shape=(?, ?, ?, 18) dtype=float32>]
WARNING:tensorflow:From keras_2_tf.py:76: The name tf.train.Saver is deprecated. Please use tf.compat.v1.train.Saver instead.
WARNING:tensorflow:From keras_2_tf.py:97: The name tf.train.write_graph is deprecated. Please use tf.io.write_graph instead.
Checkpoint saved as: ./method1/tf_chkpt.ckpt
Graph saved as : ./method1/tf_infer_graph.pb
Do i need to change the --output_node_names to 'conv2d_94', 'conv2d_102', 'conv2d_110' ?
If yes, should the command looks like this ?
freeze_graph --input_graph=./method1/tf_infer_graph.pb \
--output_node_names='conv2d_94', 'conv2d_102', 'conv2d_110'
07-03-2020 06:57 PM
Yes, the output nodes need to be: output_node_names=conv2d_94/BiasAdd,conv2d_102/BiasAdd,conv2d_110/BiasAdd
One thing to be aware of is, that I am seeing accuracy issue after converting to keras, it looks like the box sizes are not correct. So I am having doubts about using
You may need to use the Caffe Conversion flow, using the converter included from: https://www.xilinx.com/bin/public/openDownload?filename=caffe-xilinx-1.1.zip
If using the Caffe Flow you will need to modify the 3 max pooling layers of stride 1, since there is an mismatch how Caffe and Darknet does this.
I do not know the best way to modify these layers yet. I am doing a test with just removing removing them, but my training is not yet done so I dont know how much accuracy I will loose. I should have results next week.
07-04-2020 04:51 AM
Converted YOLOv4 to tf following the keras->tf path with the modifications you stated, although I didn't retrain as I wanted to see if I managed to deploy the model to my zcu104 first.
I noticed the resulting graph is not very accurate, compared to yolov3 (I should also say I only tested inference with a single image just to see if the postprocessing code worked). I didn't care too much as I thought training would solve it but now that I see your post I'm worried its not going to work even after retrain.
You also say you get inaccuracies with bbox sizes, I happen to get correct sizes, but fewer and wrongly labeled bboxes (I attached some test results after conversion to tf ). Is this something you can expect to fix with training?
Thanks for the updates and the insight!
07-06-2020 04:15 PM
I think you are seeing the same issues that I had. I saw accuracy differences between darknet and keras in serveral areas. In some cases the box sizes were wrong, and in some cases they were close. The confidence levels seemed to go down a lot as well across the board.
I think the Darknet to Caffe flow is the best way to go here. To make the model compatible with Darknet I just commented out the 3 maxpooling layers as follows:
### SPP ###
### End SPP ###
I am training against the VOC dataset and so far the mAP is almost 86%, so removing the maxpooling layers is not lowering the accuracy too much. Training should be done tomorrow, and I will update after I go through the conversion process on run on actual ZCU104 hw.
07-22-2020 08:34 AM
Any update on yolov4 testing. How much the accuracy drop after darknet to caffe conversion was done ?
In addition to this if i want to use yolov4-tiny what is the best way to make changes in the .cfg file. I see that yolov4-tiny is using leaky where as yolov4-full is using mish layers.
Here is the link to cfg file of yolov4-tiny is want to use: https://github.com/AlexeyAB/darknet/blob/master/cfg/yolov4-tiny-3l.cfg
Does this help us in achieving good accuracy after darknet to caffe conversion was done ?
Is the darkent2caffe conversion process is available in vitis ai 2019.2 or 2020.1 ? I am still using 2019.1.
07-23-2020 02:49 PM
I am seeing accuracy issues after quantization, that may be related to how I modified the SPP section of the original model Yolov4.
Instead of commenting them out as I originally did, I am rerunning training with them back in, but the kernel size set to 1 to be compatible with Caffe. Training will be done tomorrow, so I should have results tomorrow or Monday.
The Xilinx Model Zoo: https://github.com/Xilinx/AI-Model-Zoo, has a Xilinx Caffe distribution that you can download and install. This contains the Darknet to Caffe converter.
To run it you have to do the following:
Build Caffe: make -j
08-05-2020 03:36 PM
After converting to Caffe and Quantizing with Vitis-AI I.2. I am seeing good accuracy. The Quantized mAP for VOC is 81.93%.
The mAP in Darknet reported in Darknet was 82.55%, this was done using the -letter_box and -points 11 options to be compatible with the Caffe processing.
I have tried a few different variations of modifying the maxpooling layers to be compatible with the Caffe. The best results seen by just commenting out the 3 max pooling layers in the SPP section.
I will test the accuracy on hw soon.
08-08-2020 12:41 AM
Thanks for the update and mAP is at decent level after conversion.
Can you please share on how and when to using this -letter_box and -points 11 options ?
In addition to this can you share the modified cfg file also ?