UPGRADE YOUR BROWSER

We have detected your current browser version is not the latest one. Xilinx.com uses the latest web technologies to bring you the best online experience possible. Please upgrade to a Xilinx.com supported browser:Chrome, Firefox, Internet Explorer 11, Safari. Thank you!

cancel
Showing results for 
Search instead for 
Did you mean: 

Ml Suite v1.4 Released!

Xilinx Employee
Xilinx Employee
0 0 1,733

ML Suite v1.4 was recently released. This update brings many upgrades and new features. It also starts the Cloud-to-Edge unification process, with ML Suite now using Decent_q quantization, while deprecating support for the xfDNN quantizer. Version 1.4 also brings back Docker support to facilitate easier proof of concept evaluations in the Data Center, as well as providing improvements to the runtime APIs.

ML Suite v1.4 has also deprecated support for xDNNv2, and now only supports xDNNV3 overlays.

Platform Support

ML Suite v1.4 initially supports the following platforms:

  • AWS
  • Nimbix
  • Alveo U200
  • Alveo U250
  • VCU1525

Coming soon:

  • Alveo U280

 

Introducing Decent_q Support

ML Suite previously used the xfDNN quantizer. This python based quantizer performed a recalibration quantization strategy, and while it was fast to quantize a 32-bit model to int8, in some cases produced int8 models with more than 1-2% accuracy loss from the base model.

Decent_q has been used primarily for the DPU (Part of the Edge AI Platform) and embedded use cases, but now is available within ML Suite - targeting the xDNN overlay. It delivers less accuracy loss from 32-bit models. The chart below shows some common models and the accuracy difference between the original model and the quantized int8 model.

 

Networks

Float32 baseline

8-bit Quantization

Top1

Top5

Top1

ΔTop1

Top5

ΔTop5

Inception_v1

66.90%

87.68%

66.62%

-0.28%

87.58%

-0.10%

Inception_v2

72.78%

91.04%

72.40%

-0.38%

90.82%

-0.23%

Inception_v3

77.01%

93.29%

76.56%

-0.45%

93.00%

-0.29%

Inception_v4

79.74%

94.80%

79.42%

-0.32%

94.64%

-0.16%

ResNet-50

74.76%

92.09%

74.59%

-0.17%

91.95%

-0.14%

VGG16

70.97%

89.85%

70.77%

-0.20%

89.76%

-0.09%

Inception-ResNet-v2

79.95%

95.13%

79.45%

-0.51%

94.97%

-0.16%

 

ML Suite and xFDNN have been updated to natively support Decent_q and don’t require extra steps to quantize and deploy a model. ML Suite V1.4 only supports Decent_q for Caffe models. Tensorflow will be coming in the next release.

Enhanced Model Support

ML Suite traditionally included only basic reference models to help users run through a few examples and Jupyter notebook tutorials. Version v1.4 has a new set of application-specific models to give new AI/ML users the ability to get closer to their end application and showcase our AI/ML abilities in end applications. The newly enabled models are shown below.

 

Application

Function

Algorithm

Face

Face detection

SSD, Densebox

Landmark Localization

Coordinates Regression

Face recognition

ResNet + Triplet / A-softmax Loss

Face attributes recognition

Classification and regression



Pedestrian

Pedestrian Detection (Crowd Volume)

SSD

Pose Estimation

Coordinates Regression

Person Re-identification

ResNet + Loss Fusion





Video Analytics

Object detection

SSD, RefineDet

Pedestrian Attributes Recognition

GoogleNet

Car Attributes Recognition

GoogleNet

Car Logo Detection

DenseBox

Car Logo Recognition

GoogleNet + Loss Fusion

License Plate Detection

Modified DenseBox

License Plate Recognition

GoogleNet + Multi-task Learning

 

Along with these new application models, reference models for Resnet50, Inception v1/3/4, SSD, yolov2 will also be included. 

 

xfDNN Runtime Enhancements – Support for pycaffe

The xFDNN runtime now supports pycaffe. This addition makes it easier to deploy custom models with layers that aren’t fully supported in xDNN. With the addition of xfDNN subgraph, xfDNN will now automatically split the graph, separating the xDNN and CPU parts. The layers that will run on CPU are executed using pycaffe’s APIs. When the compiler is run, the outputs are feed into the xfDNN subgraph tool which parses the network, creating subgraphs whenever unsupported layers are encountered, so they can be run through pycaffe on the CPU. This new feature will make it easier for users to deploy more networks, and to avoid manually executing CPU layers like Softmax or FC.

In the code, this will look simply like:

Quantize(prototxt,caffemodel)

Compile()

Cut(prototxt)

Infer

This flow also Includes flags to run only on CPU, allowing FPGA/CPU accuracy and speed comparison for benchmarking/eval and single image and streaming image support.

Get started with ML Suite 1.4 at the GitHub Page