11-19-2018 11:58 AM
After quanting model, I can get int8 or int16 (.json) files.
What is the importance of quantizer in xilinx ml-suite, as we can get same int8 or int16 model without quantizing.
If there is diffrence between them, then how can I find it?
11-19-2018 11:38 PM
In addition to the above post, I would like to add some more details.
-> Say I have int8 tensorflow model.
-> I want to use this int8 model directly without quantization.
-> Why should I do quantize step, If I have input model for quantizer is int8.
-> If quantization step should be happen, then I want to know the quantizer role for regenerating the new int8 quantized model.
11-20-2018 01:28 PM
The quantizer in ml-suite that quantizes based on calibration does not change your model file, i.e., the weights remain unchanged. Rather, the quantizer prepares parameters necessary to execute the provided model using the specified target datatypes, i.e., int8 or int16.
I understand your desire not to requantize something that's already been quantized by a third party, but that is not officially supported at this moment because the ml-suite's quantizer takes into consideration ml-suite hardware and software in ways that third parties do not.
My recommendation is to start from your existing quantized model's parent floating-point model.
11-25-2018 10:25 PM
Thank you for your reply. I need some more clarification regarding quantizer in ml-suite.
1. After quantizing the model, as the weights remains same, then what changes mlsuite quantizer is performing on the model for supporting ml-suite hardware and software.
2. What is the effect of accuracy on quantized model, due to quantization of model for supporting ml-suite hardware and software.
11-26-2018 10:25 AM
Testing will give your the answer on the accuracy question, as one size does not fit all. If you can indeed get the same accuracy with tighter quantizing that would be best, as a reduction in hardware and potential speed improvements.
Hope that helps
11-27-2018 01:57 AM
1> Thank you for your reply, but my desire is not to check accuracy directly on FPGA.
2> My actual intension is to get .pb or .caffemodel file after quantizing the model, such that I can reuse the quantized model again working on tensorflow or caffe framework.
3> Xilinx ml-suite is not providing .pb or .caffemodel file after quantizing the model, so I cannot reuse the quantized model to work on tensorflow or caffe framework.
11-27-2018 02:24 AM
In addition to the above reply, I would like to add one more point.
4> After quantizing the model, as the weights remains same, then what changes mlsuite quantizer is performing on the model for supporting ml-suite hardware and software
01-14-2019 01:00 PM
The quantizer generates a set of scalar multipliers per layer, which are passed to the xfdnn runtime. As the runtime executes each layer, it will write these parameters to the hardware registers. The hardware will perform the multiplications necessary to prevent overflows. Today, you can't use the quantization parameters without the FPGA. We are considering releasing some emulation software in the future that would allow you to make use of these parameters without the FPGA. However, these parameters are organized in a way specific to the XDNN hardware accelerator, and are not generic.