We have detected your current browser version is not the latest one. Xilinx.com uses the latest web technologies to bring you the best online experience possible. Please upgrade to a Xilinx.com supported browser:Chrome, Firefox, Internet Explorer 11, Safari. Thank you!

Showing results for 
Search instead for 
Did you mean: 
Visitor i_am_here
Registered: ‎04-23-2019

decent_q in parallell (CPU)

Since the (official) Anaconda repo doesn't have cuDNN version 7.4.1 (needed by DNNDK in Ubuntu 18.04) I have to fall back to using the CPU-version of Xilinx Tensorflow 1.12 and the decent_q program. As it is, quantization of my ResNet101 model takes about 14 hours on a single thread. I happen to have a PC with more cores than I care to count, which should be really useful in this case.  So, is there a way to make decent_q quantize run in parallel on several cores?

Design wise, it should be straight forward since decent_q is already using orthogonal batches and the (supposedly) only part that needs to be serialized is when updating the resulting compressed model.

0 Kudos