UPGRADE YOUR BROWSER

We have detected your current browser version is not the latest one. Xilinx.com uses the latest web technologies to bring you the best online experience possible. Please upgrade to a Xilinx.com supported browser:Chrome, Firefox, Internet Explorer 11, Safari. Thank you!

cancel
Showing results for 
Search instead for 
Did you mean: 
Visitor i_am_here
Visitor
95 Views
Registered: ‎04-23-2019

decent_q in parallell (CPU)

Since the (official) Anaconda repo doesn't have cuDNN version 7.4.1 (needed by DNNDK in Ubuntu 18.04) I have to fall back to using the CPU-version of Xilinx Tensorflow 1.12 and the decent_q program. As it is, quantization of my ResNet101 model takes about 14 hours on a single thread. I happen to have a PC with more cores than I care to count, which should be really useful in this case.  So, is there a way to make decent_q quantize run in parallel on several cores?

Design wise, it should be straight forward since decent_q is already using orthogonal batches and the (supposedly) only part that needs to be serialized is when updating the resulting compressed model.

0 Kudos