07-29-2020 08:55 PM
07-31-2020 02:22 AM
Hi @natheesan ,
It seems that "vitis-ai-pytorch" environment can only be used on a GPU docker kernel.
I am afraid that my GPU machine is occupied for long time training I would try to borrow a GPU card and have a try on my side.
08-02-2020 09:36 PM
Hi @jasonwu ,
Thanks for the response.
I have used PyTorch (1.4.0) using the conda environment and have not faced any issues like this ("Illegal instruction (core dumped)") on my PC.
I am only facing this issue in your Vitis-AI-PyTorch environment.
Do you know anyone else could respond to this issue quickly as it is currently required?
08-03-2020 02:16 AM
Hi @natheesan ,
You may contact Xilinx FAE to check if they have bandwith for quick support.
I am afraid that as I know several of my colleagues are on long vacation that may be the reason that we can't give quick response.
08-03-2020 12:01 PM
Thank you for this info, @jasonwu . A separate forum thread is also exploring the "illegal instruction" error arising in the vai-q-pytorch environment running on the provided Docker container: https://forums.xilinx.com/t5/AI-and-Vitis-AI/Pytorch-Illegal-instruction-core-dumped/td-p/1134124
@natheesan, please keep us looped in as your feedback from Xilinx FAE, as an expedient solution would be required on my end, as well. Thanks!
08-03-2020 07:57 PM
I just finish the test for importing torch in my side, it can work expect installing Nvidia-440 driver destory my desktop display.
Please check the test log below:
wuxian@wuxian-ubuntu1804-sw:/workspace$ conda activate vitis-ai-pytorch (vitis-ai-pytorch) wuxian@wuxian-ubuntu1804-sw:/workspace$ python3 Python 3.6.10 |Anaconda, Inc.| (default, Mar 25 2020, 23:51:54) [GCC 7.3.0] on linux Type "help", "copyright", "credits" or "license" for more information. >>> import torch >>> exit Use exit() or Ctrl-D (i.e. EOF) to exit >>> exit() (vitis-ai-pytorch) wuxian@wuxian-ubuntu1804-sw:/workspace$ python Python 3.6.10 |Anaconda, Inc.| (default, Mar 25 2020, 23:51:54) [GCC 7.3.0] on linux Type "help", "copyright", "credits" or "license" for more information. >>> import torch >>> exit() (vitis-ai-pytorch) wuxian@wuxian-ubuntu1804-sw:/workspace$ pip3 list Package Version ------------- ------------------------ certifi 2020.6.20 cffi 1.14.0 mkl-fft 1.1.0 mkl-random 1.1.1 mkl-service 2.3.0 numpy 1.17.2 olefile 0.46 Pillow 7.2.0 pip 20.1.1 protobuf 3.11.4 pybind11 2.5.0 pycparser 2.20 pytorch-nndct 0.1.0-a5f1f45-torch1.1.0 scipy 1.3.1 setuptools 49.2.0.post20200714 six 1.15.0 torch 1.1.0 torchvision 0.3.0 tqdm 4.47.0 wheel 0.34.2
I am using the VAI 1.2 tag to run the test
wuxian@wuxian-ubuntu1804-sw:~/wu_software/Vitis-AI$ git branch * (HEAD detached at v1.2) master
And the docker image is generated via:
cd ./docker ./docker_build_gpu.sh
08-03-2020 11:59 PM
Thank you for testing this on your end, @jasonwu. I was also able to successfully run import torch previously, so potentially the "illegal instruction" issue encountered with resnet18_quant.py quantization is separate.
Regarding running the vai-q-pytorch container with GPU support, ensure that you're using using the vitis-ai-gpu Docker environment (the instruction mention vitis-ai
Also, to indicate how many GPUs are used by Docker, add
to the docker_run.sh script (substitute "all" for however many GPUs you want allocated, or leave as-is to dedicate all GPU resources to Docker):
elif [[ $IMAGE_NAME == *"gpu"* ]]; then
docker run \
-v /opt/xilinx/dsa:/opt/xilinx/dsa \
-v /opt/xilinx/overlaybins:/opt/xilinx/overlaybins \
-e USER=$user -e UID=$uid -e GID=$gid \
-v $HERE:/workspace \
-v /dev/shm:/dev/shm \
-w /workspace \
python resnet18_quant.py --quant_mode 1 --subset_len 200
Your help on this thread would be greatly appreciated: https://forums.xilinx.com/t5/AI-and-Vitis-AI/Pytorch-Illegal-instruction-core-dumped/td-p/1134124
08-05-2020 12:32 AM
Thanks for your help. I did the same way that you did. Still, I am getting that error.
I am not getting that error if I used the source code version of vai_q_pytorch (https://github.com/Xilinx/Vitis-AI/tree/master/Vitis-AI-Quantizer/vai_q_pytorch). Only I am facing this issue if I used the Docker environment.
I am not sure, where is the actual error located.
08-05-2020 05:37 AM
Thank you, @natheesan , I also only see the "illegal instruction" error within the docker environment.
I am attempting to build and run vai_q_pytorch outside of docker and have followed the README instructions to recreate the environment. Unfortunately, this results in a pytorch-nndct syntax error when running the "import pytorch_nndct" from the instructions:
File "/home/ubuntu/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/pytorch_nndct/apis/quant_api.py", line 78 GLOBAL_MAP.set_map(NNDCT_KEYS.QUANT_MODE, quant_mode) ^ SyntaxError: invalid syntax
The docker environment uses a custom pytorch-nndct version, 0.1.0-a5f1f45-torch1.1.0 .
@jasonwu, do you know how this version of pytorch-nndct can be retrieved/reproduced from outside the Vitis AI docker env? Thanks!
08-06-2020 08:22 AM
Update: It was determined that there is indeed a typo released with the pytorch_nndct/apis/quant_api.py file in the latest release (1.2.82):
NndctScreenLogger().info(f('GLOBAL_MAP set_map quant_mode')
With this incorrect '(' removed, it is possible to perform quantization (resnet18_quant.py --quant_mode 1) outside the Docker container. However, Xmodel generation fails (resnet18_quant.py --quant_mode 2) as the XIR package cannot be found. From this thread, https://forums.xilinx.com/t5/AI-and-Vitis-AI/Where-is-vai-c-xir-command/td-p/1129301, it seems that, as XIR is not publicly released, it can only be accessed through the Docker environment.
So the attempt to build vai_q_pytorch locally has come back to resolving the "illegal instruction" error encountered in the Docker env: https://forums.xilinx.com/t5/AI-and-Vitis-AI/Pytorch-Illegal-instruction-core-dumped/m-p/1134124
Thread 1 "python" received signal SIGILL, Illegal instruction.
0x00007f8325a31cdb in mkldnn::impl::scales_t::set(int, int, float const*) () from /opt/vitis_ai/conda/envs/vitis-ai-pytorch/lib/python3.6/site-packages/torch/lib/libcaffe2.so
08-09-2020 07:19 PM
Hi @natheesan ,
As I said before the machine I was using meet some display card driver problem here.
So I try to install a new Ubuntu 18.04 and start with the fresh new system.
I install the GPU associated stuffs from here:
And then install the docker environment(I am using VAI 1.1 guide because it is more detailed):
And then following the flows on VAI 1.2 tag to download and install the image. I still can't meet your issue.
And @philreiter ,
I am afraid that I haven't tried for the resnet example on Pytorch yet it would need image data to do further test.
I would try to do the test on my side if the image data is ready.
08-10-2020 12:41 AM
Hi @philreiter ,
Thanks to your update. And as an update on my side, I still can't find the proper dataset so that I am just using 100 picture of validation images and can't reproduce the issue:
(vitis-ai-pytorch) wuxian@wuxian-Ubuntu1804:/workspace/Vitis-AI-Quantizer/vai_q_ pytorch/example$ python resnet18_quant.py --quant_mode 1 --subset_len 100 [NNDCT_NOTE]: Loading NNDCT kernels... -------- Start resnet18 test [NNDCT_NOTE]: Quantization calibration process start up... [NNDCT_NOTE]: =>Parsing ResNet... [NNDCT_NOTE]: =>Quantizable module is generated.(quantize_result/ResNet.py) 100%|#############################################| 4/4 [00:01<00:00, 2.54it/s] loss: 0.687199 top-1 / top-5 accuracy: 0 / 0 [NNDCT_NOTE]: =>Exporting quant config.(quantize_result/quant_info.json) -------- End of resnet18 test (vitis-ai-pytorch) wuxian@wuxian-Ubuntu1804:/workspace/Vitis-AI-Quantizer/vai_q_ pytorch/example$