cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
NickolaySchebne
Visitor
Visitor
733 Views
Registered: ‎06-22-2021

Vitis AI Optimizer Analysis Error

Jump to solution

Hello, I try to run model analysis by vai_p_tensorflow for my tensorflow yolov3 model. But I always get errors related to out of memory. There are several types of erros, but I attach the latest logs:

WARNING:tensorflow:From /opt/vitis_ai/conda/envs/vitis-ai-optimizer_tensorflow/lib/python3.6/site-packages/tensorflow_core/python/tools/vai_p_tensorflow.py:157: The name tf.app.run is deprecated. Please use tf.compat.v1.app.run instead.

WARNING:tensorflow:From /opt/vitis_ai/conda/envs/vitis-ai-optimizer_tensorflow/lib/python3.6/site-packages/tensorflow_core/python/tools/vai_p_tensorflow.py:133: The name tf.gfile.IsDirectory is deprecated. Please use tf.io.gfile.isdir instead.

W0623 08:10:25.154484 140295811592384 module_wrapper.py:139] From /opt/vitis_ai/conda/envs/vitis-ai-optimizer_tensorflow/lib/python3.6/site-packages/tensorflow_core/python/tools/vai_p_tensorflow.py:133: The name tf.gfile.IsDirectory is deprecated. Please use tf.io.gfile.isdir instead.

Collect Dataset and Preprocessing Images:: 100%|██████████| 100/100 [00:00<00:00, 126.16it/s]
WARNING:tensorflow:From /opt/project/optimization/vitis_optimization/model_eval.py:56: The name tf.ConfigProto is deprecated. Please use tf.compat.v1.ConfigProto instead.

W0623 08:10:25.995886 140295811592384 module_wrapper.py:139] From /opt/project/optimization/vitis_optimization/model_eval.py:56: The name tf.ConfigProto is deprecated. Please use tf.compat.v1.ConfigProto instead.

WARNING:tensorflow:From /opt/project/optimization/vitis_optimization/model_eval.py:59: The name tf.Session is deprecated. Please use tf.compat.v1.Session instead.

W0623 08:10:25.996070 140295811592384 module_wrapper.py:139] From /opt/project/optimization/vitis_optimization/model_eval.py:59: The name tf.Session is deprecated. Please use tf.compat.v1.Session instead.

2021-06-23 08:10:25.996435: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.1 SSE4.2 AVX AVX2 FMA
2021-06-23 08:10:26.020187: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 3792975000 Hz
2021-06-23 08:10:26.020990: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x55f0f73400f0 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2021-06-23 08:10:26.021004: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): Host, Default Version
2021-06-23 08:10:26.021841: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcuda.so.1
2021-06-23 08:10:26.039700: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-06-23 08:10:26.040056: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1639] Found device 0 with properties: 
name: GeForce RTX 2070 SUPER major: 7 minor: 5 memoryClockRate(GHz): 1.8
pciBusID: 0000:08:00.0
2021-06-23 08:10:26.040215: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.0
2021-06-23 08:10:26.040794: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10.0
2021-06-23 08:10:26.041389: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcufft.so.10.0
2021-06-23 08:10:26.041535: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcurand.so.10.0
2021-06-23 08:10:26.042205: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusolver.so.10.0
2021-06-23 08:10:26.042741: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusparse.so.10.0
2021-06-23 08:10:26.044632: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7
2021-06-23 08:10:26.044715: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-06-23 08:10:26.045102: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-06-23 08:10:26.045414: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1767] Adding visible gpu devices: 0
2021-06-23 08:10:26.045433: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.0
2021-06-23 08:10:26.112613: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1180] Device interconnect StreamExecutor with strength 1 edge matrix:
2021-06-23 08:10:26.112629: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1186]      0 
2021-06-23 08:10:26.112632: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1199] 0:   N 
2021-06-23 08:10:26.112845: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-06-23 08:10:26.113259: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-06-23 08:10:26.113819: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-06-23 08:10:26.114423: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1325] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 7176 MB memory) -> physical GPU (device: 0, name: GeForce RTX 2070 SUPER, pci bus id: 0000:08:00.0, compute capability: 7.5)
2021-06-23 08:10:26.116691: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x55f0f5e51780 initialized for platform CUDA (this does not guarantee that XLA will be used). Devices:
2021-06-23 08:10:26.116703: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): GeForce RTX 2070 SUPER, Compute Capability 7.5
WARNING:tensorflow:From /opt/project/optimization/vitis_optimization/model_eval.py:60: The name tf.keras.backend.set_session is deprecated. Please use tf.compat.v1.keras.backend.set_session instead.

W0623 08:10:26.117948 140295811592384 module_wrapper.py:139] From /opt/project/optimization/vitis_optimization/model_eval.py:60: The name tf.keras.backend.set_session is deprecated. Please use tf.compat.v1.keras.backend.set_session instead.

WARNING:tensorflow:From /opt/vitis_ai/conda/envs/vitis-ai-optimizer_tensorflow/lib/python3.6/site-packages/tensorflow_core/python/ops/resource_variable_ops.py:1630: calling BaseResourceVariable.__init__ (from tensorflow.python.ops.resource_variable_ops) with constraint is deprecated and will be removed in a future version.
Instructions for updating:
If using Keras pass *_constraint arguments to layers.
W0623 08:10:26.133961 140295811592384 deprecation.py:506] From /opt/vitis_ai/conda/envs/vitis-ai-optimizer_tensorflow/lib/python3.6/site-packages/tensorflow_core/python/ops/resource_variable_ops.py:1630: calling BaseResourceVariable.__init__ (from tensorflow.python.ops.resource_variable_ops) with constraint is deprecated and will be removed in a future version.
Instructions for updating:
If using Keras pass *_constraint arguments to layers.
2021-06-23 08:10:29.954866: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7
2021-06-23 08:10:30.517702: W tensorflow/stream_executor/cuda/redzone_allocator.cc:312] Not found: ./bin/ptxas not found
Relying on driver to perform ptx compilation. This message will be only logged once.
2021-06-23 08:10:30.611210: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10.0
2021-06-23 08:10:31.157199: I tensorflow/stream_executor/cuda/cuda_driver.cc:831] failed to allocate 3.51G (3768103680 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY: out of memory
INFO:tensorflow:using GPU: ['0']
I0623 08:10:32.286929 140295811592384 __init__.py:228] using GPU: ['0']
2021-06-23 08:10:32.287088: I tensorflow/tools/pruning/model_pruning.cc:418] 
channel_batch: 2
exclude: conv2d_58/BiasAdd,conv2d_58/Conv2D,conv2d_57/Conv2D
input_ckpt: /opt/project/weights/optimized/vitis/yolov3_tf1.ckpt
input_graph: /opt/project/weights/optimized/vitis/yolov3_tf1.pbtxt
input_node_shapes: 1,416,416,3
input_nodes: image_input
is_skip_view: true
output_nodes: conv2d_58/BiasAdd,conv2d_66/BiasAdd,conv2d_74/BiasAdd
workspace: /opt/project/weights/optimized/vitis/analys

INFO:tensorflow:ana_total_steps: 1
I0623 08:10:32.414357 140295811592384 __init__.py:234] ana_total_steps: 1
INFO:tensorflow:begin step: 0 step_num_parallel: 1
I0623 08:10:32.414508 140295811592384 __init__.py:242] begin step: 0 step_num_parallel: 1
2021-06-23 08:10:33.554342: I tensorflow/tools/pruning/model_pruning.cc:418] 
channel_batch: 2
exclude: conv2d_58/BiasAdd,conv2d_58/Conv2D,conv2d_57/Conv2D
input_ckpt: /opt/project/weights/optimized/vitis/yolov3_tf1.ckpt
input_graph: /opt/project/weights/optimized/vitis/yolov3_tf1.pbtxt
input_node_shapes: 1,416,416,3
input_nodes: image_input
output_ckpt: /opt/project/weights/optimized/vitis/analys/tmp.ckpt-0
output_nodes: conv2d_58/BiasAdd,conv2d_66/BiasAdd,conv2d_74/BiasAdd
step: 0
workspace: /opt/project/weights/optimized/vitis/analys

2021-06-23 08:10:33.798230: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.1 SSE4.2 AVX AVX2 FMA
2021-06-23 08:10:33.820188: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 3792975000 Hz
2021-06-23 08:10:33.820953: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x561b14a6a5e0 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2021-06-23 08:10:33.820968: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): Host, Default Version
2021-06-23 08:10:33.821800: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcuda.so.1
2021-06-23 08:10:33.828025: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-06-23 08:10:33.828318: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1639] Found device 0 with properties: 
name: GeForce RTX 2070 SUPER major: 7 minor: 5 memoryClockRate(GHz): 1.8
pciBusID: 0000:08:00.0
2021-06-23 08:10:33.828501: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.0
2021-06-23 08:10:33.829444: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10.0
2021-06-23 08:10:33.830381: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcufft.so.10.0
2021-06-23 08:10:33.830630: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcurand.so.10.0
2021-06-23 08:10:33.831756: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusolver.so.10.0
2021-06-23 08:10:33.832573: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusparse.so.10.0
2021-06-23 08:10:33.834235: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7
2021-06-23 08:10:33.834319: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-06-23 08:10:33.834551: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-06-23 08:10:33.834717: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1767] Adding visible gpu devices: 0
2021-06-23 08:10:33.834739: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.0
2021-06-23 08:10:33.878336: E tensorflow/core/common_runtime/session.cc:78] Failed to create session: Internal: CUDA runtime implicit initialization on GPU:0 failed. Status: out of memory

 

Is there a way to analyze and prune yolov3 on my machine (gpu with 8 gb memory)?  

0 Kudos
1 Solution

Accepted Solutions
gguasti
Moderator
Moderator
564 Views
Registered: ‎11-29-2007

hello,

the error is coming from the GPU:

failed to allocate 3.51G (3768103680 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY

Out Of Memory can happen for many reasons and unfortunately suggestions could appear very generic:

You might check what is the available memory in your GPU, for example by using the command nvidia-smi:

gguasti_0-1626356345958.png

this also tells us if someone else is using the same GPU resource, and eventually you can pick the GPU with more memory available (GPU 1 in my example)

  • try to understand how much memory is your task allocating. You might also try to symplify your model or squeezing the input data size. 
  • you can try to reduce the amount of memory used: try reducing the batch size. 
  • Consider running the evaluation on a separate GPU if available or suspending the training binary while running the evaluation on the same GPU
  • Try to load data in batches. Refer to this post, but you can find many other valid examples.
  • consider to improve your GPU: in UG133 we recommend to use Tesla P100 or Tesla V100.
  • The memory utilization in new Vitis AI release improved (1.4 should be public in few days, by end of this month) so another option is to try with Vitis AI 1.4 and the new docker.

View solution in original post

0 Kudos
2 Replies
gguasti
Moderator
Moderator
565 Views
Registered: ‎11-29-2007

hello,

the error is coming from the GPU:

failed to allocate 3.51G (3768103680 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY

Out Of Memory can happen for many reasons and unfortunately suggestions could appear very generic:

You might check what is the available memory in your GPU, for example by using the command nvidia-smi:

gguasti_0-1626356345958.png

this also tells us if someone else is using the same GPU resource, and eventually you can pick the GPU with more memory available (GPU 1 in my example)

  • try to understand how much memory is your task allocating. You might also try to symplify your model or squeezing the input data size. 
  • you can try to reduce the amount of memory used: try reducing the batch size. 
  • Consider running the evaluation on a separate GPU if available or suspending the training binary while running the evaluation on the same GPU
  • Try to load data in batches. Refer to this post, but you can find many other valid examples.
  • consider to improve your GPU: in UG133 we recommend to use Tesla P100 or Tesla V100.
  • The memory utilization in new Vitis AI release improved (1.4 should be public in few days, by end of this month) so another option is to try with Vitis AI 1.4 and the new docker.

View solution in original post

0 Kudos
NickolaySchebne
Visitor
Visitor
530 Views
Registered: ‎06-22-2021

Thanks for your answer. I feel like NVIDIA GeForce RTX 2070 SUPER isn't coping with the execution. But I want to try your  solutions. Reducing the batch size didn't help, load data in tensorflow style is good idea, I will try it later, but I don't quite understand how to implement suspending the training binary?

I will look forward to the release of the new version Vitis AI, thanks for help.

0 Kudos