Can't find libdevice directory ${CUDA_DIR}/nvvm/libdevice

Charlie_Talbot · September 12, 2022, 3:45pm

Hi, I have a related question to the topic in

github.com/tensorflow/tensorflow

Can't find libdevice directory ${CUDA_DIR}/nvvm/libdevice

opened 06:05PM - 27 Jul 22 UTC

closed 07:07PM - 11 Aug 22 UTC

edwardyehuang

stat:awaiting response type:build/install stalled comp:gpu subtype: ubuntu/linux wsl2

<details><summary>Click to expand!</summary> ### Issue Type Bug ### Source… binary ### Tensorflow Version tf-nightly ### Custom Code Yes ### OS Platform and Distribution WSL Ubuntu ### Mobile device _No response_ ### Python version 3.9 ### Bazel version _No response_ ### GCC/Compiler version _No response_ ### CUDA/cuDNN version 11.2 ### GPU model and memory _No response_ ### Current Behaviour? ```shell Can't find libdevice directory ${CUDA_DIR}/nvvm/libdevice The error appears when setting groups > 1 in tf.keras.layers.Conv2D with tf-nightly. The error does not appear in TensorFlow 2.8. Seems the previous TensorFlow build is packed with libdevice. ``` ### Standalone code to reproduce the issue ```shell tf.keras.layers.Conv2D( 512, 1, use_bias=False groups=5, ) ``` ### Relevant log output ```shell 2022-07-28 01:48:13.943787: W tensorflow/compiler/xla/service/gpu/nvptx_helper.cc:56] Can't find libdevice directory ${CUDA_DIR}/nvvm/libdevice. This may result in compilation or runtime failures, if the program we try to run uses routines from libdevice. Searched for CUDA in the following directories: ./cuda_sdk_lib /usr/local/cuda-11.2 /usr/local/cuda . You can choose the search directory by setting xla_gpu_cuda_data_dir in HloModule's DebugOptions. For most apps, setting the environment variable XLA_FLAGS=--xla_gpu_cuda_data_dir=/path/to/cuda will work. 2022-07-28 01:48:13.986332: I tensorflow/core/platform/default/subprocess.cc:304] Start cannot spawn child process: No such file or directory 2022-07-28 01:48:13.986408: W tensorflow/stream_executor/gpu/asm_compiler.cc:80] Couldn't get ptxas version string: INTERNAL: Couldn't invoke ptxas --version 2022-07-28 01:48:14.018602: I tensorflow/core/platform/default/subprocess.cc:304] Start cannot spawn child process: No such file or directory 2022-07-28 01:48:14.018778: F tensorflow/compiler/xla/service/gpu/nvptx_compiler.cc:453] ptxas returned an error during compilation of ptx to sass: 'INTERNAL: Failed to launch ptxas' If the error message indicates that a file could not be written, please verify that sufficient filesystem space is provided. Fatal Python error: Aborted Thread 0x00007f3abe7fc640 (most recent call first): File "/home/edwardyehuang/miniconda3/envs/tf-nightly/lib/python3.9/threading.py", line 316 in wait File "/home/edwardyehuang/miniconda3/envs/tf-nightly/lib/python3.9/threading.py", line 574 in wait File "/home/edwardyehuang/.vscode-server/extensions/ms-python.python-2022.10.1/pythonFiles/lib/python/debugpy/_vendored/pydevd/pydevd.py", line 257 in _on_run File "/home/edwardyehuang/.vscode-server/extensions/ms-python.python-2022.10.1/pythonFiles/lib/python/debugpy/_vendored/pydevd/_pydevd_bundle/pydevd_daemon_thread.py", line 49 in run File "/home/edwardyehuang/miniconda3/envs/tf-nightly/lib/python3.9/threading.py", line 973 in _bootstrap_inner File "/home/edwardyehuang/miniconda3/envs/tf-nightly/lib/python3.9/threading.py", line 930 in _bootstrap Thread 0x00007f3abeffd640 (most recent call first): File "/home/edwardyehuang/miniconda3/envs/tf-nightly/lib/python3.9/threading.py", line 316 in wait File "/home/edwardyehuang/miniconda3/envs/tf-nightly/lib/python3.9/threading.py", line 574 in wait File "/home/edwardyehuang/.vscode-server/extensions/ms-python.python-2022.10.1/pythonFiles/lib/python/debugpy/_vendored/pydevd/pydevd.py", line 211 in _on_run File "/home/edwardyehuang/.vscode-server/extensions/ms-python.python-2022.10.1/pythonFiles/lib/python/debugpy/_vendored/pydevd/_pydevd_bundle/pydevd_daemon_thread.py", line 49 in run File "/home/edwardyehuang/miniconda3/envs/tf-nightly/lib/python3.9/threading.py", line 973 in _bootstrap_inner File "/home/edwardyehuang/miniconda3/envs/tf-nightly/lib/python3.9/threading.py", line 930 in _bootstrap ``` </details>

but am using Ubuntu 22.04

After a fresh installation (I was using 18.04 until a cataclysmic event), I seem to be having a similar issue
I am trying to use deepxde with tensorflow backend.

(a lot of what follows is unnecessary background)
I am trying to compile one of their examples and my attention is drawn to the portion where “nvvm” is mentioned.

chaztikov@priority:~/git/deepxde/examples/pinn_forward$ python3 Burgers_RAR.py
Using backend: tensorflow

Enable just-in-time compilation with XLA.

2022-09-10 23:10:14.449892: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2022-09-10 23:10:15.157727: W tensorflow/core/common_runtime/gpu/gpu_bfc_allocator.cc:42] Overriding orig_value setting because the TF_FORCE_GPU_ALLOW_GROWTH environment variable is set. Original config value was 0.
2022-09-10 23:10:15.157784: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1532] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 10113 MB memory: → device: 0, name: NVIDIA GeForce GTX 1080 Ti, pci bus id: 0000:03:00.0, compute capability: 6.1
Compiling model…
‘compile’ took 0.000428 s

Training model…

WARNING:tensorflow:AutoGraph could not transform <function at 0x7f736f697250> and will run it as-is.
Cause: could not parse the source code of <function at 0x7f736f697250>: no matching AST found among candidates:

coding=utf-8

lambda x, on: np.array([on_boundary(x[i], on[i]) for i in range(len(x))])
To silence this warning, decorate the function with @tf.autograph.experimental.do_not_convert
WARNING:tensorflow:AutoGraph could not transform <function at 0x7f736f697490> and will run it as-is.
Cause: could not parse the source code of <function at 0x7f736f697490>: no matching AST found among candidates:

coding=utf-8

lambda x, on: np.array([on_boundary(x[i], on[i]) for i in range(len(x))])
To silence this warning, decorate the function with @tf.autograph.experimental.do_not_convert
2022-09-10 23:10:16.768843: I tensorflow/compiler/xla/service/service.cc:170] XLA service 0x55a2ebd8d3a0 initialized for platform CUDA (this does not guarantee that XLA will be used). Devices:
2022-09-10 23:10:16.768872: I tensorflow/compiler/xla/service/service.cc:178] StreamExecutor device (0): NVIDIA GeForce GTX 1080 Ti, Compute Capability 6.1
2022-09-10 23:10:16.797603: I tensorflow/compiler/mlir/tensorflow/utils/dump_mlir_util.cc:263] disabling MLIR crash reproducer, set env var MLIR_CRASH_REPRODUCER_DIRECTORY to enable.
2022-09-10 23:10:17.433963: I tensorflow/core/platform/default/subprocess.cc:304] Start cannot spawn child process: No such file or directory
2022-09-10 23:10:17.434943: I tensorflow/core/platform/default/subprocess.cc:304] Start cannot spawn child process: No such file or directory
2022-09-10 23:10:17.434971: W tensorflow/stream_executor/gpu/asm_compiler.cc:80] Couldn’t get ptxas version string: INTERNAL: Couldn’t invoke ptxas --version
2022-09-10 23:10:17.435713: I tensorflow/core/platform/default/subprocess.cc:304] Start cannot spawn child process: No such file or directory
2022-09-10 23:10:17.435784: W tensorflow/stream_executor/gpu/redzone_allocator.cc:314] INTERNAL: Failed to launch ptxas
Relying on driver to perform ptx compilation.
Modify PATHtocustomizeptxaslocation.Thismessagewillbeonlyloggedonce.2022−09−1023:10:17.440252:Itensorflow/core/platform/default/subprocess.cc:304]Startcannotspawnchildprocess:Nosuchfileordirectory2022−09−1023:10:17.440280:Wtensorflow/streamexecutor/gpu/asmcompiler.cc:80]Couldn′tgetptxasversionstring:INTERNAL:Couldn′tinvokeptxas−−version2022−09−1023:10:17.441066:Itensorflow/core/platform/default/subprocess.cc:304]Startcannotspawnchildprocess:Nosuchfileordirectory2022−09−1023:10:17.441134:Wtensorflow/compiler/xla/service/gpu/buffercomparator.cc:640]INTERNAL:FailedtolaunchptxasRelyingondrivertoperformptxcompilation.SettingXLAFLAGS=−−xlagpucudadatadir=/path/to/cudaormodifyingPATHtocustomizeptxaslocation.Thismessagewillbeonlyloggedonce.2022−09−1023:10:17.440252:Itensorflow/core/platform/default/subprocess.cc:304]Startcannotspawnchildprocess:Nosuchfileordirectory2022−09−1023:10:17.440280:Wtensorflow/streamexecutor/gpu/asmcompiler.cc:80]Couldn′tgetptxasversionstring:INTERNAL:Couldn′tinvokeptxas−−version2022−09−1023:10:17.441066:Itensorflow/core/platform/default/subprocess.cc:304]Startcannotspawnchildprocess:Nosuchfileordirectory2022−09−1023:10:17.441134:Wtensorflow/compiler/xla/service/gpu/buffercomparator.cc:640]INTERNAL:FailedtolaunchptxasRelyingondrivertoperformptxcompilation.SettingXLAFLAGS=−−xlagpucudadatadir=/path/to/cudaormodifyingPATH can be used to set the location of ptxas
This message will only be logged once.
2022-09-10 23:10:17.534116: W tensorflow/compiler/xla/service/gpu/nvptx_helper.cc:56] Can’t find libdevice directory ${CUDA_DIR}/nvvm/libdevice. This may result in compilation or runtime failures, if the program we try to run uses routines from libdevice.
Searched for CUDA in the following directories:
./cuda_sdk_lib
/usr/local/cuda-11.2
/usr/local/cuda
.
You can choose the search directory by setting xla_gpu_cuda_data_dir in HloModule’s DebugOptions. For most apps, setting the environment variable XLA_FLAGS=–xla_gpu_cuda_data_dir=/path/to/cuda will work.
2022-09-10 23:10:17.721031: I tensorflow/core/platform/default/subprocess.cc:304] Start cannot spawn child process: No such file or directory
2022-09-10 23:10:17.721072: W tensorflow/stream_executor/gpu/asm_compiler.cc:80] Couldn’t get ptxas version string: INTERNAL: Couldn’t invoke ptxas --version
2022-09-10 23:10:17.721699: I tensorflow/core/platform/default/subprocess.cc:304] Start cannot spawn child process: No such file or directory
2022-09-10 23:10:17.722104: F tensorflow/compiler/xla/service/gpu/nvptx_compiler.cc:456] ptxas returned an error during compilation of ptx to sass: ‘INTERNAL: Failed to launch ptxas’ If the error message indicates that a file could not be written, please verify that sufficient filesystem space is provided.
Aborted (core dumped)
chaztikov@priority:/git/deepxde/examples/pinn_forwardCO
COLORTERMCOLORTERMCOLUMNS $COMP_WORDBREAKS
chaztikov@priority:/git/deepxde/examples/pinn_forwardCO
COLORTERMCOLORTERMCOLUMNS COMPWORDBREAKSchaztikov@priority: /git/deepxde/examples/pinnforwardCOMPWORDBREAKSchaztikov@priority: /git/deepxde/examples/pinnforward COCOCOLORTERM COLUMNSCOLUMNSCOMP_WORDBREAKS

I did not “conda init bashrc” because I found that doing so interfered with an otherwise successful installation of nvidia cuda cudatoolkit etc.
(though I am afraid to break anything, I welcome suggestions on this point and on all points)

I see that I do have nvvm as indicated below

chaztikov@priority:/git/deepxde/examples/pinn_forward$ locate /nvvm/libdevice
/home/chaztikov/anaconda3/nvvm/libdevice
/home/chaztikov/anaconda3/nvvm/libdevice/libdevice.10.bc
/home/chaztikov/anaconda3/pkgs/cuda-nvcc-11.7.99-0/nvvm/libdevice
/home/chaztikov/anaconda3/pkgs/cuda-nvcc-11.7.99-0/nvvm/libdevice/libdevice.10.bc
chaztikov@priority:/git/deepxde/examples/pinn_forward$

so I will set in ~/.bashrc
export CUDA_DIR="/home/chaztikov/anaconda3/pkgs/cuda-nvcc-11.7.99-0/"

I tried the above, and it didn’t work, still getting the same error message,

Note: before and after that change (export CUDA_DIR etc) to my ~/.bashrc, I still seem to have tensorflow locating the GPU.

chaztikov@priority:~/git/deepxde/examples/pinn_forward$ python3
Python 3.10.4 (main, Jun 29 2022, 12:14:53) [GCC 11.2.0] on linux
Type “help”, “copyright”, “credits” or “license” for more information.

import tensorflow
tensorflow.device(‘GPU’)
2022-09-11 21:07:57.423925: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2022-09-11 21:07:57.966877: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1532] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 9858 MB memory: → device: 0, name: NVIDIA GeForce GTX 1080 Ti, pci bus id: 0000:03:00.0, compute capability: 6.1
<tensorflow.python.eager.context._EagerDeviceContext object at 0x7fbd3557d9c0>

torch also seems to work as a side note

chaztikov@priority:~/git/deepxde/examples/pinn_forward$ python3
Python 3.10.4 (main, Jun 29 2022, 12:14:53) [GCC 11.2.0] on linux
Type “help”, “copyright”, “credits” or “license” for more information.

import torch;torch.cuda.is_available()
True

EDIT: I re-read the output when importing tensorflow, is this indicating that the GPU is found but somehow not using the GPU? It doesn’t seem that way, it seems to be indicating that tensorflow was compiled on my system without the proper optimization flags.

Charlie_Talbot · September 13, 2022, 2:57pm

Please help, I’d really appreciate it, as this is holding up progress on a time-sensitive project Thanks!

chunduriv · September 14, 2022, 5:47am

Could you share the output of the snippet ?

import tensorflow as tf
tf.config.list_physical_devices('GPU')

torlarse · November 28, 2022, 1:37am

For what it’s worth, I solved the same problem by first creating the nvvm/libdevice folder in my Conda environment lib folder. Thereafter copying the "libdevice.10.bc` file to that directory.

Next, I set

export XLA_FLAGS=–xla_gpu_cuda_data_dir=/home//miniconda3/envs//lib

in Bash, within my activated Conda environment.

Based on the error message snip Can't find libdevice directory ${CUDA_DIR}/nvvm/libdevice., it is not sufficient to make the XLA_FLAGS point to the file. The folder structure must match.

This procedure is more or less described in numerous other similar issues.

torlarse · November 28, 2022, 1:38am

After fixing my libdevice issue, I ran into the Couldn't invoke ptxas --version issue. Still using Conda, as recommended by the official docs. I solved that by installing

conda install -c nvidia cuda-nvcc

in my activated Conda/ Tensorflow environment. I got that recipe from this Github issue:

Steven_Cohen · January 10, 2023, 11:04am

Hi guys, i’ve done two fresh installs (ติดตั้ง TensorFlow ด้วย pip) on two laptops RTX (ubuntu/anaconda/jupyter) and GTX (ubuntu/miniconda/jupyter) .

Both have the same issue. nvidia-smi works but shows CUDA 12.0.

import tensorflow as tf; print(tf.config.list_physical_devices(‘GPU’)) → works.

At model.fit(x_train, y_train, epochs=5) i get the same issue →

tensorflow/compiler/xla/service/gpu/nvptx_helper.cc:56] Can’t find libdevice directory ${CUDA_DIR}/nvvm/libdevice. This may result in compilation or runtime failures, if the program we try to run uses routines from libdevice.
Searched for CUDA in the following directories:
./cuda_sdk_lib
/usr/local/cuda-11.2
/usr/local/cuda
.
You can choose the search directory by setting xla_gpu_cuda_data_dir in HloModule’s DebugOptions. For most apps, setting the environment variable XLA_FLAGS=–xla_gpu_cuda_data_dir=/path/to/cuda will work.

I’v run both

sudo cp -r /home/steve/anaconda3.2202.10/pkgs/cudatoolkit-11.2.2-hbe64b41_10/lib/ /usr/local/cuda and export XLA_FLAGS=–xla_gpu_cuda_data_dir=/anaconda3.2202.10/pkgs/cudatoolkit-11.2.2-hbe64b41_10/lib/libdevice.10.bc

and still getting the same error message.

tensorflow 2.10 native works great still.

I really would like to progress on to the linux platform to continue to use tf latest features any advise?

Steven_Cohen · February 20, 2023, 9:27am

ok guys for wsl2 tf==2.10 on works.

using code

import tensorflow as tf
tf.config.list_physical_devices(‘GPU’)
sys_details = tf.sysconfig.get_build_info()
cuda = sys_details[“cuda_version”]
cudnn = sys_details[“cudnn_version”]
print(cuda, cudnn)

confirms “11.2 8”

where as for tf==2.11

it gives “64_112 64_8”

lets hope this is not a monopolistic attack by google against microsoft

Steven_Cohen · February 21, 2023, 7:25am

ok found this solution for tf==2.11

optimizer=tf.keras.optimizers.legacy.Adam()

NOT optimizer=“adam”

so as self taught coder that is interesting to see what part of tensorflow calls the gpu

sravan953 · March 24, 2023, 3:56am

@Steven_Cohen this fix worked! I was trying to reproduce the official DDPM example on the Keras website on TF 2.11/Python 3.10 using Miniconda3. Do you have any insights as to why legacy worked?

Steven_Cohen · April 10, 2023, 1:51pm

i just followed the installation instructions for tf12* Install TensorFlow with pip and when i run and a simple optimizers.Adam() I get, 2023-04-09 18:02:22.272960: W tensorflow/compiler/xla/service/gpu/llvm_gpu_backend/gpu_backend_lib.cc:530] Can’t find libdevice directory ${CUDA_DIR}/nvvm/libdevice. This may result in compilation or runtime failures, if the program we try to run uses routines from libdevice. Also when i look in the directory there is no nvvc file downloaded by nvidia.

I’ve tried looking though “\wsl.localhost\Ubuntu\home\steve\anaconda3.23.03tf12\lib\python3.10\site-packages\keras\optimizers\adam.py” for a solution but its a bit above my paygrade.

I have also noticed that for windows native, tf.keras.optimizers.experimental.Adam(), the same error occurs InternalError: Graph execution error: … Node: ‘StatefulPartitionedCall_2’
libdevice not found at ./libdevice.10.bc
[[{{node StatefulPartitionedCall_2}}]] [Op:__inference_train_function_739].

But at least i can find “C:\Users\sjc52\anaconda3.2022.10\pkgs\cuda-nvcc-11.7.99-0”

sravan953 · April 10, 2023, 2:07pm

Thank you for your response! If you do figure out why please share!