Getting error after Installing Tensorflow and GPU setup on Ubuntu 22.04 (after latest ubuntu updates)

Georgia_Ch29 · June 5, 2023, 2:48am

I am trying to setup tensorflow to work with my GPU, on my miniconda venv, but it is impossible.
I am suspecting that there is some issue with the newest version of Ubuntu, since the Tensorflow-GPU setup “broke” after I downloaded some Ubuntu updates lately.

I followed the instructions given in Tensorflow guide over and over again, but when I try to check at the last step if the GPU setup is successful, I get these errors:

2023-06-01 18:21:18.752684: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
2023-06-01 18:21:19.805241: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
import tensorflow as tf
print(tf.config.list_physical_devices(‘GPU’))
2023-06-01 18:21:33.528426: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:996] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at linux/sysfs-bus-pci at v6.0 · torvalds/linux · GitHub
2023-06-01 18:21:33.613683: W tensorflow/core/common_runtime/gpu/gpu_device.cc:1956] Cannot dlopen some GPU libraries. Please make sure the missing libraries mentioned above are installed properly if you would like to use GPU. Follow the guide at Compatibilité avec les GPU | TensorFlow for how to download and setup the required libraries for your platform.
Skipping registering GPU devices…
[]

PC /OS specs:
Device: HP Pavilion Gaming Laptop
Ubuntu 22.04.2 LTS
memory: 16 GB
GPU: NVIDIA GeForce GTX 1660 Ti

nvidia-smi output:
| NVIDIA-SMI 525.105.17 Driver Version: 525.105.17 CUDA Version: 12.0 |

I was wondering if I am doing anything wrong, or if there is any issue regarding the latest Ubuntu updates (since I did not face any similar errors before).

Kiran_Sai_Ramineni · June 5, 2023, 10:46am

Hi @Georgia_Ch29, I can see that the CUDA Version was: 12.0 but the Tensorflow 2.12 supports CUDA 11.8 as shown below

Could you please install the appropriate version of cuDNN and CUDA as mentioned in the document using

conda install -c conda-forge cudatoolkit=11.8.0
pip install nvidia-cudnn-cu11==8.6.0.163

After installing the correct version of CUDA and cuDNN please let us know if the issue is resolved or not. Thank You.

Georgia_Ch29 · June 15, 2023, 3:26pm

Thank you for your response, and sorry for my late answer. Unfortunately, when I attempt to downgrade the cuda to 11.8, the NVIDIA drivers version is also downgraded to 520.

More specifically, my attempts to update cudatoolkit using the commands you suggested were unsuccessful. For this reason I followed the following steps:

Removed and installed cuda and nvidia packages
$ sudo apt-get --purge -y remove ‘cuda*’
$ sudo apt-get --purge -y remove ‘nvidia*’
Then installed nvidia CUDA 11.8 and
set environment variables.

However, wIth cuda 11.8, the nvidia-drivers version 520 (instead of 525, that is suggested for my system) was automatically installed.

If I try to first install the NVIDIA drivers that are suitable for my Ubuntu version (v. 525.105.17), the cuda version that is automatically installed is 12.0.

Is there any way of downgrading the cuda version, without changing NVIDIA-drivers version?

Kiran_Sai_Ramineni · June 16, 2023, 7:18am

Hi @Georgia_Ch29, The change in CUDA version is due to the Linux NVIDA Divers >=525 supports CUDA 12.x.

I don’t think without changing the NVIDIA-drivers version you cannot downgrade the CUDA. Thank You.