Tensorflow 2.15 Install is driving me CRAZY!

I’m trying to get tensorflow 2.15 to work with Cuda GPU on wsl 2 ubuntu-22.04. As I import tensorflow into a juypter notebook I get the following error:

2024-02-07 14:39:49.640124: E external/local_xla/xla/stream_executor/cuda/cuda_dnn.cc:9261] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
2024-02-07 14:39:49.640166: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:607] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
2024-02-07 14:39:49.640799: E external/local_xla/xla/stream_executor/cuda/cuda_blas.cc:1515] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered

I installed the cuda-toolkit 12.3 into ubuntu-22.04, then I installed my libraries for python including 'pip install tensorflow[and-cuda] '. I then got the above error and tried to uninstall the cuda-toolkit 12.3 because I read that tensorflow[and-cuda] should install the cuda dependencies for me, but the error still exists. Any help is appreciated!

Hi @Ian_Lawrence, Could you please confirm whether you are using wsl2 in windows or in ubuntu. Thank You.

Wsl2. Funny you should ask… I was going to add wsl2 in, but theres no edit button while the post is pending.

I feel your plain. I have spent all day trying to get TF+GPU working (WSL/Ubuntu on Windows).
It really shouldn’t be this hard.

Did you fix it ? If not i have it running on WSL2 and all is good maybe i can help.

Definitely not fixed. Any ideas?

I have few question before.

You have CUDA CUdnn and Tensorrt installed properly ?

make a new env in conda.
Then install the basic stuff with conda then with PIP:

pip install tensorflow tensorflow-datasets tfx

if cuda is installed Tf should see it
If u did not do it properly you can install

pip install tensorflow[and-cuda]

not install anything else and use a test script

import tensorflow as tf

Check available GPUs

gpus = tf.config.experimental.list_physical_devices(‘GPU’)
print("Num GPUs Available: “, len(gpus))
for gpu in gpus:
print(“Name:”, gpu.name, " Type:”, gpu.device_type)

Test GPU utilization

if gpus:
try:
# Set memory growth to avoid memory allocation errors
for gpu in gpus:
tf.config.experimental.set_memory_growth(gpu, True)

    # Simple computation to test GPU utilization
    with tf.device('/GPU:0'):  # Change this if you have more than one GPU
        a = tf.constant([[1.0, 2.0, 3.0], [4.0, 5.0, 6.0]])
        b = tf.constant([[1.0, 2.0], [3.0, 4.0], [5.0, 6.0]])
        c = tf.matmul(a, b)
        print(c)
except RuntimeError as e:
    print(e)

else:
print(“No GPU available”)

So technically even with those errors, it is finding my gpu and using CUDA cores. However, I’m still reasonably sure it’s not working 100% as it should, hence the errors. It’s also getting this NUMA I printout when I run some other code:

2024-02-09 21:52:43.747526: I external/local_xla/xla/stream_executor/cuda/cuda_executor.cc:887] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.

You have CUDA CUdnn and Tensorrt installed properly ?
So I don’t have them installed. From what I read tensorRT is an optional library (do you agree?), so I haven’t been trying to use it yet, but I think I probably will in the future. As for CUDA and cuDNN, I’m a little confused on the installation documentation. From what I have read as of tensorflow 2.15 using ‘pip install tensorflow[and-cuda]’ will install the necessary dependencies from CUDA and cuDNN for me, so I don’t think I need to install them separately. Clearly it’s not perfect cuz it gave errors. It installing the dependencies sounds like a new thing, and most of the documentation has the old way of installing CUDA and cuDNN separately/manually. (I might totally be wrong on all of this)

As for conda, I haven’t been using it, not really for any particular reason, but because I didn’t think it was a requirement. Is it a requirement for this to work?

I will try making a new installation on Monday with conda. Then I’ll test the code you gave me.

This is the example I’ve been using and it’s printouts:

import tensorflow as tf

from tensorflow.keras.models import Sequential

from tensorflow.keras.layers import Dense

from tensorflow.keras.optimizers import Adam

from tensorflow.keras.utils import to_categorical

from sklearn.datasets import make_classification

from sklearn.model_selection import train_test_split

Generate some random classification data

X, y = make_classification(n_samples=10000, n_features=20, n_classes=2, random_state=42)

Split the data into training and testing sets

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

Convert labels to one-hot encoding

y_train = to_categorical(y_train)

y_test = to_categorical(y_test)

Define a simple Keras model

model = Sequential([

Dense(64, activation=‘relu’, input_shape=(20,)),

Dense(32, activation=‘relu’),

Dense(2, activation=‘softmax’)

])

Compile the model

model.compile(optimizer=Adam(), loss=‘categorical_crossentropy’, metrics=[‘accuracy’])

Check if GPU is available and if it’s using CUDA cores

physical_devices = tf.config.list_physical_devices(‘GPU’)

if len(physical_devices) > 0:

print(“GPU available. Training on GPU.”)

else:

print(“No GPU available. Training on CPU.”)

Train the model

history = model.fit(X_train, y_train, epochs=2, batch_size=64, validation_data=(X_test, y_test))

Evaluate the model

test_loss, test_acc = model.evaluate(X_test, y_test)

print(“Test Accuracy:”, test_acc)

2024-02-09 21:52:40.810870: E external/local_xla/xla/stream_executor/cuda/cuda_dnn.cc:9261] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
2024-02-09 21:52:40.811052: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:607] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
2024-02-09 21:52:40.825636: E external/local_xla/xla/stream_executor/cuda/cuda_blas.cc:1515] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2024-02-09 21:52:40.966027: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
2024-02-09 21:52:41.886546: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
2024-02-09 21:52:43.747526: I external/local_xla/xla/stream_executor/cuda/cuda_executor.cc:887] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
2024-02-09 21:52:44.098519: I external/local_xla/xla/stream_executor/cuda/cuda_executor.cc:887] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
2024-02-09 21:52:44.098574: I external/local_xla/xla/stream_executor/cuda/cuda_executor.cc:887] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
2024-02-09 21:52:44.103297: I external/local_xla/xla/stream_executor/cuda/cuda_executor.cc:887] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
2024-02-09 21:52:44.103341: I external/local_xla/xla/stream_executor/cuda/cuda_executor.cc:887] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
2024-02-09 21:52:44.103357: I external/local_xla/xla/stream_executor/cuda/cuda_executor.cc:887] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
2024-02-09 21:52:44.292290: I external/local_xla/xla/stream_executor/cuda/cuda_executor.cc:887] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
2024-02-09 21:52:44.292343: I external/local_xla/xla/stream_executor/cuda/cuda_executor.cc:887] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
2024-02-09 21:52:44.292348: I tensorflow/core/common_runtime/gpu/gpu_device.cc:2022] Could not identify NUMA node of platform GPU id 0, defaulting to 0. Your kernel may not have been built with NUMA support.
2024-02-09 21:52:44.292370: I external/local_xla/xla/stream_executor/cuda/cuda_executor.cc:887] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
2024-02-09 21:52:44.292567: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1929] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 2249 MB memory: → device: 0, name: NVIDIA T600 Laptop GPU, pci bus id: 0000:01:00.0, compute capability: 7.5
GPU available. Training on GPU.
Epoch 1/2
2024-02-09 21:52:45.014338: I external/local_tsl/tsl/platform/default/subprocess.cc:304] Start cannot spawn child process: No such file or directory
2024-02-09 21:52:45.087930: I external/local_xla/xla/service/service.cc:168] XLA service 0x7f2caa859120 initialized for platform CUDA (this does not guarantee that XLA will be used). Devices:
2024-02-09 21:52:45.087960: I external/local_xla/xla/service/service.cc:176] StreamExecutor device (0): NVIDIA T600 Laptop GPU, Compute Capability 7.5
2024-02-09 21:52:45.108520: I tensorflow/compiler/mlir/tensorflow/utils/dump_mlir_util.cc:269] disabling MLIR crash reproducer, set env var MLIR_CRASH_REPRODUCER_DIRECTORY to enable.
2024-02-09 21:52:45.158805: I external/local_xla/xla/stream_executor/cuda/cuda_dnn.cc:454] Loaded cuDNN version 8904
WARNING: All log messages before absl::InitializeLog() is called are written to STDERR
I0000 00:00:1707537165.268011 29229 device_compiler.h:186] Compiled cluster using XLA! This line is logged at most once for the lifetime of the process.
125/125 [==============================] - 2s 7ms/step - loss: 0.3940 - accuracy: 0.8370 - val_loss: 0.2797 - val_accuracy: 0.8945
Epoch 2/2
125/125 [==============================] - 1s 7ms/step - loss: 0.2872 - accuracy: 0.8934 - val_loss: 0.2653 - val_accuracy: 0.8985
63/63 [==============================] - 0s 5ms/step - loss: 0.2653 - accuracy: 0.8985
Test Accuracy: 0.8985000252723694

Here are all of the libraries I currently have installed:
pip list

Package Version


absl-py 2.1.0
asttokens 2.4.1
astunparse 1.6.3
blinker 1.4
borb 2.1.21
cachetools 5.3.2
certifi 2024.2.2
cffi 1.16.0
charset-normalizer 3.3.2
comm 0.2.1
command-not-found 0.3
contourpy 1.2.0
cryptography 42.0.2
cycler 0.12.1
dbus-python 1.2.18
debugpy 1.8.0
decorator 5.1.1
distro 1.7.0
distro-info 1.1+ubuntu0.2
exceptiongroup 1.2.0
executing 2.0.1
flatbuffers 23.5.26
fonttools 4.48.1
gast 0.5.4
google-auth 2.27.0
google-auth-oauthlib 1.2.0
google-pasta 0.2.0
grpcio 1.60.1
h5py 3.10.0
httplib2 0.20.2
idna 3.6
importlib-metadata 4.6.4
ipykernel 6.29.1
ipython 8.21.0
jedi 0.19.1
jeepney 0.7.1
joblib 1.3.2
jupyter_client 8.6.0
jupyter_core 5.7.1
keras 2.15.0
keyring 23.5.0
kiwisolver 1.4.5
launchpadlib 1.10.16
lazr.restfulclient 0.14.4
lazr.uri 1.0.6
libclang 16.0.6
lxml 5.1.0
Markdown 3.5.2
MarkupSafe 2.1.5
matplotlib 3.8.2
matplotlib-inline 0.1.6
ml-dtypes 0.2.0
more-itertools 8.10.0
nest-asyncio 1.6.0
netifaces 0.11.0
numpy 1.26.4
nvidia-cublas-cu12 12.2.5.6
nvidia-cuda-cupti-cu12 12.2.142
nvidia-cuda-nvcc-cu12 12.2.140
nvidia-cuda-nvrtc-cu12 12.2.140
nvidia-cuda-runtime-cu12 12.2.140
nvidia-cudnn-cu12 8.9.4.25
nvidia-cufft-cu12 11.0.8.103
nvidia-curand-cu12 10.3.3.141
nvidia-cusolver-cu12 11.5.2.141
nvidia-cusparse-cu12 12.1.2.141
nvidia-nccl-cu12 2.16.5
nvidia-nvjitlink-cu12 12.2.140
oauthlib 3.2.0
opt-einsum 3.3.0
packaging 23.2
pandas 2.2.0
parso 0.8.3
patsy 0.5.6
pexpect 4.9.0
pillow 10.2.0
pip 24.0
platformdirs 4.2.0
prompt-toolkit 3.0.43
protobuf 4.23.4
psutil 5.9.8
ptyprocess 0.7.0
pure-eval 0.2.2
pyasn1 0.5.1
pyasn1-modules 0.3.0
pycparser 2.21
Pygments 2.17.2
PyGObject 3.42.1
PyJWT 2.3.0
pyparsing 2.4.7
pypng 0.20220715.0
python-apt 2.4.0+ubuntu2
python-barcode 0.15.1
python-dateutil 2.8.2
pytz 2024.1
PyYAML 5.4.1
pyzmq 25.1.2
qrcode 7.4.2
requests 2.31.0
requests-oauthlib 1.3.1
rsa 4.9
scikit-learn 1.4.0
scipy 1.12.0
seaborn 0.13.2
SecretStorage 3.3.1
setuptools 59.6.0
six 1.16.0
stack-data 0.6.3
statsmodels 0.14.1
systemd-python 234
tensorboard 2.15.1
tensorboard-data-server 0.7.2
tensorflow 2.15.0.post1
tensorflow-estimator 2.15.0
tensorflow-io-gcs-filesystem 0.36.0
termcolor 2.4.0
threadpoolctl 3.2.0
tornado 6.4
traitlets 5.14.1
typing_extensions 4.9.0
tzdata 2023.4
ubuntu-advantage-tools 8001
ufw 0.36.1
unattended-upgrades 0.1
urllib3 2.2.0
wadllib 1.3.6
wcwidth 0.2.13
Werkzeug 3.0.1
wheel 0.37.1
wrapt 1.14.1
zipp 1.0.0

NUMA u not need unless u run on a Linux server where u can change the Kernel So is all good.

If u want very good performances install Cudnn and TensorRT.
Is not hard u just DL both files and then install.

Cudnn and tensorrt need to be installed when u install CUDA on the system not in the ENV.
Is a delicate process u do before everything else.
Then to work use ENVs since u can install as much pip as u want without messing up the system.

To install cudnn and tensorrt remeber to aslo go and update the bash script so ur system will see them i psss u my script so is gonna be easy:

nano ~/.bashrc

you should fix your paths but it will look like this

CUDA and TENSORRT SECTION

export PATH=/usr/local/cuda-12.3/bin:$PATH
export LD_LIBRARY_PATH=/usr/local/cuda-12.3/lib64:$LD_LIBRARY_PATH

Tensor RT

export LD_LIBRARY_PATH=~/TensorRT-8.6.1.6/lib:$LD_LIBRARY_PATH

Cudnn

export LD_LIBRARY_PATH=/home/min0/anaconda3/envs/AIFlow_Lab/lib/python3.9/site-packages/nvidia/cudnn/lib:$LD_L>export LD_LIBRARY_PATH=/usr/local/cuda-12.3/lib:$LD_LIBRARY_PATH

NVIDIA Driver and CUDA Version

echo “NVIDIA Driver and CUDA Version:”
nvidia-smi --query-gpu=driver_version --format=csv,noheader,nounits | awk ‘{print “Driver Version:”, $1}’ | he>nvidia-smi | grep “CUDA Version”
# NVIDIA GPU Information
echo “NVIDIA GPU Information:”
nvidia-smi --query-gpu=gpu_name,index,temperature.gpu,utilization.gpu,power.draw --format=csv | awk -F, 'NR==1>

GPU Identifiers for TensorFlow

echo “GPU Identifiers for TensorFlow:”
nvidia-smi --query-gpu=gpu_name,index --format=csv,noheader,nounits

after add do

source ~/.bashrc

to apply. And generlaly also restart the wsl or linux.

After tensorrt is on you will see better performances in training and overall.
Including on a WSL2 i use now it automatically use bot GPU for training since there are 2 4090.

Modified by Moderator

Have fun mate

1 Like

What versions of CUDA, cuDNN, and TensrorRT should I be installing?

Currently I’m trying:

WSL:
WSL version: 2.0.9.0
Kernel version: 5.15.133.1-1
WSLg version: 1.0.59
MSRDC version: 1.2.4677
Direct3D version: 1.611.1-81528511
DXCore version: 10.0.25131.1002-220531-1700.rs-onecore-base2-hyp
Windows version: 10.0.19045.3996

Ubuntu:
No LSB modules are available.
Distributor ID: Ubuntu
Description: Ubuntu 22.04.3 LTS
Release: 22.04
Codename: jammy

CUDA-Toolkit 12.3
cuDNN 9.0.0
TensorRT 8.6

I can’t tell if the version matrix charts are just out of date and everything works or if they actually mean that these versions don’t support each other?

Also I’m supposed to install a gpu driver on windows, not in ubuntu 22.04, right? Does the gpu driver go in windows or in the wsl2 Ubuntu side of things?

When I was trying to install TensorRT I got this: (I think that maybe TensorRT doesn’t support Cuda 12.3?)

root@ilawrence:~# sudo dpkg -i nv-tensorrt-local-repo-ubuntu2204-8.6.1-cuda-12.0_1.0-1_amd64.deb
Selecting previously unselected package nv-tensorrt-local-repo-ubuntu2204-8.6.1-cuda-12.0.
(Reading database … 90657 files and directories currently installed.)
Preparing to unpack nv-tensorrt-local-repo-ubuntu2204-8.6.1-cuda-12.0_1.0-1_amd64.deb …
Unpacking nv-tensorrt-local-repo-ubuntu2204-8.6.1-cuda-12.0 (1.0-1) …
Setting up nv-tensorrt-local-repo-ubuntu2204-8.6.1-cuda-12.0 (1.0-1) …

The public nv-tensorrt-local-repo-ubuntu2204-8.6.1-cuda-12.0 GPG key does not appear to be installed.
To install the key, run this command:
sudo cp /var/nv-tensorrt-local-repo-ubuntu2204-8.6.1-cuda-12.0/nv-tensorrt-local-42B2FC56-keyring.gpg /usr/share/keyrings/

root@ilawrence:~# sudo cp /var/nv-tensorrt-local-repo-ubuntu2204-8.6.1-cuda-12.0/nv-tensorrt-local-42B2FC56-keyring.gpg /usr/share/keyrings/
root@ilawrence:~# sudo apt-get update
Get:1 file:/var/cuda-repo-wsl-ubuntu-12-3-local InRelease [1572 B]
Get:1 file:/var/cuda-repo-wsl-ubuntu-12-3-local InRelease [1572 B]
Get:2 file:/var/cudnn-local-repo-ubuntu2204-9.0.0 InRelease [1572 B]
Get:3 file:/var/nv-tensorrt-local-repo-ubuntu2204-8.6.1-cuda-12.0 InRelease [1572 B]
Get:2 file:/var/cudnn-local-repo-ubuntu2204-9.0.0 InRelease [1572 B]
Get:3 file:/var/nv-tensorrt-local-repo-ubuntu2204-8.6.1-cuda-12.0 InRelease [1572 B]
Get:4 file:/var/nv-tensorrt-local-repo-ubuntu2204-8.6.1-cuda-12.0 Packages [6443 B]
Get:5 h_ttp://security.ubuntu.com/ubuntu jammy-security InRelease [110 kB]
Hit:6 h_ttp://archive.ubuntu.com/ubuntu jammy InRelease
Get:7 h_ttp://archive.ubuntu.com/ubuntu jammy-updates InRelease [119 kB]
Hit:8 h_ttp://archive.ubuntu.com/ubuntu jammy-backports InRelease
Fetched 229 kB in 1s (169 kB/s)
Reading package lists… Done
root@ilawrence:~# sudo apt-get install tensorrt
Reading package lists… Done
Building dependency tree… Done
Reading state information… Done
Some packages could not be installed. This may mean that you have
requested an impossible situation or if you are using the unstable
distribution that some required packages have not yet been created
or been moved out of Incoming.
The following information may help to resolve the situation:

The following packages have unmet dependencies:
libnvinfer-dev : Depends: libcudnn8-dev but it is not installable
libnvinfer-samples : Depends: cuda-nvcc-12-1 but it is not installable or
cuda-nvcc-12-0 but it is not installable
libnvinfer8 : Depends: libcudnn8 but it is not installable
E: Unable to correct problems, you have held broken packages.

(I made http be h_ttp so I could post this without getting a max link error for the post (i.e I can only have 2 links in a post))

So I think I got it working… well sorta mostly.

Used:
CUDA-Toolkit 12.0
cuDNN: 8.9.7.29
tensorRT: 8.6.1

I also make a new conda python instance and installed tensorflow with pip.
I tested this program from @Igor_Lessio:

import tensorflow as tf

#Check available GPUs
gpus = tf.config.experimental.list_physical_devices(‘GPU’)
print('Num GPUs Available: ‘, len(gpus))
for gpu in gpus:
print(‘Name:’, gpu.name, ’ Type:’, gpu.device_type)

#Test GPU utilization
if gpus:
try:
# Set memory growth to avoid memory allocation errors
for gpu in gpus:
tf.config.experimental.set_memory_growth(gpu, True)

        # Simple computation to test GPU utilization
        with tf.device('/GPU:0'):  # Change this if you have more than one GPU
            a = tf.constant([[1.0, 2.0, 3.0], [4.0, 5.0, 6.0]])
            b = tf.constant([[1.0, 2.0], [3.0, 4.0], [5.0, 6.0]])
            c = tf.matmul(a, b)
            print(c)
except RuntimeError as e:
    print(e)

else:
print(‘No GPU available’)

results:
2024-02-14 15:40:01.456866: E external/local_xla/xla/stream_executor/cuda/cuda_dnn.cc:9261] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
2024-02-14 15:40:01.456922: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:607] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
2024-02-14 15:40:01.457739: E external/local_xla/xla/stream_executor/cuda/cuda_blas.cc:1515] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2024-02-14 15:40:01.462687: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
Num GPUs Available: 1
Name: /physical_device:GPU:0 Type: GPU
tf.Tensor(
[[22. 28.]
[49. 64.]], shape=(2, 2), dtype=float32)
2024-02-14 15:40:03.204330: I external/local_xla/xla/stream_executor/cuda/cuda_executor.cc:887] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
2024-02-14 15:40:03.209111: I external/local_xla/xla/stream_executor/cuda/cuda_executor.cc:887] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
2024-02-14 15:40:03.209141: I external/local_xla/xla/stream_executor/cuda/cuda_executor.cc:887] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
2024-02-14 15:40:03.213173: I external/local_xla/xla/stream_executor/cuda/cuda_executor.cc:887] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
2024-02-14 15:40:03.213199: I external/local_xla/xla/stream_executor/cuda/cuda_executor.cc:887] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
2024-02-14 15:40:03.213209: I external/local_xla/xla/stream_executor/cuda/cuda_executor.cc:887] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
2024-02-14 15:40:03.345983: I external/local_xla/xla/stream_executor/cuda/cuda_executor.cc:887] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
2024-02-14 15:40:03.346025: I external/local_xla/xla/stream_executor/cuda/cuda_executor.cc:887] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
2024-02-14 15:40:03.346030: I tensorflow/core/common_runtime/gpu/gpu_device.cc:2022] Could not identify NUMA node of platform GPU id 0, defaulting to 0. Your kernel may not have been built with NUMA support.
2024-02-14 15:40:03.346048: I external/local_xla/xla/stream_executor/cuda/cuda_executor.cc:887] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
2024-02-14 15:40:03.346063: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1929] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 2249 MB memory: → device: 0, name: NVIDIA T600 Laptop GPU, pci bus id: 0000:01:00.0, compute capability: 7.5

So I still have the same errors I started with, but I got tensorRT working, so I guess that’s a plus. Are all of these E and I print outs expected or is something still incorrect?

NUMA error is only about the kernel. Now it work and also better performance with RT.

Enjoy

Thanks for your help!

1 Like

Hi Igor,

Thanks for the excellent guidance. I finally understood that I needed to install TensorRT outside the environment. That fixed the “couldn’t find tensorrt” issue. Although my code runs, I’m still faced with “... Unable to register {cuDNN, cuFFT, cuBLAS} factory ...” errors that I’m trying to resolve. At the moment , I’m trying to implement my version of your bashrc file but yours seems incomplete at it appears to have been cut off on some lines (see the ‘>’ characters in your post). Would you mind clarifying the “NVIDIA Driver and CUDA Version” section?

But, I noticed that you’re pointing to the cudnn library in your anaconda path instead of the system path. I don’t use anaconda, but do use a python environment, so I’ll try that. But, shouldn’t python already search those environment paths first, anyway?

Thanks again for your help!

I would like clarification on this as well, but from what I found “Unable to register {cuDNN, cuFFT, cuBLAS} factory: Attempting to register factory for plugin {cuDNN, cuFFT, cuBLAS} when one has already been registered” is a red-herring. My code would still run the GPU with CUDA cores in use and I couldn’t find a solution… so I’m assuming getting them registered properly isn’t necessary. (also it sounds like they are registered, but maybe not the version tensor is looking for, idk. or maybe they are already registered correctly and it’s trying to register them again, but there is a check in the way to stop it… again idk) But I’m not an expert, so if someone has more knowledge on this, please correct me.

I wrote this for myself, so I could remember how to do this all again… maybe it will help:
I don’t think I had to do anything with bashrc or environment variables in the end. I think the installations take care of that for you. (but I’m not an expert)
In order steps of how to set up WSL2: (for GPU usage)
What I used:
WSL:
WSL version: 2.0.9.0
Kernel version: 5.15.133.1-1
WSLg version: 1.0.59
MSRDC version: 1.2.4677
Direct3D version: 1.611.1-81528511
DXCore version: 10.0.25131.1002-220531-1700.rs-onecore-base2-hyp
Windows version: 10.0.19045.3996

    Ubuntu:
    No LSB modules are available.
    Distributor ID: Ubuntu
    Description: Ubuntu 22.04.3 LTS
    Release: 22.04
    Codename: jammy

    Python: 3.11.5
    CUDA-Toolkit: 12.0     Think I could have used 12.2 ... but at this time (2/20/2024) 12.3 wasn't compatible
    cuDNN: 8.9.7.29
    tensorRT: 8.6.1

install GPU Driver on Windows side
install WSL2 (you need WSL2, *NOT* WSL1)
install Ubuntu 22.04
look at tensorRT for which version of CUDA-toolkit and cuDNN you need

in Ubuntu 22.04:
install WSL-Ubuntu version of CUDA-toolkit (this one doesn't include a gpu driver for linux... we already installed the gpu driver on windows... BAD: if you use the non-WSL2 version (aka the Ubuntu only) then it will include a gpu driver for linux which will overwrite the gpu driver on windows which we want to keep... or something like that from what I read)
install cuDNN
install tensorRT
install conda
add these packages to conda (I think this is all, but I'm not sure I remember):
try to upgrade pip:
    python.exe -m pip install --upgrade pip
pip install tensorflow
pip install borb

Other notes:

  • 1/11/2024 tensorflow needs python <= 3.11 … if using windows powershell do ‘py -3.11 -m pip install packagename’

  • pip install tensorflow[and-cuda] #is supposed to include dependencies for CUDA, however it’s still better to install CUDA, cuDNN, and tensorRT manualy (this is mostly because you need tensorRT, which isn’t included in [and-cuda]), !! CHECK THIS FIRST → !! tensorRT will define compatible versions for CUDA and cuDNN … then use ‘pip install tensorflow’ (without [and-cuda]) (yes you need tensorRT for best GPU optimization)

  • these seem to not matter:
    2024-02-20 10:21:35.340121: E external/local_xla/xla/stream_executor/cuda/cuda_dnn.cc:9261] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
    2024-02-20 10:21:35.340173: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:607] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
    2024-02-20 10:21:35.370841: E external/local_xla/xla/stream_executor/cuda/cuda_blas.cc:1515] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
    2024-02-20 10:21:35.440379: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
    To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.

    Also any printout related to NUMA

  • if you want to know what the current installed packages are
    pip list

  • os.environ[‘TF_CPP_MIN_LOG_LEVEL’] = ‘3’ # or any {‘0’, ‘1’, ‘2’} # this removes all logs… probably best to = 0 (or comment out completely) for seting up your environment

Cudnn is system-wise. So like TensorRt you have to install and point out in te Bash.

Now cudnn have also a nice installer so is easy

Anything specific you did in conda for this?

I created a new environment in conda. Then upgrade pip and use pip to install tensorflow. You don’t need borb, thats just something I was using to make pdf files. Did that answer your question?

Yes it did, just wanted to confirm if its standard stuff or some specific steps for adding packages to conda.
I got it working with CUDA 12.1, didn’t go with 12.2 or later because TensorRT’s latest supported version on support matrix is still 12.1 as of 2/29. Also, I was confused if I should do PyPi install for TensorRT or Deb, eventually I did Deb and didn’t need to do any PATHs config.

I think I got similar warnings as what you had listed, and going to ignore them.

python3 -c “import tensorflow as tf; print(tf.config.list_physical_devices(‘GPU’))”

2024-03-01 10:10:55.283322: I tensorflow/core/util/port.cc:113] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable TF_ENABLE_ONEDNN_OPTS=0.
2024-03-01 10:10:55.430548: E external/local_xla/xla/stream_executor/cuda/cuda_dnn.cc:9261] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
2024-03-01 10:10:55.430589: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:607] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
2024-03-01 10:10:55.450826: E external/local_xla/xla/stream_executor/cuda/cuda_blas.cc:1515] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2024-03-01 10:10:55.494728: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 AVX_VNNI FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
2024-03-01 10:10:56.935453: I external/local_xla/xla/stream_executor/cuda/cuda_executor.cc:887] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
2024-03-01 10:10:56.971304: I external/local_xla/xla/stream_executor/cuda/cuda_executor.cc:887] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
2024-03-01 10:10:56.971347: I external/local_xla/xla/stream_executor/cuda/cuda_executor.cc:887] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
[PhysicalDevice(name=‘/physical_device:GPU:0’, device_type=‘GPU’)]

1 Like

You probably work in WSL that have a kernel wirth no NUMA.
If u run in Linux 22.4 for example you have NUMA activated.