Tensorflow:latest-gpu docker image tf.config not detecting GPU

PROBLEM:

tensorflow:latest-gpu docker image tf.config not detecting GPU

docker run --env TF_ENABLE_ONEDNN_OPTS=0 -it --rm tensorflow/tensorflow:latest-gpu  python -c "import tensorflow as tf; print(\"Num GPUs Available: \", len(tf.config.list_physical_devices('GPU')))"

2023-05-06 09:45:58.721496: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 AVX_VNNI FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
2023-05-06 09:45:59.416662: E tensorflow/compiler/xla/stream_executor/cuda/cuda_driver.cc:266] failed call to cuInit: UNKNOWN ERROR (34)
Num GPUs Available: 0

Docker GPU is detectable:
docker run --rm --gpus all nvidia/cuda:12.0.0-base-ubuntu20.04 nvidia-smi
Sat May 6 09:39:11 2023
±--------------------------------------------------------------------------------------+
| NVIDIA-SMI 530.41.03 Driver Version: 530.41.03 CUDA Version: 12.1 |
|-----------------------------------------±---------------------±---------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+======================+======================|
| 0 NVIDIA GeForce RTX 4090 L… Off| 00000000:01:00.0 Off | N/A |
| N/A 42C P0 32W / N/A| 6MiB / 16376MiB | 0% Default |
| | | N/A |
±----------------------------------------±---------------------±---------------------+

±--------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=======================================================================================|
±--------------------------------------------------------------------------------------+

running on:
- Asus ROG STRIX 18 with RTX 4090
- Ubuntu 22.04 LTS

specs

  • installed
    • nvidia-driver-530
    • nvidia-cuda-toolkit
    • nvidia-container-toolkit-base
    • docker
    • nvidia-docker2

diagnostics

hardware:

cat /proc/cpuinfo | grep "model name" -m 1
model name : 13th Gen Intel(R) Core™ i9-13980HX

dmidecode -t 2
SMBIOS 3.5.0
Base Board Information
Manufacturer: ASUSTeK COMPUTER INC.
Product Name: G834JY

lspci | grep -e VGA
0000:00:02.0 VGA compatible controller: Intel Corporation Device a788 (rev 04)
0000:01:00.0 VGA compatible controller: NVIDIA Corporation Device 2757 (rev a1)

nvidia-settings -q CUDACores -t
9728

nvidia-smi
Sat May 6 12:17:10 2023
±--------------------------------------------------------------------------------------+
| NVIDIA-SMI 530.41.03 Driver Version: 530.41.03 CUDA Version: 12.1 |
|-----------------------------------------±---------------------±---------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+======================+======================|
| 0 NVIDIA GeForce RTX 4090 L… Off| 00000000:01:00.0 Off | N/A |
| N/A 39C P0 N/A / N/A| 6MiB / 16376MiB | 0% Default |
| | | N/A |
±----------------------------------------±---------------------±---------------------+

±--------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=======================================================================================|
| 0 N/A N/A 2783 G /usr/lib/xorg/Xorg 4MiB |
±--------------------------------------------------------------------------------------+

software:

lsmod | grep nvidia
nvidia_wmi_ec_backlight 16384 0
nvidia_uvm 1433600 0
nvidia_drm 77824 2
nvidia_modeset 1273856 3 nvidia_drm
nvidia 55750656 99 nvidia_uvm,nvidia_modeset
drm_kms_helper 200704 3 drm_display_helper,nvidia_drm,i915
drm 581632 19 drm_kms_helper,drm_display_helper,nvidia,drm_buddy,nvidia_drm,i915,ttm
wmi 32768 4 nvidia_wmi_ec_backlight,asus_wmi,wmi_bmof,mfd_aaeon

docker version
Version: 23.0.5
API version: 1.42
OS/Arch: linux/amd64

nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2021 NVIDIA Corporation
Built on Thu_Nov_18_09:45:30_PST_2021
Cuda compilation tools, release 11.5, V11.5.119
Build cuda_11.5.r11.5/compiler.30672275_0

It seems you are not passing the --gpus all argument in the first command:

docker run --env TF_ENABLE_ONEDNN_OPTS=0 -it --rm tensorflow/tensorflow:latest-gpu python -c “import tensorflow as tf; print("Num GPUs Available: ", len(tf.config.list_physical_devices(‘GPU’)))”

Could you add --gpus all to this command and verify that it also finds no gpus?