Building from source

I am trying to build tensorflow from source on ubuntu 20.04 LTS. I need to build from source because I am running on a Westmere architecture, the binaries available through the package manager is built for AVX instructions which Westmere doesn’t support.

My procedure:

Clone the repository from GitHub - tensorflow/tensorflow: An Open Source Machine Learning Framework for Everyone
check out the r2.11 branch (yes I know it isn’t the latest. It needs to match keras and is good enough)
install bazel
Ran configure, selected no for GPU
compiled it: bazel build --copt=“-march=westmere” //tensorflow/tools/pip_package:build_pip_package

This builds successfully

Next I tried to rebuild, this time GPU enabled.
I used conda to create a new environment and used pip to install the cuda libraries into it.
I am running driver version 535, cuda version 12.2

I re-ran configure, this time choosing to enable GPU support and point it to the library path inside my conda environment. This is how I answered the CUDA questions:

Please specify the CUDA SDK version you want to use. [Leave empty to default to CUDA 11]12 (I also tried 12.2)
Please specify the cuDNN version you want to use. [Leave empty to default to cuDNN 2] (chose the default)

Please specify the locally installed NCCL version you want to use. [Leave empty to use GitHub - NVIDIA/nccl: Optimized primitives for collective multi-GPU communication]: (chose the default)

Please specify the comma-separated list of base paths to look for CUDA libraries and headers. [Leave empty to use the default]:/home/myname/anaconda3/envs/tfgpu/include,/home/myname/anaconda3/envs/tfgpu/lib (order seems to be important)

I then get this error: Could not find any nvcc matching version ‘12.2’ in any subdirectory:
‘’
‘bin’
‘local/cuda/bin’
of:
‘/home/myname/anaconda3/envs/tfgpu/include’
‘/home/myname/anaconda3/envs/tfgpu/lib’

How do I install nvcc? there is no pip or apt package.

  1. List item

Hi @lazarus_long

Welcome to the TensorFlow Forum!

There is version mismatch issue with installed CUDA, cuDNN and Nvidia driver to have GPU setup with TensorFlow in your system. Please check this tested build configuration to install the correct version of CUDA, cuDNN with specific TensorFlow version in your system and set the path for these libraries to enable GPU support.

You can follow the same link to build tensorflow from source or can check the TF install page to install TF with GPU in conda environment.

Please try again and let us know if the issue still persists. Thank you.

2 Likes

I am using TF 2.11, it looks like I need cuda 11.2 and cudnn 8.1.

I created a new conda environment:
conda create -n tfgpu python=3.7

I installed cuda:
conda install -c conda-forge cudatoolkit=11.2 cudnn=8.1.0

I ran configure.py.

I selected these options:
Please specify the CUDA SDK version you want to use. [Leave empty to default to CUDA 11]: 11

Please specify the cuDNN version you want to use. [Leave empty to default to cuDNN 2]: 8

I still get errors:
Could not find any cuda.h matching version ‘11’ in any subdirectory:
‘’
‘include’
‘include/cuda’
‘include/*-linux-gnu’
‘extras/CUPTI/include’
‘include/cuda/CUPTI’
‘local/cuda/extras/CUPTI/include’
of:
‘/home/myname/anaconda3/envs/tfgpu/include’
‘/home/myname/anaconda3/envs/tfgpu/lib’

I checked /home/myname/anaconda3/envs/tfgpu/include, there is no cuda.h. I thought that was installed by installing cudatoolkit. What do I need to install?

@lazarus_long , Could you please verify the Hardware/Software requirements are satisfied as mentioned in this link and also ensure that all the mentioned Steps-by-step instructions have been followed correctly?

I’ve since moved on to ubuntu 22.04, I’ve found that I can move to Tensorflow 2.13, so that is what I’m trying to set up now. I need the following. I was able to install the cuda-toolkit 11.8, but cudnn was not found:

Version Python version Compiler Build tools cuDNN CUDA
tensorflow-2.13.0 3.8-3.11 Clang 16.0.0 Bazel 5.3.0 8.6 11.8

My machine meets the hardware requirements. I have a Geforce 1060 (Pascal architecture). The following test runs successfully:

docker run --gpus all nvcr.io/nvidia/k8s/cuda-sample:nbody nbody -gpu -benchmark

Windowed mode
Simulation data stored in video memory
Single precision floating point simulation
1 Devices used for simulation
GPU Device 0: “Pascal” with compute capability 6.1

Compute 6.1 CUDA device: [NVIDIA GeForce GTX 1060 6GB]
10240 bodies, total time for 10 iterations: 7.831 ms
= 133.909 billion interactions per second
= 2678.175 single-precision GFLOP/s at 20 flops per interaction

nvidia-smi shows the following:

Sat Oct 14 09:58:10 2023
±--------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.86.05 Driver Version: 535.86.05 CUDA Version: 12.2 |
|-----------------------------------------±---------------------±---------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+======================+======================|
| 0 NVIDIA GeForce GTX 1060 6GB Off | 00000000:08:00.0 Off | N/A |
| 0% 38C P8 7W / 200W | 6MiB / 6144MiB | 0% Default |
| | | N/A |
±----------------------------------------±---------------------±---------------------+

±--------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=======================================================================================|
| 0 N/A N/A 1336 G /usr/lib/xorg/Xorg 4MiB |
±--------------------------------------------------------------------------------------+

I am trying to install my development environment in a conda environment.

I was able to install the cudatoolkit 11.8, but can’t install cudnn 8.6:

conda install -c conda-forge cudnn=8.6.*

PackagesNotFoundError: The following packages are not available from current channels:

  • cudnn=8.6

How do I install cudnn?

I want to compile gpu supported tensorflow within docker container tensorflow/tensorflow:devel_gpu;
My environment is:
ubuntu 22.04;
RTX 4090;
Python 3.10.13

Within the container, according to the guide, tensorflow-2.15.0 python 3.9-3.11 Clang 16.0.0 Bazel 6.1.0 cuDNN 8.8 , I checkout by git checkout tags/v2.15.0, then run the build command , and failed:

bazel build --config=cuda //tensorflow_src/tools/pip_package:build_pip_package

the error message:

2023/12/05 07:46:17 Downloading https://releases.bazel.build/6.1.0/release/bazel-6.1.0-linux-x86_64...
INFO: Reading 'startup' options from /tensorflow_src/.bazelrc: --windows_enable_symlinks
INFO: Options provided by the client:
  Inherited 'common' options: --isatty=1 --terminal_columns=182
INFO: Reading rc options for 'build' from /tensorflow_src/.bazelrc:
  Inherited 'common' options: --experimental_repo_remote_exec
INFO: Reading rc options for 'build' from /tensorflow_src/.bazelrc:
  'build' options: --define framework_shared_object=true --define tsl_protobuf_header_only=true --define=use_fast_cpp_protos=true --define=allow_oversize_protos=true --spawn_strategy=standalone -c opt --announce_rc --define=grpc_no_ares=true --noincompatible_remove_legacy_whole_archive --features=-force_no_whole_archive --enable_platform_specific_config --define=with_xla_support=true --config=short_logs --config=v2 --define=no_aws_support=true --define=no_hdfs_support=true --experimental_cc_shared_library --experimental_link_static_libraries_once=false --incompatible_enforce_config_setting_visibility
INFO: Reading rc options for 'build' from /tensorflow_src/.tf_configure.bazelrc:
  'build' options: --action_env PYTHON_BIN_PATH=/usr/bin/python3 --action_env PYTHON_LIB_PATH=/usr/lib/python3/dist-packages --python_path=/usr/bin/python3 --config=tensorrt
INFO: Found applicable config definition build:short_logs in file /tensorflow_src/.bazelrc: --output_filter=DONT_MATCH_ANYTHING
INFO: Found applicable config definition build:v2 in file /tensorflow_src/.bazelrc: --define=tf_api_version=2 --action_env=TF2_BEHAVIOR=1
INFO: Found applicable config definition build:tensorrt in file /tensorflow_src/.bazelrc: --repo_env TF_NEED_TENSORRT=1
INFO: Found applicable config definition build:cuda in file /tensorflow_src/.bazelrc: --repo_env TF_NEED_CUDA=1 --crosstool_top=@local_config_cuda//crosstool:toolchain --@local_config_cuda//:enable_cuda
INFO: Found applicable config definition build:linux in file /tensorflow_src/.bazelrc: --host_copt=-w --copt=-Wno-all --copt=-Wno-extra --copt=-Wno-deprecated --copt=-Wno-deprecated-declarations --copt=-Wno-ignored-attributes --copt=-Wno-array-bounds --copt=-Wunused-result --copt=-Werror=unused-result --copt=-Wswitch --copt=-Werror=switch --copt=-Wno-error=unused-but-set-variable --define=PREFIX=/usr --define=LIBDIR=$(PREFIX)/lib --define=INCLUDEDIR=$(PREFIX)/include --define=PROTOBUF_INCLUDE_PATH=$(PREFIX)/include --cxxopt=-std=c++17 --host_cxxopt=-std=c++17 --config=dynamic_kernels --experimental_guard_against_concurrent_changes
INFO: Found applicable config definition build:dynamic_kernels in file /tensorflow_src/.bazelrc: --define=dynamic_loaded_kernels=true --copt=-DAUTOLOAD_DYNAMIC_KERNELS
ERROR: Traceback (most recent call last):
	File "/tensorflow_src/WORKSPACE", line 70, column 13, in <toplevel>
		install_deps()
	File "/root/.cache/bazel/_bazel_root/43801f1e35f242fb634ebbc6079cf6c5/external/pypi/requirements.bzl", line 49, column 20, in install_deps
		whl_library(
Error in repository_rule: invalid repository name '@pypi_<': repo names may contain only A-Z, a-z, 0-9, '-', '_', '.' and '~' and must not start with '~'
ERROR: Error computing the main repository mapping: at /tensorflow_src/tensorflow/workspace2.bzl:19:6: at /tensorflow_src/third_party/llvm/setup.bzl:3:6: Encountered error while reading extension file 'utils/bazel/configure.bzl': no such package '@llvm-raw//utils/bazel': error loading package 'external': Could not load //external package
Loading: 

Please help. Thank you.

I want to add some colour. I have experienced something similar. And perhaps @Renu Patel can clear the confusion here. I have followed tensorflow´s compatibility matrix for 2.12.0 and attempted to install it correctly with the right version of cuDNNN: Please see the output of nvcc.
(tf_gpu) abhimehrish@linux-machine:~/tensorflow$ nvcc -V
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2022 NVIDIA Corporation
Built on Wed_Sep_21_10:33:58_PDT_2022
Cuda compilation tools, release 11.8, V11.8.89
Build cuda_11.8.r11.8/compiler.31833905_0

However, the output of nvidia-smi reads as follows and shows: CUDA Version: 12.3 |.
It is truncated below in the screenshot:

(tf_gpu) abhimehrish@linux-machine:~/tensorflow$ nvidia-smi
Sun Dec 17 23:30:15 2023       
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 545.23.08              Driver Version: 545.23.08    CUDA Version: 12.3     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|=========================================+======================+======================|
|   0  NVIDIA GeForce RTX 3080 ...    On  | 00000000:01:00.0 Off |                  N/A |
| N/A   39C    P8              16W /  80W |     58MiB / 16384MiB |      0%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+
                                                                                         
+---------------------------------------------------------------------------------------+
| Processes:                                                                            |
|  GPU   GI   CI        PID   Type   Process name                            GPU Memory |
|        ID   ID                                                             Usage      |
|=======================================================================================|
|    0   N/A  N/A      1752      G   /usr/lib/xorg/Xorg                           52MiB |
+---------------------------------------------------------------------------------------+

I dont understand! I see the same problems. In the past, I have not seen any problems.

An additional bit, My virtual environment with Python is not only working well with GPU but also tensorrt 8.6.1. IT was only after ensuring that release Python wheels are working well that I decided to attempt a more optimised custom C++ build. I dont use docker.