Problem running Tensorflow BERT tutorial on GPU

Hello everybody.

I was trying to run the Tensorflow BERT tutorial for text classification ( Classify text with BERT  |  Text  |  TensorFlow ), but it does non want to run on my GPU. Does anyone faced the same issue?

Thank you.

Hi @Stefano_Di_Pietro , I am training the model with GPU enabled in Colab (in the cloud) without issues (notebook: Google Colab → Edit > Notebook settings > GPU).

So, this may be specific to your setup. Have you had issues with training other TensorFlow 2.x models on your GPU? Can you share your OS, GPU model, cuDNN/CUDA versions?

Hardware requirements: Suporte a GPUs  |  TensorFlow
Software requirements: GPU support  |  TensorFlow
GPU support: GPU support  |  TensorFlow

1 Like

Thank you @8bitmp3 for the reply. I already run plenty of tensorflow models on my computer

This model is running on a Ubuntu 20.10
The Graphic card is a GeForce GTX1060, driver Version: 460.80
CUDA Version: 11.2
cuDNN: 7.6.5

What kind of error do you have?

I don’t have any error, but the model is not using my GPU.

What is the value of this in your setup:

https://www.tensorflow.org/api_docs/python/tf/test/is_gpu_available

Oh my god … it says False.
But I see a Python process under Anaconda (I’m using Anaconda, I forgot to mention it):

±----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| 0 N/A N/A 909 G /usr/lib/xorg/Xorg 456MiB |
| 0 N/A N/A 1505 G /usr/bin/kded5 24MiB |
| 0 N/A N/A 1510 G /usr/bin/kwin_x11 45MiB |
| 0 N/A N/A 1558 G /usr/bin/plasmashell 24MiB |
| 0 N/A N/A 1622 G /usr/lib/firefox/firefox 11MiB |
| 0 N/A N/A 2066 G …AAAAAAAAA= --shared-files 128MiB |
| 0 N/A N/A 75968 G …AAAAAAAAA= --shared-files 30MiB |
| 0 N/A N/A 242507 C …conda3/envs/tf/bin/python 61MiB |
±----------------------------------------------------------------------------+

For TF v2.5 the minimum cuDNN version appears to be 8.1
GPU support  |  TensorFlow - do you think this may be causing the issue or are you running your setup with TF v2.4 or lower?

Alternatively, maybe trying v2.5 in a Docker container (as @bhack might recommend) - Docker  |  TensorFlow - can help resolve this issue.

Actually, in some way, my Anaconda environment was broken. I tried with a new created one it it works fine.
I will try to figure it out.
Thank you for the help.

Hi, Try with this

Tutorial