How to manage gpu memory allocation properly

swtb · November 3, 2021, 2:02am

Hello all,

I have been running into issues with TensorFlow 2.6.1 in which my GPU memory is exhausted by relatively minor models. I have found a lot of documentation on using GPUs and limiting their memory allocation. But for me, they haven’t worked and they felt more like a patchwork solution anyways.

How can I begin writing more GPU optimal code for TensorFlow? What parameters determine the amount of memory used? Is GPU memory allocated in full at the start or is it allocated per batch as needed? Is memory then reallocated?

Details:
The error occurs at the beginning of training for a model with 4 conv2D layers and then 3 fully connected layers. Around 300 images are used and they are 512px512px3ch in size.

Trained on my home computer which uses 64GB of system RAM and an Nvidia 3070Ti 8 GB

I’m eager to get answers to these questions and to learn more about this area, let me know if you require any more information from me in order to assist.

Cheers!

Bhack · November 3, 2021, 1:02pm

Do you have already tried with:

https://www.tensorflow.org/api_docs/python/tf/config/experimental/set_memory_growth

Also please check your model memory requirement for your specific input size:
https://tensorflow-prod.ospodiscourse.com/t/how-to-find-out-keras-model-memory-size/5249