Out of GPU memory when training CNN with large images

want to train a CNN network for video frame prediction. My input sizes are 16* 480* 1440* 3 (# of sequence, height, width, channels) and my output shape is (8* 480* 1440). My network has about 4M trainable parameters, and when I compute the GPU memory it needs for the weight training it’s about 5 Gb. However, even when I use a GPU with 40 Gb memory I run out of memory even when the batch size is 1. I think the problem is with the number of samples I’m loading for training since I am not using any input pipeline like generators. (P.S when I only use 100 samples there will be no memory problem) I want to know if my suspicion is making sense and if yes, will using a data generator save a considerable amount of GPU memory?

Can someone give me a useful link on how to write a dataloader function? Currently, I am trying to use this link to run my function, but the problem with it is that the loss will not improve over the epoch.

Thank you in advance for your help.

There are lot of things that are going to consume a lot of memory in you application.Just as you have realised, your dataset may have huge number of samples, judging from the fact that just a 100 samples does not throw any out of memory error. You may want to consider reducing the number of training samples. Of course that would also mean losing some meaningful information relevant to your task

I would suggest you take a look at your imgae sizes. Naturally large image sizes correlates to extra processing and more memory consumption as there is a lot of infomation to extract. I recommend you reduce you image sizes. Start with a 512x512 and continue reducing till you get to a point that works for you.

Try using an efficient input pipeline as that would also greatly improve on memory usage. Try this notebook I wrote sometime ago utilising tf.data.Dataset and TFRecords. It’s not pretty but I hope it gives you some insight into writing a input pipeline.

1 Like