Dataflow for image preprocessing & training with GPU optimization

Hi, hopefully this is a common problem with a common practice solution. In summary, what is the best way to preprocess images and train a neural network with TF/Keras such that GPU usage is optimized during training? Assume the image dataset does not fit into memory.

The current implementation uses ImageDataGenerators for on-the-fly preprocessing during model training, but the CPU is maxed out and the GPU is hardly used. Is this due to the ImageDataGenerators? If yes, what is typical way to get around this?

The images are JPEGs and the preprocessing consists of resizing to 150 x 150 and dividing the pixel values by 255. One approach would be to do the preprocessing separately on the CPU and store the results to disk. That would be fine. But what would be the storage format? Eg. NumPy in CSV? And what would be the way to stream the preprocessed images into the fit() method so that GPU usage is optimized and not bottlenecked by the CPU? Eg. a tf dataset?

For what it is worth, this is all on Kaggle.

All thoughts are welcome - thanks!

We are collecting many image preprocessing layers at:

More in general about the CPU preprocessing and the model computational occupancy see our thread at:

Update before closing this one.

Resizing the images prior to training cut the training time approximately in half even though ImageDataGenerator is still used for rescaling the pixel values.

The CPU is still maxed out and bottlenecking the GPU but further optimization (dataflow) is not necessary given the improvement that has already been achieved.