Proper use of Keras ImageDataGenerator: Create Masks for Segmentation and sample_weight parameter

Hello all,

I want to do Image Data Augmentation for an Semantic Segmentation task. Therefore, I want to use the ImageDataGenerator from Keras, together with the flow() method, because my data is in Numpy arrays and does not need to be loaded from a folder. Since this is a segmentation task, I need to augment the image and the corresponding mask. I do this by following the last example in the API reference (ImageDataGenerator ) and accordingly using two different generators for image and mask with the same data_gen_args. I only want to rotate, flip and move my images, so I want to use the arguments rotation_range, width_shift_range, height_shift_range,horizontal_flip, vertical_flip.
Accordingly, I want to get masks that are 8 bit images of the shape (128,128,1) like the input mask and also contain only the classes of the input mask (all integer values). And this is exactly where the problem lies, the masks I get are 32-bit floats, which do not contain integer values at all. Even when specifying the argument dtype = “uint8” the code always returns only float32 masks. I have not found an example that fixes this problem? Is there a trick that can be used ?

Another problem in connection with the ImageDataGenerator is sample_weight. As my dataset is quite unbalanced, I would like to use them. In a segmentation task, I think the sample_weight parameter in the flow() method would have to correspond to another mask containing the respective class_weight for the class of each pixel in the original mask. If I do it this way, I get sample_weight back as well, but it seems to me that these weights, similar to the mask, are not correct either, as my UNet does not train well with them anymore. In the meantime I use a third ImageDataGenerator only for the sample_weight, so the training works better, but I hardly think this is the right approach. However, I have not found an example for the correct use. Therefore I hope that the community can help me with their experience.

Thank you.

Kind regards,
Jonas

Hi Jonas

ImageDataGenerator has been superseded by Keras Preprocessing Layers for data preprocessing, to be used together with the tf.data API. However, at this time, you cannot yet do joint preprocessing of the image and mask using Keras Preprocessing Layers so I cannot recommend that route yet.

In my experience, the following data augmentation frameworks support image segmentation use cases directly:

Your best way for now is to use one of these libraries and then format your dataset as a Python generator (or tf.data.Dataset through tf.data.Dataset.from_generator)

The limitations of these approaches is that they do the data transformations in Python rather than TF operations and therefore cannot be saved to SavedModel and deployed in production.

Until we have a segmentation-compatible, Keras Preprocessing Layer implemented with TF ops, I advise you to special-case inference in your model setup. You can use Python libraries for data preprocessing for training and evaluation, but implement the minimal necessary inference-time data transformations (JPEG decompression, size, scale, …) using TF functions and Keras Preprocessing Layers. For example tf.io.decode_image and tf.keras.layers.Resizing.

1 Like

Do we have a small example at: