I want to do Image Data Augmentation for an Semantic Segmentation task. Therefore, I want to use the ImageDataGenerator from Keras, together with the flow() method, because my data is in Numpy arrays and does not need to be loaded from a folder. Since this is a segmentation task, I need to augment the image and the corresponding mask. I do this by following the last example in the API reference (ImageDataGenerator ) and accordingly using two different generators for image and mask with the same data_gen_args. I only want to rotate, flip and move my images, so I want to use the arguments rotation_range, width_shift_range, height_shift_range,horizontal_flip, vertical_flip.
Accordingly, I want to get masks that are 8 bit images of the shape (128,128,1) like the input mask and also contain only the classes of the input mask (all integer values). And this is exactly where the problem lies, the masks I get are 32-bit floats, which do not contain integer values at all. Even when specifying the argument dtype = “uint8” the code always returns only float32 masks. I have not found an example that fixes this problem? Is there a trick that can be used ?
Another problem in connection with the ImageDataGenerator is sample_weight. As my dataset is quite unbalanced, I would like to use them. In a segmentation task, I think the sample_weight parameter in the flow() method would have to correspond to another mask containing the respective class_weight for the class of each pixel in the original mask. If I do it this way, I get sample_weight back as well, but it seems to me that these weights, similar to the mask, are not correct either, as my UNet does not train well with them anymore. In the meantime I use a third ImageDataGenerator only for the sample_weight, so the training works better, but I hardly think this is the right approach. However, I have not found an example for the correct use. Therefore I hope that the community can help me with their experience.